Big data doesn’t refer to high fives or towering twelves. It refers to lots of data. I mean, really lots, as in exabytes, which is two to the sixteenth power or roughly a billion billion, otherwise known as one quintillion. (If you’re wondering what’s even bigger than exabytes, check out zettabytes and yottabytes.)
The primary goal of handling big data is to capture all the data available in a given computing scenario and then analyze it to find identifiable business, behavior, or other patterns. Apparently, every day, we create 2.5 quintillion bytes of data, much of it unstructured, in everything from blogs, photos, documents, and social media sites, to the data obtained from tasks like purchasing transaction records or the information gleaned from satellites that stream images of the earth.
Much of the data has resulted from increasing Internet usage, plus the connections of billions of devices to the Internet. Every time you add another device to your network, you’re making big data even bigger.
Amazingly, most of the data—as much of 90 percent of it—has been created in the last two years and will continue to grow in gigantic leaps and bounds. What’s making it the “in” thing is that tools are becoming available to analyze it beyond what business intelligence tools have been able to do. So it’s a means to better understand customers, develop new products, cut operational costs, gain competitive advantage, analyze DNA, plan transportation systems, and generate important insights about our world and beyond.
When people describe big data, they talk about three factors: volume, variety, and velocity.
- Volume doesn’t refer to a specific quantity of data but rather a volume of data too large for a relational database. It’s necessary to break up the data, analyze subsets, and then regroup the results to produce the output.
- Variety refers to the many different types of data (documents, images, tweets, etc.) that may be pertinent in a given analysis.
- Velocity refers to data that arrives quickly and often must be processed quickly, such as for emergency response systems or stock market transactions. And that very velocity can vary according to where it’s coming from and who is generating it.
Needless to say, the brave new world of big data is not without concerns, such as security (will the data fall into the wrong hands?), data storage (all that data has to go somewhere), and personalization (such as credit card companies that use your purchase history to adjust your credit limit based on where you’ve shopped).
It’s expected that big data will open the way for new careers. Although conventional titles and standard qualifications don’t yet exist, companies are creating positions for roles such as data scientist. These jobs are for people who tend to posses math or statistics backgrounds, as well as backgrounds in artificial intelligence, natural language processing, or data management. Apparently, you can’t do much with big data without data scientists.
Before you apply for a position as a data scientist, be aware that some pundits are already speculating about its demise because of the possibility that the role will be replaced by tools, automation, and startups that offer what they call “data science as a service.” Don’t give up your day job just yet.
Naomi Karten is a writer and speaker who draws from her background in both psychology and IT. Naomi's recent books are Presentation Skills for Technical Professionals and Changing How You Manage and Communicate Change. Readers have described her newsletter, Perceptions and Realities, as lively, informative, and a breath of fresh air. Naomi is a regular columnist for StickyMinds.com.