How Big a Deal is Big Data in The Cloud?
What I also noticed in the Cloud conversations I was having with the technical experts I’m connected with throughout the industry was that two terms – Big Data and Analytics – often dominated our dialogue.
The fact that these terms seemed indivisible from the wider Cloud conversations gave me my theme: how important are Big Data and Analytics for the Cloud journey? And what must developers focus on to ensure that they deliver Big Data and Analytics benefits that continue to meet the ever-advancing requirements of their Cloud users? Here are my thoughts …
Cloud, Big Data, Analytics and the Four Vs
The first point to make is that Big Data and Analytics can only really exist in their recognised form because of the Cloud, as they depend on the availability of vast and multiple data sources – the very thing the Cloud is, de facto, able to deliver.
Essentially, there was no such thing as what we now call Big Data until Cloud came along, hence Big Data, Analytics and Cloud are natural bedfellows from the word go.
It’s also worth noting that the result of that liaison has been a fruitful one: Big Data has become “Fast Data” – actionable insights derived in seconds by use of advanced analytics from volumes of information that would have been impenetrable in anything like the same timescale ten years ago.
But nothing stands still, and the critical opportunity for Big Data and Analytics developers to continue to contribute to the Cloud journey will be largely decided, in my view, by their ability to extract actionable business insights from the very specific challenges identified by data scientists as the “Four Vs”: Volume, Velocity, Veracity, Variety.
Here’s a quick overview of each, with some opinion from me as to where I believe the key sticking points could lie:
Volume: the scale of data.
The sheer volume of data we produce is mind-blowing. Industry analyst IDC predicts that the digital data we create, capture, replicate and consume will grow from approximately 40 zettabytes of data in 2019 to 175 zettabytes in 2025 (one zettabyte equals one trillion gigabytes).
Almost all that data growth will be in or via the Cloud. Which Big Data and Analytics tools will be available able to make sense of it?
Velocity: the speed of data.
Every 60 seconds, 204 million emails are sent, 216,000 Instagram posts are uploaded and 72 hours of footage are posted to YouTube, carried on a global tide of Internet traffic that reaches 50,000 GB per second.
Data processing developers face significant challenges in keeping up with the dual demands of data volume and data velocity.
Veracity: the certainty of data.
“Garbage in, garbage out, goes the saying – and international research findings have indicated that 84% of global CEOs were concerned about the quality of the data they base their decisions on. The revenue loss caused by poor decision-making is estimated by some industry analysts to be as high as 30%!” Source: KPMG/Forbes
What are developers doing to build data quality control into their Big Data and Analytics processing routines?
Variety: the diversity of data.
Some 90% of data generated is unstructured, including sources like tweets, photos, customer and service calls, as well as video, images and documents. Not forgetting that IoT generates masses of immediate data from sensors.
Can developers’ apps really make business sense of all this heterogeneous, unstructured information – or do they simply ignore some of it? Of course, it’s also worth mentioning that Big Data is actually something of a misnomer anyway!
The value in such data is not in its extensiveness and weight but rather in the targeted insights that can be mined from it by Analytics. In fact, the value we get from looking at BigData only manifests itself once the outputs have been made small!