• Nomtha Ngumbela - Research Analyst at Effectus

A few words on a big subject: Big Data

Big Data is a massive and complex subject. In this article we touch on a few key aspects of understanding what it is, where it comes from and one of the ways in which we are investing in it.


The term ‘Big Data’ has been at the center of the digital revolution conversation but its meaning and application is often misunderstood. At its most basic, Big Data refers to large volumes of structured and unstructured datasets from traditional and non-traditional sources which is gathered, processed and utilized to drive insights which in turn enhance the decision making process and strengthen the users short to long term strategy. According to IDC, by 2021 at least 50% of global GDP will be digitized, and growth in every industry will be driven by digitally enhanced offerings, operations, and relationships.

Traditional sources of data include structured data that is clearly defined and is relatively easy to analyse such as direct sales and supply chain data, customer’s banking records or something as simple as a client’s name or height. This information is extracted from traditional sources such as online forms, company spreadsheets and medical records.

However, the expansion of the internet and the interconnecated-ness of the new global community has given rise to non-traditional or alternative sources of data. From trails of usable information left by the activities of internet users on social media platforms, via wireless sensors, machine to machine interactions and from company exhausts (data footprints). The data produced is often unstructured in nature given that it grows organically, but also uncontrollably like in the instance of customer’s online reviews of products and services, text messages, video imaging or machine generated data from heat sensors which do not require human intervention.

Four Factors of Big Data


The International Data Corporation predicts that by the end of 2020 there will be around 40 zettabytes of digital data, an increase of 300 times since 2005. What is a zettabyte? Well to put in into perspective, a zettabyte is 1000 exabytes and equivalent to all the grains of sand on all the world’s beaches!

Data Demand 2019-2025

Most of this new data will not be produced by people, rather this will be a collaborative effort between data networks, machines, and businesses.

For example, think of sensors; health and fitness smartwatches utilize sensors to monitor heart-rates and GPS mapping during exercise. The information is then sent to systems where the data is mined to produce coherent information regarding the intensity of a workout or the average pace at which the individual performed.

On the other hand, as economies begin to reopen and consumers head to retail outlets and malls, sensor enabled technologies such as automatic doors, industrial sized sanitizer dispensers and check-out conveyancer will become essential tools for ensuring social distancing is maintained and human-to-human transmission is contained. Additionally, these small consumer safety systems will provide invaluable data on spending and visitation patterns to businesses. These are just two use cases of sensors and the vast amounts of data that they produce which fosters greater convenience for consumers and powerful insights for firms.

With million if not billions of data points being generated daily, seasonally, or even event triggered, how do