The ABC's of Big Data

Big Data is too important a topic to miss out on as an investor. This article will get you up to speed and will provide you with a foundation from which to expand your knowledge. Moreover, I will provide some accompanying data to grasp the potential of Big Data and to see how this trend is related to other technological (mega) trends. As with every new trend, understanding the challenges and threats is key to making sound investment decisions. At this point in time, Big Data may still be somewhat hyped, but this is bound to change. In the long run, the potential of Big Data will be beyond anyone’s imagination.

The basics

The amount of data increases exponentially. Virtually everything we do on the internet these days is recorded, saved and stored. There is tremendous value in understanding and analysing this data for any company.

However, there is so much data that our traditional computers just do not have the computation power to process it all. For a better understanding of Big Data, it is very important to elaborate on this limitation, which is typically illustrated by the three V’s (see figure 1).

 Figure 1: The Three V's of Big Data

The first ‘V’ refers to the exponentially growing volume of data. Variety is the second ‘V’, which poses a much bigger challenge than volume. The variety of data has evolved from plain text to images, audio, video, locations and sensor data. The vast amount of different types of data has made it increasingly demanding to analyse said data. Lastly, velocity is playing an increasingly important role. Nowadays, managers demand real time modelling of changes to sales campaigns, which puts enormous pressure on current systems. These developments will increase the need for companies to constantly renew current computer systems in order to stay competitive. 

Now that we have established the essentials of Big Data, it is time to analyse how this trend relates to all the other technological developments. The easiest way to understand this is to visualize Big Data as a large umbrella, under which all these technological trends are placed. In my previous article, I elaborated on one of these trends, the Internet of Things. It is estimated that the IoT will encompass thirty billion connected devices in 2020, which will generate tremendous amounts of data. This data will contain precious information about its users, which will fuel the urge to analyse and visualize everything in order to exploit the full potential of Big Data.

The potential

Big Data has the potential to be extremely valuable for companies. Experts even argue that data is – or should be – one of the most valuable assets of a company, and that data must be seen and handled as a core product, instead of a supplement. This can be done by creating a separate supply chain for data. The input to this chain is the raw data, which needs extensive processing – including Big Data tools, analytics apps and visualization software. The data will steadily gain value as it continues through the supply chain, and the eventual output will consist of valuable insights into company’s processes (see figure 2).

Figure 2: processing Big Data  -  Source: Sogeti

However, I do have to stress that the efficacy of this approach completely depends on the business you are in, as seen in figure 3 and figure 4. The health care sector is prone to massive cost reductions. Real-time screening of patients and more efficient inventory management will cause enormous cost reductions. McKinsey estimated that Big Data makes it possible for health-care spending to drop by $300-$450 billion (12%-16% of total baseline in 2011), which McKinsey voiced to be a conservative estimate.

Figure 3: Potential of Big Data per sector

Moreover, Big Data can be useful for a cluster of sectors, including retail, industrial and air transportation, as can be seen in figures 3 and 4. Big data will enable more efficient marketing, better in-store placement of products, personalized offers, a better match between supply and demand and real-time status reports on warehouses.

Figure 4: Forecasted productivity increase by Big Data

Big Data will lower costs for the government, the utility sector and the agricultural sector. Big Data can also enable closer monitoring of markets to identify sales opportunities, help recruit staff, forecast sales, predict wear and maintenance, detect fraud, estimate financial risks, improve products and develop innovative products. In short, the possibilities are endless. Therefore, I will approach this overload of potential, as Big Data is handled by analysing and visualizing (see figure 3,4,5).

Figure 5: Forecasted impact of Big Data

Big Data today

The term Big Data has been around for a few years now and we are getting used to it. However, this trend has not topped yet. Wearables are gaining popularity but still have a long way to go. Only when the IoT and the analysing and visualizing capacity for large data sets improves can Big Data really gain pace. This pace refers to different stages of development, as can be seen in figure 6.

Figure 6: Development of Big Data - Source: Nationaal Big Data Congres 2014

In the world of Big Data, Apache Hadoop (Hadoop) is something that cannot be neglected. Hadoop is an open-source software framework for processing and storing demanding data sets. Open-source means in essence that the software is free to use. A global community builds on Hadoop’s framework in order to develop it. This seems to work, since key players like Yahoo and Facebook use Hadoop’s services. Hadoop has several interesting high-tech modules, of which I would like to elaborate on one, the Map Reduce (MR). MR (see figure 7) is a module for processing and generating large data sets based on a parallel distributed algorithm that executes multiple simultaneous analyses, instead of just one at a time. Moreover, it categorizes the data into groups (mapping) and then summarizes the total output into a concise solution (reducing). This allows large data sets to be processed faster and more accurately. By doing so, Hadoop fuels the revolution of data and enables the growth of this industry.

Lastly, I want to shed some light on the hype around Big Data. Gartner developed a tool called the “Gartner Hype Cycle” to assess in which stage of the Hype Cycle technological trends are (in 2014), see figure 7. In the first stage of development, only the ‘geeks’ paid attention to Big Data. Soon after the public joined and Big Data got some serious publicity. As always, this is starting to fade. As figure 8 indicates, we are now leaving the peak of inflated expectations and entering the phase of disillusion. I strongly believe that Big Data will endure this phase, after which the full benefits can be reaped. The obvious question is when these phases will be met. Researchers estimate that this will take at least five to ten years.

Figure 7: The Gartner Hype Cycle 2014  -  Source: Gartner

Challenges

As with every trend, an investor should also analyse the downside potential.

In a recent report, Gartner found that 80% of all Fortune 500 companies will be unable to use Big Data as a competitive advantage. Big Data is perceived as something essential, but this is totally dependent on the business. Big Data is costly because it requires constant reinvestment to upgrade systems. In short, Big Data may prove a total waste of scarce resources. Moreover, many managers see Big Data as a new strategy. While this may sometimes be the case, I’d rather regard it as a complement to an existing strategy.

Some ethical issues have to be taken into consideration too. When at some point in time all this data is flowing through the cloud, privacy becomes virtually zero. Governments will be able to track civilians so closely that criminals can be arrested before they even commit a crime. That is why privacy issues are to be monitored closely. Then again, security is a persistent problem of the cloud. Businesses worry that due a breach in security valuable information will leak out, and they are therefore reluctant to go all the way in this trend. These are both issues that need serious attention in order for Big Data to reach the next step.

As data supply keeps growing, the pressure on hardware and software will keep increasing. Software engineers will need to design programs suitable to the scale of Big Data. High grade algorithms and data visualization software need to be developed in order to harness full potential.

Cutting edge hardware is needed to support and process all the data. At present, this does limit the scope of Big Data. However, multiple factors do indicate that computation power is still increasing (see figure 8). It is expected that in 5 to 10 years, ‘traditional’ computers will match the super computers of today. Then these computers will possess sufficient capacity to process, analyse and visualize the data, thus delimiting the scope of Big Data.

Figure 8: The development of compute powerd  -  Source: AMD

Conclusion

This article aims at explaining some of the basic concepts of Big Data, and is intended to serve as a basis for further research on this topic. Big Data poses many questions. Not only the amount (volume) of data increases exponentially, but the diversity is much larger (variety) and the demanded speed of this processing is enormous (velocity). This ocean of data does present benefits, as the potential is endless and many sectors will be able to realize immense cost savings. Companies like Hadoop enable the world to fully exploit Big Data and to develop it. However, Big Data is not quite there yet. Challenges like security, ethical issues, the growing data supply and the hype around Big Data all need serious attention. Still, I believe that with proper policy making and persistent developments, Big Data can have drastic consequences for our society. So, to conclude with a famous quote by Peter Sondergaard of Gartner, "Information is the oil of the 21st century, and analytics is the combustion engine".

Enjoyed this article?

With your help, we can keep this website ad-free. Support Foresight Investor with a donation!


Karsten

Karsten Niemeijer

Karsten is currently studying finance and is a member of Risk. He organised the Risk Finance Symposium and invests on his own account. He specializes in equities, technology and emerging markets.

This article provides opinions and information, but does not contain recommendations or personal investment advice to any specific person for any particular purpose. Do your own research or obtain suitable personal advice. You are responsible for your own investment decisions.

Up Next

Why the US Treasury Rally is Over

Read more