What is Big Data?

Big Data has been in existence for some time but there is still no exact definition on what it really is. It continues to evolve as it is the main boost for various digital transformation that includes data science, artificial intelligence, and the Internet of Things (IoT).

Big Data began with the constant bombardment of data that has been generated since day one of the digital era. Reasons behind this exponential explosion include the rise of computers, the World Wide Web, and technology that allows information to be captured from the real world that converts it to digital data.

Source of Big Data

Data is generated whenever a person goes online, or through the use of a smartphone equipped with GPS, or when friends update themselves through chat or social media platforms, or even when one does online shopping. Users leave digital footprints with almost every digital transaction.

There is also a rapid increase in the amount data generated by machines. Generation and sharing of data happens when home devices communicate with each other or with their home servers. Plants and factories have industrial machinery that also gather and transmit data because of the sensors that these have. There is also a possibility that streets will see self-driving cars that transmit real-time data.

How does Big Data work?

Big Data operates on the concept that the more information a person knows about something, the better insights and predictions can be made about what will happen in the future. Analyzing these data points will enable the establishment of relationships that can help people make better decisions.

This process is done by building models, basing it on data that is collected, running simulations, tweaking the value of data points, and assessing how these will impact the results. The advancement of the technology regarding analytics allows the process to be automated. Technology can run millions of simulations that will tweak all possible variables until it provides a possible solution on just an insight on the problem in question.

Data in Unstructured Form

Most data that is generated comes in the unstructured form. Thus, it cannot be simplified into simple tables with rows and columns. Many of these data is in the form of videos and pictures. These could either be photographs uploaded to various social media platforms or even satellite images. Big Data utilizes machine learning and artificial intelligence. Machines can be equipped to detect patterns by teaching them to identify the data being represented.

How is Data Analyzed?

Data can be analyzed through different processes. One of these is stream processing. It involves real-time processing. It operates on a continuous stream of data composed of individual units. In-memory computing characterizes real-time processors as this represents data in the memory of the cluster.

Significant to stream process is streaming analytics. This is the capacity to constantly solve statistical analytics while it moves within the data stream. It permits monitoring, management, and real-time analytics of data streamed live. Software like Apache Kafka  provide efficient data streaming for applications and messaging systems.

Streaming Analytics is composed of knowing and acting upon events that happen to a business at any given time. Since there is a continuous happening of Streaming Analytics, then companies must be active to quickly act on the analytics data before losing the value of the data. These data can come from mobile phones, Internet of Things (IoT), market data, Web clickstream. When data loses its value, this can result in added costs that include reduction of the competitive edge of a company, the inability to make sound decisions, decrease in productivity, potential legal action, reputational damage, business risks, operational, and administrative concerns.


Big Data is constantly evolving as a field in itself. A basic understanding on what it is can help you use it more effectively in your everyday life, wherever it may apply.


By: Kevin Faber