IT Chronicles (2019)

Big data has grown tremendously rapidly, in recent years, to an unprecedented position leading to data to attract more attention and to be used, in many different ways, than it did with table or structured data. However, it is also creating unprecedentedly new challenges including epistemic challenges.

Research on Big data identified challenges that are not only technological but also epistemological challenges related for instance to the establishment of theoretical and conceptual frameworks to scale inferences and machine learning algorithms. Sampling paradigm has been also changed under Big data as more data doesn’t inherently remove sampling bias, the volume may hide sampling error of different type [1]. Big data is also biased towards analyzing things that are more common, but often falls short when analyzing things that are less common wrote Marcus and Davis. This leads to an overlook of rare and yet important, entities or items, as per the expression not be able to find the sought-after needle in the haystack.

There are concerns that these epistemic challenges, if not addressed, may slow progress in innovation and delay the development of future Big data applications.

“The trouble is, we don’t have a unified, conceptual framework for addressing questions of data complexity…Big data without a “big theory” to go with it loses much of its usefulness, potentially generating new unintended consequences.” Geoffrey West (2013)

Urgent and persistent need of more people

While there is a need for more concepts to address big data epistemic challenges, there is also an urgent need of new breeds of Big data savvy professionals to catch-up with the technology, which is taking far major leaps ahead[2]. The 2011 McKinsey Global Institute report predicted a 50 to 60 percent gap between the supply and demand of people with deep analytical talent. In the USA alone, by 2018, the report estimated a shortage of 1.5 million people to help turn data into insights. According to the 2015 Canadian Information and Communications Technology Council (CICTC) Report titled “Big data & the Intelligence Economy” the supply of data scientists and related professionals is also well below today’s industry demand [3]. Similar skills gaps were also reported in Europe where big data’s shortage in skilled manpower is having an impact on the labor market[4, 5].

This new e-book “Subtle Challenges of Big Data” intends to address big data’s challenges and in particular to address epistemic challenges. I believe that we need to address big data epistemic challenges along with technological challenges, in both public and private sectors, and to catch up with both shortages in skills and concepts to not be caught in the middle of rogue waves of big data.

Trend of data engineers (in the USA) based on posted information on LinkedIn — Source: Stitch, Inc. (2017) The State of Data Engineering [6].

This eBook focuses on the epistemic challenges of Big Data.

It also highlights the importance of the interplay between data and theory and between a priori and a posteriori in analytics in light of cognitive analytics.


[1] Fan, J., Han, F., & Liu, H. (2014). Challenges of Big Data Analysis. National Science Review, 1(2), 293–314.

[2] Steven Miller and Debbie Hughes (2017) Burning Glass Technologies 2017

[3] The Canadian Information and Communications Technology Council (ICTC 2015)

[4] Eleanor Smith. Big data: labour market Posted in Blog, Social Sciences

[5] European Data Forum — a recap from the BDE perspective (2015)

[6] Stitch, Inc. (2017) The State of Data Engineering)

OperAI develops IoTs with Math and AI Embedded Solutions to speed up and streamline operational processes at the edges of the cloud.