Trends in Big Data Technologies and Analytics — Opening Statement
CUTTER BUSINESS TECHNOLOGY JOURNAL VOL. 30, NO. 10/11
Cloud and fog, swarm and bees, chef and colors. Are these the latest trends in big data? Really? When we survey the multiplicity of acronyms encompassing big data — AI, ML, IoT, SaaS, IaaS, and even AaaS — the mind boggles in a way that puts Brownian motion to shame. And when we couple this with the subjectivity of the meaning of big data, we are left wondering where to start when embarking on big data’s latest trends. Regardless, we must discuss the array of new technologies, processes, frameworks, and applications in big data to fully understand the industry’s trajectory, evaluate its new developments, and ensure ongoing value to business.
Big data itself is a trend. Numerous businesses with vast amounts of data are questioning how they can gain value from big data. The volume of data easily extends into petabytes; its velocity is high as humans and machines compete to generate large amounts of data within the shortest time; and the variety is diverse, comprising structured and unstructured text, graphics, audio, video, and data generated by machine sensors. Furthermore, this challenging mass of data loses value if its meaning is not discovered and acted upon quickly.
These challenges — from a business perspective — are further exacerbated by the term “big data” itself, which means different things to different people depending on perspective. For some, big data is a gateway to new opportunities; for others, it’s a means to manage risks; for even others, big data is a tool to improve business sustainability. Often, big data is associated with two keywords: analytics and technologies. Analytics comprise a selection of impressive statistical techniques resulting in descriptive, predictive, and prescriptive analytics, while trends in technologies comprise devices (especially Internet of Things [IoT] devices at both the consumer and the industrial level) and storage mechanisms (notably the cloud and its variants). Spanning across both analytics and technologies are the machine learning (ML) algorithms that dive deep into the oceans of data to pull out the pearls of insights that can provide immense value to decision makers. But how can such analytics and their accompanying technologies help make sense out of the immensity of big data? What are the challenges and risks associated with such initiatives? This issue of Cutter Business Technology Journal explores various angles to big data with a focus on the trends in predictive analytics, ML, IoT, and the cloud.
Strategic Use of Big Data Technologies and Analytics
Predictive analytics is the application of various statistical techniques on big data to make future predictions. The study of historical and transactional data helps identify patterns that can predict risks and opportunities. The evolution of predictive functions closely accompanies the corresponding technical advances that enable cheap storage of vast amounts of structured and unstructured data and its fast processing. Various industry verticals, including hospitals, education, airlines, fraud detection, and weather make use of predictive analytics. Each application is based on a probability score that can be incorporated into the decision-making process.
Machine learning closely links to predictive analytics to effectively enable the utilization of computing capabilities to “teach and learn.” Instead of simply following a static algorithm, computing algorithms “come alive” by modifying and enhancing logic based on the identification of complex and dynamically changing patterns. In essence, machine learning provides computers with artificial intelligence (AI). While AI is challenging, advances in big data technologies make ML relatively easy to execute — making it possible for businesses to incorporate such learning in their decision-making processes. Machine learning enables the development of self-learning algorithms that transcend fixed, static programming. The application of ML in business complements the use of predictive analytics (e.g., by helping identify sales and marketing priorities based on learnings from a dynamically changing business environment, by detecting network intrusions based on learnings from existing and ongoing intrusions, or by optimizing supply chains based on continuously changing parameters).
Then we have the Internet of Things, which is rapidly emerging as a transformational paradigm in which several entities (“things”) have the capabilities to sense or control — and to communicate and interact over the Internet — to achieve an objective. Many IoT-based products and services are now available and being deployed in manufacturing, retail, healthcare, smart cities, and more. The potential of the IoT is both transformative and disruptive. It can radically transform business processes, government policies, and individual behavior. For instance, a wearable device like the Fitbit (by sensing certain parameters of interest, analyzing them, and integrating the results with other apps) can change a person’s behavior — making him or her climb more steps than usual or adopt other wellness practices. The mere deployment of IoT-based devices, however, would have only limited value unless both the business and the user benefit by deriving improved insights from the massive volume/variety of data from the IoT. Big data technologies (e.g., Hadoop/HDFS, Spark, NoSQL) help manage IoT data. To gain value from the huge amount of data an IoT application generates, we need to turn that data into meaningful and actionable insights through appropriate data analytics.
Finally, the cloud represents shared resources and services (e.g., infrastructures, platforms, software, and analytics). Cloud computing enables users to connect to a vast network of computing resources, data, and servers that reside somewhere else, usually on the Internet, rather than locally. The actual execution of the applications and the analytics also occur on the cloud. Thus, cloud computing obviates the need to install software and analytical applications locally. Thus, computing becomes a utility, wherein analytical applications are available on demand. The distributed nature of this approach to computing allows for numerous opportunities. For example, capacity planning is easily sidestepped since big data analytics can be offered as a service on the cloud, enabling businesses to plug in their decision-making processes. Plus, the cloud enables vastly improved collaboration between systems than locally hosted software and data.
In This Issue
For this issue, we were fortunate to have received a plethora of contributions from experts — both researchers and practitioners — who not only highlight the trends in the vast and ever-expanding field of big data but also demonstrate the applicability of their thoughts through techniques, experiments, and practical discussions. I firmly believe that a careful read of this issue will open your eyes to the many possibilities and challenges that exist in the domain of big data.
We begin this issue with an article by Greg Smith, Michael Papadopoulos, Andreas Macek, and Andrea Solda, who offer some insightful thoughts on how machine learning can help extract value from big data. The authors begin their discussion by “illustrating the limitations of current methods and human intellect across the 4 Vs (volume, velocity, variety, and veracity)” and the barriers that can block the extraction of the 5th V (value). Their article further highlights some excellent ML use cases at cutting-edge companies.
Next, Santhosh Ravindran and Fiona Nah explore utilizing prescriptive analytics to enhance business processes, focusing on ML algorithms. They highlight how “the application of prescriptive analytics in business operations can help to not only optimize production processes and automate workflows for business processes, but also analyze, predict, and position products.” While clarifying how “the foresight offered by prescriptive analytics enables organizations to make major decisions in a short time frame with greater accuracy,” Ravindran and Nah rightfully direct our attention on the importance of embedding such analytics carefully and iteratively within business processes.
In our third article, Cutter Fellow Vince Kellen sheds light on the significant impact of big data in his domain of expertise: education. He highlights the “multiple roles of higher education in the evolution of big data, including creation of large data sets in research; predictions of data growth; techniques for storing, managing, curating, compressing, and computing with big data; engineering of the hardware underlying big data systems; and analysis of big data using ML, neuromorphic computing, and AI.” Kellen then examines how universities can use big data to improve teaching and learning and describes the challenges involved. He concludes by offering suggestions for strategy development as universities incrementally apply big data to their core enterprise, education.
Next, Andrew Guitarte describes the paradigm of big data analysis and how it “shifts from manual to automated, dependent to autonomous, isolated to context-aware, product-driven to needs-based, batch to real-time, and static to streaming.” This paradigm shift is important especially in the context of automated wealth advisory services where the organization of otherwise unstructured data is vital. His article takes us into the realms of business capability architecture (BCA) and presents a cognitive and heuristics-based emergent financial management (aka CHEF) tool that can be used effectively in emergent decision-making processes of bank employees, shareholders, and their customers.
In our fifth article, Cole Lyman and I touch upon a crucial challenge in the big data space: identifying the right questions for it. As data continues to explode, not only are businesses struggling to find answers to business questions, they often cannot even determine what questions to ask of their data. Thus, we discuss a practical experiment on classifying genetic data using a colored de Bruijn graph and show the application of this technique in the business world.
Next, John Collins and Sunita Lodwig focus on the challenges of security in the Agile deployment of big data applications in the cloud. Security issues can make or break the deployment of otherwise complete solutions in the cloud. The authors begin by introducing the topic of “microsegmentation” — which allows “public cloud-based infrastructure as a service (IaaS) providers to offer software-centric or software-only solutions.” The value of this discussion to business lies in the opportunity to quickly deploy secure models for
Once we start digging deeper into the mines of big data, the challenge of making sense out of unstructured data (e.g., text, graphics, audio, and video) comes to the fore. Clustering large amounts of unstructured data into relatively smaller chunks based on some similarity is at the crux of unstructured data analytics. Here’s where “swarm intelligence” can play a role — an application that drives neural networks for clustering unstructured data. In our seventh contribution, Tad Gonsalves and Yasuaki Nishimoto offer an excellent example of swarm intelligence via a model that learns to classify Web documents. We can easily apply this same algorithm to business, medicine, defense, and supply chains, to name a few other areas.
Next up comes an article by Matthew Ganis and Frank Coloccia, who eloquently explain that analytics is not a perfect science. Indeed, the authors judiciously argue that analytics is more an art than a science. They demonstrate their analytical approach with a direct and practical example of using Twitter data in real time to understand the sentiments (positive or negative) of trade show participants and to gain insights from them.
Finally, Giti Javidi, Ehsan Sheybani, and Lila Rajabion present a convincing argument for redistributing the cloud. They suggest that the cost and time associated with transmitting data to the cloud, processing it there, and returning the results back to the IoT devices are critical. Fog computing, as the authors explain, both enhances and complements the cloud by bringing the processing closer to a cluster of IoT devices, resulting in faster analytics.
I am personally delighted to have had the opportunity to fulfill the role of Guest Editor for this issue of Cutter Business Technology Journal. I have learned a great deal from this endeavor, and I’m sure reading this collection of articles will enhance your knowledge of big data, too.
More: Articles Like This
- Trends in Big Data Technologies and Analytics — An Introduction
- Compliments to the CHEF: A Cognitive and Heuristics-Based Emergent Financial Management Tool
- Extracting the 5th “V” for Value in Big Data Strategies
- Using the BCA to Innovate with Business Architecture
- Agility as Value: The Strategic Adoption of Big Data