Shine the light on your dark data

agosto 15, 2016

What does this blog post cover?

  1. What is dark data?
  2. Not another BI buzzword: read about a business scenario in which dark data was important
  3. What does dark data have to do with my analytics strategy?

What is dark data?

With today's powerful new data discovery tools, companies have more options than ever before in regards to the complexity, amount, and origin of the data they analyze. Indeed, it seems that many companies just can't get enough. There are two options companies have when searching for more data to analyze: expand data discovery into the world of external data or search for insights that are hidden within the data that already exists in your company but is not currently being tapped for greater insight and analysis. This is known as dark data

Many are excited by the idea of big data – it’s new! It’s exciting! It’s sexy! But here’s why you should instead first focus on bolstering your current BI strategy with the dark data that exists within your company.

For the record, TARGIT Decision Suite supports both integrating and combining data from your data warehouse as well as external data from almost any data source imaginable. If you want external data, you got it. But first we want to be sure you’re placing enough value on the useful data your company is producing.

Not another buzzword

Let’s say you’re a manufacturing company that is humming along with your business processes. You have a solid BI and analytics strategy set in place that delivers the daily reports and analyses you need to measure, monitor, and make decisions. But then something happens. A quality issue has suddenly put you at risk of losing a valuable customer. Your existing analysis flags the problem: One of your production lines is less effective than the others. But your analysis doesn’t give you the information you need to figure out why.

It’s easier to figure out how to harness the unused data you already have than diving into the world of big data, 99.9 percent of which is irrelevant to your business in the first place. That’s not to say harnessing internal data doesn’t involve a little bit of trial and error and experimentation. That’s the BI lifecycle. In particular, the practice of sandbox analytics within the BI lifecycle is designed to help bring dark data into the light.

dark data

With sandbox analytics, a small group of BI users experiment with potentially useful data. If that data does, in fact, prove valuable, only then is it distributed for greater use throughout the organization. The ability to play with big data sets and analyze them on top of what’s already in the data warehouse encourages employees to think strategically without the need to pull in IT. This is bimodal BI.

Let’s go back to the example of that manufacturing company that doesn’t have insight into their quality issue. There are a number of possible reasons for a sudden dip in quality product reaching the customer. Perhaps it’s a worker satisfaction issue that is causing an increase in mistakes. Could it be a shift time issue? Would this have to do with the hours of the shift itself or the manager of that shift? Can it be narrowed down by employee? What about the supply chain?

With these hunches in mind, it’s time to dig into existing data to see what, if anything, supports these hypotheses. Let’s go with the inkling that a drop in quality might have to do with an issue regarding shifts. The current dashboard that displays employee shift data only includes the hours per week that each employee has worked, and doesn’t tell you which shift those hours correspond with. The company’s HR system tracks this, but that type of data isn’t currently set up in an existing data model for analyses. Not to worry.

A comprehensive data discovery tool will allow users to pull data that isn't already available for analyses and lets users mash it up with the data already being used in current reports and analyses. By mashing up the time each employee on this particular production line clocks in and out each day along with the data already used to analyze shift data across the company shows that these employees are alternating between day and evening shifts much more frequently than employees on other teams.

This type of shift switching seems to not only affect productivity, but likely also causes general fatigue and overall employee dissatisfaction, likely the root cause to this dip in overall effectiveness. There is an evident negative correlation between shift times, employee satisfaction, and production quality. This previously dark data can be moved into an existing data model now or at a later time if so desired.

Having identified the problem, the company can now fix it and, of course, monitor the progress with BI. From here, a scorecard is created to monitor hours, shifts, and product quality across all teams. Now that a root cause is in focus, new best practices can be applied across all production teams to improve even already high performing teams.

Dark data and your analytics strategy

The process of uncovering internal dark data requires minimal hand holding from the IT specialists within the company, as BI users can create an experimental environment on their own with the right tools. This is significantly more cost effective than creating a full development cycle around what might be a bunk hypothesis. If a hunch doesn’t prove to be useful in “sandbox mode,” the hypothesis can be rejected and users can move on to the next hypothesis quickly, without the use of many resources.

This is the OODA Loop in action. The company switched from analytical mode to decision and to action, and right back to analytical. The cycle continues and company health improves overall. Uncovering a little bit of dark data in a relatively short period of time – days, as opposed to weeks or months that it would realistically take for a business analyst to create a potentially valuable analysis with new, external data – adds tremendous value to the BI project as a whole.

This guide will help you understand the OODA Loop better so you too can integrate an increasing number of fast decision cycles into your company’s strategic initiatives: The Better BI Strategy.

Action LoopShining light on the right dark data can elucidate more than you might originally think. So next time you need to go digging for more data, first consider flipping on your flashlight.


Mads C. Brink Hansen

BI Project Manager
"From small to large, from self-service to centrallyimplemented BI ..." Throughout my many years of working in the BI world, I've been involved in a large array of BI project -- from small to large, from self-service to centrally implemented BI. I've served as developer, architect, and project manager. In addition to being a BI practitioner, I am external lecturer in ..
Continue Reading...

We use cookies to improve your site experience, but they also provide us with information on your use of our website.
To find out more about the cookies we use and how to delete them, see our Privacy Policy. By continuing to browse the site, you are consenting to our use of cookies.