One of the most important steps in a successful BI implementation is selecting the type of BI solution design that works best for your company. Whereas the traditional solution builds BI on top of a data warehouse, today’s modern BI and Analytics strategies need not only the option of the data warehouse BI power center, but also one for a self-service approach. A truly robust solution will offer both.
Just as BI is not a one-size-fits-all solution for those who use it, neither is solution design. That’s why it’s critical to understand the difference between each of the options and how to determine which one is right for your company.
In traditional BI strategies, data from disparate systems – CRM, ERP and other data sources– is loaded into a data warehouse. There can be hundreds, if not thousands, of data tables within a data warehouse. Marketing data, customer information, sales numbers, finance data, and so on are all stored here.
From there, BI developers select which dimensions and measures they would find most useful for analyses and consolidate those into pre-defined online analytical processing (OLAP) cubes. The BI platform then pulls from these cubes to process reports and analytics on company metrics.
You can also connect the BI solution directly to the data sources. This type of relational database calculates measures and dimensions on the fly (real-time), instead of using pre-aggregated values from the cube.
Data warehouses are robust repositories that guarantee that everyone has the same version of data to analyze and share throughout the company. A data warehouse maintains high data quality and consistency, as data is restructured for excellent query performance within the cubes.
The data warehouse ensures that no matter where the information is coming from, it will be delivered to information consumers consistently from a single common data model.
While a data warehouse set up is extremely powerful, it’s also highly cumbersome. In order to maintain quality, all new data must be tested and verified before being moved into the data warehouse. This is the territory of highly technical skilled information designers. Changing or adjusting anything within the data warehouse is a lot like opening up the hood of a car and working. You can’t do it if you’re not a specialist. And if you do, it’s easy to create significant problems.
This makes experimentation difficult. Data prototyping is typically given lower priority to problems or data discrepancies, so from a practical standpoint, experimental projects regularly get bumped from the IT data warehouse queue. Clearly, this isn’t very helpful when decisions need to be made on the fly.
A self-service analytics tool like TARGIT’s updated Data Discovery module is an add-on to a traditional BI platform, and combines the benefits of the secure data warehouse with the modern flexibility and experimentation users want and need. This tool serves as a mixing pot, allowing you to pull in and combine data on top of the data that exists inside the data warehouse in a single report or analysis. With this type of tool, sandbox analytics and data prototyping can be done outside of the data warehouse and if found valuable, added in to the data warehouse and pre-defined data models later by IT.
This is a smart move for organizations – especially enterprise companies – that have dozens of ERP systems from which to pull data. Companies need the time and resources to manage a data warehouse, but benefit from assurance of data quality at all times.
The primary difference between traditional and modern BI is the ability for users to pull in external data themselves and perform analytics on without major hand holding from IT. This is self-service BI.
In order to accommodate this external data, a modern BI platform is structurally designed differently. As opposed to the data warehouse that stores data on physical or virtual disks, many companies today choose an in-memory database that primarily relies on memory (RAM) for data storage. Reading and writing from memory is significantly faster than reading and writing from disks, so queries running on an in-memory database – even big or external data -- will be faster.
Much like a data warehouse model, in-memory databases store data from disparate systems such as the ERP and CRM. The difference though, is that in-memory can also pull directly from the data sources from open source frameworks and data lakes.
This approach to BI leads to faster insights and flexibility. Work process speeds are significantly increased, as users can load data on-demand and mash it up with existing data sources. Most self-service tools are also generally more user-friendly than a typical data warehouse setup. It doesn’t take significant training to acquire the skills needed to work directly in the system.
Most importantly, self-service systems allow for near real-time experimentation with data outside the data warehouse. That can be anything from Excel spreadsheets living on a desktop to big data repositories such as Hadoop.
This is incredibly useful not only for individual business users but for departments who want to build models on their own.
Because it is fairly easy for BI users to get their hands dirty with a self-service BI model, it’s significantly easier for something to go wrong. Without a strict, documented method for acquiring and maintaining proper data quality, an in-memory/self-service system can quickly become chaotic.
Gaps in data from exclusions or inconsistencies can skew analyses. That’s because self-service tools lack in the ability to build in data validation. To do so would eliminate the “self-service” aspect of this model.
As such, self-service is a significantly more agile method, but it is also significantly more fragile. Small- to mid-sized companies in particular are often intimated by a self-service application option because they worry about struggling to enforce the necessary protocols and best practices.
A closed loop model solves the issue of a chaotic environment and ensures data governance in a way that straight self-service cannot. Closed loop BI feeds information back into itself, which allows users to see if there are inconsistencies in data.
Sandbox analytics allows users to prototype new data securely without the help of IT. That data then is distributed for user testing where inconsistencies are discovered, or it is sent on to production.
From there more sandboxing is inspired and the cycle of experimentation and data discovery begins again. This closed loop system allows users to verify data before things go into production without technical assistance. Quality assurance increases and users take greater ownership of their data.
Additionally, most data gap problems can be solved with a little bit of duct tape, so to say. Quick fixes with Excel can be pulled in directly and used to close gaps until a better way to submit data to the source system is implemented and/or corrected. Because loading data is so fast, users are incentivized to fix problems quickly. In comparison, any type of change within a data warehouse is tedious and significantly time consuming.
Companies looking for a flexible, user-friendly approach to BI who want to reap the benefits of BI right away and don’t necessarily already have a data warehouse in place.
The importance and power of the self-service BI approach is clear. However, I’m not recommending that companies with an existing data warehouse scrape their data warehouse system to go entirely in-memory/self-service. Rather, I suggest a fix that is truly the best of both worlds for those completely new to BI and those working to evolve their current strategy: a bimodal BI environment.
If your company already has a data warehouse in place, adopting a bimodal BI strategy will bridge the gap between the two solution designs. The integrity of data will be maintained and the agility of a modern system will come into play. Bimodal also allows for a certain amount of self-sufficiency from the data warehouse system that typically involves handholding from IT.
A bimodal BI strategy should not only facilitate traditional business operation—the classic data warehouse—but also discovery and innovation. To do this, a tool such as TARGIT’s Data Discovery module is needed on top of the traditional BI platform, as I mentioned above.
This add-on tool makes it possible to pull in that external data and combine it with data already in the data warehouse. This allows for data prototyping and experimentation outside of the data warehouse with the speed and flexibility of an in-memory solution. It removes IT from the process of new data incorporation and puts complex and external data exploration into the hands of everyday business users.
In addition to TARGIT’s data discovery tool, Microsoft also offers a solution that bridges the gap between traditional and modern solution systems with their Analysis Services Tabular models. This is an in-memory database that eliminates the need for complex ETL processes as data can be loaded directly into it.
Unlike TARGIT Data Discovery, putting together a tabular model is still a task for a technical resource, so this is not truly a self-service tool. The advantage though, is that it enables the benefits of modern BI with the speed of running in-memory but still maintains full data integrity.