Business intelligence has always presented a tricky “Catch-22” scenario. Businesses need to analyze information in near real-time so they can react quickly while there’s still time to seize upon revenue opportunities or to rectify operational inefficiencies. But businesses also need accurate and precise analytical information. Otherwise, the actions they take may go in the wrong direction.
Traditional relational data warehouses solve part of the problem. They are great for addressing the needs of accuracy and preciseness. They are also strong when it comes to governance, security and protecting data integrity.
Traditional data warehouses are built upon routines that extract data from multiple sources, cleanse and transform the data into a structured format, and then load the data for generating reports. Non-technical users can easily query the warehouse databases to find the information they need to make business decisions.
However, traditional data warehouses also take a long time to build—the extraction, transformation and loading (ETL) of multiple sources requires complex coding. If a business needs immediate access to new sources of information, the amount of time it takes to build the required dimensional database may take too long in order to take full advantage of a new opportunity or to correct operational issues.
There’s also the question of exactly how the business will utilize a particular set of data. Managers and end users may not be able to clarify in advance how they want to use a data set until they see the actual information that’s generated. But if that can’t happen until the warehouse is built, the data models that developers create may not meet the needs of the business, and the coding may have to start all over again.
Traditional data warehouses have typically also operated on-premises, which means they require time to scale compute resources. So if there’s a sudden spike in incoming data, the data warehouse servers may not be able to handle the workload.
Another factor that comes into play is unstructured data—ranging from social media to e-mail, word processing documents, flat files, search results, videos, photos, audio files, presentations, and website pages. In today’s digital economy, businesses now often need to analyze the content within these data types. But these files typically can’t be processed by traditional databases.
Enter the “modern data warehouse” as a front-end compliment to traditional data warehouses. Modern data warehouses have emerged in recent years due to changes in data sources and data from unstructured sources such as social media. Other contributing factors include the volume and complexity of Big Data, the increased demand for analytics from business users who want to see data more quickly, and technology improvements in cloud storage and analytics.
Modern data warehouses utilize “data lakes” that can take in both structured and unstructured data in raw format. This approach can be deployed faster because it initially postpones the
While modern warehouses may lack the level of data preciseness and accuracy that will ultimately be required, they do enable business users to quickly see patterns in the data that’s available—through layered transformations that run during the
The modern data warehouse thus serves as an initial collection resource where preliminary analysis can be performed on the “rivers” of data that flow in—hence the name data lake. From there, the lake can feed the data into a traditional data warehouse so that over time, the business also gets the data accuracy and preciseness it needs.
Modern data warehouses also lend themselves to deployments in the cloud. This enables them to scale more easily with less IT involvement so businesses can efficiently ingest Big Data sources and accommodate spikes in data feeds. And in the cloud, you typically only pay for the compute resources you consume, thus potentially reducing data warehousing costs.
As the scenarios presented above illustrate, modern data warehouses are not a replacement for traditional data warehouses. Rather, they are a compliment that allows businesses to benefit from the best of both worlds:
By combining modern and traditional data warehouses, your business can conduct BI in a hybrid mode— where you can leverage the rapid deployment and scalability of a modern data warehouse alongside the accuracy of a traditional warehouse.
You also have the flexibility to approach data analysis incrementally; instead of spending a lot of time, effort and cost up-front on hardware, integrations, BI tools and database structures for a complete data warehouse, you can start with just one or two use cases and generate value from the data sooner.
Ultimately, you will become more agile in reacting to changes in business conditions. You will also generate information that’s on-target so you can manage business operations more effectively.