The Optimized Data Stack: Strategies for building a budget-conscious data stack

By Demilade Agboola, Senior Analytics Engineer @ Data Culture

Building a robust data infrastructure is not only a complex initiative for organizations, but it can easily blow initial budgets out of the water. Ongoing use cases and changing needs can seem overwhelming or almost impossible to tackle, however with the right planning, it will prove its return on investment significantly. Companies that further optimize their data infrastructure ensure that they are getting exponentially more than they are putting in. By visualizing current data stack costs, leveraging open source solutions, improving data warehouses, and putting a focused effort into forecasting future data needs data teams can get in front of modern data stack cost optimization.

Data and the growth of data

While the increase in the availability also corresponds to an increase in the utility of data across businesses, a massive concern to business teams is the potentially high cost to set up the necessary infrastructure to gain insights into their operations, customers, and markets. These added insights can enable leaders to make better decisions backed by objective data that improve customer experience, resource allocation, and business growth while also increasing the chances of the business surviving a harsh economic climate. Such insights should, however, not come at an unexpected cost to the company.

The use of the modern data stack

The concept of the MDS has grown because of the now reduced technical requirements, its scalability, flexibility, improved efficiency, enabling collaboration and improved security. The technical expertise required to set up an MDS compared to traditional data stacks is no longer a barrier to entry. Plug-and-play tools allow for less technical skill and knowledge and enable business stakeholders to also engage in data initiatives. Storage and computing resources are made available through cloud-based services making it easier to share data and collaborate more easily with partners, further facilitating the exchange of ideas and insights and removing on-premise hurdles. This combined makes implementing a data stack possible for teams that understand the potential offered by the added ease of access.

The Modern Data Stack (source: Fivetran)

Cost of the data stack

Many of the tools and technologies used in the modern data stack are proprietary software products that require organizations to purchase multiple licenses for use. Some business needs require migrating legacy systems to cloud infrastructure to handle large amounts of data to then be computed on and stored, which can also be expensive. This also requires skilled personnel to design, implement, and maintain these new systems. Labor costs that come with these implementations can include salaries and benefits for employees, as well as the cost of training and professional development adding to the initial assumed total cost for the overall efforts.

The Optimized Data Stack

Measuring Costs

Collecting the necessary metadata from these tools and funneling them into the data warehouse to estimate the cost in total ownership is a great first step. This data can then be fed into a business intelligence (BI) tool like Tableau or PowerBI which will visualize the rolled-up pricing to the necessary business and data owners. An accessible dashboard for cost builds confidence and synergy within the company as surprise surges in cost don’t appear at the end of the fiscal period.

Open Source Tools

However, security is one of the main concerns with an open-source solution, as vulnerabilities of the software can be seen from the codebase. Open-source software may also lack some of the flexibility that paid tool options come with out of the box. Ample research is required to determine the advantages and disadvantages of whatever tool is selected. This research then needs to be merged with the use case. As an example, if a BI tool is only used by a limited number of people, there might not be an advantage to paying $20,000+ for a tool when a free BI tool will provide similar value despite slightly less functionality, while still meeting the overall business requirements.

Evaluating Data Warehousing

Putting a focus on optimizing data warehousing needs also allows for future scalability. As your data increases, the impact on monthly invoicing isn’t as high as it could be if this is a consideration from the start. There is a cascading effect that, if left unchecked, will snowball as pipelines will run continuously for years. If data teams don’t accurately establish these pipelines initially or if they’re never modified, data warehousing alone can eat through yearly budgets in just months.

Forecasting Future Needs

Forecasting data needs requires constant research by pooling information from teams across various business lines and an ongoing learning effort on the ever-changing data landscape to ensure that the data stack is as cost-efficient as possible. It also helps to monitor tooling prices (both increases and decreases), especially tools that are new to the market.

Implementing, testing, and revising your data strategy is integral to proving its value over time. There’s never a one size fits all approach and many levers can be pulled as you take on internal conversations surrounding strategy for your data infrastructure, and you don’t have to start from scratch. Leverage what you have and improve upon your data initiatives with a focused and organized effort.

At Data Culture, we are passionate about meeting organizations where they are in their data maturity, helping unlock the value of their data, and strategizing optimal ways to implement, measure, and gain business value from Modern Data Stacks. From assisting in growing out data teams to guiding data infrastructure projects, our team of experts is positioned to assist organizations in navigating the data landscape to meet internal data goals that drive business decisions.

--

--

We help organizations build data capabilities and get value from their data.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Data Culture

We help organizations build data capabilities and get value from their data.