Scaling Data-Driven Decision Making
Today, around the world, companies are investing in various methods to use data to help make decisions about their business. There is an explosion in the amount of data available to businesses. There is a huge upside for companies who learn to apply these data insights to drive greater business value.
However, most companies are only reaching a fraction of the potential value from these insights because not everyone in the company actually has access to the insights.
Why is this? There is a disconnect between technical teams, who look at and analyse the data, and non-technical stakeholders (aka everyone else) within an organisation. This is in most part because the tools used are very technical, and converting results to more readable formats for a wider audience remains a persistent hurdle. This results in a lot of one-to-one sharing via emails and presentations rather than larger-scale publishing.
How then can we effectively communicate insights across the organisation? The answer is with a central, formalised, knowledge management system. Take for example Airbnb — they actually built their own open-source tool to tackle this problem and help their company to scale the ability to make decisions using data.
In this article, I will make the argument that every company needs to be doing the same thing to facilitate better sharing and organizing of knowledge to drive discussion and learning across the wider organisation, and to ultimately become more data driven.
- First, I will go through why we share insights and the difference between discovery and sharing.
- Second, I will discuss the challenges that exist with current data-science tooling.
- Finally, we will look at an example business before and after they adopt such a central knowledge hub.
Disclaimer: I am one of the co-founders at Kyso, which is our solution to the issue of undiscovered knowledge from data, a central hub for technical reports. Naturally, I would love for you to use our platform, but I am going to make the general argument anyway as to why you should set up some system for sharing insights — regardless of what that system is.
Why We Share & Discovery vs Sharing
If the goal of collecting data, modelling, and making predictions is to help everyone to contribute to organisational business objectives, then everyone needs access to the insights to make informed data-driven decisions.
How can companies ensure this happens across the entire organisation? A good way is to implement a system in which the process of creating reports from data analyses is as seamless as possible. In this way, the power of their insights is easily discoverable by other relevant agents throughout the business. Let’s call this democratising insights.
We share results to turn insights into action. But if sharing is only happening upon request instead of on a one-to-many based system, business agents across the wider organisation will be less effective in their respective roles. Why? Because the data insights relevant to their positions are not being communicated effectively to them. We want to be driving analytics-based actions across all areas of the business.
A major point of distinction we’ve made here is the difference between sharing and discovery. We all share analyses, typically one-to-one. We screen-share, email, make presentations, discuss results on slack, or over a coffee.
But these insights get siloed within all these different sub-groups. Alice shares with Bob, but the chain stops there. Allowing people to discover your work is really really important because someone else could benefit from these privately-shared insights, someone Alice hadn’t even thought of.
There is a huge benefit for different types of stakeholders across the business having the ability to discover these results, and use them for their own work. Discovery is a key issue that has already given rise to lots of different tools, from general knowledge hubs like Notion & Confluence to other platforms for more technical content like Github.
And therein lies the problem.
Challenges With Data-Science Tooling
The tools that data scientists use, like, for example, Jupyter & R notebooks, are used only by technical members within businesses, and the platforms they use for sharing, discovery, and reproducibility, like Github, are typically restricted to these technical stakeholders.
There is a need for something that bridges the gap, that connects the workflow of the data team to a general reporting system for the entire company.
As mentioned in the opening section, Airbnb built their own solution, which they open-sourced. Stripe also built an internal tooling system — while mostly for their technical teams, the objectives of its development were the sharing of results in a central knowledge repository.
The goal of central knowledge management systems like these is to get more & more stakeholders to better use and interact with data insights, and to facilitate wider communication of these insights across the business.
Let’s imagine a scenario that many of you will be familiar with — an example organisation that would benefit from a centralised knowledge hub, specifically for data insights.
ACME Inc. is an online SaaS company, running a subscription-based model, charging per seat (or user) for teams using their product. They are an online messaging application, similar in some regards to Slack. The CEO, James, wants to get a better overview of their customers’ behaviour & asks Sarah, a data scientist, to make a report.
She starts working in a Jupyter notebook, importing company data from MongoDB, and plots out some key analytics on how users are using the product.
- She plots a simple histogram of team size vs the number of teams.
- She also graphs the number of posts (think articles on Notion or Confluence) by team vs the team size, and the same for the number of comments.
- Sarah discovers that there is a super-linear dependence of the number posts on team size. Teams of >400 are using the platform proportionally more. The result is the same for the number of comments made.
If there is no central place to share this analysis:
Sarah just emails the report to the CEO, who reads it — and that’s it. Sarah might discuss the results with some of her immediate colleagues. The CEO might inform the board or investors. But the insights generated don’t get widely shared within the company because there is no discovery mechanism in place.
What if ACME Inc. does have a central hub?
Given that the entire world is now working remotely, perhaps they’ve even been compelled to set one up. Sarah posts the report to the internal knowledge hub — Notion, Confluence, or Kyso, for example. Everyone in the company can now discover and read it.
- Mary from the product team comes across the report. Sarah & Mary have no direct connection but Mary discovers it on the hub. And being on the product engineering team, the above dependence is really interesting to Mary and her team’s goals. Why does this relationship exist? Perhaps because, as the size of a team increases, so too does the number of connections. Or perhaps it is a political issue — the larger a company is, the more likely a VP of engineering could be pushing internal usage of the app. This insight is clearly valuable to Mary and her team.
- Patrick, from the Sales & Marketing department, also discovers the report. Because Patrick has a fixed monthly marketing budget, he may now decide to focus in on larger team lead sizes, or those that have the potential to be larger. Because the company charges per seat, this is clearly a relevant insight for Patrick.
- How about Barbara from the infrastructure team? Maybe the cost of providing their service is not flat — and is worried that servicing more larger companies is much too expensive so they need to either make their systems much more efficient or she might make the argument internally to focus on smaller teams.
So we not only have a central system for sharing analyses but this system is now also driving communication between different departments. This is an example of a typical company and how moving from no system for curation to having a central place for results can transform how decisions get driven within the company.
All of the different stakeholders in different positions, using slightly different methods, are sharing and communicating their reports and projects on one unified platform.
When data is made available to all the relevant stakeholders (which is everyone in the company) with the right tools to turn the data into actionable insights, the total understanding of the value of data and analytics increases in the organisation. Everybody can start contributing to business value through more data-driven decision making at scale.
In connecting the work of data scientists to the everyday decisions made around the business, data teams will no longer be the bottleneck in the organisation, but rather the drivers of better business decisions. In doing so, they also create a higher understanding in the organisation as to what analytics can do for the business. New projects, for which subject matter expertise is required, will run much smoother and faster.
We commented at the beginning of this article on how companies today are gathering a lot of data across various business applications. Making this data available to their data engineers, scientists, and analysts is the first step to scaling data-driven decision making. The second step involves empowering business users with these results.
Only companies that truly understand both aspects will succeed in enabling their organizations to make data-driven decisions at scale and continuously improve. Putting analytics in the hands of the many will sustain or improve their competitive edge.