‘Our data just isn’t ready for AI yet. Maybe in a year or two.’
We speak to a lot of data people about how AI tooling helps unpick the problem of generating and providing company-wide data insights, and they usually offer some version of the above. Their biggest worry is their data simply isn’t mature enough for enhancement.
Questioning readiness for AI tooling is understandable. Some data teams see their terminology and warehouse structure as too nuanced or inconsistent, and some opt to try to build a solution themselves - as difficult as that can be (learn more about in-house building here).
It’s natural to be cautious, but a message to ‘data doubters’: Your organisation and your data are more ready than you think.
Data maturity can be calculated in a few ways. Still, convention breaks it down into the following stages:
Data Exploration: The organisation collects data but primarily uses it in a very ad-hoc, exploratory manner with limited structured analysis.
Data-Informed: Data is systematically analysed to support decision-making, though it is often used alongside intuition or past experience.
Data-Driven: Decisions are consistently made based on data insights, with a focus on measurable outcomes and data as the primary guiding force.
Data-Transformed: Data insights are fully automated and integrated into most business processes, enabling real-time decision-making and continuous innovation driven with advanced analytics.
Anecdotally speaking, only 1% or 2% of companies reach that final stage of data maturity. Within a ‘data-transformed’ org, generating insights is automated at this point, and the impact of data on strategic goals is closely tied and easily demonstrated.
The tiny proportion who manage to get there demonstrates how difficult it is.
‘Data maturity’ can refer to everything from processes or governance to data quality and utilisation. Data teams often struggle to achieve the highest impact stuff; for example, enabling self-service insight amongst the non-technical team.
What does ‘mature data’ look like?
Some of the features of mature data could be seen as the ‘bread and butter’ of a well-oiled data team, things you’ve already implemented and understand.
A centralised Data Warehouse or data lake and a clear ETL process (Extract, transform, load) to ensure new data is clean and updated regularly.
Planned measures to maintain data quality and governance, like regular profiling and cleansing, to maintain accuracy. Some clear guidelines on data ownership and compliance (e.g., GDPR), and Metadata management should be in place, too.
Using cloud services as part of a scalable infrastructure and secure data-sharing processes.
An established data team with some AI and ML understanding and wider company alignment around producing more data-driven insights.
A clear understanding of responsibility towards ethical data management and risk mitigation during third-party/AI tooling adoption.
Mature data functions use SQL with really clear comments, CTEs (common table expressions), version control (git), plus explicit conventions and joins throughout to model their data. These additions make it much easier for LLMs to decipher relevant intent behind queries and map natural language into code more effectively. Something like this:
Conversely, less mature data might lack the same clarity, missing aggregation, formatting or readability. Joins might be more ambiguous, and integers like dates might be stored poorly, and require manual updates.
Below is a particularly unpleasant example of data that is difficult for humans and AI alike, lacking almost any relational structure or adaptive syntax to be understood in context:
If all your structured data doesn’t look like the first example, remove your head from your hands. All is not lost.
Don’t let perfect be the enemy of good
Your data warehouse and practices may still evolve, but there are several ways to enhance data using AI without achieving 100% uniformity across your data.
Incremental AI Adoption | Start small and scale up as data maturity improves. You can focus on specific business areas where data quality is good and introduce LLMs here first. Examples include customer support logs, sales data, or inventory management.
Leverage AI to Enhance Data Maturity | Use cloud-based AI tools to do the hard work. It isn’t just for analysis; some tools will allow you to create clean datasets and simplify their management - enabling complex tasks like natural language querying on additional, new datasets over time.
Adopt User-Friendly Tools | Implement AI tools that translate natural language queries into SQL, rather than plugging away, trying to upskill non-technical people on traditional BI platforms they don’t engage with.
Implement Feedback Loops | AI tools can identify areas where data quality needs enhancement and adjust data collection and management processes accordingly. Easy-to-use tools also mean more useful feedback from business users, who can actually manipulate their own data without supervision.
Risk Management | Understand the limitations imposed by your less mature data. Then, ensure you have contingency plans to mitigate these risks. Any third-party applications should have relevant data protections of their own. We made sure Fluent was SOC2 compliant for this very reason.
Accept Imperfection | Perfect data is rare; aiming for “good enough” while striving for continuous improvement is the key to introducing AI into an organisation.
With this approach in place, a business can experience AI’s compounding benefits, such as carrying the burden of lower-value tasks like ad-hoc querying, improving data cleanliness, and enriching decision-making.
Work with Fluent
Put data back into the conversation. Book a demo to see how Fluent can work for you.
29 November 2024
How to build trust with natural language querying
21 November 2024
What's the biggest challenge facing data teams today?
Stay up to date with the Fluent Newsflash
Everything you need to know from the world of AI, BI and Fluent. Hand-delivered (digitally) twice a month.