14 October 2024

How To Train An LLM For Peak Performance

Liam Garrison

Imagine this scenario: you're a Junior Analyst in your first week at a new job. Your Slack pings, and it's the Head of Sales asking, "Can you tell me how revenue is trending in the Southwest region, month on month?"

Ad-hoc queries like this come in countless forms, they’re important pieces of insight, but force analysts to constantly context-switch. Using LLMs and AI as part of a business intelligence strategy - specifically in lightening the workload of answering ad-hoc questions - is increasingly attractive. However, LLMs aren’t infallible. When they encounter ‘trust issues’, it can stem from little more than one unsatisfactory interaction with a senior stakeholder.

Business users lose faith with LLMs that: 

  • Offer inaccurate or incomplete answers

  • Frequently hallucinate

  • Can’t explain themselves; the ‘black-box' perception 

  • Struggle with edge case queries

  • Can’t flag when they don’t know something

Some technical teething troubles are expected with any new AI tool (just as they would be with a newly-hired analyst). However, not all errors are the fault of the LLM itself, they can often be mitigated through the way the model is trained on relational data. 

Starting From Day One

You wouldn’t expect a new hire to instantly know every nuance of a business on their first day. It takes training to build context and familiarity. The same principle applies to an LLM.

Your data team needs to establish a well-defined semantic layer of valuable metrics. Like onboarding a junior analyst with training programmes, these metrics serve to help the LLM know where to look, how to calculate and critically, when it doesn’t know something. An LLM now trained on metrics, and their associated dimensions and views, can handle a variation of related queries immediately. More complex metrics can be introduced once the LLM is trained on those ‘most-asked’ metrics. 

The more accurate and precise your metrics, the quicker the LLM can be given autonomy to converse with business users. It’s similar to a junior analyst knowing where to find an answer, rather than having to calculate each answer themselves. 

Think in simple terms

Every business user thinks and communicates using concepts like revenue, churn, or average delivery times - not in code. As a data team, you need to translate these frequently-used concepts into a metrics layer your LLM can digest. The metric layer allows for more abstract, accurate and understandable data querying than traditional text-to-SQL, which is more open to interpretation. 

Here’s a simple example of how an LLM uses a metrics layer to generate an answer for revenue month on month. In this case, it's using 'MQL' (Fluent's metrics querying language) to reach the answer. This focuses on three main components explained below:

{

  "metrics": ["revenue.total"],

  "groupBy": ["region", "order_date.month"],

  "order": [

    {"field": "order_date.month", "sort": "desc"},

    {"field": "region", "sort": "asc"}

  ]

}
  • Measure: revenue.total - The measure represents a predefined calculation (likely the sum of revenue) that has been centralised within the MQL setup, and is stored as predefined SQL code.


  • Dimension: region and order_date.month - The dimension in a MQL query corresponds to the columns in the data table. Dimensions are used to segment or group the metric, breaking down the data into categories or time periods.


  • View - The view isn't explicitly named in this query, but it would be the underlying data source that holds the revenue, region, and order_date information.

Unlike more verbose SQL queries, a metrics layer abstracts the complexity, and limits the LLM’s ‘freedom’ to interpret queries, ultimately leading to better performance. Here’s a few more examples of commonly-asked queries constructed in MQL. 

Customer Retention & Churn 

"How many customers have churned since July?" 

This query identifies customers who haven’t engaged in the last three months, which the business defines as churned. Training the LLM to recognise this metric allows it to autonomously respond to a number of churn-related questions.

{

  "metrics": ["customers.churned"],

  "timeframe": {

    "relative": {

      "past": "3 months"

    }

  },

  "groupBy": [],

  "filter": [

    {

      "field": "customers.last_order_date",

      "operator": "<",

      "value": "current_date - interval 3 months"

    }

  ]

}

Delivery Performance

“Whose our quickest delivery driver?”

This metric is showing the average delivery time for shipments, grouped by the carrier, then ranked in ascending order.

{

  "metrics": ["shipments.avg_delivery_time"],

  "groupBy": ["carrier"],

  "order": [

    {

      "field": "shipments.avg_delivery_time",

      "sort": "asc"

    }

  ],

  "limit": null

}
  • shipments.avg_delivery_time: This is the measure in the query, in this case, the calculation for the average delivery time for shipments.


  • carrier: This dimension is the average delivery time is being grouped by the carrier, meaning you will see the average delivery time per carrier.

  • shipments: This refers to the underlying view from which the data is being pulled. In SQL terms, this would be similar to a table or dataset that contains fields like avg_delivery_time and carrier. As mentioned, in MQL, views are often not explicitly mentioned but are inferred from the metric definitions.

Turn natural ambiguity into scalable productivity

Business users will inevitably phrase their questions in different ways, so teaching the model to recognise synonyms - such as “revenue” being used interchangeably with “sales” or “income” - lets the LLM tackle multi-step questions, without needing a pre-written query for this exact scenario. If someone asks, "How much revenue was lost to customer churn in the last 6 months?”, a trained LLM should be able to generate a query combining both revenue and churn metrics. 

The data team managing the LLM has an easier job of updating and editing existing metrics, gradually introducing new ones, and not trying to diagnose incorrect answers. Meanwhile, the LLM serves business users accurate, consistent results, or flags to the data team that it can’t.  

TL; DR

A metrics layer serves as the fast-track training that an LLM needs to deliver the most impact in a business. Once trained on your metrics, your definitions and your data, the LLM can serve as a self-service analytics tool that either provides answers based on predetermined logic or flags queries it cannot resolve.

Day-to-day, the reliability of a new AI tool helps build trust with a business user. Knowing when to respond or indicate it doesn’t have an answer is how an AI tool solidifies trust with the data team.  

If this sounds like a problem your data is facing today, or you want to learn more about how solutions like Fluent can impact your business data culture, join us on the 28th October for our webinar: Introducing AI to Your Data Stack: Quick Wins and Key Learnings.

© 2024 Artickl Ltd. All rights reserved.

© 2024 Artickl Ltd. All rights reserved.

© 2024 Artickl Ltd. All rights reserved.