The pitfalls and practical realities of using generative AI in your analytics workflow

We’ve heard much about how generative AI is set to change digital marketing over the last few months. As consultants, we work with brands to harness technology for innovative marketing. We quickly delved into the potential of ChatGPT, the most buzzworthy large language model-based chatbot on the block. Now, we see how generative AI can act as an assistant by generating initial drafts of code and visualizations, which our experts refine into usable materials.

In our view, the key to a successful generative AI project is for the end user to have a clear expectation for the final output so any AI-generated materials can be edited and shaped. The first principle of using generative AI is you should not trust it to provide completely correct answers to your queries.

ChatGPT answered just 12 of 42 GA4 questions right.

We decided to put ChatGPT to the test on something our consultants do regularly — answering common client questions about GA4. The results were not that impressive: Out of the 42 questions we asked, ChatGPT only provided 12 answers we’d deem acceptable and send on to our clients, a success rate of just 29%.

A further eight answers (19%) were “semi-correct.” These either misinterpreted the question and provided a different answer to what was asked (although factually correct) or had a small amount of misinformation in an otherwise correct response.

For example, ChatGPT told us that the “Other” row you find in some GA4 reports is a grouping of many rows of low-volume data (correct) but that the instances when this occurs are defined by “Google machine learning algorithms.” This is incorrect. There are standard rules in place to define this.

Dig deeper: Artificial Intelligence: A beginner’s guide

Limitations of ChatGPT’s knowledge — and it’s overconfidence

The remaining 52% of answers were factually incorrect and, in some cases, actively misleading. The most common reason is that ChatGPT does not use training data beyond 2021, so many recent updates are not factored into its answers. 

For example, Google only officially announced the deprecation of Universal Analytics in 2022, so ChatGPT couldn’t say when this would be. In this instance, the bot did at least caveat its answer with this context, leading with “…as to my knowledge cut off is in 2021…”

However, some remaining questions were wrongly answered with a worrying amount of confidence. Such as the bot telling us that “GA4 uses a machine learning-based approach to track events and can automatically identify purchase events based on the data it collects.”  

While GA4 does have auto-tracked “enhanced measurement” events, these are generally defined by listening to simple code within a webpage’s metadata rather than through any machine learning or statistical model. Furthermore, purchase events are certainly not within the scope of enhanced measurement.

As demonstrated in our GA4 test, the limited “knowledge” held within ChatGPT makes it an unreliable source of facts. But it remains a very efficient assistant, providing first drafts of analyses and code for an expert to cut the time required for tasks. 

It cannot replace the role of a knowledgeable analyst who knows the type of output they are expecting to see. Instead, time can be saved by instructing ChatGPT to produce analyses from sample data without heavy programming. From this, you can obtain a close approximation in seconds and instruct ChatGPT to modify its output or manipulate it yourself.

For example, we recently used ChatGPT to analyze and optimize a retailer’s shopping baskets. We wanted to analyze average basket sizes and understand the optimal size to offer free shipping to customers. This required a routine analysis of the distribution of revenue and margin and an understanding of variance over time. 

We instructed ChatGPT to review how basket sizes varied over 14 months using a GA4 dataset. We then suggested some initial SQL queries for further analysis within BigQuery and some data visualization options for the insights it found.

While the options were imperfect, they offered useful areas for further exploration. Our analyst adapted the queries from ChatGPT to finalize the output. This reduced the time for a senior analyst working with junior support to create the output from roughly three days to one day.

Dig deeper: 3 steps to make AI work for you

Automating manual tasks and saving time

Another example is using it to automate more manual tasks within a given process, such as quality assurance checks for a data table or a piece of code that has been produced. This is a core aspect of any project, and flagging discrepancies or anomalies can often be laborious.

However, using ChatGPT to validate a 500+ row piece of code to combine and process multiple datasets — ensuring they are error-free — can be a huge time saver. In this scenario, what would normally have taken two hours for someone to manually review themselves could now be achieved within 30 minutes. 

Final QA checks still need to be performed by an expert, and the quality of ChatGPT’s output is highly dependent on the specific parameters you set in your instructions. However, a task that has very clear parameters and has no ambiguity in the output (the numbers either match or don’t) is ideal for generative AI to handle most of the heavy lifting. 

Treat generative AI like an assistant rather than an expert

The progress made by ChatGPT in recent months is remarkable. Simply put, we can now use conversational English to request highly technical materials that can be used for the widest range of tasks across programming, communication and visualization.

As we’ve demonstrated above, the outputs from these tools need to be treated with care and expert judgment to make them valuable. A good use case is driving efficiencies in building analyses in our everyday work or speeding up lengthy, complex tasks that would normally be done manually. We treat the outputs skeptically and use our technical knowledge to hone them into value-adding materials for our clients.

While generative AI, exemplified by ChatGPT, has shown immense potential in revolutionizing various aspects of our digital workflows, it is crucial to approach its applications with a balanced perspective. There are limitations in accuracy, particularly concerning recent updates and nuanced details. 

However, as the technology matures, the potential will grow for AI to be used as a tool to augment our capabilities and drive efficiencies in our everyday work. I think we should focus less on generative AI replacing the expert and more on how it can improve our productivity.

The bottom line is clear – ChatGPT and additional LLM AI tools will be more and more common in our daily routines. Having said that, it’s important to have  professionals managing your content and take care of your analytics workflow.