Based on Image by Temel from Pixabay

New AI agent tools have made automating data analysis remarkably simple. When combined with a visualisation platform like Streamlit, it is a simple task to create a visually compelling business reporting app.

In this tutorial, we will focus on analysing customer feedback. This analysis and a good understanding of customer sentiment are essential to ensure products meet expectations and that quality concerns are addressed. Products perceived as poor quality simply won’t sell.

However, analysing customer feedback can involve processing a large volume of unstructured data—a task that can be daunting. This is where large language models excel, making it possible to extract meaningful insights from this data.

Our first step is to clarify the objective.

Imagine we have a list of customer feedback messages about products from an online retailer. Our goal is to generate an executive report that highlights top-performing products, identifies issues with underperforming ones, and includes compelling visualizations of customer sentiment.

We are going to use CrewAI to orchestrate a handful of AI agents that will structure the data, analyse it, create an insightful report and finally, build a Streamlit app to present the results.

It will look like the screenshot, below.

First, we need to define precisely what we want to do.

I have a list of fictional customer feedback messages about products from an online retailer. The messages are in Markdown format. I want to generate an executive report that highlights products that perform well, identifies issues with those that perform badly and includes some interactive charts on customer sentiment. And I want all of this to be in an interactive web app.

So, let’s define the steps we need to take.

Defining the process

Here is the list of steps:

  1. Convert the raw customer data and create a structured version - probably CSV. This should make further analysis easier and it is something I can use directly in the resulting app.

  2. Calculate the sentiment of each customer message and add this to the structured data.

  3. Write a report that summarises the data in a table and includes sections that identify the best-performing products, those with quality problems and general issues that need to be addressed. The report should be formatted as Markdown.

  4. Create a Streamlit app that displays the report in one column and some interactive charts that show customer satisfaction in a second column.

  5. Make the structured customer feedback data available as a table in the app.

As with my previous article, AI for BI: Building a Business Information Report with CrewAI and OpenAI, I used CrewAI and OpenAI to build this software. The reasons are the same, it’s not because they are necessarily the best technical solutions for this problem but they are convenient, easy to use and they do the job.

I used a Jupyter Notebook to write the analysis code, so most of the code you see below can be written directly into a notebook cell in the order it is presented. (To create a standalone app, simply concatenate the cells.)

The reporter program in Streamlit is separate and is created by the AI. You will find all of the code and data in my GitHub repo.

As we will use the OpenAI API, you need an API key which should be accessible as an environment variable. If it is not yet stored as an environment variable, you can run the following code block first to ensure it is.

# Omit this if your API key is already set as an environment variable

import os
os.environ["OPENAI_API_KEY"] = "your api key"

OpenAI will charge you for using their LLM. However, running the code below should cost no more than a few cents (but you MUST keep an eye on your usage in the OpenAI dashboard — things go wrong sometimes! It is also a good idea to set a monthly limit on your spending.)). The CrewAI software that we use is open source and costs nothing.

Let’s get on with the coding.

Writing the code

I’ve outlined the steps that we will go through, above. These can be grouped into three stages: convert the raw data to CSV, write a report and create a Streamlit app. We will define agents and tasks for each stage and save the results. So we’ll end up with three files, the structured data in a CSV file, the report in a Markdown file and the Streamlit app.

There are two advantages to saving the intermediate files. The first is that, if we wish, we can run each of the three stages independently of the others. If we want to tweak the report’s structure, for example, there is no need to re-create the CSV file, we can use one we prepared earlier. And if the Streamlit app needs work it can be done independently of the rest of the coding. The second advantage is that the resulting app can use these files.

This is how we shall proceed but, before we get to the main code, a bit of setting up is required.

We need to import stuff from CrewAI and set the LLM model. As I said before, CrewAI defaults to using the OpenAI API so, as long as we have an API key, all we need to do is set the model we want to use in the variable llm. I have also set the temperature to zero. This reduces the randomness for a more consistent response.

I’ve also set a flag, DEBUG. This is used to control the verbosity of the agent response. For debugging purposes, it’s a good idea to have a full response to see how the agents process requests. Set DEBUG to True for this.

from crewai import Agent, Task, Crew, LLM

llm = LLM(
    model="gpt-4o-mini",
    temperature=0.0
)

DEBUG = False

The next part of the code defines some file names. The first is the raw customer data file. The next two files will store intermediate data for the Streamlit app and the last one is the app itself.

# files

fb_raw =    "./data/clothes.md"

fb_csv =    "./data/fb.csv"
report_md = "./data/report.md"
st_app =    "./data/report.py"

Now we import the CrewAI tools required to read and write files and assign them to variables.

from crewai_tools import FileReadTool, FileWriterTool

file_read_tool = FileReadTool()
file_writer_tool = FileWriterTool()

That’s all the preamble code out of the way and we can start on the main program.

CrewAI apps have three essential components: agents, tasks and crews. An agent is the interface to the LLM; it has a particular purpose and can be provided with functions (tools) to help it. A task details something that needs to be done by an agent. Finally, a crew executes a list of tasks and agents and returns a result.

We will define three agents that map onto the stages outlined above. One to create and modify a CSV file, another to write a report in Markdown and a third to build the Streamlit app.

Each agent is provided with one or more tasks and these agent/task combinations will be managed by a crew.

CSV agent

The first agent’s task is to construct a data structure from the raw Markdown file of customer feedback, then calculate customer sentiment and add that to the data structure.

There is no strict distinction between the attributes of the agent and the task. The task has a description and the agent has three descriptive attributes role, goal and backstory. Each contributes towards a meaningful prompt that will be generated by CrewAI but it is ultimately up to us how we use them.

My approach here is to be concise in the agent definition and leave a detailed description of what the agent has to do for the task.

Using this approach, I can use the same agent for more than one task, as we shall see.

Here’s my definition of the CSV agent.

csv_agent = Agent(
        role="Extract, process data and record data",
        goal="""Extract data and organise as instructed. 
                The result MUST be valid CSV.""",
        backstory="""You are a data processing agent""",
        tools=[file_read_tool],
        llm=llm,
    )

And here is the first task.

create_CSV = Task(
    description=""" 
                Analyse '{input_file}' the data provided - it is in 
                Markdown format. 
                Your output should be in CSV format. Respond without 
                using Markdown code fences.

                The data is about the range of items in an online shop.
                Following this is a set of messages from customers giving 
                feedback about the products that they have purchased.

                Your task is to:
                   Create a structured file in CSV format that records a 
                   list of all customer feedback messages.
                   Each item in the list should have its columns 
                   populated as follows.

                        "Product": insert the name of the item, 
                        "Overall_Rating": insert the rating as given by customer, 
                        "Issue": insert any issues identified - if no issue can be identified write 'None', 
                        "Review": insert the customer message 
                """,
    expected_output="A correctly formatted CSV data structure",
    agent=csv_agent,   
    tools=[file_read_tool]
)

The task is much more explicit and tells the agent exactly what to do. The result should be a CSV structure with four columns.

We also want a fifth column that holds the sentiment of each message. So as not to overcomplicate the prompts, I assign this to a second task.

add_sentiment = Task(
    description=""" 
                Analyse CSV data and calculate the sentiment of each 
                message in the 'Review' column. Add a new column to the 
                CSV that records that sentiment.
                Your output should be in CSV format. Respond without 
                using Markdown code fences.              
                """,
    expected_output="A correctly formatted CSV data file",
    agent=csv_agent,   
    output_file=fb_csv,
    tools=[file_read_tool]
)

Notice that in this task there is an extra attribute output_filewhich is set to the file name for the CSV file. This tells the task to automatically place the result in that file.

Having defined the tasks, we execute them.

crew = Crew(
    agents=[csv_agent, csv_agent],
    tasks=[create_CSV, add_sentiment],
    verbose=DEBUG,
)
result1 = crew.kickoff(inputs={'input_file': fb_raw})

In the crew, there is a list of agents and tasks that will be executed sequentially and when the crew is run, the input file name is set as a parameter. Notice that we have two tasks that use the same agent.

Once complete we will have a CSV file with a structure something like this:

Our next job is to write a report.

Report writing agent

We now have some nicely formatted data that has been enhanced with a customer sentiment assessment and saved in a file for later use.

Our next job is to attempt a deeper analysis of the data with a new agent.

Once again, the agent definition is short and the detail of the work to be done is in the task.

# Define agent
report_generator = Agent(
        role="Compile and present results",
        goal="""Deliver a polished, structured report on customer 
                satisfaction.
             """,
        backstory="""You are an agent, that generates clear, 
                     well-designed and professional reports""",
        tools=[file_read_tool],
        llm=llm,
    )

The task defines precisely what the report should look like and what it should contain.

create_report = Task(
    description="""
            Read the CSV data in '{csv_file}', create a summary report.
            The report must consolidate and summarize the customer 
            feedback, it should be in Markdown file format 
            (without ``` fences). 

            The report should be structured as follows:

                # Product review report

                ### Summary

                Insert a Markdown table with a row for every product.
                The table header should look like this:

                | Product | Average Rating | Number of reviews | Positive | Neutral | Negative |

                The should be a row for every product like this:

                | insert the product name here 
                | insert the average of all the rating for this product 
                | insert total number of reviews
                | insert number of positive reviews 
                | insert number of neutral reviews 
                | insert number of negative reviews |

                ### Insights

                #### Best performers

                insert a short report on the products with the best reviews

                #### Underperformers

                insert a short report on the products that are underperforming

                #### Issues

                insert a short report on what steps need to be taken to improve products and sales
    """,
    expected_output="""A Markdown report file""",
    agent=report_generator,
    output_file = report_md,
    tools=[file_read_tool]
)

CrewAI does a good job of producing a report and the result is saved in report_mdby executing the following crew.

crew = Crew(
    agents=[report_generator],
    tasks=[create_report],
    verbose=DEBUG,
)
result2 = crew.kickoff(inputs={'input_file':fb_raw, 'csv_file': fb_csv})

You can see the full report in my GitHub repo but below is a screenshot of the report part of the app so you can see the sort of thing that is produced.

Next, we want to generate the app.

Streamlit app

I’ve tried to make the app as general as possible so that it will cope with different sets of feedback messages about different products.

Consequently, you only need to run this part of the code once. You’ll then have a Streamlit app that will read the CSV files and Markdown report that we generated earlier or, indeed, ones we create later.

I tackle the app generation in two parts using two tasks but with the same Streamlit generating agent.

Here is the agent definition.

# Define agent
app_generator = Agent(
        role="Create or modify a Streamlit program",
        goal="""To deliver a valid, Streamlit program  in Python, with 
                meaningful variable and function names.
                Ensure that the output strictly conforms to Python syntax.
                Do not include ``` fencing.
           """,
        backstory="""You are an agent that generates clear, well-designed
                     Streamlit programs""",
        tools=[file_writer_tool, file_read_tool],
        llm=llm,
    )

Now for the first task. It sets up the basic app and loads the CSV data as a Pandas dataframe. I had to tell it about a (now fairly old) update to the Streamlit API. An earlier version used the decorator st.cache, this is deprecated and we should use st.cache_data to cache the CSV data instead. I was quite surprised at this as the new decorator has been around for a while now. However, the problem was easily solved as you can see.

create_app = Task(
    description="""
            Create a Streamlit app as follows:
            - set the display to wide format
            - include the pandas and ploty express libraries
            - create a pandas dataframe called "df" from the csv file 
              {csv_file} include all fields
            Note that st.cache is deprecated, use st.cache_data instead.

            Don't save the program to a file.

    """,
    expected_output="""A valid and syntactically correct Python program""",
    agent=app_generator,
    tools=[file_read_tool]
)

Now we have a basic program. In the next task, I give explicit instructions on how to add to this program to display the data and construct the charts.

add_content = Task(
    description="""
            Modify the Streamlit code as follows:
            - create two tabs, the first named "report_tab" the second 
              named "messages_tab"
            - in "messages_tab" load the dataframe, "df", in a st.table 
            - in report_tab create two columns of equal width called 
              "report_column" and "chart_column"
            - in the column "report_column" read the report from 
              {report_file} and display it in st.markdown
            - in the "chart_column" draw a bar chart of 'Product' over 
              'Overall_Rating'
            - in the column "chart_column" draw a histogram of 'Sentiment' 
              with the bars colored by 'Product'

    """,
    expected_output="A valid and syntactically correct Python program",
    output_file = st_app,
    agent=app_generator,
    tools=[file_read_tool]   
)         

As before, we run both tasks with a single agent.

crew = Crew(
    agents=[app_generator, app_generator],
    tasks=[create_app, add_content],
    verbose=DEBUG,
)
result3 = crew.kickoff(inputs={'csv_file': fb_csv ,
                               'report_file': report_md})

That’s the final stage done. Run the Streamlit app; the result will be the screenshot we saw earlier. Note that with the way I have organised the folders you will need to run Streamlit from the parent folder thus:

` streamlit run data/report.py`

Final thoughts on CrewAI

On the whole, I was quite happy with the result. CrewAI is an easy framework to get to grips with although it has a few foibles. There are a lot of descriptive strings that don’t feature in rival products and, while I’m sure that they make sense to the good folk at CrewAI, I’m not entirely convinced that they are all necessary.

However, having said that, I do like the way we can separate a task from an agent, giving the agent a broad description of its capabilities and providing the details of what needs to be done in the task. The other framework I’ve been looking at recently (Swarm) doesn’t allow that.

I encountered a problem with CrewAI that I may have solved with this feature.

I was getting the following error when running one of the crews:

Tool Output:

Error: the Action Input is not a valid key, value dictionary.
 Error parsing LLM output, agent will retry: I did it wrong. 
Invalid Format: I missed the 'Action:' after 'Thought:'. 
I will do right next, and don't use a tool I have already used.

Which is a little difficult to interpret.

This seemed something to do with the complexity of the task because when I split the tasks the problem disappeared.

However, I’m not entirely convinced that my complexity diagnosis was correct as I have been in touch with other developers who have found different solutions to what appears to be the same problem (lowering the temperature, for example - that didn’t work for me). At the time of writing, this is a live issue on the CrewAI GitHub repo.

Update: that bug was intermittent and has happened again with the code, above You should keep an eye on the output when running this code. It could loop infinitely and if not stopped you might end up with a large bill.

Final thoughts on the app

I was very explicit in defining the Streamlit app. You could argue that I was so explicit that I might as well have written the app myself. It’s a simple program, so there is some merit in that.

But I wanted to produce an end-to-end program that produced the final product from the raw data. Perhaps, I could have been a little vaguer with the program definition. Perhaps, that would have resulted in a better program! But apart from the cache decorator problem, the AI produced no errors (and it types much quicker than me!) so I’m not complaining.

The final question is why Streamlit? While it is a perfectly good framework for this sort of app, the real answer is, because ChatGPT-4o mini has the programming knowledge to use it!

Asking ChatGPT to use a newer, or less well-known, framework such as Mesop or Taipy would have produced poorer results. The LLM ‘knows’ less about these products and is more likely to make mistakes and hallucinate.


The code and data for this article can be found in this GitHub repo in the AI4BI-fb folder. The resulting charts and report are in the same folder.

All illustrations are by me, the author, unless otherwise stated.

Thanks for reading. If you liked this article please consider subscribing to my newsletter and you'll be notified of new content.
Most of my stuff is also on Medium (paywalled).
If the article was useful, please consider a contribution.