How I Analyzed My Finance With A Local AI
A personalized and secure approach to analyzing financial data, providing tailored insights and recommendations
As the spring season rolls in, bringing with it the tradition of house decluttering, I decided to use my Easter break to get a grip on my finances, cut costs, and formulate a better financial plan.
The idea of asking ChatGPT or sharing my confidential data with OpenAI, Google, or any other AI solution provider didn’t sit well with me. No way! So, I embarked on a project over my Easter break to create a tool I named “My Local AI Finance Insighter.”
True to its name, it runs entirely locally, can operate without an internet connection, and keeps your financial data secure.
My Local AI Finance Insighter first takes in my transaction data, analyzes my financial situation — including income and spending — and then offers personalized savings tips and suggests investment plans tailored to my lifestyle and financial goals. Check out the demo here.
Interested in having this tool running on your laptop? Follow this tutorial, and I’ll guide you step by step through the process of building your own Local AI Finance Insighter.
Disclaimer: The information provided here is provided solely for educational purposes and is not intended to be personal financial, investment, or other advice. All opinions are my own.
Table of Contents:
- An Overview of the Application, Its Design, and Development Stack 1.2. Overview of the Application and Architecture 1.3. Ollama and Open Source LLMs 1.2. LangChain 1.3. Streamlit
- Pre-requisite and Dataset Preparation
- Building the Data Upload Interface and Transaction Data Processing Modules
- Building the Personal Finance Dashboard and Personalized Report Generation
- Closing thoughts
An Overview of the Application, Its Design, and Development Stack
Overview of the Application and Architecture
The application is built on streamlit for its user interface and at its core, it leverages on Langchain to interact with locally served open-source LLM models within Ollama.
Within this application, we’re tapping into some of the most advanced open-source models available, such as Mistral as well as LLAVA to unlock multimodality features.
We will employ prompt engineering to direct the outputs of the LLMs to advise you on “Professional Finance Planner’s opinion” on savings and investment recommendations.
Here’s what we will achieve in this project:
- Transaction Data Processing: Using LLM to process and categorize transaction data,
- Quantitative Analysis and Visuals: We will calculate your total income, expenses and net savings rate from your transaction details. There also be graphic visualization for income and expenses trend analysis, spending analysis and much more.
- Interpreting Data With Multimodality: Using the multimodality feature to understand the graphs, uncover trends, and highlight key takeaways.
- Personalized Finance Plan: Using LLMs to generate personalized savings tips and investment recommendations based on your unique lifestyle and financial goals.
But, before entering into the technical coding, here is a line-up I used within this project, if you already familiar with them, you may skip.
Ollama
Ollama is, for me, the best and also the easiest way to get up and running with open source LLMs. It supports, among others, the most capable LLMs such as Llama 2, Mistral, Llava, and you can find the list of available models on ollama.ai/library. It´s available on MacOS, Windows and Linux.
LangChain
Langchain is an open-source framework built around LLMs. It dramatically simplifies the design and development of GenAI applications, including chatbots, summarization tools, and beyond.
The core idea of the library is to “chain” together different components to simplify complex AI tasks and create more advanced use cases around LLMs. It offers a great integration with Ollama served open source models.
Streamlit
Streamlit is an open-source framework that enables developers to create and share data applications fast. It allows the development of web-based user interfaces with only a few lines of Python code.
The level of simplicity makes Streamlit a perfect match for rapid prototyping and complex data dashboards projects.
Pre-requisite and Dataset Preparation
Installing Ollama
First, go to Ollama download page, pick the version that matches your operating system, download and install it.
With Ollama installed, open your command terminal and enter the following commands. These commands will download the models and run them locally on your machine. In this project, we´ll use Mistral and LLAVA.
ollama serve ollama pull mistral ollama pull llava ollama run mistral ollama run llava
Preparing Your Dataset
For the demo purpose, I’ll replace my personal transaction data with synthetic data. So, I prompted ChatGPT to generate 1,000 transactions for me. Here’s the prompt you can use for your testing if needed.
Generate a realistic dataset of 1,000 financial transactions for a young professional working in finance, residing in Europe, covering the period from January 2022 to December 2023. Ensure a balanced representation of income and expenses across various categories typical for this demographic. Include the following four columns:
Date: Transaction date (format: YYYY-MM-DD)
Name/Description: A unique and detailed description of each transaction (e.g., "Salary deposit," "Monthly rent payment," "Restaurant dinner with friends")
Expense/Income: Clearly indicate whether the transaction is an expense (e.g., "Expense") or income (e.g., "Income")
Amount(EUR): Transaction amount in Euros
The generated dataset should consist of four columns:
- Date: Date of the transaction
- Name/Description: A short description of the transaction. This field will be used by the LLM to categorize the transaction based on its primary nature.
- Expense/Income: Indicates if it’s an income or expense
- Amount(EUR): Amount in Euros
Building the Data Upload Interface and Transaction Data Processing Modules
Installing dependencies
Now, we need to install dependancies related to Langchain and Streamlit.
pip install langchain-community pip install streamlit
Setting Up Data Upload Module
Let´s create a new python file “Upload.py” and add the following code. Here we´ll :
- Import the necessary libraries
- Initialise the LLM model which will be used for transaction categorization
- Define Categories: Create a list of categories that cover various types of income and expenses. This will help the LLM categorize each transaction accurately.
import streamlit as st
import pandas as pd
from langchain_community.llms import Ollama
llm = Ollama(model="mistral")
categories = [
"Salary/Wages", "Investment Income", "Freelance Income", "Business Revenue","Rental Income","Housing", "Utilities","Groceries","Transportation","Insurance","Healthcare","Entertainment","Personal Care","Education","Savings/Investments","Loans/Debt","Taxes","Childcare","Gifts/Donations","Dining Out","Travel","Shopping","Subscriptions","Pet Care", "Home Improvement","Clothing","Tech/Gadgets", "Fitness/Sports",
]
categories_string = ",".join(categories)
Building the Transaction Categorization Function
- Categorize Transactions: Write a function
categorize_transactions
that takes in transaction names. We´ll use the prompt engineering technique to guide the output of the LLM. We’ll include the transaction names in the context of a prompt and request the LLM to categorize them according to our predefined list of categories. After receiving the LLM’s output, we’ll then organize and transform this data into a structured pandas DataFrame.
def categorize_transactions(transaction_names, llm):
prompt = f"""Add an appropriate category to the following expenses.
Remember The category should only be one of the following and choose only one category from the list that is most relevant based on their primary purpose or nature: {categories_string}.\n
The output format should always be : transaction name - category. For example For example: Spotify #2 - Entertainment, Basic Fit Amsterdam Nld #3 - Fitness/Sports \n
Here are the Transactions to be categorized: {transaction_names} \n"""
print(prompt)
filtered_response = []
# retry is the LLM output is not consistent
while len(filtered_response) < 2:
response = llm.invoke(prompt).split("\n")
print(response)
# Remove items that do not contain "transaction: category" pairs
filtered_response = [item for item in response if '-' in item]
print(filtered_response)
# Put in dataframe
categories_df = pd.DataFrame({"Transaction vs category": filtered_response})
size_dif = len(categories_df) - len(transaction_names.split(","))
if size_dif >= 0:
categories_df["Transaction"] = transaction_names.split(",") + [None] * size_dif
else:
categories_df["Transaction"] = transaction_names.split(",")[:len(categories_df)]
categories_df["Category"] = categories_df["Transaction vs category"].str.split("-", expand=True)[1]
return categories_df
2. Process Data Function: Create a process_data
function that processes the uploaded data file, categorizes transactions using categorize_transactions
, and merges the categorized data back into a global DataFrame that will be used for the further data analysis.
def hop(start, stop, step):
for i in range(start, stop, step):
yield i
yield stop
def process_data(df: pd.DataFrame):
unique_transactions = df["Name/Description"].unique()
index_list = list(hop(0, len(unique_transactions), 30))
# Intialise the categories_df_all dataframe
categories_df_all = pd.DataFrame()
# Loop through the index_list
for i in range(0, len(index_list) - 1):
print(f"Looping: {i}")
transaction_names = unique_transactions[index_list[i] : index_list[i + 1]]
transaction_names = ",".join(transaction_names)
categories_df = categorize_transactions(transaction_names, llm)
categories_df_all = pd.concat(
[categories_df_all, categories_df], ignore_index=True
)
# futher clean data:
# Drop NA values
categories_df_all = categories_df_all.dropna()
# Remove the numbering eg "1. " from Transaction column
categories_df_all["Transaction"] = categories_df_all["Transaction"].str.replace(
r"\d+\.\s?", "", regex=True
).str.strip()
new_df = pd.merge(
df,
categories_df_all,
left_on="Name/Description",
right_on="Transaction",
how="left",
)
new_df.to_csv(f"data/{uploaded_file.name}_categorized.csv", index=False)
return new_df
Creating the Streamlit Web App
- Set Up Your Streamlit Interface: Begin by setting the title for the web app and adding a file uploader widget. This allows users to upload their financial data.
st.title("📝 Load your financial data here")
uploaded_file = st.file_uploader("Upload your financial data", type=("txt", "csv", "pdf"))
2. Process Uploaded Data: Once a file is uploaded, read it into a pandas DataFrame and call the process_data
function to categorize transactions.
if uploaded_file:
with st.spinner("Processing data..."):
file_details = {"FileName": uploaded_file.name, "FileType": uploaded_file.type}
df = pd.read_csv(uploaded_file)
df = process_data(df)
st.markdown("Data processed : OK")
3. Run the streamlit app : you should have an interface like this
Building the Personal Finance Dashboard and Personalized Report Generation
Now that all our transactions have been neatly sorted by the LLM Mistral, we’re ready to process the financial analysis, which consists of four steps:
- Quantitative Analysis: In order to get a clear picture of your financial health, we´ll first calculate annual income and expenses, identify where most of the money is going, and more.
- Visual Presentation: We´ll chart the transaction data to spot trends and patterns.
- Qualitative analysis by LLM : We’ll take the key finance indicators and insights we’ve gathered (including those neat graphs) and feed them back to Mistral. With a carefully crafted prompt, we’ll ask it to give us a qualitative analysis of your financial situation.
- Personalised saving tips and investment recommendation generation : Based on the insights from both the quantitative and qualitative analyses, Mistral will then generate personalized advice for you.
Importing Libraries and Creating Financial Analysis Function
First, let´s create a new python file “Finance_Dashboard.py”import the necessary Python libraries and initialize your Ollama models.
import os
import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
from langchain_community.llms import Ollama
llm_llava = Ollama(model="llava")
llm = Ollama(model="mistral")
Then we’ll create a function called financial_analysis
to analyze your financial data. This function calculates yearly and monthly income and expenses, savings rate, and identifies the top expense categories.
def financial_analysis(data:pd.DataFrame):
key_figures = {}
# Calculate yearly total income and total expenses
yearly_income = data.loc[data['Expense/Income'] == 'Income'].groupby('Year')['Amount(EUR)'].sum().mean()
yearly_expenses = data.loc[data['Expense/Income'] == 'Expense'].groupby('Year')['Amount(EUR)'].sum().mean()
# Identify the top expense categories
top_expenses = data.loc[data['Expense/Income'] == 'Expense'].groupby('Category')['Amount(EUR)'].sum().sort_values(
ascending=False)
# Calculate average monthly income and expenses
monthly_income = data.loc[data['Expense/Income'] == 'Income'].groupby(data['Date'].dt.to_period('M'))[
'Amount(EUR)'].sum().mean()
monthly_expenses = data.loc[data['Expense/Income'] == 'Expense'].groupby(data['Date'].dt.to_period('M'))[
'Amount(EUR)'].sum().mean()
# Determine the savings rate
savings = yearly_income - yearly_expenses
savings_rate = (savings / yearly_income) * 100 if yearly_income > 0 else 0
key_figures['Average Annual Income'] = f"€{yearly_income:,.2f}"
key_figures['Average Annual Expenses'] = f"€{yearly_expenses:,.2f}"
key_figures['Annual Savings Rate'] = f" {savings_rate:.2f}%"
key_figures['Top Expense Categories'] = {category: f"€{amount:,.2f}" for category, amount in
top_expenses.head().items()}
key_figures['Average Monthly Income'] = f"€{monthly_income:,.2f}"
key_figures['Average Monthly Expenses'] = f"€{monthly_expenses:,.2f}"
return key_figures
Creating Plotting Functions
Here we´ll implement functions to visualize your financial data. These include four plots for income vs. expenses over time, monthly saving rates, income sources, and category-wise spending.
def plot_income_vs_expense_over_time(df):
# Income vs Expense Over time
st.markdown("1. Income vs Expense Over time")
income_expense_summary = (
df.groupby(["YearMonth", "Expense/Income"])["Amount(EUR)"]
.sum()
.unstack()
.fillna(0)
)
income_expense_summary.plot(kind="bar", figsize=(10, 8))
plt.title("Income vs Expenses Over Time")
plt.ylabel("Amount (EUR)")
plt.xlabel("Month")
plt.savefig("data/income_vs_expense_over_time.png", bbox_inches="tight")
st.pyplot(plt)
def plot_saving_rate_trend(data: pd.DataFrame):
st.markdown("2. Monthly Saving Rate Trend")
monthly_data = data.groupby(['YearMonth', 'Expense/Income'])['Amount(EUR)'].sum().unstack().fillna(0)
monthly_data['Savings Rate'] = (monthly_data['Income'] - monthly_data['Expense']) / monthly_data['Income'] * 100
fig, ax = plt.subplots()
monthly_data['Savings Rate'].plot(ax=ax)
ax.set_xlabel('Month')
ax.set_ylabel('Savings Rate (%)')
plt.savefig("data/saving_rate_over_time.png", bbox_inches="tight")
st.pyplot(fig)
def plot_income_source_analysis(data: pd.DataFrame):
st.markdown("3. Income Sources Analysis")
income_sources = data[data['Expense/Income'] == 'Income'].groupby('Category')['Amount(EUR)'].sum()
income_sources.plot(kind="pie", figsize=(10, 8), autopct="%1.1f%%", startangle=140)
plt.title("Income Sources Analysis")
plt.ylabel("") # Hide the y-label as it's unnecessary for pie charts
plt.savefig("data/income_source_analysis.png", bbox_inches="tight")
st.pyplot(plt)
def plot_category_wise_spending_analysis(data: pd.DataFrame):
st.markdown("4. Category-wise Spending Analysis")
expenses_by_category = data[data['Expense/Income'] == 'Expense'].groupby('Category')['Amount(EUR)'].sum()
expenses_by_category.plot(kind="pie", figsize=(10, 8), autopct="%1.1f%%", startangle=140)
plt.title("Expenses Analysis")
plt.ylabel("") # Hide the y-label as it's unnecessary for pie charts
plt.savefig("data/expense_category_analysis.png", bbox_inches="tight")
st.pyplot(plt)
Loading Your Financial Data
Read your financial data from CSV files, process the data, and prepare it for analysis.
total_df = pd.DataFrame()
for root, dirs, files in os.walk("data"):
for file in files:
if file.endswith(".csv"):
df = pd.read_csv(os.path.join(root, file))
total_df = pd.concat([total_df, df], ignore_index=True)
total_df["Date"] = pd.to_datetime(total_df["Date"])
total_df["YearMonth"] = total_df["Date"].dt.to_period("M")
total_df["Year"] = total_df["Date"].dt.year
Setting Up Your Streamlit Dashboard
Use Streamlit to create a title for your dashboard, display the analysis results, and integrate the plotting functions.
st.title("My Local AI Finance Insighter")
st.markdown(
"**A personalized and secure approach to analyzing financial data, providing insights and recommendations tailored to individual needs.**"
)
analysis_results = financial_analysis(total_df)
results_str = ""
# Loop through the dictionary
for key, value in analysis_results.items():
if isinstance(value, dict):
# If the value is another dictionary, further iterate to get sub-keys and values
sub_results = ', '.join([f"{sub_key}: {sub_value}" for sub_key, sub_value in value.items()])
results_str += f"{key}: {sub_results}\n"
else:
# For direct key-value pairs, simply concatenate
results_str += f"{key}: {value}\n"
st.subheader("Yearly Figures")
col1, col2, col3 = st.columns(3)
col1.metric(label="Average Annual Income", value=analysis_results['Average Annual Income'])
col2.metric(label="Average Annual Expenses", value=analysis_results['Average Annual Expenses'])
col3.metric(label="Savings Rate", value=analysis_results['Annual Savings Rate'])
# Display average monthly figures
st.subheader("Average Monthly Figures")
col1, col2 = st.columns(2)
col1.metric(label="Average Monthly Income", value=analysis_results['Average Monthly Income'])
col2.metric(label="Average Monthly Expenses", value=analysis_results['Average Monthly Expenses'])
# Display top expense categories in a table
st.subheader("Top Expense Categories")
expenses_df = pd.DataFrame(list(analysis_results['Top Expense Categories'].items()), columns=['Category', 'Amount'])
st.table(expenses_df)
with st.container():
col1, col2 = st.columns(2)
with col1:
plot_income_vs_expense_over_time(total_df)
with col2:
plot_saving_rate_trend(total_df)
with st.container():
col3, col4 = st.columns(2)
with col3:
plot_income_source_analysis(total_df)
with col4:
plot_category_wise_spending_analysis(total_df)
When you run streamlit, you should get a dashboard like this:
Generating Personalized Finance Analysis and Recommandation
Finally, we´ll pass the previously generated quantitative and qualitative analysis to Mistral so it can create personalized advice for you!
with st.container():
col3, col4 = st.columns(2)
with col3:
plot_income_source_analysis(total_df)
with col4:
plot_category_wise_spending_analysis(total_df)
with st.spinner("Generating reports ..."):
total_response = ""
for root, dirs, files in os.walk("data"):
for file in files:
if file.endswith(".png"):
response = llm_llava.invoke(
f"Act as an expert finance planner and analyse the image : {os.path.join(root, file)}. You should give your insights extracted from the image and key figures you see from the image "
)
total_response += response
total_response += f"\nHere are the user key financial figures : {results_str}"
st.write("---------------")
st.markdown("**Finance analysis and budget planner**")
summary = llm.invoke(
f"You are a helpful and expert finance planner. Base on the following analysis: {total_response}, make a summary of the financial status of the user and suggest tips on savings. Highlight categories where the user can potentially reduce expenses and suggest an ideal savings rate based on their income and goals. Tailor these suggestions to fit the user’s lifestyle and financial objectives. Use a friendly tone. "
)
st.write(summary)
st.write("---------------")
st.markdown("**Investment tips**")
if "user_answers_str" in st.session_state:
user_investment_answer = st.session_state.user_answers_str
else:
user_investment_answer = ""
investment_tips = llm.invoke(
f"You are a helpful and expert finance planner. Based on the user's risk tolerance and investment goals, provide an overview of suitable investment options. Discuss the basics of stocks, bonds, mutual funds, ETFs, and other investment vehicles that align with their profile. Explain the importance of diversification and the role of risk management in investing. Offer to guide them through setting up a diversified investment portfolio, suggesting steps to get started based on their current financial situation. Use a friendly tone. Below are the user´s investment objective and risk tolerance : {user_investment_answer}"
)
st.write(investment_tips)
The report is well-organized and seems coherent, but it’s longer than I’d prefer. To get a more concise output, we could refine the prompt further.
Closing Thoughts
And so, this is the Local AI-Powered Finance Insighter that we built together to better understand our financial health!
We’ve tapped into the power of Generative AI, by tailoring it to provide highly personalized advice. Doing it entirely locally is the standout feature of this dashboard: your financial data remain securely within your laptop.
Going Beyond
I initially started this project, inspired by a video from by Thu Vu on data analytics. She explained her use of a local LLM for categorizing transaction data. However, I discovered that LLMs, due to their creative nature, can struggle with consistent categorization, especially for large datasets! I had to tweak code and prompts in order to have good results.
On the other hand, using general AI and LLMs to generate insightful analysis, reports, and personalized tips proved very effective.
Another interesting feature could be adding a chatbot interface, allowing users to ask finance-related questions directly.
Hope you enjoyed this project! If you discover a better way to consistently categorize transactions, let me know in the comments!
Before you go! 🦸🏻♀️
If you liked my story and you want to support me:
- Throw some Medium love 💕(claps, comments and highlights), your support means the world to me.👏
- Follow me on Medium and subscribe to get my latest article🫶