avatarTyler Owen

Summary

This context discusses the implementation of tracing in a Python FastAPI application using OpenTelemetry for observability.

Abstract

The text describes the process of adding tracing to a Python FastAPI application using OpenTelemetry, which is a CNCF incubating project for observability. The author covers the installation of required dependencies, enabling auto-instrumentation, and adding logging to include trace and span IDs in the logs. They also discuss the use of SQLAlchemy instrumentation for database interactions and the creation of additional spans for better performance tracking. The author concludes by summarizing their experience with tracing and encourages readers to explore the topic further.

Bullet points

  • The context discusses the implementation of tracing in a Python FastAPI application using OpenTelemetry.
  • OpenTelemetry is a CNCF incubating project that supports various languages and aims to standardize observability.
  • The author uses a Python FastAPI application with SQLAlchemy for database access.
  • Required dependencies include opentelemetry-exporter-otlp, opentelemetry-instrumentation-fastapi, opentelemetry-instrumentation-logging, and opentelemetry-instrumentation-sqlalchemy.
  • Auto-instrumentation can be enabled by adding the necessary imports and configurations.
  • Adding logging with trace and span IDs is crucial for proper tracing.
  • SQLAlchemy instrumentation is added to track database interactions.
  • Additional spans can be created for better performance tracking.
  • The author encourages readers to explore tracing further.

Observability, Part 2 Tracing

Node Graph View from Grafana Tempo

My plan with this series initially was to cover many of the exporters that we are using to collect Prometheus metrics, which is where Part 1 started. However, I am going to pivot a little based on what I am currently playing with. I will get back to that mission shortly. However this stuff is too cool to wait.

For those that have read my previous articles, you know that I have been on an Observability mission lately. We have gone all in with the Grafana Stack (Grafana, Prometheus, Loki) and now are adding Tempo to the Mix. This will cover off all three areas of Observability that we are trying to tackle: Logs (Loki), Metrics (Prometheus) and now Tracing (Tempo).

I covered our usage of Loki briefly and will venture further down that road in the future. Part 1 and the Gitlab Prometheus exporter is shows a little of our Prometheus usage. Now to the final leg of the tripod, Tempo.

Tracing, A Quick Primer

For those of you that might not be familiar with Tracing or have heard of it but don’t know the jargon, this section is for you in an EIL5 way.

Technologies

There are multiple technologies out there that let you accomplish tracing from Zipkin to Jaeger to OpenCensus and may paid options. Many of these open source technologies have all decided or been absorbed by OpenTelemetry. OpenTelemetry (Otel for short) is an incubating project in the CNCF and has a lot of active development currently. Otel supports a very large variety of languages out of the box. Each language has a varying support for Logs, Metrics, Tracing. They are trying to standardize all 3 into common packages and support to make observability easy!

Terms

Trace — A trace is the start to finish activity that your application is doing. An example of this could be an API endpoint that is accessed.

Span — A span is part of a trace that could be an internal function or database query or other discrete activity that your program does to complete the action (aka trace)

Auto-Instrumentation — Otel provides Auto-Instrumentation of many different areas of code, but it really depends on the language that you are using.

Getting Started with a Python App

I have a Python FastAPI application that I used for this write-up. This application is an API that has a number of endpoints and uses SQLAlchemy for Database access.

I am running all this through a docker-compose.yaml file that has Postgres, Grafana, Tempo, Loki and my app in it.

Adding the Dependencies

You will need to install the OpenTelemetry modules on your machine, venv, container, etc. I personally use pip to install these and let it handle all of the other requirements, but these will get you going and will bring in the API and SDK for OpenTelemetry

  • opentelemetry-exporter-otlp
  • opentelemetry-instrumentation-fastapi
  • opentelemetry-instrumentation-logging
  • opentelemetry-instrumentation-sqlalchemy

Once those are installed you can add them to your code:

Enabling Auto-Instrumentation

This will get you a trace ID and a span or two but not as much detail as I had personally hoped for, but I will cover that in a section below!

Notice, that I am sending to the local Tempo in my docker-compose, so you may need to adjust as necessary for your Tempo installation or OtelCollector if you go that route.

The missing link for me personally

I thought hey, this is great it is all done, but it wasn’t for me personally. I never saw any traceIDs in my logs. The missing piece for me was making sure to “include logging” and then inside of my APIs, calling logging.info(“some text”).

Since we are using LoggingIntrumentation from Otel, it will add the Trace IDs and Span IDs to the log lines as seen below. The first line is the logging call and the 2nd line is the output of that.

logging.info(“Calling Config API”)

2022–03–29 15:01:27,658 INFO [root] [relationships.py:366] [trace_id=041de13035424b97cbd2dc81bbc2dbc5 span_id=ae96466432775bb7 resource.service.name=cloudapi] — Calling Config API

Without the logging lines (which I didn’t have by default) you will not see the trace_id.

Viewing Traces in Tempo

Once you have your trace ID you can then enter that ID into Tempo via Grafana. Navigate to the Explore Section of Grafana and select the Tempo Data Source and enter the Trace ID and hit “Run Query”

Query for a Trace ID

You can also have Loki identify trace IDs in the logs as well and provide a link directly to Tempo via Derived Fields. You will add this to the Loki data source as seen below.

Adding Support for SQLAlchemy

If you noticed in our includes section above we added support for SQLAlchemy with OpenTelemetry. To able this support, it is very simple, just add one line after you run the create_engine command (as shown by the engine= line below)

engine=sqlalchemy.create_engine(DB_URL)

SQLAlchemyInstrumentor().instrument(engine=engine,)

Now you will see a span for when your application talks to the Database via SQLAlchemy

Adding More Valuable Spans

The auto-instrumentation got some data flowing to Tempo and it was valuable, but I wanted to see spans for different functions that I was running so I can understand the performance impact of those and isolate where I need to attack from a performance perspective.

This is done by creating a new span and then wrapping your function call in a with statement.

You can add these wherever you want to add an additional span so you can anywhere you need.

Fully instrumented API Call

Summary

This was my first foray into tracing and all in all it only took a couple hours. The majority of that time was spent trying to figure out that I had to have a logging line inside the API that I wanted to instrument. Once I figured that gem out it was smooth sailing.

I am sure I am just scratching the surface here, but wanted to share what I had found thus far.

Hope you found this useful, if so, give me a follow and a clap!

Observability
Grafana
Fastapi
Tempo
Python
Recommended from ReadMedium