avatarDave Taubler

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

3730

Abstract

"c846">Moreover, we might take this opportunity as a mini-break from writing all of this code. Writing an engineer-consumable message is an opportunity to show how creative we can be. So we might write something like we saw at the beginning of this article:</p><div id="1605"><pre>} catch <span class="hljs-comment">(IOException e)</span> { log.error<span class="hljs-comment">("Couldn't find the server. Guess it went on vacation!!! :)</span><span class="hljs-string">"); }</span></pre></div><p id="ab75">And so, we’ll have solved our second immediate need:</p><blockquote id="d0f2"><p><b><i>Finding an outlet to express our creativity</i></b></p></blockquote><p id="99ba">The problem is, logging is not intended to serve either of these purposes. Instead, its raison d’etre is to serve whoever will be troubleshooting the issues that arise after our code is deployed. So how, then, do we write logs that achieve <i>that</i> purpose?</p><h1 id="1cd9">Just the facts, man</h1><p id="f623">First, we need to stop being conversational. This is not the place for clever comments, or complete sentences. Instead, we should focus solely on providing the future engineer with the information they’ll need to troubleshoot.</p><p id="1aff">I’ve found that the most effective way to do this is to think in terms of two things:</p><ul><li>key/value pairs</li><li>searchable terms</li></ul><p id="573d">Let’s look at each of these.</p><h2 id="695c">Key/value pairs</h2><p id="b370">As discussed earlier, we’re often tempted to write our log messages as prose. This naturally lends itself to less useful information. Instead, we should write our log messages as a series of <i>key/value pairs</i>. For example:</p><div id="8d03"><pre>log.error<span class="hljs-params">("<span class="hljs-attr">action</span>=get-person <span class="hljs-attr">event</span>=server-failure <span class="hljs-attr">person-id</span>={id} <span class="hljs-attr">connection-status</span>={connection.status}")</span></pre></div><p id="2676">Why key/value fields? I’ve found that writing messages this way has a few advantages. Psychologically, it forces us to pause and consider all of the information that might be useful in our log message. It also results in a log message with the information displayed in easily-consumed manner. In addition, some log aggregators such as <a href="https://docs.splunk.com/Documentation/Splunk/8.2.0/Knowledge/Automatickey-valuefieldextractionsatsearch-time">Splunk</a> can be configured to automatically extract and index key/value pairs. This leads to our second point of focus:</p><h2 id="7bb8">Searchable terms</h2><p id="86c2">Often when diagnosing a production issue, we’ll need to search through our logs for information. If we’ve written prose-y message like the example at the beginning of this article, then our searches will look something like this:</p><div id="06a0"><pre><span class="hljs-attribute">index</span><span class="hljs-operator">=</span>my.service <span class="hljs-string">"Guess it went on vacation!!!"</span></pre></div><p id="b78d">This search might — serendipitously — uniquely identify that one specific log output (that is, unless we’ve used this same gag in other log messages.) In other cases, we might not be so lucky. If we get in the habit of writing the following:</p><div id="ce4c"><pre>} <span class="hljs-built_in">catch</span> (IOException e) { <span class="hljs-built_in">log</span>.<span class="hljs-built_in">error</span>(<span class="hljs-string">"Error taking to the server"</span>, e); }</pre></div><p id="8bc0">then a search for <code>"Error taking to the server"</code> might be a little too non-specific.</p><p id="a431">Instead, we want to be sure that each log message w

Options

ill allow us to uniquely query it. Fortunately, this comes almost for free if we write our messages as a series of key/value pairs. To isolate the example from earlier, our search might look like this:</p><div id="86ed"><pre><span class="hljs-attribute">index</span><span class="hljs-operator">=</span>my.service <span class="hljs-string">"action=get-person"</span> <span class="hljs-string">"event=server-failure"</span></pre></div><p id="f236"><i>(with or without the quotes, depending on whether our fields are automatically extracted by our log aggregator)</i></p><p id="ad34">Notice that I’m searching by both the <code><i>action</i></code> and the <code><i>event</i></code>. This points to another best practice with logging:</p><blockquote id="eb2f"><p><b>Provide a search term to tie related logs together.</b></p></blockquote><p id="9779">Often when troubleshooting issues, we need to search across multiple log messages. Most engineers are aware of the concept of <a href="https://www.oreilly.com/library/view/building-microservices-with/9781785887833/1bebcf55-05bb-44a1-a4e5-f9733b8edfe3.xhtml">correlation IDs</a>, which allow us to group log messages belonging to a single <i>request</i>. In a similar manner, we can tag log messages that belong to a single <i>flow</i> with a common key/value pair, so that we can easily query for all log messages that belong to that specific flow.</p><p id="9027">In our example, we would include <code>action=get-person</code> in all of the log messages that are associated with our <i>get-person</i> call. That way, a search for that term would bring up all logs associated with the initialization of that request, as well as all successes and failures. Providing the <code>event</code> key/value pair, then, allows us to narrow down our search to a particular event.</p><h2 id="d63e">A quick note about security</h2><p id="e78e">It’s worth pausing to discuss the security implications of logging. The approach outlined here encourages us to provide any potentially useful data in our log messages. We need to be sure, however, that we are not logging any sensitive information such as <a href="https://cipherpoint.com/solutions/pii-security-compliance/">PII</a> in our logs.</p><p id="cdc8">Ideally, your organization has a security team that can guide you with what is acceptable and unacceptable to log. If not, it’s something you’ll need to be aware of and—ideally—proactive in developing standards for the rest of your engineering organization.</p><h1 id="8796">The tail -f end</h1><p id="5e46">Writing log messages may not be rocket science. But neither should it be poetry. Instead, we should pause and put some thought into the messages that we write. Moreover, we should follow the guidelines presented in this article:</p><ul><li>Write log messages as a series of key/value pairs, rather than as prose</li><li>Ensure that our messages are easily queryable</li><li>Ensure that we include a key/value pair that allows us to correlate messages associated with a single action</li></ul><p id="3cd7">I’ve found that doing this turns writing log messages from a slog into an interesting, rewarding challenge. But more importantly, it will make life less frustrating for the poor, future engineer who needs to consume our logs. And remember, that future engineer might be us!</p><p id="cce3">Find this story useful? Want to read more? Just <a href="https://dt-23597.medium.com/subscribe">subscribe here</a> to get my latest stories sent directly to your inbox.</p><p id="2412">You can also support me and my writing — and get access to an unlimited number of stories — by <a href="https://dt-23597.medium.com/membership">becoming a Medium member today</a>.</p></article></body>

Stop Writing Creative Log Messages

Save the prose for your poetry

Photo by Marko Horvat on Unsplash

If you work long enough as a software engineer, you’ll see them. Those “funny” log messages sprinkled throughout your company’s codebase. A line written by someone who wanted to show that they’re a humorist as well as an engineer. Something like:

log.error(
  "Couldn’t find the server. Guess it went on vacation!!! :)" );

Maybe you’ll chuckle to yourself the first time you see it in code. But be assured, there will be no chuckling when you’re dealing with a production outage, and that is the data to which your hopes of a quick resolution are pinned.

At this point, the urge to find the engineer who wrote this message—and pop them in their smug, wanna-be comedian face—might be strong. But let’s be honest: we’ve all written log messages like this. Ones that—if not quite as “funny” — were just as frustratingly useless. Undoubtedly, if faced with seeing some our own messages in production, we’d wish we could slink back in time and rewrite them.

So why do we continue to be so glib with our log messages?

I contend that its simply because we focus on solving our own immediate needs. Instead, we should be solving the needs of the future engineer (who might turn out to be us!) who will be consuming our logs.

Solving our immediate needs

We tend to view writing logs as a necessary evil. As we write code, we’ll hit a point where we feel pressured to tell someone that something has happened. Often this point is prompted by catching an exception. For example, imagine writing code like the following:

var connection = getConnection();
try {
  var personResponse = connection.get("/persons/" + id);
  var b = validate(personResponse);
  return (b) ? personResponse : null;
} catch (IOException e) {
  //TODO do something
}

As experienced engineers, we innately know that we cannot leave that seventh line as a TODO comment. We need to handle it somehow.

So we write a log message. But let’s be honest. Composing a log message feels like a disruption from our “real” programming work. We want to get in and get out as speedily as possible. And so, we quickly type up a message:

} catch (IOException e) {
  log.error("Error taking to the server", e);
}

Knowing that we’ve met the minimum error-handling criteria, we get back to “real” programming. Thus, we’ll have solved our first immediate need:

Quickly absolving ourselves of the error

Moreover, we might take this opportunity as a mini-break from writing all of this code. Writing an engineer-consumable message is an opportunity to show how creative we can be. So we might write something like we saw at the beginning of this article:

} catch (IOException e) {
  log.error("Couldn't find the server. Guess it went on vacation!!! :)");
}

And so, we’ll have solved our second immediate need:

Finding an outlet to express our creativity

The problem is, logging is not intended to serve either of these purposes. Instead, its raison d’etre is to serve whoever will be troubleshooting the issues that arise after our code is deployed. So how, then, do we write logs that achieve that purpose?

Just the facts, man

First, we need to stop being conversational. This is not the place for clever comments, or complete sentences. Instead, we should focus solely on providing the future engineer with the information they’ll need to troubleshoot.

I’ve found that the most effective way to do this is to think in terms of two things:

  • key/value pairs
  • searchable terms

Let’s look at each of these.

Key/value pairs

As discussed earlier, we’re often tempted to write our log messages as prose. This naturally lends itself to less useful information. Instead, we should write our log messages as a series of key/value pairs. For example:

log.error("action=get-person event=server-failure person-id={id} connection-status={connection.status}")

Why key/value fields? I’ve found that writing messages this way has a few advantages. Psychologically, it forces us to pause and consider all of the information that might be useful in our log message. It also results in a log message with the information displayed in easily-consumed manner. In addition, some log aggregators such as Splunk can be configured to automatically extract and index key/value pairs. This leads to our second point of focus:

Searchable terms

Often when diagnosing a production issue, we’ll need to search through our logs for information. If we’ve written prose-y message like the example at the beginning of this article, then our searches will look something like this:

index=my.service "Guess it went on vacation!!!"

This search might — serendipitously — uniquely identify that one specific log output (that is, unless we’ve used this same gag in other log messages.) In other cases, we might not be so lucky. If we get in the habit of writing the following:

} catch (IOException e) {
  log.error("Error taking to the server", e);
}

then a search for "Error taking to the server" might be a little too non-specific.

Instead, we want to be sure that each log message will allow us to uniquely query it. Fortunately, this comes almost for free if we write our messages as a series of key/value pairs. To isolate the example from earlier, our search might look like this:

index=my.service "action=get-person" "event=server-failure"

(with or without the quotes, depending on whether our fields are automatically extracted by our log aggregator)

Notice that I’m searching by both the action and the event. This points to another best practice with logging:

Provide a search term to tie related logs together.

Often when troubleshooting issues, we need to search across multiple log messages. Most engineers are aware of the concept of correlation IDs, which allow us to group log messages belonging to a single request. In a similar manner, we can tag log messages that belong to a single flow with a common key/value pair, so that we can easily query for all log messages that belong to that specific flow.

In our example, we would include action=get-person in all of the log messages that are associated with our get-person call. That way, a search for that term would bring up all logs associated with the initialization of that request, as well as all successes and failures. Providing the event key/value pair, then, allows us to narrow down our search to a particular event.

A quick note about security

It’s worth pausing to discuss the security implications of logging. The approach outlined here encourages us to provide any potentially useful data in our log messages. We need to be sure, however, that we are not logging any sensitive information such as PII in our logs.

Ideally, your organization has a security team that can guide you with what is acceptable and unacceptable to log. If not, it’s something you’ll need to be aware of and—ideally—proactive in developing standards for the rest of your engineering organization.

The tail -f end

Writing log messages may not be rocket science. But neither should it be poetry. Instead, we should pause and put some thought into the messages that we write. Moreover, we should follow the guidelines presented in this article:

  • Write log messages as a series of key/value pairs, rather than as prose
  • Ensure that our messages are easily queryable
  • Ensure that we include a key/value pair that allows us to correlate messages associated with a single action

I’ve found that doing this turns writing log messages from a slog into an interesting, rewarding challenge. But more importantly, it will make life less frustrating for the poor, future engineer who needs to consume our logs. And remember, that future engineer might be us!

Find this story useful? Want to read more? Just subscribe here to get my latest stories sent directly to your inbox.

You can also support me and my writing — and get access to an unlimited number of stories — by becoming a Medium member today.

Programming
Software Development
Software Engineering
Logging
Microservices
Recommended from ReadMedium