Observability with Structured logging
Problem Statement
Currently I am working on architecting end-to-end observability in a large-scale cloud transformation project. So, we started with foundational activities like analysing and standardizing application logs in centralized logging i.e. Elastic Stack (ELKB — Elasticsearch — Logstash — Kibana- Beats). All observability tools are setting up properly but team are not using logs for debugging. When we dig in, we found out that logs are generated with no clear purpose and also not human- and machine-readable. On top of that, it is consuming lot of storage with unpredictable licensing costs. To tackle this situation, we have taken ground-up approach to standardize logs and make it more contextual rather than free-form text message for applications (mostly microservices) in order to get better debugging and visibility.
Before I jump on solutions & approach, let me give a quick tour of observability and logs -
Observability 101
Modern distributed applications are very complex in nature because of lots of inter- and intra- connected systems and out-of-process communications. So failure is inevitable but we can create a fault-tolerant system and leverage observability with automation for fast recovery. So Observability is imperative to deliver a better product with customer satisfaction.
“Observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs.” — Wikipedia
It means — various “internal systems” like applications, VM, containers, databases, Networks, DNS, Cache, Cloud service provider (CSP) services, etc. are generating lots of telemetry signals(like log events) which are captured by observability tools to produce output.
Key Obervability pillars are Log(L), Metrics(M), and Trace(T), I remembered it as MTL(~metal). Although we need to consider other pillars also like Events, Alerts, RUM, Synthetic Monitoring, Anomaly detections, etc. Out of these one of the powerful and sometimes ignored pillars is Log. It is a cornerstone of observability — when all other observability signals are not able to provide comprehensive information during debugging and troubleshooting, a log can be your last resort.
Why Log?
Logs can be used for a wide range of reasons, I have mentioned below some key purposes -
What is Log?
There are many things to consider in logging, and I have tried to capture some important facets of logging ( see below mind map).
NOTE: Now I am going to discuss log formatting which is very essential and necessary (yellow highlighted one of the important aspects of logging — Log formatting)
Solution Approach
In order to start with a comprehensive approach, we have created a log formatting ground rules and principles that every application team should be aware of and adhere it.
Log Format Ground Rules
- The log should be context riched and at least follow 5W’s principle.(Credit goes to Phil Wilkins in his “Logging in Action”)
2. Logs should be both human and machine-readable
- Generated logs should be indexed fields properly so that log queries will avoid full-text search and use the power of the backend search engine( like for ELK Elastissearch)
- If we provide free-form text in log messages sometime it would be difficult to understand the mapping between values and their corresponding fields.
3. Logs should be consistent in format.
4. Logs can be verbose but they should be schemaless and flexible enough to ingest log events from any source.
Now let’s start with the approach and demo around the concept.
Approach — Spring Boot Microservice Perspective
We have taken a structured logging approach for our microservice which will generate logs as JSON log events using logstash-logback-encoder.
slf4j + logstash + logback
Note: As Spring boot default logging is logback and use Slf4j for implementation to avoid coupling between different logging frameworks (log4j,common-logging, etc.)
For more information, I would recommend reading a nice article more information on structured logging.
Demo
Let's have my favorite - Demo time !!
I have taken ELK as log aggregation and vanilla spring boot application as a log generation source. You can try other tools and feel free to share your experience.
Prerequisites -
- Verify Elastic stack is up and running
- Verify ElasticSearch -
Go to the browser and hit http://localhost:9200/
Check below response is coming —
{
"name": "DESKTOP-2H9LU00",
"cluster_name": "elasticsearch",
"cluster_uuid": "4aGpqzZcQFe5JjRgaiqP6w",
"version": {
"number": "8.5.3",
"build_flavor": "default",
"build_type": "zip",
"build_hash": "4ed5ee9afac63de92ec98f404ccbed7d3ba9584e",
"build_date": "2022-12-05T18:22:22.226119656Z",
"build_snapshot": false,
"lucene_version": "9.4.2",
"minimum_wire_compatibility_version": "7.17.0",
"minimum_index_compatibility_version": "7.0.0"
},
"tagline": "You Know, for Search"
}
- Verify Kibana-
Open up another tab with http://localhost:5601/app/home
you will see below home page -
- Verify Logstash — command line process in running
2. Create/modify spring boot application ( any java project)—
one can go spring initializer and create a new spring boot project. If you have an existing project, you can simply add the below dependency. You can check my sample project which I have used here.
<dependency>
<groupId>net.logstash.logback</groupId>
<artifactId>logstash-logback-encoder</artifactId>
<version>5.2</version>
</dependency>
Add logback.xml in your project directory
<configuration>
<appender name="FILE" class="ch.qos.logback.core.FileAppender">
<file>D:/elaticstack/logs/tests.log</file>
<append>true</append>
<encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
</appender>
<root level="INFO">
<appender-ref ref="FILE"/>
</root>
</configuration>
GitHub Sample project here
3. Change logstash.conf to read log files
Create logstash.conf file on your favourite editor to read local file. you need to mention spring boot logs file in “path”, Elasticsearch host in “hosts” and am index name like here I have given “app-logs”
Here logstash JSON filter is important which parses “message” and creates index fields. Also, skip_on_invalid_json can accept non-JSON format logs. check more info
input {
file {
path => "D:/elaticstack/logs/*.log"
start_position => "beginning"
sincedb_path => "NULL"
}
}
filter {
json {
source => "message"
skip_on_invalid_json => true
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "app-logs"
}
}
Keep logstash.conf file inside logstash installation folder (where it is unzipped)
Start logstash below command -
logstash -f logstash.conf
4. Generate a log
Start the spring boot application which will generate some startup INFO logs into the file location which is mentioned in logback.xml configuration->appender->file
( As I have mentioned as “D:/elaticstack/logs/tests.log” that's why logs are generated inside it)
Below are sample logs that are generated from my sample spring boot application. You can see the logs are coming in JSON format with key values. You might wonder from where those fields are coming. Actually, it is generated by the encoder library by default and you can customize it. Please see here for more info.
{"@timestamp":"2022-12-29T11:58:08.438+05:30","@version":"1","message":"Starting StructuredApplication using Java 19.0.1 with PID 14016 (D:\\elaticstack\\project\\structured\\structured\\target\\classes started by ADMIN in D:\\elaticstack\\project\\structured\\structured)","logger_name":"com.structured.logging.StructuredApplication","thread_name":"main","level":"INFO","level_value":20000}
{"@timestamp":"2022-12-29T11:58:08.476+05:30","@version":"1","message":"No active profile set, falling back to 1 default profile: \"default\"","logger_name":"com.structured.logging.StructuredApplication","thread_name":"main","level":"INFO","level_value":20000}
{"@timestamp":"2022-12-29T11:58:12.041+05:30","@version":"1","message":"Tomcat initialized with port(s): 8080 (http)","logger_name":"org.springframework.boot.web.embedded.tomcat.TomcatWebServer","thread_name":"main","level":"INFO","level_value":20000}
{"@timestamp":"2022-12-29T11:58:12.067+05:30","@version":"1","message":"Initializing ProtocolHandler [\"http-nio-8080\"]","logger_name":"org.apache.coyote.http11.Http11NioProtocol","thread_name":"main","level":"INFO","level_value":20000}
{"@timestamp":"2022-12-29T11:58:12.068+05:30","@version":"1","message":"Starting service [Tomcat]","logger_name":"org.apache.catalina.core.StandardService","thread_name":"main","level":"INFO","level_value":20000}
{"@timestamp":"2022-12-29T11:58:12.070+05:30","@version":"1","message":"Starting Servlet engine: [Apache Tomcat/10.1.4]","logger_name":"org.apache.catalina.core.StandardEngine","thread_name":"main","level":"INFO","level_value":20000}
{"@timestamp":"2022-12-29T11:58:12.328+05:30","@version":"1","message":"Initializing Spring embedded WebApplicationContext","logger_name":"org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/]","thread_name":"main","level":"INFO","level_value":20000}
{"@timestamp":"2022-12-29T11:58:12.332+05:30","@version":"1","message":"Root WebApplicationContext: initialization completed in 3707 ms","logger_name":"org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext","thread_name":"main","level":"INFO","level_value":20000}
{"@timestamp":"2022-12-29T11:58:13.882+05:30","@version":"1","message":"Exposing 1 endpoint(s) beneath base path '/actuator'","logger_name":"org.springframework.boot.actuate.endpoint.web.EndpointLinksResolver","thread_name":"main","level":"INFO","level_value":20000}
{"@timestamp":"2022-12-29T11:58:14.158+05:30","@version":"1","message":"Starting ProtocolHandler [\"http-nio-8080\"]","logger_name":"org.apache.coyote.http11.Http11NioProtocol","thread_name":"main","level":"INFO","level_value":20000}
{"@timestamp":"2022-12-29T11:58:14.228+05:30","@version":"1","message":"Tomcat started on port(s): 8080 (http) with context path ''","logger_name":"org.springframework.boot.web.embedded.tomcat.TomcatWebServer","thread_name":"main","level":"INFO","level_value":20000}
{"@timestamp":"2022-12-29T11:58:14.260+05:30","@version":"1","message":"Started StructuredApplication in 6.648 seconds (process running for 8.601)","logger_name":"com.structured.logging.StructuredApplication","thread_name":"main","level":"INFO","level_value":20000}
6. Verify Logs on Kibana
Go to discover
Select the kibana data view which is associated with the index ( like below as I have created one data view named “app-logs” of the corresponding “app-log” elasticsearch index)
You can see below logs fields(left panel) are showing as index fields -
and a log showing like the below -
{
"_index": "app-logs",
"_id": "EotBZ4UBz6yAb2wxa0IA",
"_version": 1,
"_score": 0,
"_ignored": [
"event.original.keyword"
],
"_source": {
"thread_name": "main",
"level": "INFO",
"@timestamp": "2022-12-31T08:16:54.187Z",
"message": "Started StructuredApplication in 9.239 seconds (process running for 11.241)",
"event": {
"original": "{\"@timestamp\":\"2022-12-31T13:46:54.187+05:30\",\"@version\":\"1\",\"message\":\"Started StructuredApplication in 9.239 seconds (process running for 11.241)\",\"logger_name\":\"com.structured.logging.StructuredApplication\",\"thread_name\":\"main\",\"level\":\"INFO\",\"level_value\":20000}\r"
},
"log": {
"file": {
"path": "D:/elaticstack/logs/tests.log"
}
},
"level_value": 20000,
"host": {
"name": "DESKTOP-2H9LU00"
},
"@version": "1",
"logger_name": "com.structured.logging.StructuredApplication"
},
"fields": {
"thread_name.keyword": [
"main"
],
"event.original": [
"{\"@timestamp\":\"2022-12-31T13:46:54.187+05:30\",\"@version\":\"1\",\"message\":\"Started StructuredApplication in 9.239 seconds (process running for 11.241)\",\"logger_name\":\"com.structured.logging.StructuredApplication\",\"thread_name\":\"main\",\"level\":\"INFO\",\"level_value\":20000}\r"
],
"host.name.keyword": [
"DESKTOP-2H9LU00"
],
"level": [
"INFO"
],
"@version.keyword": [
"1"
],
"message": [
"Started StructuredApplication in 9.239 seconds (process running for 11.241)"
],
"@timestamp": [
"2022-12-31T08:16:54.187Z"
],
"level.keyword": [
"INFO"
],
"level_value": [
20000
],
"thread_name": [
"main"
],
"log.file.path": [
"D:/elaticstack/logs/tests.log"
],
"message.keyword": [
"Started StructuredApplication in 9.239 seconds (process running for 11.241)"
],
"@version": [
"1"
],
"logger_name": [
"com.structured.logging.StructuredApplication"
],
"host.name": [
"DESKTOP-2H9LU00"
],
"logger_name.keyword": [
"com.structured.logging.StructuredApplication"
],
"log.file.path.keyword": [
"D:/elaticstack/logs/tests.log"
]
},
"ignored_field_values": {
"event.original.keyword": [
"{\"@timestamp\":\"2022-12-31T13:46:54.187+05:30\",\"@version\":\"1\",\"message\":\"Started StructuredApplication in 9.239 seconds (process running for 11.241)\",\"logger_name\":\"com.structured.logging.StructuredApplication\",\"thread_name\":\"main\",\"level\":\"INFO\",\"level_value\":20000}\r"
]
}
}
Limitations
Structured Logging might be easy to implement for applications but it might be difficult for COTS and third-party products where we don’t have control over customizing the log format. So in that case, we might need a lot of processing steps and configurations( like grok filter plugins in logstash) to convert into JSON-like format.
Final Thought
Observability is not about setting up necessary logging, APM, and other tools and your job is done. It needs a very thoughtful and iterative approach. So in order to make logging useful and meaningful, one can start thinking about structured logging and provide a self-describe format.
I hope the above approach might provide a comprehensive logging format strategy and implementation guide. Please share your thoughts !!