Log me now

Forum PHP 2017

Press [s] to read the speaker notes (will open a new window).

We will talk about logs…

… even if it's not “trendy”
… because they are valuable
… and often used inconsistently

Every "real-world" application has some kind of logging infrastructure. Sometimes it's quick and simple (error_log), sometimes it's more sophisticated and configurable (Monolog). Unfortunately, many applications use logging inconsistently. The logging infrastructure is usually grown organically and not thought through. Usually, it's not the most loved and fun aspect of the development. Lots of us log blindly and without thinking of who will read these logs and why. In this talk, we will take some time to see how to make our logs more valuable (and why). Log exploitation (with tools such as ELK stacks/Graylog/… ) will not be addressed.

How can we write useful logs?

Definition: to log

“to make an official record of things that happen”

Macmillan Dictionary

Logging strategies

performance and resources
errors
user actions

I think that logs can serve different purposes. The most common being: * performance and resources: memory consumption, CPU/IO/network usage, external API calls and their response times, … Helps to monitor, debug and improve the general performance of the system. * errors: exceptions, warnings, unexpected responses, … Helps detecting and troubleshooting issues with the system or its environment. * user actions: actions performed by your users. Useful to help users when they come across an issue. * business events: some log entries can be transformed into business-related metrics and statistics. Can be used as a poor man’s BI tool. Therefore, the first step to write useful logs is to determine what they will be used for and by who.

Log, log, log!

Ensure that all the relevant events are logged!

… relevant for who and why?

Log messages

Logs should be explicit.


Order "REF-42" failed: payment gateway unreachable.

Write sentences describing the event and including the relevant details. Think about the person who will read the log. An order failed? Which one? Why? When? The first version is almost useless as it provides no information that can be used to explain what happened and what action should be taken. When a developer writes a log, he has in mind the context (in terms of code and application flow) in which the log is to be inserted. We often tend to write log messages that rely on this context… that is NOT present anymore when the log entry is read. To avoid having cryptic logs emitted by the application, log messages have to be the most explicit possible.

Chronological logs

Logs should be timestamped.


Order "REF-42" failed on 12/12/17 at 14h30: payment gateway unreachable.

Human readable, machine parseable

Logs should be human readable and unambiguously parseable.


[2017-12-12T14:30:51.965Z] Order "REF-42" failed on 12/12/17 at 14h30: payment gateway unreachable.

Human readable, machine parseable

Logs should be human readable and unambiguously parseable.


[2017-12-12T14:30:51.965Z] Order "REF-42{order_ref}" failed: payment gateway unreachable.{reason} -- {"order_ref": "REF-42", "reason": "payment gateway unreachable"}

Human readable, machine parseable

JSON is fine.

NEVER write multi-line logs.

Context matters


[2017-12-12T14:30:51.965Z] Order {order_ref} failed: {reason} -- {"order_ref": "REF-42", "reason": "payment gateway unreachable", "customer_id": 42, "payment_transaction_id": 4224}

Context matters

Never log sensitive data.

Divide and conquer: severity levels

Logs should be categorized by severity level.


[2017-12-12T14:30:51.965Z] [WARNING] Order {order_ref} failed: {reason} -- {"order_ref": "REF-42", "reason": "payment gateway unreachable"}

Divide and conquer: severity levels

As defined by the RFC 5424 (Syslog protocol)

debug	Detailed debug information
info	Interesting events (user logs in, SQL queries, …)
notice	Normal but significant events
warning	Exceptional occurrences that are not errors
error	Runtime errors that do not require immediate action
critical	Critical conditions, action must be taken
alert	Action must be taken immediately
emergency	Emergency: system is unusable.

Divide and conquer: subsystems

Logs should be categorized by subsystem.


[2017-12-12T14:30:51.965Z] [WARNING] [metadata] Export generation failed: {reason} -- {"reason": "export directory not writable", "export_ref": 42, "export_directory": "/home/sftp/exports"}

Actionable logs


[2017-12-12T14:30:51.965Z] [WARNING] [metadata] Export generation failed: {reason} -- {"reason": "export directory not writable", "export_ref": 42, "export_directory": "/hom/sftp/exports", "_action": "The export could not be generated because the target directory is not writable. Fix the permissions on the directory and retry the export (`./bin/console tea:export --request=42 --retry`)"}

Correlate logs


[2017-12-12T14:30:51.965Z] [WARNING] [metadata] Export generation failed: {reason} -- {"reason": "export directory not writable", "export_ref": 42, "export_directory": "/hom/sftp/exports"} {"request_id": "7d5e8092-a13d-45b8-adb4-b26c18806825"}

See Monolog's "processors"

to identify the events associated to a given request. In the same application, but even among applications and servers (business applications or infrastructure servers such as nginx and such) For this, a UUID is fine. The earlier it is generated, the better. I often include two sets of metadata in my log entries: * the first represents a log context and is associated to the event itself ; * the other represents extra metadata automatically added to any event emitted by the application. Monolog handles that nicely using "Processors": https://github.com/Seldaek/monolog/blob/master/doc/02-handlers-formatters-processors.md#processors

Log me now

Forum PHP 2017

We will talk about logs…

How can we write useful logs?

Definition: to log

Logging strategies

Log, log, log!

Log messages

Chronological logs

Human readable, machine parseable

Human readable, machine parseable

Human readable, machine parseable

Context matters

Context matters

Divide and conquer: severity levels

Divide and conquer: severity levels

Divide and conquer: subsystems

Actionable logs

Correlate logs

Summary

Thanks!

TEA is recruiting, join us!

https://joind.in/talk/2d06e