Quality Logging – Quality Ramblings

Over the past year I’ve moved company and experienced two very different approaches to logging. Both of which I strongly dislike.

Value of logging

Some context on my opinions here. Throughout the majority of my career I have lent heavily on logs. This was most relevant when I spent 2 years in a role handling all support cases escalated to engineering. I really got to appreciate the value in being able to understand exactly WTF went wrong…

As well as leveraging them to try and understand the cause of issues, I’ve also found them extremely valuable in catching some of those “hidden” defects that can be a sign of nastiness. For example as a developer I implemented a connection method that wasn’t keeping connections alive correctly… but it was reconnecting. If I tested purely as a customer via the front end, it was fine. When I looked at the logs, I realised how nasty things were.

Subsequently as a tester I love to check the logs, even if everything looks OK. What gremlins are lurking in silence?

Another great test with logging is looking for information that shouldn’t be available. If I stick everything on debug level then can I read user passwords in plain text? As a user can I discover your implementation and tech stack through stack traces? This is especially relevant for software where the logs live on a person’s computer.

Logging Woes

Sadly my experienced with logging have been less than ideal over the past 3 years.

In my previous role I was overwhelmed by logs as there were hundreds of exceptions within an hour of moderate usage on top of thousands of other messages. Anything possibly slightly untowards at the point of implementation, throw an exception. In my current role I don’t have access to meaningful logs, if they even exist at all. I can perform an operation and it may or may not succeed and I’ve no idea why.

Clearly logging everything that happens to a text file isn’t manageable, especially in a complex system. Even with good layering of your logging (e.g. fatal, error, warn, info, debug), I’ve been in scenarios where we’ve had debug level logging over 24 hours to try and catch a scenario but the verbosity was so much that we couldn’t cover the full timespan.

One thing that I’ve liked is when logging can be customised, for example debug level on a specific class. Even better, if you’re using a dedicated logging solution then pointing at that and filtering after the fact is fantastic.

Takeaway Point

Logging is important but don’t over use it. The most important thing when it comes to logging is make sure you are making considered decisions.

If you’ve added extra log messages to help with (dev) testing, remove them immediately afterwards. If you’re working on a new user flow and haven’t added any logging, ask yourself how you will debug it? Is it clear who has done what? When things go wrong, can you piece together enough to understand why?

And importantly… whether you’re developing or testing a feature, when the user (inevitably?) does something completely unexpected down the line, will you be able to quickly understand what happened?

Value of logging

Logging Woes

Takeaway Point

Leave a Reply Cancel reply