Metrics, Logs, and Traces in JavaScript Tools

Canonical URL: https://www.swyx.io/writing/js-tools-metrics-logs-traces/

I was listening to the Official AWS Podcast's episode on Observability and was struck by how much thought has been given towards improving tools for investigating when things go wrong.

I realized that we could probably have something to learn by applying this lens to the JavaScript developer experience.

Note: This post originally framed these data types in terms of Observability - That was a mistake born out of my confusion on the topic from reading multiple sources - Charity Majors weighed in with far more info on the difference. To be clear, I don't claim any authority whatsoever on the topic - I'm just writing down my learning in public, and have since removed all mention of Observability from this post.

Data Types 🔗 We can break down the data types discussed into metrics, logs, traces, and events. Metrics : Time series data, like CPU utilization

: Time series data, like CPU utilization Logs : Structured or semistructured bits of text emitted by the application

: Structured or semistructured bits of text emitted by the application Traces : A record of an API call that's made from one part of my application to another

: A record of an API call that's made from one part of my application to another Events: An indication of a state change of some type* *That last one is in a special category - we'll discuss that separately at the end. In JavaScript we tend to just mush all this into "stuff we console.log out", but I think we can try to be a bit more sophisticated about it. I thought I should list out what each of these maps to in my mind given my experience writing and teaching Node.js CLIs.

JavaScript Logging 🔗 In JS we are pretty good, sometimes too good, about console.log ging everything, but it isn't good enough to dump a bunch of irrelevant unstructured crap in the terminal or browser console. Logs: Streaming Logs 🔗 In Node, we should become a little more comfortable with Node streams - they seem alien at first but are actually pretty darn handy especially for memory efficient I/O. For example, we can output work logs and error logs with streams: let fs = require ( ' fs ' ) ; let writer = fs . createWriteStream ( ' applog.txt ' ) ; let errors = fs . createWriteStream ( ' errlog.txt ' ) ; writer . write ( ' hello world ' ) ; try { // something risky } catch ( err ) { errors . write ( err ) console . error ( err ) } // etc. Logs: Structuring Logs 🔗 If your logs have some structure but not too much info, a table might be appropriate: var table = new AsciiTable ( ' A Title ' ) table . setHeading ( '' , ' Name ' , ' Age ' ) . addRow ( 1 , ' Bob ' , 52 ) . addRow ( 2 , ' John ' , 34 ) . addRow ( 3 , ' Jim ' , 83 ) console . log ( table . toString ()) // .----------------. // | A Title | // |----------------| // | | Name | Age | // |---|------|-----| // | 1 | Bob | 52 | // | 2 | John | 34 | // | 3 | Jim | 83 | // '----------------' But be mindful of whether you need your logs to be grep/awk/sed friendly (or maybe you just need to dump some JSON, up to you - Bunyan helps you stream JSON to files). Maybe there are other tools for padding structured data with whitespace for logging, but I haven't come across them yet. Logs: Log Levels 🔗 I do have a strong opinion that you should not clutter the developer console with random logs from everywhere - but you should make it easy for yourself and others to turn more verbose logging on when needed. This is often addressed in CLI tools with a --verbose flag, but even that is not good enough. You will want to have different log levels of abstraction so that you can enable the developer to request the correct density of logs for the problem they are trying to face. Bunyan builds in the concept of Levels and this idea is apparently built into Rails. Syslog is a more formally designed standard for message logging with an established hierarchy of severity: Of course, as developers we will mostly surface levels 3-6, but spend the bulk of our time in level 7 - debugging. There are 2 tools I strongly recommend for Level 7 logging. Node has an inbuilt util.debuglog function: const util = require ( ' util ' ) ; const debuglog = util . debuglog ( ' foo ' ) ; debuglog ( ' hello from foo [%d] ' , 123 ) ; // If this program is run with NODE_DEBUG=foo in the environment // then it will output something like: // // FOO 3245: hello from foo [123] Whereas the aptly named debug tool takes this idea and adds timing output with pretty colors. var a = require ( ' debug ' )( ' worker:a ' ) , b = require ( ' debug ' )( ' worker:b ' ) ; function work () { a ( ' doing lots of uninteresting work ' ) ; setTimeout ( work , Math . random () * 1000 ) ; } work () ; function workb () { b ( ' doing some work ' ) ; setTimeout ( workb , Math . random () * 2000 ) ; } workb () ; Isn't that beautiful! You can control what shows by setting the DEBUG environment variable - which means you can arbitrarily make your program spit out logs for the feature you're focusing on without changing any code inside. This is infinitely scalable. SERIOUSLY, EVERYBODY SHOULD USE DEBUG !!!

JavaScript Traces 🔗 A record of an API call that's made from one part of my application to another Yup, you can add that to debug . If you care about readable stack traces, Node.js can be fairly scary with its impenetrable internals. Fortunately you can clean it up with Sindre Sorhus' clean-stack : const cleanStack = require ( ' clean-stack ' ) ; const error = new Error ( ' Missing unicorn ' ) ; console . log ( error . stack ) ; /* Error: Missing unicorn at Object.<anonymous> (/Users/sindresorhus/dev/clean-stack/unicorn.js:2:15) at Module._compile (module.js:409:26) at Object.Module._extensions..js (module.js:416:10) at Module.load (module.js:343:32) at Function.Module._load (module.js:300:12) at Function.Module.runMain (module.js:441:10) at startup (node.js:139:18) */ console . log ( cleanStack ( error . stack )) ; /* Error: Missing unicorn at Object.<anonymous> (/Users/sindresorhus/dev/clean-stack/unicorn.js:2:15) */ stack-utils seems to also do the same thing but I haven't tried it out yet. Sometimes you have to output something when your Node process ends, either gracefully or abruptly. node-cleanup can help you tie up any loose ends and do optional reporting to the developer. What other ideas do you have here? Let me know 😻 Recommended resources from readers: OpenTracing - Vendor-neutral APIs and instrumentation for distributed tracing

Thomas Watson — An introduction to distributed tracing

Events 🔗 According to Honeycomb: Events are built up over time, gaining context as they go. Events provide both human and computer readable information, are not aggregated at write time, and allow us to ask more questions and combine the data in different ways. Charity also contrasted Events vs Metrics, Logs, and Traces in her mega response thread: we need to stop firing off metrics and log lines ad hoc-like and start issuing a single arbitrarily-wide event -- one per service, per request -- containing the full context of that request. EVERYTHING you know about it, did in it, parameters passed in to it, etc. Anything that could help us find and identify the request later. Think of that request like a needle in a haystack, and you need to be able to swiftly locate every damn needle in the stack. You especially want to stuff in any kind of IDs. Userids, app id, shopping cart id, etc Then, when the request is poised to exit or error the service, you ship that blob off to your o11y store in one very-wide blob. (A maturely instrumented service usually has a few HUNDRED dimensions per event, and it will have one event for each service that the request hits.) So this is really a concept that you should build up using unique identifiers for events and a place to store and query events somewhere. Honestly I don't have a ton of experience creating Events, but if you need unique id's you can use uuid to generate some, and event object creation is up to you I guess. You can also use concordance to compare, format, diff and serialize any JavaScript value to create events with just diff data.