A Pragmatic Overview of Async Hooks API in Node.js

Published in

ITNEXT

10 min readDec 12, 2018

Recently I wrote a post about request id tracing in Node.js apps. The proposed solution was built around cls-hooked library, which in its turn uses node’s built-in Async Hooks API. So, I decided to get more familiar with async hooks. In this post I’m going to share my findings and describe some real-world use cases for this API.

Let’s start our journey with a short intro.

An Intro to Async Hooks API

Async Hooks is an experimental API available in Node.js starting from v8.0.0. So, despite of being an experimental API, it exists for about a year and a half and seems to have no critical performance issues and other bugs. Experimental status means that the API may have non-backward compatible changes in future or may be even completely removed. But, considering that this API had a couple of not-so-lucky predecessors, process.addAsyncListener (<v0.12) and AsyncWrap (v6–7, unofficial), Async Hooks API is not the very first attempt and should eventually become a stable API.

The documentation of the async_hooks module describes the purpose of this module in the following fashion:

The async_hooks module provides an API to register callbacks tracking the lifetime of asynchronous resources created inside a Node.js application.
…
An asynchronous resource represents an object with an associated callback. This callback may be called multiple times, for example, the 'connection' event in net.createServer(), or just a single time like in fs.open(). A resource can also be closed before the callback is called. AsyncHook does not explicitly distinguish between these different cases but will represent them as the abstract concept that is a resource.

So, async hooks allow you to to track any (well, almost any) asynchronous stuff that is happening in your node app. Events related with registration and invocation of any callbacks and native promises in your code can be potentially listened via an async hook. In other words, this API allows you to attach listeners to macrotasks and microtasks lifecycle events. Moreover, the API allows listening to low-level async resources from node’s built-in native modules, like fs and net.

The core Async Hooks API can be expressed with the following snippet (a shortened version of this snippet from the docs):

const async_hooks = require('async_hooks')// ID of the current execution context
const eid = async_hooks.executionAsyncId()
// ID of the handle responsible for triggering the callback of the
// current execution scope to call
const tid = async_hooks.triggerAsyncId()const asyncHook = async_hooks.createHook({
  // called during object construction
  init: function (asyncId, type, triggerAsyncId, resource) { },
  // called just before resource's callback is called
  before: function (asyncId) { },
  // called just after resource's callback has finished
  after: function (asyncId) { },
  // called when an AsyncWrap instance is destroyed
  destroy: function (asyncId) { },
  // called only for promise resources, when the `resolve`
  // function passed to the `Promise` constructor is invoked
  promiseResolve: function (asyncId) { }
})// starts listening for async events
asyncHook.enable()
// stops listening for new async events
asyncHook.disable()

You can see that there are not so many functions in Async Hooks API and, in general, it looks quite simple.

The executionAsyncId() function returns an identifier of the current execution context. The triggerAsyncId() function returns an id of the resource that was responsible for calling the callback that is currently being executed (let’s call it a parent or trigger id). The same id(s) are also available in async hook’s event listeners (see the createHook() function).

You can use executionAsyncId() and triggerAsyncId() functions without creating and enabling an async hook. But, in this case, promise executions are not assigned async ids due to the relatively expensive nature of the promise introspection API in V8.

Now, we’re going to focus on behavior of async hooks, as it’s not so obvious how and when callbacks in a created hook will be triggered. As the next step, we’re going to do some experiments with async hooks and learn how they work.

Let’s Play!

Before doing any experiments, we’re going to implement a very primitive async hook. It’ll be storing necessary metadata for the event on init invocation and outputting it into the console for all subsequent invocations. To minimize the console output, it also supports filtering by event types. Here it is:

const asyncHooks = require('async_hooks')module.exports = (types) => {
  // will contain metadata for all tracked events
  this._tracked = {}  const asyncHook = asyncHooks.createHook({
    init: (asyncId, type, triggerAsyncId, resource) => {
      if (!types || types.includes(type)) {
        const meta = {
          asyncId,
          type,
          pAsyncId: triggerAsyncId,
          res: resource
        }
        this._tracked[asyncId] = meta
        printMeta('init', meta)
      }
    },
    before: (asyncId) => {
      const meta = this._tracked[asyncId]
      if (meta) printMeta('before', meta)
    },
    after: (asyncId) => {
      const meta = this._tracked[asyncId]
      if (meta) printMeta('after', meta)
    },
    destroy: (asyncId) => {
      const meta = this._tracked[asyncId]
      if (meta) printMeta('destroy', meta)
      // delete meta for the event
      delete this._tracked[asyncId]
    },
    promiseResolve: (asyncId) => {
      const meta = this._tracked[asyncId]
      if (meta) printMeta('promiseResolve', meta)
    }
  })  asyncHook.enable()  function printMeta (eventName, meta) {
    console.log(`[${eventName}] asyncId=${meta.asyncId}, ` +
      `type=${meta.type}, pAsyncId=${meta.pAsyncId}, ` +
      `res type=${meta.res.constructor.name}`)
  }
}

We’re going to use it as a module in our experiments, so let’s place it in a file called verbose-hook.js. Now, we’re ready for experiments. For the sake of simplicity, we’ll be mostly using Timers API (to be precise, the setTimeout() function) in our examples.

First, let’s see what happens for a single timer:

require('./verbose-hook')(['Timeout'])setTimeout(() => {
  console.log('Timeout happened')
}, 0)
console.log('Registered timeout')

This script will product the following output:

[init] asyncId=5, type=Timeout, pAsyncId=1, res type=Timeout
Registered timeout
[before] asyncId=5, type=Timeout, pAsyncId=1, res type=Timeout
Timeout happened
[after] asyncId=5, type=Timeout, pAsyncId=1, res type=Timeout
[destroy] asyncId=5, type=Timeout, pAsyncId=1, res type=Timeout

As we can see, the lifecycle of a single setTimeout operation is very simple and straightforward. It starts with a (synchronous!) call of init listener when the async operation (timeout’s callback) is being added into the timers queue, or, in other words, when an async resource is created. Once the callback is going to be triggered, the before event listener is fired, followed by listeners for after and destroy events when the callback has finished execution.

You may wonder, what will happen in case of nested operations? Let’s see:

require('./verbose-hook')(['Timeout'])setTimeout(() => {
  console.log('Timeout 1 happened')
  setTimeout(() => {
    console.log('Timeout 2 happened')
  }, 0)
  console.log('Registered timeout 2')
}, 0)
console.log('Registered timeout 1')

This script will produce a longer ouput which looks similar to this one:

[init] asyncId=5, type=Timeout, pAsyncId=1, res type=Timeout
Registered timeout 1
[before] asyncId=5, type=Timeout, pAsyncId=1, res type=Timeout
Timeout 1 happened
[init] asyncId=11, type=Timeout, pAsyncId=5, res type=Timeout
Registered timeout 2
[after] asyncId=5, type=Timeout, pAsyncId=1, res type=Timeout
[destroy] asyncId=5, type=Timeout, pAsyncId=1, res type=Timeout
[before] asyncId=11, type=Timeout, pAsyncId=5, res type=Timeout
Timeout 2 happened
[after] asyncId=11, type=Timeout, pAsyncId=5, res type=Timeout
[destroy] asyncId=11, type=Timeout, pAsyncId=5, res type=Timeout

The output shows that nested async operations have direct correlation in Async Hooks API. The id of the root setTimeout operation (asyncId=5) acts as the parent (or trigger) id for the nested operation (asyncId=11). Another interesting thing shown in this output is that the destroy event for the root call happens before the nested destroy. That’s because destroy listener is called after the resource corresponding to the async operation (the Timeout object in our case) is destroyed.

Another important thing related with the destroy event to notice is that under certain conditions it might not be triggered at all. Here is what official docs say:

Some resources depend on garbage collection for cleanup, so if a reference is made to the resource object passed to init it is possible that destroy will never be called, causing a memory leak in the application. If the resource does not depend on garbage collection, then this will not be an issue.

So, if you’re developing a library based on async hooks, you need to be thinking of possible memory leak issues that your library may introduce.

How about doing some bad things now? Let’s try to create a timeout, then clear it right away and see what events will be registered by the async hook:

require('./verbose-hook')(['Timeout'])clearTimeout(
  setTimeout(() => {
    console.log('Timeout happened')
  }, 0)
)
console.log('Registered timeout')

This example produces the following output:

[init] asyncId=5, type=Timeout, pAsyncId=1, res type=Timeout
Registered timeout
[destroy] asyncId=5, type=Timeout, pAsyncId=1, res type=Timeout

Despite from being immediately cancelled, the timeout still creates an async resource in Async Hooks terminology. Thus, listeners for init and destroy events are still triggered. This example also shows that after and before events are not guaranteed to be called.

So far, we haven’t seen any promiseResolve events. That’s because we weren’t using any native promises in our examples. Let’s start with the most trivial example:

require('./verbose-hook')(['PROMISE'])Promise.resolve()
console.log('Registered Promise.resolve')

This script outputs the following into the console:

[init] asyncId=5, type=PROMISE, pAsyncId=1, res type=PromiseWrap
[promiseResolve] asyncId=5, type=PROMISE, pAsyncId=1, res type=PromiseWrap
Registered Promise.resolve

Interestingly, in this example the promiseResolve listener is run synchronously during execution of the Promise.resolve() function. As docs mention, promiseResolve listener will be triggered when the resolve function passed to the Promise constructor is invoked (either directly or through other means of resolving a promise). And in our case resolve function is called synchronously because of the Promise.resolve() function.

As another consequence, promiseResolve (and other listeners) will be triggered multiple times in those cases, when chains of promises are built with then/catch chains. In order to illustrate this, let’s see the following example (this time we’ll be using Promise.reject to make the example a bit more different from the previous one):

require('./verbose-hook')(['PROMISE'])Promise.reject()
  .catch(() => console.log('Promise.reject callback'))
console.log('Registered Promise.reject')

This script produces the following output:

[init] asyncId=5, type=PROMISE, pAsyncId=1, res type=PromiseWrap
[promiseResolve] asyncId=5, type=PROMISE, pAsyncId=1, res type=PromiseWrap
[init] asyncId=8, type=PROMISE, pAsyncId=5, res type=PromiseWrap
Registered Promise.reject
[before] asyncId=8, type=PROMISE, pAsyncId=5, res type=PromiseWrap
Promise.reject callback
[promiseResolve] asyncId=8, type=PROMISE, pAsyncId=5, res type=PromiseWrap
[after] asyncId=8, type=PROMISE, pAsyncId=5, res type=PromiseWrap

As expected, we see a hierarchy of two of async resources here. The first one (asyncId=5) corresponds to the Promise.reject() invocation, while the second one (asyncId=8) stands for the chained catch() call.

By this point, you should have an understanding of main principles behind the Async Hooks API. Don’t hesitate to do more experiments with other scenarios and types of events.

Now, we’re going to discuss some internal implementation details.

Diving a Bit Deeper

An important note. I’m using one of the latest commits from the node’s master branch in all links below, so the internals may be different in past/future versions. Also there are no code snippets in this sections, so feel free to follow up on the links if you’re interested in seeing the source code.

If you want to understand how Async Hooks are implemented by reading node sources, then the first thing to be checked in the async_hooks module itself. It defines the AsyncHook class which describes objects returned by the createHook() function, as well as so called JS Embedder API. The later allows you to extend AsyncResource class, so that lifetime events of your own resources will be processed by Async Hooks API.

If you continue to dive deeper, you’re going to find the internal/async_hooks module. This module is used by the public one and acts as a bridge between native code and JS part of the API. The C++ part of Async Hooks API is represented by the async_wrap native module, which also defines the public C++ Embedder API. The native API defines AsyncWrap and PromiseWrap classes that we’re going to be considering later. So, these three modules define the main part of Async Hooks API implementation.

Let’s consider a concrete example of call chain that happens behind the scenes right befire an init listener is triggered.

On JS side, the AsyncResource class has a call of emitInit() function in the constructor. This function is defined in the internal/async_hooks module. In its turn, this function calls emitInitNative() function of the same module. Finally, this function synchronously iterates over existing active hooks and calls init listeners for each of them. That’s why we have seen synchronous invocations of init listeners in our experiments.

On C++ side, i.e. for native async resources, the same emitInitNative() function is asynchronously (from native code’s execution perspective) called in the async resource constructor. I don’t think it’s necessary to go through the whole chain of calls this time, so believe me (or check it yourself) that the call is eventually happening in EmitAsyncInit() function.

Expectedly, Async Hooks API (and namely AsyncWrap native class) is integrated into all standard Node.js modules. For example, you may find AsyncWrap in internals of native modules, like crypto and fs, and in many other places. As another example, the promiseResolve event listener is based on a hook for native promises.

In summary, any async stuff that is happening in your node app will be listenable via Async Hooks API. The only exception might be some 3rd-party native modules and wrappers around them. But for those you can still use the Embedder API to integrate them with async hooks.

Now, as we have a better understanding of async hooks internals and principles, let’s think of some real-world problems that we can solve with this API.

But What Are Async Hooks Good For?

The first use case for async hooks is, as we already know, Continuation-local storage (CLS), an async variation of the well-know Thread-local storage concept from the multi-threaded world. In short, the implementation in the cls-hooked library is based on tracking associations between a context object and a chain of async calls starting with an execution of a particular function or a lifecycle of an EventEmitter object. Async Hooks API is used here to listen to async operations and track contexts through the whole execution chain. Without this or similar built-in API, you would have to deal with lots of monkey-patching for all async APIs in Node.js.

The second use case would be about profiling and monitoring. Async hooks in a combination with the Performance Timing API may be used to profile your app. These two APIs allow to collect information about async operations that are happening in the app and measure their duration. For the purpose of web apps monitoring, one can build a middleware that would be gathering request handling statistics in a sampling manner, i.e. for certain percentage of requests (not all of them), thus minimizing the performance impact for the app. This information can be written into a file or streamed in to a standalone service and visualized in many ways later, e.g. as a flame graph.

As a real world example of a profiling tool built on top of Async Hooks API, I can name the Bubbleprof tool which is a part of Clinic.js. Once run, it traces async resources and does a human-readable visualization of the collected data. Check out this blog post to learn more about Bubbleprof.

Hopefully, this post gave you a better undestanding of async hooks. As we have seen, Async Hooks API is a powerful out-of-the-box feature that allows to solve real-world problems in a neat way.

P.S. If you know any other real-world use cases for async hooks, feel free to describe them in comments below. I’d really like to hear about those.

A Pragmatic Overview of Async Hooks API in Node.js

An Intro to Async Hooks API

Let’s Play!

Diving a Bit Deeper

But What Are Async Hooks Good For?

Written by Andrey Pechkurov