DevelopmentBackendNode.js

How Node.js handles I/O?

03 MARCH 2020 • 9 MIN READ

Grzegorz Bar

Grzegorz

Bar

header picture

How Node.js handles I/O?

Everyone who has ever come across Node has also heard about its cross-platform, event-driven, non-blocking I/O solution and its capabilities.
In this article, I would like to explain what makes Node so special about that approach and what exactly rules all those I/O complexities behind the red curtain.
Before we start and dig deeper into the rabbit hole, let's go through a bunch of definitions that will help us understand the entire design better.

Reactor Pattern

It is an event handling design pattern for taking care of service requests delivered concurrently to a service handler by one or more inputs. The service handler then demultiplexes the incoming requests and dispatches them synchronously to the associated request handlers.

Let's check how this definition fits in Node. According to the documentation, we can find two abstract entities involved here: Event Demultiplexer and Event Queue. The first one receives and delegates the I/O request to the appropriate handler, and when the request is processed, it will add the registered callback handler (Event) in the event queue to be processed.

The program that manages the entire process is called the Event Loop. An Event Loop is a single-threaded and semi-infinite loop. The reason why this is called a semi-infinite loop is that it actually quits at the point when there's no more work to be done. From the developer's perspective, this is where the program exits.
The above sounds quite simple, but taking into account those cross-platform, non-blocking I/O, and the asynchronous nature of Node, it doesn't feel like we tackled everything, not just yet.

Event Demultiplexer

Now we know the responsibilities of event demultiplexer, but this is just an abstract concept described in the reactor pattern. In the real world, its implementation differs across systems and is known under different names:

  • epoll on Linux
  • kqueue in macOS
  • event ports in Solaris
  • IOCP in Windows

NodeJS takes advantage of the low-level non-blocking, asynchronous hardware I/O functionalities provided by these implementations.

I/O limitations

The system and its I/O implementations are very complex, e.g., some parts of I/O are not fully supported in terms of asynchrony. Those issues are especially related to the File I/O implementations provided by each system. It's worth mentioning that this also has an impact on some of Node's DNS functions.

Thread Pool

To preserve the complete asynchrony across platforms, Node uses a thread pool to address those operations which cannot be covered by system asynchronous I/O. Thread pool also covers CPU-intensive operations (some crypto functions and zlib async functions provided by Node) to prevent blocking the event loop that could eventually kill application execution.
This does not mean that a thread pool performs or delegates all the I/O for Node. Thread pool supports those parts of system I/O that are not asynchronous. Node and event loop are both single-threaded (same thread during execution), but there are some functions of Node (blocking operations) that take advantage of multi-threading thanks to thread pool. You can set thread pool size from the default 4 up to 1024 threads. Take a look this simple program and the results that depend on the thread pool size:

const crypto = require('crypto');
const start = Date.now();
function logRandomBytesTime() {
    crypto.randomBytes(4096, () => {
        console.log('RandomBytes time: ', Date.now() - start);
    });
}
logRandomBytesTime();
logRandomBytesTime();
logRandomBytesTime();
logRandomBytesTime();
logRandomBytesTime();
logRandomBytesTime();
logRandomBytesTime();
logRandomBytesTime();

Lets run above code:

➜  nodejs-example export UV_THREADPOOL_SIZE=4
➜  nodejs-example node thread-pool.js
RandomBytes time:  2
RandomBytes time:  5
RandomBytes time:  6
RandomBytes time:  6
RandomBytes time:  6
RandomBytes time:  6
RandomBytes time:  6
RandomBytes time:  6
➜  nodejs-example export UV_THREADPOOL_SIZE=5
➜  nodejs-example node thread-pool.js
RandomBytes time:  2
RandomBytes time:  5
RandomBytes time:  5
RandomBytes time:  5
RandomBytes time:  5
RandomBytes time:  6
RandomBytes time:  6
RandomBytes time:  6
➜  nodejs-example export UV_THREADPOOL_SIZE=10
➜  nodejs-example node thread-pool.js
RandomBytes time:  1
RandomBytes time:  4
RandomBytes time:  4
RandomBytes time:  5
RandomBytes time:  5
RandomBytes time:  5
RandomBytes time:  5
RandomBytes time:  5

This simple usage of crypto.randomBytes shows that it is delegated to the thread pool for execution and the resulting time differs when the thread pool size is larger. Nevertheless, that does not mean that you can set the size to 1024 and completely forget about it.

➜  nodejs-example export UV_THREADPOOL_SIZE=1024
➜  nodejs-example node thread-pool.js
RandomBytes time:  3
RandomBytes time:  6
RandomBytes time:  6
RandomBytes time:  6
RandomBytes time:  6
RandomBytes time:  6
RandomBytes time:  6
RandomBytes time:  7

Time of execution for size=1024 was longer than for other examples because of hardware limitations and memory allocation for the thread pool. Examples were performed on Quad-Core with 2 threads per core which give 8 available threads.

So far we’ve learned that Node takes advantage of system I/O and is supported by Thread Pool to achieve complete asynchrony, non-blocking behavior on heavy stuff, and even multi-threading. Is it Node that manages and handles all those things or maybe there’s some low-level library that pulls all those strings and only exposes the API for Node?

libuv

Following the documentation: it is a multi-platform support library with a focus on asynchronous I/O. It was primarily developed for use by Node.js, but it’s also used by Luvit, Julia, pyuv, php-uv, and many more.

libuv implements Reactor Pattern and provides an advanced implementation of Event Demultiplexer with the composition of an I/O processing APIs. Moreover, libuv provides the entire Event Loop and Event Queue mechanism.

libuv-architecture

Summary

libuv is a powerful low-level I/O Engine with cross-platform support that provides complete asynchronous capabilities for anything that is built on top of it. Understanding how libuv's implementation of Reactor Pattern work is crucial for everyone who wants to use Node.js efficiently.