Node.js concurrency model — a Java vs Javascript comparison

Tan Bui
6 min readFeb 16, 2022

--

We know that Javascript is single-thread, non-blocking IO, asynchronous. But Node.js also supports clusters with child processes, is it really single-thread at all? Let’s go through that with a comparison of a native multi-threading runtime JVM/Java.

Recap

  1. Concurrent vs Parallel, from The Art of Concurrency

A system is said to be concurrent if it can support two or more actions in progress at the same time. A system is said to be parallel if it can support two or more actions executing simultaneously.

2. Node.js concurrency model in a nutshell

Forget about child processes, let’s assume there is a JS engine is running in a browser as single thread.

Event-loop model

The JS code statements are put into Stack and executed one by one. When there is an async operation, that operation is executed in another internal thread (the main thread continues running to the next statement) and when it’s done, a callback is sent to the Task Queue, waiting to be picked up by the event loop and processed by the main thread. (Or you can just watch this legacy video.)

Note: setTimeout is a bit different from async operations (e.g. fetch) but it’s okay for demo purposes as they both would be skipped by the main thread and handled by the event loop, but setTimeout callback goes to the Task Queue while async callback goes to the Micro-task Queue.

3. Java concurrency model in a nutshell

Java is single-thread, I/O blocking, and synchronous without explicitly creating a new thread or using Future APIs. “Async operation” is a common concept. In JS, it returns a Promise which is similar to CompletableFuture in Java.

In this article, we discuss how Java/Javascript handles multi-thread, shared memory, and thread communication.

Java multi-threading — the older first

In Java, we control the child thread manually or by a Concurrency API e.g. ExecutorService . For example:

System.out.println("Start...");
new Thread(() -> System.out.println("I'm from another thread.")).start();
System.out.println("End...");

As we can see, for operations that take a long time to execute or can be executed parallelly, spawning a new thread to handle them would improve performances and avoid blocking the main thread.

An example of fetching JSON data using Java

Shared memory

Many threads can read/write on the same object in the memory, leading to concurrency issues such as race conditions. For example:

public class Concurrency {
static int count = 0;

public static void main(String[] args) {
Runnable process = () -> {
for (int i=0;i<100000;i++) {
count++;
}
System.out.printf("Thread %s: %d%n",
Thread.currentThread().getName(), count);
};

Thread thread1 = new Thread(process);
Thread thread2 = new Thread(process);

ExecutorService executor = Executors.newFixedThreadPool(2);
executor.submit(thread1);
executor.submit(thread2);
executor.shutdown();
}
}
// Result
Thread pool-1-thread-1: 103830
Thread pool-1-thread-2: 132853

One solution is to lock the critical session (i.e. the code block that could cause race conditions) so it can only be executed by one thread at a time:

public class Concurrency {
static Object lock = new Object();
static volatile int count = 0;

public static void main(String[] args) {
Runnable process = () -> {
for (int i=0;i<100000;i++) {
synchronized (lock) {
count++;
}
}
System.out.printf("Thread %s: %d%n",
Thread.currentThread().getName(), count);
};
...

And since the count could be read/written differently in different CPU caches, it’s better to store it in the main memory by marking it volatile keyword. With this, we always get count=200000 after two threads are completed.

An alternative is to use Atomic APIs which guarantee that the object is in the main memory and can be read/written by one thread at a time. A full example of these two solutions.

Thread communication — wait() and notify()

Besides shared memory, Java uses Lock and Monitor synchronization constructs to control threading execution.

In the Class Concurrency example above, we use thelock object to automatically hold/release the lock when entering/leaving the synchronized (lock){...} block. When a thread is monitoring a lock object (i.e having a lock on it), we can force it to release the lock by lockObject.wait() so a second thread can be executed and eventually release the lock for the first thread by lockObject.notify().

An example of wait and notify .

Good practices:

  • Pay attention to identifying the critical sessions
  • Try to minimize the critical session
  • Avoid synchronized the whole method/class, but just the critical sessions
  • Try to make use of supported APIs, e.g. atomic variables, concurrent maps.

Javascript multi-threading — the younger comes

To be clear, Javascript is single-thread when running on browsers (let’s ignore Web Worker for now). But when it comes to Node.js, JS can run as multi-threading.

A Node.js process is created when running a Node.js program like node app.js (or the child process created through child_process or cluster modules). Each process will have its own memory and resources. Simple examples of child process:

Cluster module makes it even easier to distribute the workload from the primary process to the worker processes. The Node.js API Cluster docs explains it very clear already.

Notes:

  • Again, no shared memory between processes, the communication is done by sending and receiving events via an IPC channel.
  • Node.js does not provide routing logic. It is, therefore important to design an application such that it does not rely too heavily on in-memory data objects for things like sessions and login.
  • Node.js does not automatically manage the number of workers. It is the application’s responsibility to manage the worker pool based on its own needs.
Cluster child processes

An example of cluster primary process and worker processes:

Since it’s expensive to create new processes, Worker thread is a recommended approach for multi-threading in Node.js since it only creates a thread inside the current process for CPU-intensive operations (it does not help much with I/O-intensive work).

Similar to child_process or cluster, worker_threads communicate to the main process by an IPC channel. However, worker_threads can share memory. They do so by transferring ArrayBuffer instances or sharing SharedArrayBuffer instances.

A simplest example is calculating Fibonacci numbers:

There is no new thread in the CPU activity, but there are two CPU threads get to 100%.

Worker threads

Below is another example of using SharedArrayBuffer to create shared memory between threads. Note that Atomics method is used to prevent race conditions so that only one operation can read/write on the array.

Good practices:

  • Keep in mind that creating a Worker — even though it’s a lot cheaper than forking a process — can also use too many resources depending on your needs. In practice, use a pool of Workers for these kinds of tasks. Otherwise, the overhead of creating Workers would likely exceed their benefit. You can probably look for a generic pool TS implementation or use a workerpool npm module.
  • Node.js provides AsyncResource to provide proper async tracking of a worker pool.
  • workerpool also supports multi-threading on the browser environment using Web Worker . A simple demo of blocking UI vs non-blocking using workerpool (source code).

In Java, multi-threading looks natural and easy to implement, the trade-off is that race conditions can happen easily, requiring careful attention. It’s common in Node.js to use a message channel to communicate between threads, reducing race conditions (although shared memory and atomic operations are also supported) but requires explicit implementations for handling the thread pool and message events.

References:

--

--

Tan Bui

Software Engineer @Smartly.io, phototaker, naturelover.