Threaded versus event-driven architecture

Node.js's blistering performance is said to be because of its asynchronous event-driven architecture, and its use of the V8 JavaScript engine. That's a nice thing to say, but what's the rationale for the statement?

The V8 JavaScript engine is among the fastest JavaScript implementations. As a result, Chrome is widely used not just to view website content, but to run complex applications. Examples include Gmail, the Google GSuite applications (Docs, Slides, and so on), image editors such as Pixlr, and drawing applications such as draw.io and Canva. Both Atom and Microsoft's Visual Studio Code are excellent IDE's that just happen to be implemented in Node.js and Chrome using Electron. That these applications exist and are happily used by a large number of people is testament to V8's performance. Node.js benefits from V8 performance improvements.

The normal application server model uses blocking I/O to retrieve data, and it uses threads for concurrency. Blocking I/O causes threads to wait on results. That causes a churn between threads as the application server starts and stops the threads to handle requests. Each suspended thread (typically waiting on an I/O operation to finish) consumes a full stack trace of memory, increasing memory consumption overhead. Threads add complexity to the application server as well as server overhead.

Node.js has a single execution thread with no waiting on I/O or context switching. Instead, there is an event loop looking for events and dispatching them to handler functions. The paradigm is that any operation that would block or otherwise take time to complete must use the asynchronous model. These functions are to be given an anonymous function to act as a handler callback, or else (with the advent of ES2015 promises), the function would return a Promise. The handler function, or Promise, is invoked when the operation is complete. In the meantime, control returns to the event loop, which continues dispatching events.

At the Node.js interactive conference in 2017, IBM's Chris Bailey made a case for Node.js being an excellent choice for highly scalable microservices. Key performance characteristics are I/O performance, measured in transactions per second, startup time, because that limits how quickly your service can scale up to meet demand, and memory footprint, because that determines how many application instances can be deployed per server. Node.js excels on all those measures; with every subsequent release each, is either improving or remaining fairly steady. Bailey presented figures comparing Node.js to a similar benchmark written in Spring Boot showing Node.js to perform much better. To view his talk, see https://www.youtube.com/watch?v=Fbhhc4jtGW4.

To help us wrap our heads around why this would be, let's return to Ryan Dahl, the creator of Node.js, and the key inspiration leading him to create Node.js. In his Cinco de NodeJS presentation in May 2010, https://www.youtube.com/watch?v=M-sc73Y-zQA, Dahl asked us what happens while executing a line of code such as this:

result = query('SELECT * from db'); 
// operate on the result

Of course, the program pauses at that point while the database layer sends the query to the database, which determines the result and returns the data. Depending on the query, that pause can be quite long; well, a few milliseconds, which is an eon in computer time. This pause is bad because that execution thread can do nothing while waiting for the result to arrive. If your software is running on a single-threaded platform, the entire server would be blocked and unresponsive. If instead, your application is running on a thread-based server platform, a thread context switch is required to satisfy any other requests that arrive. The greater the number of outstanding connections to the server, the greater the number of thread context switches. Context switching is not free because more threads require more memory per thread state and more time for the CPU to spend on thread management overhead.

Simply using an asynchronous, event-driven I/O, Node.js removes most of this overhead while introducing very little of its own.

Using threads to implement concurrency often comes with admonitions such as these: expensive and error-prone, the error-prone synchronization primitives of Java, or designing concurrent software can be complex and error prone. The complexity comes from the access to shared variables and various strategies to avoid deadlock and competition between threads. The synchronization primitives of Java are an example of such a strategy, and obviously many programmers find them difficult to use. There's the tendency to create frameworks such as java.util.concurrent to tame the complexity of threaded concurrency, but some might argue that papering over complexity does not make things simpler.

Node.js asks us to think differently about concurrency. Callbacks fired asynchronously from an event loop are a much simpler concurrency model—simpler to understand, simpler to implement, simpler to reason about, and simpler to debug and maintain.

Ryan Dahl points to the relative access time of objects to understand the need for asynchronous I/O. Objects in memory are more quickly accessed (in the order of nanoseconds) than objects on disk or objects retrieved over the network (milliseconds or seconds). The longer access time for external objects is measured in zillions of clock cycles, which can be an eternity when your customer is sitting at their web browser ready to move on if it takes longer than two seconds to load the page.

In Node.js, the query discussed previously will read as follows:

query('SELECT * from db', function (err, result) { 
    if (err) throw err; // handle errors 
    // operate on result 
});

The programmer supplies a function that is called (hence the name callback function) when the result (or error) is available. Instead of a thread context switch, this code returns almost immediately to the event loop. That event loop is free to handle other requests. The Node.js runtime keeps track of the stack context leading to this callback function, and eventually an event will fire causing this callback function to be called.

Advances in the JavaScript language are giving us new options to implement this idea. The equivalent code looks like so when used with ES2015 Promise's:

query('SELECT * from db') 
.then(result => { 
    // operate on result 
}) 
.catch(err => { 
    // handle errors 
});

The following with an ES-2017 async function:

try {
    var result = await query('SELECT * from db');
    // operate on result
} catch (err) {
    // handle errors
}

All three of these code snippets perform the same query written earlier. The difference is that the query does not block the execution thread, because control passes back to the event loop. By returning almost immediately to the event loop, it is free to service other requests. Eventually, one of those events will be the response to the query shown previously, which will invoke the callback function.

With the callback or Promise approach, the result is not returned as the result of the function call, but is provided to a callback function that will be called later. The order of execution is not one line after another, as it is in synchronous programming languages. Instead, the order of execution is determined by the order of the callback function execution.

When using an async function, the coding style LOOKS like the original synchronous code example. The result is returned as the result of the function call, and errors are handled in a natural manner using try/catch. The await keyword integrates asynchronous results handling without blocking the execution thread. A lot is buried under the covers of the async/await feature, and we'll be covering this model extensively throughout the book.

Commonly, web pages bring together data from dozens of sources. Each one has a query and response as discussed earlier. Using asynchronous queries, each query can happen in parallel, where the page construction function can fire off dozens of queries—no waiting, each with their own callback—and then go back to the event loop, invoking the callbacks as each is done. Because it's in parallel, the data can be collected much more quickly than if these queries were done synchronously one at a time. Now, the reader on the web browser is happier because the page loads more quickly.

Performance and utilization

Some of the excitement over Node.js is due to its throughput (the requests per second it can serve). Comparative benchmarks of similar applications, for example, Apache, show that Node.js has tremendous performance gains.

One benchmark going around is this simple HTTP server (borrowed from https://nodejs.org/en/), which simply returns a Hello World message directly from memory:

var http = require('http'); 
http.createServer(function (req, res) { 
  res.writeHead(200, {'Content-Type': 'text/plain'}); 
  res.end('Hello World\n'); 
}).listen(8124, "127.0.0.1"); 
console.log('Server running at http://127.0.0.1:8124/');

This is one of the simpler web servers that you can build with Node.js. The http object encapsulates the HTTP protocol, and its http.createServer method creates a whole web server, listening on the port specified in the listen method. Every request (whether a GET or POST on any URL) on that web server calls the provided function. It is very simple and lightweight. In this case, regardless of the URL, it returns a simple text/plain that is the Hello World response.

Ryan Dahl showed a simple benchmark (https://www.youtube.com/watch?v=M-sc73Y-zQA) that returned a 1-megabyte binary buffer; Node.js gave 822 req/sec, while Nginx gave 708 req/sec, for a 15% improvement over Nginx. He also noted that Nginx peaked at four megabytes memory, while Node.js peaked at 64 megabytes.

The key observation was that Node.js, running an interpreted JIT-compiled high-level language, was about as fast as Nginx, built of highly optimized C code, while running similar tasks. That presentation was in May 2010, and Node.js has improved hugely since then, as shown in Chris Bailey's talk that we referenced earlier.

Yahoo! search engineer Fabian Frank published a performance case study of a real-world search query suggestion widget implemented with Apache/PHP and two variants of Node.js stacks (http://www.slideshare.net/FabianFrankDe/nodejs-performance-case-study). The application is a pop-up panel showing search suggestions as the user types in phrases, using a JSON-based HTTP query. The Node.js version could handle eight times the number of requests per second with the same request latency. Fabian Frank said both Node.js stacks scaled linearly until CPU usage hit 100%. In another presentation (http://www.slideshare.net/FabianFrankDe/yahoo-scale-nodejs), he discussed how Yahoo! Axis is running on Manhattan + Mojito and the value of being able to use the same language (JavaScript) and framework (YUI/YQL) on both frontend and backend.

LinkedIn did a massive overhaul of their mobile app using Node.js for the server-side to replace an old Ruby on Rails app. The switch let them move from 30 servers down to three, and allowed them to merge the frontend and backend team because everything was written in JavaScript. Before choosing Node.js, they'd evaluated Rails with Event Machine, Python with Twisted, and Node.js, choosing Node.js for the reasons that we just discussed. For a look at what LinkedIn did, see http://arstechnica.com/information-technology/2012/10/a-behind-the-scenes-look-at-linkedins-mobile-engineering/.

Most existing advice on Node.js performance tips tends to have been written for older V8 versions that used the CrankShaft optimizer. The V8 team has completely dumped CrankShaft, and it has a new optimizer called TurboFan. For example, under CrankShaft, it was slower to use try/catch, let/const, generator functions, and so on. Therefore, common wisdom said to not use those features, which is depressing because we want to use the new JavaScript features because of how much it has improved the JavaScript language. Peter Marshall, an Engineer on the V8 team at Google, gave a talk at Node.js Interactive 2017 claiming that, under TurboFan, you should just write natural JavaScript. With TurboFan, the goal is for across-the-board performance improvements in V8. To view the presentation, see https://www.youtube.com/watch?v=YqOhBezMx1o.

A truism about JavaScript is that it's no good for heavy computation work, because of the nature of JavaScript. We'll go over some ideas related to this in the next section. A talk by Mikola Lysenko at Node.js Interactive 2016 went over some issues with numerical computing in JavaScript, and some possible solutions. Common numerical computing involves large numerical arrays processed by numerical algorithms that you might have learned in Calculus or Linear Algebra classes. What JavaScript lacks is multi-dimensional arrays, and access to certain CPU instructions. The solution he presented is a library to implement multi-dimensional arrays in JavaScript, along with another library full of numerical computing algorithms. To view the presentation, see https://www.youtube.com/watch?v=1ORaKEzlnys.

The bottom line is that Node.js excels at event-driven I/O throughput. Whether a Node.js program can excel at computational programs depends on your ingenuity in working around some limitations in the JavaScript language. A big problem with computational programming is that it prevents the event loop from executing and, as we will see in the next section, that can make Node.js look like a poor candidate for anything.

Is Node.js a cancerous scalability disaster?

In October 2011, software developer and blogger Ted Dziuba wrote a blog post (since pulled from his blog) titled Node.js is a cancer, calling it a scalability disaster. The example he showed for proof is a CPU-bound implementation of the Fibonacci sequence algorithm. While his argument was flawed, he raised a valid point that Node.js application developers have to consider the following: where do you put the heavy computational tasks?

A key to maintaining high throughput of Node.js applications is ensuring that events are handled quickly. Because it uses a single execution thread, if that thread is bogged down with a big calculation, Node.js cannot handle events, and event throughput will suffer.

The Fibonacci sequence, serving as a stand-in for heavy computational tasks, quickly becomes computationally expensive to calculate, especially for a naïve implementation such as this:

const fibonacci = exports.fibonacci = function(n) { 
    if (n === 1 || n === 2) return 1; 
    else return fibonacci(n-1) + fibonacci(n-2); 
}

Yes, there are many ways to calculate fibonacci numbers more quickly. We are showing this as a general example of what happens to Node.js when event handlers are slow, and not to debate the best ways to calculate mathematics functions. Consider this server:

const http = require('http'); 
const url  = require('url'); 
 
const fibonacci = // as above 
 
http.createServer(function (req, res) { 
  const urlP = url.parse(req.url, true); 
  let fibo; 
  res.writeHead(200, {'Content-Type': 'text/plain'}); 
  if (urlP.query['n']) { 
    fibo = fibonacci(urlP.query['n']); 
    res.end('Fibonacci '+ urlP.query['n'] +'='+ fibo); 
  } else { 
    res.end('USAGE: http://127.0.0.1:8124?n=## where ## is the Fibonacci number desired'); 
  } 
}).listen(8124, '127.0.0.1'); 
console.log('Server running at http://127.0.0.1:8124');

For sufficiently large values of n (for example, 40), the server becomes completely unresponsive because the event loop is not running, and instead this function is blocking event processing because it is grinding through the calculation.

Does this mean that Node.js is a flawed platform? No, it just means that the programmer must take care to identify code with long-running computations and develop solutions. These include rewriting the algorithm to work with the event loop, or rewriting the algorithm for efficiency, or integrating a native code library, or foisting computationally expensive calculations on to a backend server.

A simple rewrite dispatches the computations through the event loop, letting the server continue to handle requests on the event loop. Using callbacks and closures (anonymous functions), we're able to maintain asynchronous I/O and concurrency promises:

const fibonacciAsync = function(n, done) { 
    if (n === 0) return 0;
    else if (n === 1 || n === 2) done(1); 
    else if (n === 3) return 2;
    else { 
        process.nextTick(function() { 
            fibonacciAsync(n-1, function(val1) { 
                process.nextTick(function() { 
                    fibonacciAsync(n-2, function(val2) {
                    done(val1+val2); }); 
                }); 
            }); 
        }); 
    } 
}

Because this is an asynchronous function, it necessitates a small refactoring of the server:

const http = require('http'); 
const url  = require('url'); 
 
const fibonacciAsync = // as above 
 
http.createServer(function (req, res) { 
  let urlP = url.parse(req.url, true);
  res.writeHead(200, {'Content-Type': 'text/plain'}); 
  if (urlP.query['n']) { 
    fibonacciAsync(urlP.query['n'], fibo => {
        res.end('Fibonacci '+ urlP.query['n'] +'='+ fibo);
    });
  } else { 
    res.end('USAGE: http://127.0.0.1:8124?n=## where ## is the Fibonacci number desired');
  }
}).listen(8124, '127.0.0.1'); console.log('Server running at http://127.0.0.1:8124');

Dziuba's valid point wasn't expressed well in his blog post, and it was somewhat lost in the flames following that post. Namely, that while Node.js is a great platform for I/O-bound applications, it isn't a good platform for computationally intensive ones.

Later in this book, we'll explore this example a little more deeply.

Server utilization, the business bottom line, and green web hosting

The striving for optimal efficiency (handling more requests per second) is not just about the geeky satisfaction that comes from optimization. There are real business and environmental benefits. Handling more requests per second, as Node.js servers can do, means the difference between buying lots of servers and buying only a few servers. Node.js potentially lets your organization do more with less.

Roughly speaking, the more servers you buy, the greater the cost, and the greater the environmental impact of having those servers. There's a whole field of expertise around reducing costs and the environmental impact of running web server facilities, to which that rough guideline doesn't do justice. The goal is fairly obvious—fewer servers, lower costs, and a reduced environmental impact through utilizing more efficient software.

Intel's paper, Increasing Data Center Efficiency with Server Power Measurements (https://www.intel.com/content/dam/doc/white-paper/intel-it-data-center-efficiency-server-power-paper.pdf), gives an objective framework for understanding efficiency and data center costs. There are many factors, such as buildings, cooling systems, and computer system designs. Efficient building design, efficient cooling systems, and efficient computer systems (data center efficiency, data center density, and storage density) can lower costs and environmental impact. But you can destroy those gains by deploying an inefficient software stack compelling you to buy more servers than you would if you had an efficient software stack. Alternatively, you can amplify gains from data center efficiency with an efficient software stack that lets you decrease the number of servers required.

This talk about efficient software stacks isn't just for altruistic environmental purposes. This is one of those cases where being green can help your business bottom line.