After recently studying and playing with Node.js, I realized that I had somehow picked up a couple of misperceptions that were wildly off base:
- Node.js is solely a platform for web development similar to one of the other popular server-side web development platforms, perhaps PHP or Ruby on Rails.
To clarify misperception one, while Node.js provides powerful support for web development, Node.js can be used to build any manner of application. Node.js applications may access file systems, networks, databases, and child processes. Just as with Python and Ruby scripts, you can run Node.js scripts from a command line and can pipe streams back and forth to other applications in the Unix style.
My second misperception turned out to be doubly wrong.
Of course, this is an old problem, and all modern operating systems offer a built-in solution—multi-threading. A multi-threaded server makes a system call to generate a new thread for each request that it receives. That thread handles the request to completion while other threads handle other requests. The OS manages all of these threads, sharing out slices of CPU cycles to each as it is ready.
Obviously, multi-threading still works as it has for decades, but it does have its problems when it comes to massively scalable web applications. Managing threads is not free. Each has its own associated process, registry values, program counter, call stack. As the number of threads increases, the OS spends more and more CPU cycles managing rather than running the threads. And with web applications, threads spend the vast majority of their time waiting for IO, whether that be calls to files, databases, or network services.
Node.js takes a radically different approach, avoiding the need for OS threads by simply refusing to wait. Rather than making blocking IO calls, wherein the thread stalls waiting for the call to return, almost all IO calls in Node.js are asynchronous, wherein the thread continues without waiting for the call to return. In order to handle the returned data, code in Node.js passes callback functions to each asynchronous IO call. An event loop implemented within Node.js keeps track of these IO requests and calls the callback when the IO becomes available. Managing the event loop costs less than managing multiple threads, as it only requires tracking events and callbacks rather than entire call stacks.
Each Node.js process is single threaded, but it can be scaled to multiple processes and multiple machines just as traditional multi-threaded servers.
One might also argue that Node.js as a platform for server-side web development is innovative in its lack of abstraction; the Node.js programmer handles http requests by forming http responses (what the http protocol is all about), rather than creating pages (PHP, JSP, Asp.Net) or writing models, views, and controllers (Ruby on Rails). Personally, I have a preference for lighter frameworks or at least frameworks that do not force me into certain patterns so I certainly find Node.js appealing despite my initial misperceptions.