Is Node.js a superhero? - codecentric AG Blog

:

In superhero comics there is a common theme: a hero/ine with superpowers and a weakness, for example kryptonite. The weakness here is probably more important than the power, because it makes the hero more human again, more accessible to us. In our reality strength and power comes almost always at a cost. Sometimes, if we read about Node.js, we might get the feeling of a superhero technology. But all its glorified powers come at a cost, too. We’re in real life software development, right? As in comics the hero/ine and his/her helpers should know about the weaknesses, so they can help and work around it. So it is in software development. Let’s discuss some powers of Node.js and what we have to pay for them.

To start, the official claim from the Node.js website:

“Node.js is a platform built on Chrome’s JavaScript runtime for easily building fast, scalable network applications. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices.”

It’s marketing, but this is what Node.js is about: handling of great amounts of networking and data without eating up much resources or suffering from high latency. I want to split this discussion in three areas: Node.js as a technology, as a platform and it’s ecosystem and community

Node.js as a technology

Node.js was born as an attempt to create web servers fast and easy, and the server is still the main and most important target. Even though we can create CLI tools, desktop applications or even control robots with Node.js, pretty much everything in its development is done to make it do the server thing as good as it can. And Node.js has become very good at it. A typical server in Node.js has a low memory footprint, starts blazing fast, handles tons of connections simultaneously in a single process, while spinning the event loop again and again.

So what is the cost? First of all, it is a single process. Always. It uses one single CPU core at a time. And, due to v8, the JavaScript engine, the memory is capped to a maximum of 1,7 GB. So we can’t scale vertically as we could with JVM, instead we have need to spin up more processes, we have to use horizontal scaling to use a machine to it’s full capacity. Fortunately there is the cluster module in Node.js core, that helps us with that. It makes the processes share some resources like sockets and establishes an IPC channel between the master and the workers in the cluster. So the vertical scalability isn’t that great, but sufficient and can be worked around with horizontal scaling.

Another strength of the Node.js technology is the fact, that one can create bindings to C or C++ libraries and provide access to them in JavaScript as a module. That way we can harness the power of these tools and add functionality implemented somewhere else. Also we can load off some logic into C-land to do some hard computation or utilizing threads if we must. The downside of this is the same as with all C or C++ projects: we have to compile to a specific platform and/or bind to libraries which probably are not present on every platform supported by Node.js. But it turns out, that servers run in strongly specified and controlled environments, so platform independence is not that important.

Node.js as a platform

I think the more interesting part for the most of us is this questions: what gives Node.js to us as an execution environment? Let’s look at the major points.

First of all, Node.js brings a dedicated module system implementation, that roughly follows the CommonJS standard. This removes one of the major flaws of JavaScript itself which lacks higher module abstractions. Node.js modules are basically bound to files, so one file equals one module. Also the module loader works in a way, that every module can have its own dependencies exclusively and without conflicting with dependencies of other modules. Two modules can even have a dependency to the same module in two different versions without any problems. And every module is a singleton by default.

This module system is very powerful. One cost is bigger memory footprint on the hard-drive, because every referenced version of a dependency, even the transient ones, are installed in the project folder. However npm, the package manager, can help here a bit with the dedupe command. The other cost is caused by the module loader caching strategy. It caches modules by the absolute path of the containing file, so two different versions are imported independently. Usually this is good thing, except if the module encapsulates a resource that should be a singleton for the whole system. To enforce that we have to go an extra mile and design for dependency injection. However, personally, i haven’t seen this problem in practice yet.

Another strength of Node.js is its abstraction of concurrency. In Node.js every piece of code is practically concurrent. But we don’t have to think about this fact, most of the time it doesn’t matter, because every piece of application code is running exclusively and uninterrupted by its other pieces. The concurrency is handled by the event loop in a single thread. That doesn’t mean that Node.js is single-threaded. Node.js uses threads under the covers, but our code doesn’t know about that. However it turned out, that many developers consider the asynchronous, event-driven programming model as a cost they need to pay. It is certainly a paradigm change, because dealing with continuation passing isn’t that common among our OO-influenced knowledge, even if it have been there for like forever in form of the Observer pattern. But still a function, or a closure, as an observer seems not that obvious. So different ways has been developed to work around this obstacle. Most prominent solutions are the async library, several promise implementations, the streamline.js preprocessor, and the stream module, probably the most underestimated part of the Node.js API.

The Node.js API is tiny and reduced and as such is considered a strength by many Node.js developers. It should help us or stay out of the way. And it does mostly. And to keep it like this, many feature requests were rejected by the development team. They propose to publish these features as third party modules instead. The general philosophy is to do as little as possible in the Node.js core and as much as possible in user land. It turns out that this has been an extremely successful approach. Node.js does a good job to empower the community to provide the missing functionality, and there is obviously very much diversity in the 3rd party space. This is good, but also a liability. In the Java world, the JSRs and API specs are a normative factor, which defines how things should play together. There is no such ordering power in Node.js. The community votes with their feet and will, maybe, converge on few competitors and practices eventually.

Last point on platform, and this one might be a controversial one. JavaScript is a strength of this platform. It is one of the most used, loved and hated languages today. Despite its flaws it is a small and powerful language. It makes it easier to bring front-end developers to the server. In fact many productive and respected community members are either full-stack web developers or pure front-end specialists. Also many libraries developed for browsers first, can be used in Node.js. JavaScript itself allows to implement many concepts very elegant and expressive. Nevertheless there are some pitfalls, also static typing might help in big teams and some of us just can’t think about OO without classes. In this case there are tools such as TypeScript, CoffeeScript, Googles Closure Tools and JSHint.

Node.js ecosystem and community

The community of Node.js is probably its greatest strength. There are other technologies for asynchronous I/O, other ways for using JavaScript server-side. But none of them has such a big, vibrant and dedicated community. This is because the Node.js API and ecosystem are so empowering.

Node.js comes bundled with npm. It’s this tool, that created the infrastructure for the ecosystem as we know it today. Everybody can publish a module in no time with npm. npm client comes with a lot of powerful features and can even replace a build tool for smaller projects. This package manager is just nice and simple. The other part of npm is the module repository, originally built as a thin wrapper around a CouchDB cluster. npm was and is a long-time community servant, but unfortunately enterprise needs were not that important at that time. Also the repository failed to scale with the rest of the ecosystem. This will change. Now npm is baked by newly founded npm, Inc. and they work hard on rebuilding npm to make it stable and scale. And they plan to offer features and services specifically for enterprises, such as private modules or managed copies (say proxies).

To this day there is no explicit name-spacing in npm or the Node.js module system, there was no technical need for it. Until now a simulated name-space, like “connect-json” or “grunt-cli”, was sufficient for most people. Now, with upcoming public mirrors, private registries and private/public modules on the same repository, this topic is revived again. Since npm is the standard for dependency management, this is where name-spaces will come in. We’ll need to be able to specify a certain repository to find the right module, to distinguish it from a module in other repository. So the definition of a dependency in the package descriptor file has to and will be redesigned.

With the support of npm, many developers have published even more modules. Over 78k to this point. If one needs some functionality, there are probably a few modules to find on npm. But first, we have to find them. The cost of the magnitude and diversity of modules has been poor discoverability. Even though npm, Inc is working on that and there are different ways to search for modules, it has to be the community to help with discoverability. The maintainer of a module can do different things to improve that, starting with well selected keywords and a good “readme”, up to building a small sub-community, that helps spread the knowledge about the module.

Another cost of the rapid growth and high diversity of the module landscape as well as of the community itself is that there are no real standards for quality and selection mechanism to help us make fast decisions. Merely a few best practices and common sense as our last resort. That may be enough for side-projects and startups, but not for enterprises. The community needs to establish some standards on quality of npm hosted modules and at least make modules comparable in the matters of license, presence of tests, documentation and release notes.

Conclusion

Node.js has experienced a lot of traction and acceptance among hobbyist, startups and a few big scale enterprises, as well as many evangelism. Although it deserves all this love, we should not forget, that Node.js isn’t a superhero as it seems. It has its costs, be it in the programming model or in the young community. But if we are aware of these costs, we are able to consider this technology in an educated way and objectively decide on when to and what to use it for, as another useful tool in our tool belt. Because at the end of the day, Node.js is very good in its domain.