Web Performance Optimization: The Infrastructure - codecentric AG Blog
In my previous blog I described three key areas for WPO, one of them being infrastructure with all topics around server setup. In this blog I am going to describe this in detail.
About Content Delivery Networks
Wouldn’t it be great to have somebody hosting your content, close to your customers? That’s what Content Delivery Networks (CDN) are useful for. So far only large companies with worldwide customers have used them, but they can be just as useful locally, too. They do have the fastest network connection possible and can reduce your IT spendings. But you can also create CDNs yourself. Hosting images on a different subdomain reduces data transferred, as less headers and cookies are sent with the request for each image. Those subdomains could be also pointing to specialized servers. As a simple example: a httpd could serve images from a RAM disk, instead of having a Tomcat server generating them from inside an archive file. And you can utilize public CDNs like Google.
Distributed Memory Caches are fast
Content hosting is one part of Infrastructure. The other part is running your application with your business logic. You cannot prepare readymade responses, but will have to make sure all requests can be answered, even when thousands of users are hitting your site. Once you outgrow a single server things get complicated. But this can be fought with simple designs. A common problem involved with scaling Java applications is that session data is held per node. So you cannot shift users easily to different servers. This is called “sticky session”. An attempt to fix this was to introduce session replication, which copies session data to other nodes which then can take over the user. But I strongly advise not to do this. It just causes to much trouble and effort with a minimal advantage. It would be much better, however, to have a stateless server. This allows ramping up and down computation power with ease. The big question is: Where should the state go. We need state.
Looking back, state was put into the session, because the central data storage called “database” was just too slow and did not scale easily either. But I am not mandating to put session state in traditional databases. I am proposing to remove them as well from your architecture. The state of the art solution for this dilemma are so called NoSQL databases, which work in a distributed way and store data in key value pairs. It’s not rocket science, but simple. And current developments prove that this simple solution works out much better than the traditional RDBMS. Big Java NoSQL databases are Hadoop and Cassandra.
And session information should be kept in memory on a distributed memory cache solution like MemCache. A great compilation of solution ideas can be found on nosql-database.org
The reason for making your application stateless is that it allows easy scaling. When load goes up usually some limits on existing infrastructure are reached that would in fact scale, but no longer linearly. In those situations it is advisable to start additional servers. Ideally decoupled ones, so you can either start API servers, Presentation servers, or Logic servers. Dynamically starting servers, also close to your customers, is the real added value of “the cloud”. This is not possible with complicated replication and failover solutions mostly deployed on companies’ internal systems.
Of course fast networking equipment and sensible physical distribution of servers also make sense, but offer only little tuning potential. However WPO leaders like Google started to create new networking protocols like SPDY, build custom network adapters or bend the rules set by RFCs to make a fast experience. One example of these is the so called slow-start feature of TCP. As many other RFCs, TCP was defined in the early days of networking and is still used. At that time clients had very bad connectivity to the server, so the idea was to protect the server from bad clients and vice versa, so they start sending data only when the clients accept them. The amount of data that can be sent is called initial window and is described in RFC 3390. But actually sending more data saves roundtrip time, allowing to get below 100ms page loads. A good start into this discussion can be found in Ben Strong’s Blog about Cheating on slow start.
Those are just a few ideas how operations could help improving the performance of web pages. While some aspects are limited by the application architecture, others could be offered as a premium service by the hoster to their customers. While the area of infrastructure is not our key competence at codecentric, we can help designing applications that get the full benefit from infrastructure optimizations and can speak the language of operation teams to get performance improved on all sides. One of them also being software architecture, which I will discuss in the next installment of this blog.
My WPO series: