System and software scalability
This is the third post in a series of posts treating software quality attributes. Unless your only concern is the scalability of your solution I do recommend that you start by reading the first post of this series.
The scalability quality attribute is something we are only concerned with for server based systems. Scalability is the ability to handle more traffic, more concurrent users, more transactions per unit of time and is therefore related to performance. It is important for your business because if the things we are hoping for happens and your business attracts a lot of customers that produces a huge amount of business transactions you would want your system to be able to handle that increased load. Ideally you would like to be able to handle the new load without having to make a big effort. If you need to redesign the system to be able to handle the increased workload you will lose valuable time (not be able to do all the business you otherwise would have the opportunity to do) and the work of improving the software will itself cost money.
When it comes to scaling a system to handle increased load there are basically two ways of doing so.
When a system is installed on a node (e.g. a server) that has more power it is referred to as scaling up. It could be a faster processor with more parallellism, it could be more RAM or a bigger hard drive. One commonly applied method of scaling up is simply upgrading the hardware of the machine that the system is currently running on by for instance adding extra RAM.
When scaling out you would instead add nodes (e.g. servers) so that the system is running on several machines. Multiple machines with their own processing power would easily be able to outperform one machine. In addition to this there is a failover security in place as soon as you allow multiple machines to share a workload without splitting tasks. If one machine enters a fail state and cannot perform any tasks anymore your other machines may still continue to work, meaning that scaling out also also is a way to improve the reliability of the system. At last, but not the least, scaling out is cheaper than scaling up. Purchasing a big machine is generally more expensive than purchasing a number of smaller machines that put together have the equivalent or even higher processing power.
Scaling in the cloud
Cloud computing platforms are as Amazon, Google and Microsoft are designed with scaling out in mind. One reason for this very deliberate design choice is that very large systems need huge amounts of processing power. In fact some systems need so much processing power that there is no single machine that can satisfy the needs, making scaling out the only choice we have. These cloud platforms all makes it easy to add more machines to support your system. Since you as a customer of a could provider do not own all the hardware and other infrastructure this creates another interesting opportunity: Scaling in. If your business (permanently or temporarily) experiences a decreasing number of transactions you don’t really need all that processing power, which is another way of saying that you pay too much. So to save money you can simply scale in, that is remove some of the machines and have your system run on fewer machines so that you pay less to the cloud provider.
Software design can help or hinder scalabiliy
Although scaling out may seem like a preferrable option it also requires that you put more thought into your software design. Even if you build your solution on top of a cloud computing platform it does not guarantee that your system will be scalable. Below I list some of the more important considerations for creating scalable software:
- State Handling state is important in any solution, but when designing for scalability state handling becomes paramount. Ideally your solution would have no state handling at all, apart form the permanent state you save in your data storages. In practice though a lot of systems will need some kind of state handling, e.g. for interacting with legacy systems. If you need state in your system you should not store it in the memory of the server. If you store it there the requester must always send subsequent requests to the very same server that got the initial request and therefore has the state in it’s memory. It is possible to handle this by using so called sticky sessions, but this mechanism is not always available. Specifically, if you are designing for the cloud you must avoid sticky sessions since the big cloud providers do not offer this functionality in their platforms. Potent alternatives are storing sessions inside messages, centralized state storage (beware of single point of failure) and distributed state storages where multiple easily accessible storages share the burden of handling state data.
- Caching Caching the right data is indeed vital for the performance and scalability of any highly utilized system. Having the most important bits of information readily available saves a mountain of resources and makes transactions flow smoothly. In a similar way as with state data centalized or distributed caching solutions is to be preferred.
- Data partitioning In data-intensive systems you might have to work with a lot of data to complete one task. In order to do this more efficiently you might want to look into partitioning your data. The design you are looking for is one that allows you to work with different bits and pieces of your data in parallell. For instance if you often process invoice and customer data at the same time you might want to put these kinds of data in different machines so that they can work in parallell with retrieving the data. Even if you have all the data in one machine you can benefit from putting the different partitions on different disks so that the different disks might work in parallell with finding the data you seek.