A cluster is a group of servers that can logically expose themselves as a highly available and capable super-server. And you need clusters because the success of your business is rooted in your ability to provide your customers the products and services they need when they need them .

So what do groups of servers have to do with customers? To answer that question, let’s dig into how clusters came to be.

Why we need clusters

One of the fundamental problems with traditional, single-point deployment systems — your average on-premises enterprise system — is the danger of a system-wide failure. You’ve deployed servers that are fully capable of running your application, but you get a sudden power outage, lose connectivity to a hardwired system, and your entire business stalls.

The customer that was in the middle of paying their bill online gets an error message and isn’t sure whether the payment went through or not. They press the refresh button over and over again, finally calling the customer service line to complain.

Clusters offer a reliable platform for applications that strive to be fault tolerant and highly available — aka there when your customer needs them.

Clusters as backup

One way to address this critical issue is to employ backup servers that act as an emergency substitute for the primary deployment — not quite as advanced as what we’re talking about today but a first step in handling the possibility of system failures. We’ll call them Clusters Lite. In this set-up, a business has, say, 3 servers, all with the same deployment. But only one of those servers is exposed to the customer. The other 2 servers operate essentially as backup in a different location. If the primary server goes down, the business can expose one of the other servers to the customer.

The system is now fault tolerant, but it is also very expensive. Businesses have to manage the cost of running 3 identical databases plus the cost of keeping the 2 backups fully synchronized to whatever is happening on the primary server.

Clusters enabling load balancing

An option for limiting the need for ongoing supplemental synchronization is to have the 3 servers run in the cluster at the same time, keeping the data pool common. The cluster includes one interface service exposed to the customer that balances the load between 3 application servers, choosing whichever is most available at the time.

So now businesses have a system that is both fault tolerant and highly available, but running clusters in an on-premises system is still incredibly expensive because each of those servers is a separate piece of hardware, generally in different locations. And, of course, in the real world, a system often requires significantly more than 3 servers.

Clusters in the cloud

We’ve established that using clusters can make an application fault-tolerant, highly available, and scalable, but we need to address the big downside of clusters — infrastructure cost.

Enter the cloud, where virtualization means a business isn’t paying for each server as a separate piece of hardware.

Combining the power of clusters with the inherent flexibility of the cloud provides fault tolerance, high availability, and scalability without the enormous expense associated with running clusters in an on-premises system.

Every cloud platform can provide load balancing, so designing a cluster in the cloud involves creating the rules that a load balancer will follow when interfacing with customers. For instance, a load balancer (what we referred to as the customer interface server in an on-premises system) might have access to 10 servers (or “instances,” in the cloud). Each provides an identical service, so the load balancer’s task is to receive customer requests and balance those across the servers in the cluster.

The scaling parameters a developer sets tell the load balancer what its highs and lows should be. For instance, the parameters could tell the load balancer that it should run a minimum of 5 servers at all times, but that if CPU utilization reaches 70%, the load balancer should add another server. If utilization drops below 50%, it should drop a server. It should never exceed 10 servers but should send a notice if CPU utilization reaches 70% with 10 servers in use.

So back to that initial question — what does all of this have to do with customers?

Why clusters matter for your business

Imagine a dad. For months, he’s been keeping his eye out for sales on a motorized ride-on convertible for his four-year old daughter. He’s already imagining what her face is going to look like when she sees it on Christmas morning even though he can’t quite figure out how he’s going to pay for it.

Then he sees a Black Friday ad for his local toy store. From 5:00 am – 6:00 am on the morning after Thanksgiving, he can get the perfect convertible for half the price. He’s ecstatic. He sets his alarm and arrives at the toy store at 5:45 am, but the doors are locked. He sees people inside. In fact, someone is holding the exact car he wants, but no matter how many times he pulls on the doors, he can’t get them open.

Finally a store employee tells him through the glass that the store has reached capacity, and they can’t let anymore people in until at least 25 customers leave. By the time the dad gets inside 20 minutes later, all the cars are gone and the deal is over. He vows to never again give that store his business.

The “doors” for an online retailer aren’t really doors, but when a retailer doesn’t have enough availability, customers are locked out all the same. Imagine if a quarter of the time you shopped at Amazon, their server crashed and you weren’t able to complete your purchase. You’d quit shopping there.

Unpredictability of traffic is one of the major issues that any production system faces. If the system isn’t running with clusters, the business better do a good job accurately anticipating any traffic increases or decreases — otherwise they’ll either be unavailable to their customers during a high demand period or pay for way more server use than they need during a low demand period.

With a cluster that can scale itself in the cloud, if a thousand new users come on, no problem. Your system will add a couple more nodes, and your customers never know the difference. If one server goes down, the system automatically mends itself by the replacing that server.

Customers don’t care about fault tolerance, availability, or scalability. But they do care about successfully completing the task they came to your site to complete. Clusters mean they won’t be disappointed.