Google Cloud Platform has an amazing feature that few people use, partially because it is unknown, but mainly because it is very difficult to set up a system architecture that allows you to use. This feature is preemptible instances. How does it work? Simple: you have a virtual machine like any other, except that this VM will shutdown unexpectedly within 24 hours and be eventually unavailable for short periods. The advantage: this preemptive instances cost less than 50% compared to the ordinary machine.
Usually, people use this kind of machine for servers that run workers or asynchronous jobs, a kind of application that does not need 24/7 availability. In my case, I could use the preemptible instances for my internal API, an application that do need 24/7 availability. This internal API can’t stay offline, so the way I solved the unavailability problem was by running many servers in parallel behind a haproxy load balancer. So, in basically 3 steps I could reduce my API infrastructure cost by 50%.
Step 1 – Setup the client to be fault tolerant
My code is in Scala language. Basically, I made the client to repeat a request when it eventually failed. This is necessary because, even if the API machines are behind the load balancer, the load balancer takes some time (seconds) to realize that a specific machine is down, so eventually it sends some requests to unavailable machines. The client code snippet is:
def query(params, retries = 0) { val response = api.query(params) response.onSuccess { codeForSuccess() } response.onFailure { case x => { LOG.error(s"Failure on $retries try of API request: " + x.getMessage) Thread.sleep(retries * 3000) //this sleep is optional query(params, retries + 1) //the could be a maximum number of retries here } } }
Step 2 – put all servers behind a load balancer
I created a haproxy config file that I can auto-update based on a list of servers that I get from the gcloud command line. Here is the script that re-writes the haproxy config file with a list of all servers that has a specific substring in their names:
#!/bin/bash SERVER_SUBSTRING=playax-fingerprint EMPTY_FILE=`cat /etc/haproxy/haproxy.cfg |grep -v $SERVER_SUBSTRING` NEW_LINES=`gcloud compute instances list |grep $SERVER_SUBSTRING | sed 's/true//g' |sed 's/ [ ]*/ /g'|cut -d" " -f4|awk '{print " server playax-fingerprint" $NF " " $NF ":9000 check inter 5s rise 1 fall 1 weight 1"}'` echo "$EMPTY_FILE" >new_config echo "$NEW_LINES" >>new_config sudo cp new_config /etc/haproxy/haproxy.cfg sudo ./restart.sh
The restart script reloads the haproxy configuration without any outage.
Step 3 – create an instance group for these servers
By creating an instance template and an instance group, I can easily add or remove servers to the infrastructure. The preemptible configuration is inside the instance template page in google cloud panel.
- Create an instance template with preemptible option checked
- Create an instance group that uses that template
One very important warning is that you need to plan your capacity to allow 20% of your servers to be down (remember that preemptible instances eventually are out). In my case, I had 20 servers before using the preemptible option. With the preemptible on, I changed the group to 25 servers.
Before | After | |
Servers | 20 | 24 |
Cost per server | $0.07 | $0.03 |
Total cost per hour | $1.4 | $0.72 |
Total cost per month | $1,008 | $518 |
Price reduction: $490 or 48.6%
Graphs of server usage along 1 day (observe how many outages there are, but application ran perfectly ):