Scaling down stateful applications gracefully on Google Cloud

by Anmol Sethi   Last Updated July 04, 2018 04:00 AM

Lets say I have a multiplayer game. The client connects to one of my game servers and begins a persistent websocket connection where it sends and receives messages to play the game which is of an undefined duration. Consquently, the duration of the client's websocket connection is undefined.

I have 5 game servers all on their own google cloud instances, all running at near max until traffic begins to decrease and one instance becomes under-utilized.

What I want to happen is for all traffic to the under-utilized instance to stop and then for it to be deleted only once its done dealing with all connections. It gets all the time it needs to deal with existing connections to not disturb existing sessions of the game.

If the traffic increases again and a new instance is needed and the under-utilized one hasn't shut down, I want traffic to begin routing to it again and for it to not terminate.

Is this possible on google cloud? If not, is there any way to approximate this? Should I have a different architecture for my servers?

Related Questions

How can i log traffic activity of a GCP instance?

Updated October 12, 2017 15:00 PM