Scaling your Network - Effective Hosting, HaProxy, Smart Design

Preface

With the average Minecraft server having under 1 player online at peak time, very few servers have difficulties in building their infrastructures. Even so, the ones that do often have to go through the mud until they manage to build something stable. This blog post will explain some of the infrastructure designs used in the industry, with examples for minigame networks, game mode networks, and hybrids; all with the pros and cons of each of the solutions.

The Hosting Choice

Since hosting is the root of your entire infrastructure, taking the right choice here can mean the difference between being able to sustain tens of thousands of players vs barely reaching 1000. There are a few kinds of hosting methods I want to touch upon, along with my personal picks in terms of providers.

  • Cloud Hosting: With cloud hosting being more prevalent each day, many minigame networks have decided to move to it to reduce costs and quickly scale with size. Since many networks are based on one timezone (Gamster included), this choice is able to reduce about 30-40% of hosting costs vs choosing dedicated. Also, this means that minigame networks no longer need to purchase expensive hardware even when they expect a large inflow of players, with the automatic cloud deployment system taking care of all the work. Even though this may be a dream solution for minigame networks, it is not so viable for servers mainly focusing on game modes like Survival, Factions, etc; where servers have difficulty in converting the game modes to work on the cloud. Favorite providers include: OVH, Hetzner, DigitalOcean.

  • Dedicated Hosting: Dedicated is the second solution used by networks worldwide. With providers like OVH (SYS Subsidiary) offering effective dedicated solutions for low costs, including DDoS protection and effective support (If you call them); it's the easiest solution as it doesn't require developing a cloud deployment system for your network. Dedicated server hosting is most ideal for servers with stable player counts or network hosting game modes like Survival, Factions, etc. A good point is that poor DDoS protection providers often have the best pricing for the provided hardware! Favorite providers include OVH Subsidiary SYS, Hetzner, MYLOC subsidiary WebTropia.

Solving inherent hosting difficulties:

  • Low cloud RAM: OpenJ9 is a JVM with lower memory usage (~30-40%) at the cost of CPU power (10-15%). It has been optimized for running in the cloud. It can also be used outside of cloud environments, with Gamster using a hybrid OJ9, Hotspot approach. More info available in TUX's blog post .
  • Unreliable DDoS Protection: With most hosting providers having difficulties when you are under attack, the best option is to route your traffic through something you can trust, such as OVH. Providers like TCPShield can help you with that, or you can develop your own in-house solution. I've explained a bit about the former in a previous blog post, and I'll provide some images to explain some of the proposed infrastructure schemes. Avoid using reverse DNS or pointing DNS to your unprotected machines to avoid leaking your infrastructure info and being susceptible to denial of service. For OVH, buy a secondary IP and have a haproxy instance running on your main IP pointing to a closed internal bungee on the second IP. OVH will drop TCP connections to backend servers hosted on other providers when DDoSed. When using 2 IPs, with 1 closed and unavailable to the attacker, the second IP will never disconnect from your backend servers versus directly proxying being susceptible to DDoS, even though OVH Game!!!

Multiple Bungeecord Instances

Once you pass a point (~500 players), it's a good idea to start to use multiple bungeecord instances to ensure your server won't flop when your playerbases spikes. Good solutions include using RedisBungee (performance issues at huge player base spikes) or developing your own in-house solution to sync player counts via ping packets, and implementing between bungee sync whenever required. Because using multiple A records will mean bungeecord instances WILL NOT be properly load balanced, a good idea would be to have a haproxy cluster running behind to ensure almost perfect player spread on your bungee servers. Please note this will consume MORE bandwidth.

Infrastructure Examples

The following infrastructure is designed to allow the lowest pricing possible by allowing low-cost hosting providers. It uses HaProxy instances running on OVH, with a backend of cheap yet DDoS susceptible machines. Cloud can also be added into the mix if the user so desires.

PROS

  • Easy to implement
  • Cheap in comparison to other alternatives

CONS

  • Consumes a lot of bandwidth. Almost double the bandwidth vs players directly connecting to the bungee, since the traffic is proxied 2 times (PLAYER <-> HA <-> BUNGEE vs PLAYER <-BUNGEE)
  • Doesn't scale so well when you have tens of thousands of players.

Smart DDoS Prevention The following infrastructure is designed for larger networks (5000+ players) that can save a lot by removing HaProxy from the mix. It is currently untested by me but is what Hypixel is most probably using. DNS service providers with smart queries are generally on the more expensive end, but once you rack tens of thousands, it's will reduce networking costs significantly, as instead of having player <-> HA <-> bungee you only have player<->bungee Smart DNS Solution PROS

  • Cheapest solution. Cloudflare with a watcher service on your end can be used for virtually 0$/month A balancing.
  • Highly effective at huge player counts (5000+)
  • Almost no extra bandwidth used.

CONS

  • Your bungees need to be protected against DDoS vs your HaProxies
  • Removes some of the proxying features that haproxy can offer.
  • Rough load balancing due to not all resolvers following TTL.