Scalability and Building For Massive Multiplayer Audiences
A common question developers often ask us is "Can HeroEngine support X players..."
This is, however, the wrong question as it tries to simplify an extremely complicated subject to a yes or no answer rendering the answer effectively meaningless. The answer is Yes, absolutely, HeroEngine can handle X players for any value of X (please see the previous sentence).
HeroEngine has number of features specifically designed to support the demands of a Massive Multiplayer Game including sophisticated systems for:
However, using HeroEngine does not negate the need for your designers and programmers to:
- design the game world that supports massive populations
- design game play that supports massive populations
- provide sufficient content for massive populations
- utilize the engine's features
- architect systems that are asynchronous, parrallelizable, perform caching, lazy evaluation, etc as appropriate (designers think of area server instance as geometry with npcs, shops, quests, etc. As programmers we know an area server instance is a unit of simulation and we have control over spinning up additional processes called areas each of which can be used to run any code we need System areas.)
- write good code
- No engine, no programming language, no technology can make a bad N^2 algorithm anything other than a bad N^2 algorithm.
- Cluster - A HeroEngine cluster is composed of multiple machines including a database and 1...N supporting machines called "world servers" (meaning physical or virtual servers that run HeroEngine processes for a world)
- Server Group - A server group defines a conceptual set of physical machines (servers) that provides the physical resources upon which HeroEngine processes are spun up. HeroEngine's flexible architecture allows world configurations to specify specific server groups for the different services it provides. Server groups may be, but need not be, shared between worlds.
- Server - ambiguously used, this could mean: a physical server, a virtualized instance, or a game world occasionally referred to by players as a "shard"
- Shard - a game world instance that arbitrarily splits the total player population into a manageable size generally for reasons of adequate population density for a feeling of a "massive" game while maintaining accessibility to content and system responsiveness
- World - Alternate word for "shard", with identical meaning
- Area Instance - an area instance is a Play or Edit instance of an area. Each area instance contains a complete copy of the data describing the geometry and game-play elements for a particular area. Any given area instance runs in a single process on the server, this can be though of as a unit of simulation.
- Instance Slices - an instance slice is an area instance spun up to distribute load for a specific region of geometry by dividing players between the slices and proxying information between them so that players can interact with each other. Each slice is a play instance of the same area. (not to be confused with area instances for different areas connected seamlessly together).
N-squared (N^2) Algorithms
N-squared (hereafter N^2) algorithms are a subset of algorithms that have non-constant run-time costs as the number (N) increases. In massive multiplayer games, n^2 algorithms are almost always a massive performance problem.
For example, take the average movement packet in a Massive Multiplayer Online Game like World of Warcraft. When a character moves the server has to send an update to everyone who was aware of the character, so if 100 players are in the area and all are aware of each other when one player moves updates must be sent to the other 99. In isolation, that is not so bad 99 movement packets are sent out to the individual clients. Now, what happens when all 100 players are moving?
Each client, must receive updates for all of the other characters resulting in 10000 movement packets being sent from the server. What if we are sending movement packets 10 times per second? Now we are talking about 100,000 packets that must be generated and sent out each and every second that is JUST for movement and no other game information!
MMO game engines, and HeroEngine is no exception, deal with these kinds of issues through a variety of strategies including:
- awareness systems
- bandwidth shaping / prioritization
- efficient replication mechanisms
- game-play area design
- game system architecture
- GOOD CODE!
It bears repeating...no engine, no programming language, no technology can make an N^2 algorithm anything other than an N^2. Game logic must take advantage of an engine's features to choose the right trade-offs for the game design.
Engines whose marketing departments claim their engine solves the problem of N (for a large value of N) players who all decide to move to the same game location are selling you snake oil, don't buy any. There is no magic "Easy Button", the laws of physics still apply.
Production vs Development Clusters
A production HeroEngine Cluster is typically composed of a database server capable of handling many worlds (often referred to as "shards" or "servers" by players), with a supporting cast of 1...N physical machines called "world" servers (in a server group) upon which the "world" processes (such as area server instances) are run on. Because HeroEngine is a process oriented architecture, expanding cluster compute capacity is simply adding another physical server into the server group and processes (such as area server instances) will start spinning up on the new machine.
A development cluster on the other hand is generally quite different, often running less capable and fewer servers than a production cluster. Additionally, development clusters are (as is the case in HeroCloud)run virtualized to provide operational and budgetary flexibility. Beyond that, development worlds have processes that utilize significant CPU time performing tasks that are not generally necessary during development (such as nav mesh generation on the PathMaker server).
Consequently, one can not look at a development World and say this is how HeroEngine will perform in a production environment. It is like a comparison between apples and go carts, they both might roll downhill...but beyond that there is very little commonality between them.
Processes: Areas as Units of Simulation
Game designers, players, and artists think of areas as geography neatly connected filled with a variety of content to entertain the user.
As programmers, we know that areas are simply units of simulation. In HeroEngine, HeroScript has complete control over how, when and why units of simulation (area instances, aka a process) are spun up. That means, you have the power to distribute processing in arbitrary ways taking advantage of HeroEngine's load balancing mechanisms to distribute processing across the cluster. One of the ways to do this is through the use of System Areas, which provide a basic framework for a distributed service where requests for processing are distributed in a a parallelizable fashion.
The key here is that you want the processes that ARE actually involved in game logic that occurs synchronously (combat for example) to spend their time doing those things while offloading other types of processing away from the user so eliminate their impact.
MMORPG systems that lend themselves naturally to this type of architecture include everything from the in-memory representation of quests, chat, guild, auction and just about any kind of service.
Some of the things that must be understood in order to get a meaningful answer to "How many...?" is the answers to resource requirements per user including:
- How much CPU time is required...
- distributed between all of the services in the cluster?
- in the main unit of simulation (area instance)? (This value probably varies based on the "normal" activities for the area instance)
- How much RAM is required...
- distributed between all of the services in the cluster?
- in the main unit of simulation (area instance)?
- Are the game systems architected in a fashion that allows subsets of user data to be loaded independently?
- How much database I/O is required?
- How much storage space on disk is required?
- How much bandwidth is required?
- How many hits to the website...
- on average?
- How many billing transactions...
- on average?
- What is acceptable latency...
- per system?
- HeroEngine is a process oriented architecture which deals with the issue of server-side scaling by spinning up additional processes to service load.
Examples of the process oriented architecture are found in the externally facing services to which a client connects Dude Server and the Repository Server, which handle all of the traffic between internal processes (such as area instances) and file requests populating the Local Repository Cache respectively. Each of those processes is managed by a Director and handles approximately 1000 connected users, when additional capacity is required the director spins up additional processes to handle it.
At the game play level of things, the fundamental unit of simulation is an area instance. Each area instance runs in a separate server process called an Area Server processing events using ASIO. The Area Server processes are dynamically allocated to physical hardware servers based on a load-balancing scheme (least-burdened heuristic). Area instances are spun up dynamically based on arbitrary game logic implemented in HeroScript.
An Area Server process is a monolithic process with a single-threaded HeroScript Virtual Machine. Area Servers can communicate with any other Area Server through asynchronous remote function calls from HeroScript and Replication. Because any number of Area Server processes can be spun-up, utilizing as much hardware as available, the broad-phase scalability is limited mostly by switching fabric bandwidth and the database back-end. The utilization of these resources is contingent on your particular game implementation, so care must be taken during the design phase. Further, the capacity of the switching fabric and the database store depends on your datacenter build-out.
Narrow-phase scalability is constrainted by the fact that a given Area process can become over-burdened and utilize too much CPU during processing game events, character movement packets, game mechanics, AI and so forth. An over-burdened server can have heavy-load processes migrated/configured to less burdened physical hardware. However, once a heavy-load process utilizes the entire resources of a physical piece of hardware, there can be no further automatic load-balancing steps taken.
Particularly taxing game mechanics can be engineered (or re-engineered after they are discovered to be a bottleneck) into "services" by use of System Areas. A System Area is a process, created from HeroScript, that manages any type of processing required asyncronously to any particular geogrpahical area. In this way, it provides a "service" to the game system. For example, the management of guilds, or quest assignment, perhaps even combat resolution. Because System Areas can be instanced many times to account for load, and each process can run on any physical hardware available, this provides a high-degree of scalability.
Like all systems that trade off communcation for parallelism, there is ultimately a limit at which the communication processing itself is more burdensome than the processing it was attempting to parallelize. This is also the fundemental problem is dynamic load balancing systems which, in their effort to create greater and greater parallelism, simply explode the resources required to manage the proxy traffic. Therefore careful game and system design will always be a big part of any scalability strategy. HeroEngine can provide a solid foundation and toolset for this, but it is still important to realize that the game-specific implementations will be key to effective scaling.
Designing for Scalability
- Just because HeroEngine can handle N players per area instance, does not mean that the game design or content supports that number of players.
It is critical to coordinate game design with the techncial realities of massive numbers of players and their behavior. How many users can participate in a given area instance will be largely determined by the game mechanics that exist in your game design. For example, extremely complex combat logic in a high-frequency combat-heavy game design will necessitate the splitting of the world into smaller areas so that the load can be properly balanced.
Any single area instance can easily handled a thousand users in one game implementation, but maybe only handle a couple hundred in another. There is no way to determine this ahead of time, since HeroEngine is entirely generic and doesn't restrict how you might use it. If you write inefficient scripts, your results will be worse than if you write efficient ones. If you are smart about how much AI procesing creatures perform, then you will be able to handle more creatures in a given area than if you didn't. And so on.
The best way to deal with this is to identify the key aspects of your game design that are absolute requirements and which are flexible. Make sure to leave room for trade-offs, because in MMOs there are hard-and-fast physical limitations. Server hardware is only so fast at any given generation, database systems can process only so many transactions, etc. And remember that the "illusion" of doing a lot is better than actually doing a lot. This is, of course, the Black Art of MMO design.
Lets examine the classic first day of release crush, where 100,000 pre-ordered customers log in at the same time. Being cautious developers on release date, lets assume we reasonably decided to spin up more than sufficient server capacity allocating roughly 3000 players per shard (on day one, expecting the leading edge to advance through the content leaving room for additional players logging in on subsequent days eventually growing to a total server population of 5-10,000).
Great, our world is plenty large for 3000 players to roam around. Only the problem is, the 3000 players are not going to roam around the whole world...they are all concentrated in the first zones for content level 1-10. If we have designed content where a particular uniquely named monster must be killed to complete a quest, we end up with a line of hundreds of players all trying to kill the same creature. This is a classic example of design and content that does not scale.
Instancing is the technique of taking a portion of the game content (i.e. a dungeon or zone) and spinning up many copies of it into which a subset of the total player population is placed. There are advantages and disadvantages to instancing, but one of the most common complaints from designers and players is that they want the game experience to be totally fluid and not break the players' immersion in the game.
Continuing our narrative of release day...
One of the classic ways to scale the initial player experience is to instance the first X hours of game content. Using instancing in this fashion allows the developer to design for and control the initial experience and content utilization to an appropriate number of players. Instancing also provides the benefit of splitting players into separate units of simulation, reducing the number of players the server (area instance process) and the client must handle.
Dynamically Scaling Systems
Well designed systems can be architected to scale with the population. For example, a spawner system could take into account the number of players and a design goal of the player not having to wait to fight another creature for more than 20 seconds following the end of their current engagement. Whereas as a spawner system that just spawns a new creature every minute if the previous creature is dead would not scale to supply content for players exceeding the explicit number of creatures placed by level builders.
Another example might be collect X quests (4 feathers, and 2 acorns), which could be implemented as objects in the area or unique client-only representations that are generated specifically for each player who has the quest. The design decision here is, does it matter if one player sees an acorn on their client but an observing player did not? Probably not.
Architecting Systems for Scalability
- Coming soon...
How Many Players can HeroEngine Support on...
- Well...gee after all of that can we please just get a number?
I mentioned that this is the wrong question to ask...never-the-less here are some numbers. Keep in mind the actual numbers can differ by orders of magnitude based on:
- type of game
- game design
- level design
- skill of the team
- physical hardware
- resource budget per player
...a Production Cluster?
HeroEngine is designed for Massive Multiplayer games, easily accommodating tens of thousands, hundreds of thousands or even millions of players in a cluster. HeroEngine's architecture allows the amount of compute units to be trivially expanded on the fly by adding new world servers to a server group. The database providing HeroEngine's persistence layer runs on Oracle, which has a vast array of options for increasing its IO throughput and transaction processing.
Much like a cluster, an individual shard could be expected to accommodate thousands, tens of thousands or hundreds of thousands of players. For a classic MMORPG like World of Warcraft, the numbers are probably thousands to tens of thousands per world/shard.
At the world level for MMORPGs, the limiting factors are more often on the content and game system design levels. How many players does a world need to feel "massive" while not overwhelming the content or impacting game system responsiveness?
For social games, the numbers are probably in the hundreds of thousands or millions of players per world/shard.
...a Physical Server?
A physical server is simpler, based on your budget for a player's resource usage it is possible to calculate on average how many players a single physical machine can handle. For example, a typically MMORP such as World of Warcraft might budget anywhere from 3-10 Mhz worth of processing distributed throughout all of the services consumed by a player. If the budget for your game assumes 10 Mhz per player (which even for an MMORPG is high), a dual package 2 Ghz quad core could be expected to handle somewhere between 1600 players per physical server.
Now, in reality the budget per player is probably lower and the current sweet spot for commodity hardware is for CPUs with clock higher speeds, more packages, and more cores per package so one could reasonably expect to get higher numbers.
For a social game where the budget per player is very small in comparison to an MMORPG one could expect tens or hundreds of thousands of players might be serviced by a single world server.
...an Area Instance?
- Ok...process oriented and an area server instance is a process. What if we ask how many players can be serviced by an area server instance? Maybe that will give us an answer we can use...
Unfortunately, this again is not the right question because there are too many variables that are entirely under your team's control and could differ on an area instance by area instance basis based on the typical activities of players there.
For example, I personally have run 5000 AI controlled characters in an area server instance (a process). Great! That means we can have 5000 players in an area server instance easy right?
It is all a matter game design, architecture and coding. Additionally, it depends on the hardware and the CPU budget for a user. For example, you can define the CPU budget on a per user basis to be 10 megahertz. If you have hardware that runs at 3 gigahertz, approximately 300 users could exist on the same area server (a physical server could be expected to an area server on each of its cores at that load). However, some of the scripting functions such as broadcast chat do not all scale linearly, so the actual capacity of the server would be somewhat less. If you have a game implementation that offloads some of the CPU budget to other servers, then more users will fit.
The number of players a game can support in a "shard" is more often a question of where is the proper balance point for a world to feel populated (i.e. massive) but not over populated such that there is insufficient content for the players (competition for resources such as "spawns") or systems (e.g. auctions) are no longer sufficiently responsive.
The example of running thousands of AI controlled characters was based on the following choices/factors for the test, modifying any of them has the potential to modify the number by orders of magnitude in either direction.
- dedicated commodity server hardware ~3 years old (as opposed to a development cluster in HeroCloud where your worlds run in a shared virtualized environment). Technically, this test only utilized a single core since it was performed in a single process.
- characters were placed in packets of 100 with sufficent separation that each character would only receive awareness events for up to 99 others entering/leaving awareness
- ai behavior was simple wandering using pathfinding
- I write really good code
If we alter the parameters to place 5000 players within awareness range of each other, ignoring the fact the client will not be able to render 5000 typical animated MMO characters simultaneously, the overhead from awareness events would have reduced the processing that could be spent for on other things. Depending on the choices I made when defining the DOM's replication parameters, I might pump out more data in movement messages than can be pushed through the a physical server's network interface(s) because of everyone being aware of everyone (remember the N^2 algorithm discussion from earlier?).
An individual area server instance (i.e. process) can be expected to handle hundreds or thousands of players depending on the what resource demands are per character for the typical activities performed there. For example, a "housing" area server instance (Player Housing Tutorial) where players generally sit around doing little to nothing and generally are not aware of each other could reasonably be expected to handle thousands or even tens of thousands of players. A battle ground area instance where PvP combat events run might only handle a few hundred (though here probably the issue will be more of a client one depending on the topology represented and number of characters a given client simultaneously observes).
So its quite reasonable that the answer varies where some area instances of your game might handle hundreds of players while others might handle thousands. Of course, if you have a game where each player controls 1000 AI characters the area server instance might only handle 10s of players.
What Question Should I Be Asking?
The real question is, based on my game's budgeting of resources (e.g. How much is on average required per player? CPU, RAM, Database I/O, Persisted Data Size, etc). How many physical servers are required to service X players?
The good news is...in HeroCloud we handle the servers leaving you to concentrate on making a game, which is after all what you probably wanted to do anyway.