Back to Blog

Mobile Game Scaling Optimization: Architecting Cities for 1M+ Concurrent Players

Published on May 21, 2026
Mobile Game Scaling Optimization: Architecting Cities for 1M+ Concurrent Players

In a nutshell

Master mobile game scaling optimization by building high-concurrency city infrastructure. Learn netcode, asset streaming, and server sharding.

Every multiplayer developer knows the exact moment their mobile game architecture cracks. You design a sprawling, beautiful urban environment. You test it locally with 10 simulated clients, and the build runs at a flawless 60 FPS. Then you push it to a live environment with 1,000 concurrent players crowding into the central plaza. Within seconds, low-end Android devices hard-crash due to Out-Of-Memory (OOM) exceptions, iOS Jetsam aggressively kills your application, and your dedicated server's CPU spikes to 100% as it attempts to calculate network replication for thousands of overlapping entities.

When building a mobile MMO or a large-scale open world designed to support millions of active users, you cannot rely on out-of-the-box engine defaults. Mobile hardware has strict thermal throttling and hard memory caps (often limiting your game to less than 2GB of usable RAM on mid-range devices). Simultaneously, your server must handle dense player clusters without buckling.

Achieving true mobile game scaling optimization requires a three-pillar approach: aggressive spatial partitioning on the server, ruthless memory management on the client, and distributed backend architecture to handle the sheer volume of connections. In this step-by-step tutorial, we will break down exactly how to architect large-scale cities for mobile platforms.

Step 1: Server-Side Spatial Partitioning

The fundamental enemy of server performance in massive multiplayer games is the O(N²) problem. If your server loops through every player to check their distance against every other player to determine who needs network updates, the math scales catastrophically. 100 players require 10,000 distance checks per tick. 1,000 players require 1,000,000 distance checks. At a 30Hz server tick rate, that is 30 million checks per second.

To solve this, we must implement Spatial Hashing (or a Grid/Quadtree system). By dividing the city into a logical grid, players only ever check for network relevance against entities in their current cell and the immediate surrounding cells. This reduces our O(N²) nightmare to an O(1) grid lookup followed by a heavily constrained local check.

Implementing a Spatial Hash Grid (C# Example)

Here is a highly efficient implementation of a 2D Spatial Hash Grid in C# that you can adapt for Unity, Godot (via C#), or a custom backend server to manage entity proximity without looping through the entire world state.

using System.Collections.Generic;
using UnityEngine;

public class SpatialHashGrid
{
    private readonly float _cellSize;
    private readonly Dictionary<Vector2Int, HashSet<uint>> _grid;

    public SpatialHashGrid(float cellSize = 50f)
    {
        _cellSize = cellSize;
        _grid = new Dictionary<Vector2Int, HashSet<uint>>();
    }

    // Convert a world position to a grid coordinate
    private Vector2Int GetCellCoordinate(Vector3 position)
    {
        return new Vector2Int(
            Mathf.FloorToInt(position.x / _cellSize),
            Mathf.FloorToInt(position.z / _cellSize)
        );
    }

    // Add or update a player's position in the grid
    public void UpdateEntityPosition(uint entityId, Vector3 oldPosition, Vector3 newPosition)
    {
        Vector2Int oldCell = GetCellCoordinate(oldPosition);
        Vector2Int newCell = GetCellCoordinate(newPosition);

        if (oldCell != newCell)
        {
            if (_grid.ContainsKey(oldCell))
            {
                _grid[oldCell].Remove(entityId);
            }
            
            if (!_grid.ContainsKey(newCell))
            {
                _grid[newCell] = new HashSet<uint>();
            }
            _grid[newCell].Add(entityId);
        }
    }

    // Retrieve all entities in the immediate vicinity (9 cells)
    public List<uint> GetEntitiesInProximity(Vector3 position)
    {
        List<uint> nearbyEntities = new List<uint>();
        Vector2Int centerCell = GetCellCoordinate(position);

        // Loop through the 3x3 grid around the player
        for (int x = -1; x <= 1; x++)
        {
            for (int y = -1; y <= 1; y++)
            {
                Vector2Int cellToCheck = new Vector2Int(centerCell.x + x, centerCell.y + y);
                if (_grid.TryGetValue(cellToCheck, out HashSet<uint> entitiesInCell))
                {
                    nearbyEntities.AddRange(entitiesInCell);
                }
            }
        }

        return nearbyEntities;
    }
}

By routing your network replication logic through GetEntitiesInProximity, your server only calculates exact distances for the few dozen players actively near each other, drastically reducing CPU load and allowing your server to comfortably handle thousands of concurrents in the same instance.

Step 2: Network Interest Management

Even with spatial hashing solving the server's CPU bottleneck, you still have a bandwidth problem. Mobile networks (4G/5G) are inherently unstable, prone to high jitter, and have strict bandwidth limitations. Sending data for 50 nearby players every tick will flood the mobile client's socket buffer, leading to extreme desyncs.

Interest Management (or Network Relevancy) is the practice of prioritizing what gets sent over the network. A player 2 meters away engaged in a firefight requires 30 updates per second. A player 40 meters away walking down a different street only needs 2 updates per second.

Overriding Network Relevancy (Unreal Engine C++ Example)

In Unreal Engine, you can take control of this by overriding the IsNetRelevantFor function. This allows you to aggressively cull network traffic based on line-of-sight and distance tiers.

bool ACityPlayerCharacter::IsNetRelevantFor(const AActor* RealViewer, const AActor* ViewTarget, const FVector& SrcLocation) const
{
    // 1. Always relevant to ourselves
    if (RealViewer == this || ViewTarget == this)
    {
        return true;
    }

    // 2. Calculate squared distance (faster than exact distance)
    const float DistanceSquared = FVector::DistSquared(SrcLocation, GetActorLocation());

    // 3. Absolute Cull Distance (e.g., 10,000 units = 100 meters)
    const float MaxRelevancyDistSq = 100000000.0f; 
    if (DistanceSquared > MaxRelevancyDistSq)
    {
        return false;
    }

    // 4. Dynamic Network Update Frequency based on distance
    // If they are far away, we lower how often we send data
    if (DistanceSquared > 25000000.0f) // 50 meters
    {
        NetUpdateFrequency = 2.0f; // 2 updates a second
    }
    else
    {
        NetUpdateFrequency = 30.0f; // 30 updates a second
    }

    return Super::IsNetRelevantFor(RealViewer, ViewTarget, SrcLocation);
}

By scaling your NetUpdateFrequency dynamically based on distance, you can reduce server outbound bandwidth by upwards of 70%, preserving the player's mobile data plan and preventing latency spikes.

Step 3: Client-Side Memory Limits and Asset Streaming

Servers have plenty of RAM; mobile phones do not. An iPhone 13 has 4GB of unified memory. The iOS operating system typically reserves around 1.5GB to 2GB of that. Your game must fit entirely within the remaining 2GB footprint. If you load an entire large-scale city into memory at once, the OS will instantly terminate the application.

To survive this environment, your city must be chunked and streamed asynchronously.

  • Hierarchical Level of Detail (HLODs): Instead of rendering 50 individual buildings in a distant city block (amounting to 3,000 draw calls), you must bake that entire city block into a single static mesh with a unified texture atlas. This reduces the draw calls for distant geometry from thousands down to exactly one.
  • Addressable Asset Systems: Never use hard references in your primary data assets. If a player spawns in District A, the client should use asynchronous loading (e.g., Unity's Addressables or Unreal's PrimaryAssetLabels) to download or load only the textures and meshes required for District A. District B must be rigorously purged from RAM.
  • Texture Compression: Rely exclusively on ASTC (Adaptive Scalable Texture Compression) for mobile. It allows for highly variable block footprints, giving you granular control over memory vs. visual quality on a per-texture basis.

Step 4: Distributed Backend Architecture and Server Sharding

A massive metropolis cannot run on a single physical machine. When designing an MMO-scale city, the world must be physically divided across multiple server instances (shards or nodes). When a player crosses a bridge from the Downtown Node to the Slums Node, their client connection and world state must seamlessly hand off between two completely different server processes.

Building this yourself requires setting up Kubernetes clusters orchestrated by systems like Agones, database sharding with Redis to pass player state between server nodes, and custom UDP load balancers for seamless connection handoffs. Designing this robustly so that players don't lose items during the transition is a massive undertaking—easily 4-6 months of dedicated DevOps work for a senior engineering team.

If you don't properly handle the RPC queues and database writes during these handoffs, you will inevitably run into state corruption. We have previously covered the mechanics of fixing the Unreal Engine RPC replication issue breaking your states, and those exact same principles apply directly to spatial handoffs across server nodes.

This is where platform solutions shine. With horizOn, these high-concurrency backend services, real-time database syncs, and dedicated server orchestrations come pre-configured. Instead of spending your runway architecting and debugging Kubernetes networking rules, you can focus strictly on building out your city's gameplay loops and client optimizations.

Best Practices for Mobile City Worldbuilding

To ensure your city scales flawlessly to millions of total users while maintaining high frame rates on budget devices, adhere strictly to these architectural rules:

  1. Aggressive Instance Pooling: Never use Instantiate() or SpawnActor for transient objects like vehicles, pedestrians, or projectiles during gameplay. Mobile CPUs choke heavily on memory allocation and garbage collection. Pre-warm object pools during the loading screen and cycle them continuously.
  2. Texture Atlasing for City Blocks: Draw calls are the primary killer of mobile GPUs (which rely on Tile-Based Deferred Rendering). Combine the textures of all generic street props (trash cans, benches, streetlights) into a single large texture atlas. This allows the engine to batch the rendering of hundreds of props into a single draw call.
  3. Strict Polycount Budgets per Chunk: Enforce hard limits. A single mobile city chunk (e.g., a 100x100 meter area) should ideally stay under 300,000 visible triangles. Rely heavily on normal maps rather than raw geometry to simulate architectural details.
  4. Implement Server-Side Hibernation: Running a dedicated server for a massive city where 80% of the map is currently empty is a fast track to bankrupting your studio. You need aggressive instance management, drawing inspiration from the Fortnite server optimization hibernation proposal to spin down idle grid coordinates and wake them up instantly when a player approaches.
  5. Decouple Collision from Visual Mesh: Never use complex visual meshes for server-side collision calculations. The server should only understand the city as a series of low-poly primitive shapes (boxes, capsules, spheres). This keeps server memory footprints minimal and physics calculations sub-millisecond.

Common Pitfalls to Avoid

  • The RPC Flooding Trap: Developers often trigger server-to-client Remote Procedure Calls (RPCs) for visual effects (like a spark emitting from a car crash). Do not do this. The server should only replicate the car's state (e.g., bIsCrashed = true). The client should independently observe this state change via an OnRep/property hook and trigger the spark VFX locally. This saves massive amounts of network bandwidth.
  • Leaking Memory on Zone Transitions: When streaming out a city chunk on mobile, ensure you are actively forcing garbage collection or manually unloading the asset bundles. If you leave even a few Megabytes of orphaned textures in memory every time a player moves between zones, they will inevitably crash after 20 minutes of gameplay.

Conclusion

Achieving true mobile game scaling optimization is a balancing act. It requires fighting for every megabyte of client RAM, strictly regulating network relevancy, and distributing server load across scalable backend nodes. By implementing spatial hashing, dynamic update frequencies, and asynchronous asset streaming, you can build massive, living cities that run smoothly even on years-old mobile hardware.

However, building the scalable infrastructure to route thousands of concurrent connections and manage seamless server handoffs is often harder than building the game itself. Ready to scale your multiplayer backend without the devops nightmare? Try horizOn for free or check out the API docs to see how we handle high-concurrency architecture out of the box.


Source: Designing Large-Scale Mobile Game Cities: Production, Optimization, & Worldbuilding Expertise

This dashboard is made with love by Projectmakers

© 2026 projectmakers.de

unknown-v1.87.4 / unknown-v--