Real-World Buildings Without Breaking the Bank
How we replaced expensive API calls and a 40GB database with 4.6GB of compressed files on object storage — serving millions of real-world places for pennies.
Why Real-World Buildings Matter
One of the things that makes Retropolis special is that the game map is the real world. When you walk down your street, you see actual restaurants, shops, and landmarks — and you can purchase them as in-game buildings that generate revenue for your gang.
This isn't a gimmick. It's the core engagement loop. When a coffee shop you pass every morning becomes your building in the game, it creates a connection that no procedurally generated map can replicate. Players check their buildings while commuting, compete for ownership of local landmarks, and discover places in their own city they didn't know existed.
Every building in Retropolis has a type — food, shop, bar, office, culture, or station — that determines its revenue profile and strategic value. Bars generate more at night. Stations are high-traffic. Culture buildings are rare and prestigious. This means the real-world character of a neighborhood directly shapes the in-game economy around it.
But here's the thing: to make this work, we need access to millions of real-world places, globally, with categories, coordinates, and names. And that's where it gets expensive.
The Problem: Place Data at Scale
Location-based games need a source of truth for what exists in the real world. The industry standard is Foursquare's Places API — the same data that powers Swarm, Mapbox, and countless other location-aware apps.
The challenge? API calls add up fast. Every time a player opens their map, we need to fetch nearby places. Multiply that by thousands of concurrent players across different cities, and you're looking at millions of API requests per month. At scale, this becomes one of the biggest line items on the bill.
The Neopolis Days: A 40GB Postgres Monster
Before Retropolis, I worked on a similar location-based game called Neopolis. We hit this exact problem and came up with a solution: download the entire Foursquare Open Source Places dataset, index it with H3 (Uber's hexagonal spatial indexing system), and serve it from a self-hosted PostgreSQL database.
It worked. Kind of.
The database was 40GB and running on a dedicated machine. We wrote a Go service in front of it that handled spatial lookups using H3 cell indices. The performance was solid — H3 is incredibly efficient for "find everything near this point" queries — but the operational costs were brutal:
- A beefy Postgres instance running 24/7 just for place lookups
- Backup and replication overhead for a massive dataset
- Schema migrations on a 40GB table are not fun
- Updating the dataset (Foursquare releases monthly) meant a full re-import
The data wasn't even changing during gameplay. These are static locations — a restaurant doesn't move. We were paying database costs for what was essentially a read-only lookup table.
The New Approach: H3 + Object Storage
For Retropolis, we kept what worked (H3 indexing) and threw out what didn't (the database).
The key insight: if the data is static and read-heavy, it belongs on object storage, not in a database.
Here's the architecture:
Step 1: Process and Partition
We run a pipeline that takes the Foursquare dataset, filters and categorizes each place into our six building types, computes its H3 cell index, and groups places by their H3 parent cell at a coarser resolution. Each group gets compressed into a single gzipped JSON file.
The entire world — millions of places — compresses down to 4.6GB of .json.gz files.
Step 2: Serve from the Edge
Those files sit on Cloudflare R2 (S3-compatible object storage). When a player opens their map, the server computes which H3 cells are visible, fetches the corresponding files, and caches them in Redis.
Because R2 has zero egress fees and the files are served from Cloudflare's global edge network, a player in Tokyo gets the same latency as a player in Paris. No database connection pools, no query planning, no connection limits.
Step 3: Cache Aggressively
Once a cell's places are loaded from R2, they're cached in Redis. Subsequent requests for the same area are served entirely from memory. The cache TTL is short enough that we could update the dataset without coordination, but long enough that hot areas (city centers) almost never hit R2 at all.
The Tradeoffs
No architecture is perfect. Here's what we gained and what we gave up:
What we gained:
- Cost: Object storage is orders of magnitude cheaper than a managed database for read-only data. R2 egress is free.
- Latency: Edge-served files + Redis cache means sub-10ms for cached cells
- Operational simplicity: No database to manage, no replicas, no migrations, no connection pooling
- Scaling: R2 handles any request volume. No need to vertically scale a database.
- Updates: Deploying a new dataset is just uploading new files. No downtime, no migration.
What we gave up:
- No ad-hoc queries: We can't run
SELECT * FROM places WHERE name LIKE '%pizza%'anymore. The data is pre-partitioned by location, period. - Fixed spatial resolution: The H3 cell size is baked into the file structure. Changing it means regenerating everything.
- Cold starts: The first request for a rarely-visited area has to fetch from R2. In practice this is ~50-100ms — not bad, but noticeable compared to a cached hit.
For a game where 99% of place lookups are "what's near this coordinate," these tradeoffs are overwhelmingly in our favor.
Built Places: Where Postgres Still Shines
Static place data lives on R2, but once a player purchases a building, that ownership data goes into PostgreSQL. This is dynamic, relational data — who owns what, building levels, active auctions — and a relational database is the right tool for it.
The separation is clean: R2 for "what exists in the world," Postgres for "what players have done with it." The server merges both views when responding to map requests, filtering out already-built locations from the static pool.
The Numbers
| Neopolis (old) | Retropolis (new) | |
|---|---|---|
| Storage | 40GB Postgres | 4.6GB compressed on R2 |
| Monthly cost | ~$50-100/mo (managed DB) | ~$1-2/mo (R2 storage) |
| Latency (cached) | ~5ms | ~2ms (Redis) |
| Latency (cold) | ~15ms (DB query) | ~50-100ms (R2 fetch) |
| Update process | Full re-import, downtime | Upload new files, zero downtime |
| Scaling | Vertical (bigger DB) | Infinite (edge CDN) |
What's Next
We're continuing to refine the pipeline — better category mapping, smarter filtering, and potentially integrating additional data sources. The beauty of this architecture is that the serving layer doesn't care what generates the files. As long as the format stays the same, we can swap out or combine data sources without touching the game server.
If you're building a location-based game and wrestling with place data costs, the takeaway is simple: treat static geodata as files, not rows. Your database will thank you.