Niantic Is Training a Giant ‘Geospatial’ AI on Pokémon Go Data

Niantic is harnessing the power of crowd-sourced data from Pokémon Go players and its Scaniverse app to develop a “large geospatial model” capable of mapping the world in 3D with near-centimeter precision.

Much like how ChatGPT and other large language models were trained on text scraped from the internet, Niantic’s approach relies on millions of real-world images captured by users.

Each scan provides details about angles, lighting conditions, and physical surroundings, building a comprehensive dataset that can infer missing perspectives of roads, buildings, or landmarks.

By merging these scans into a single, unified AI model—rather than multiple smaller models keyed to individual locations—Niantic believes it can recreate or “fill in” unseen areas based on what it has already learned, not unlike how a person uses experience to predict what’s around the corner.

This robust geospatial understanding could enhance augmented reality (AR) experiences, informing more accurate placement of digital characters or objects, and may prove equally valuable for robotic systems, autonomous vehicles, and navigation services.

The potential applications go beyond gaming. Niantic’s Visual Positioning System (VPS) already powers immersive AR by matching players’ cameras to known data about specific places. With a single foundation model, the AI would be free to generalize patterns across an entire global dataset, offering more flexibility for developers to craft dynamic AR environments or guide robots in unfamiliar settings.

Although such maps are typically proprietary and fragmented, Niantic’s crowdsourced approach may offer a path toward a more open, expansive digital recreation of the physical world. Still, collecting and using real-world data present significant privacy concerns, especially as 3D scans can capture sensitive details about people’s homes and workplaces. While Niantic notes that its scanning feature is strictly voluntary, other companies may not maintain the same level of transparency.

Furthermore, some researchers question whether an architecture designed for language can directly apply to robotics or spatial data without significant modifications. Even so, as AR gains momentum and new devices, like advanced smartphones or AR glasses, proliferate, gathering large-scale real-world data will only become easier. The result could be an internet-like dataset of physical spaces, supporting AI that’s as fluent in the real world as today’s large language models are with text, fundamentally expanding how we interact with digital content, robots, and each other.