Safe social media practices include not posting photos that showcase personal information such as license plate numbers, street names, or house numbers. But what if I told you that generative AI could still find a way to locate you — just from your photo’s background?
Also: The best AI chatbots: ChatGPT and other noteworthy alternatives
As generative AI developments continue, new use cases are being identified. Now, graduate students at Stanford University have developed an application that can detect your location from a street view or even just an image.
The project, called Predicting Image Geolocations (PIGEON), can — in most cases — accurately determine a specific location simply by looking at the Google Street View of the location.
PIGEON can predict the country pictured with 92% accuracy, and it can pinpoint a location within 25 kilometers of the target location in over 40% of its guesses, according to the preprint paper.
To understand how impressive that is, PIGEON ranked within the top 0.01% of GeoGuessr players, the game in which users guess the location of a photo taken from a Google Street View of the location. That game served as the genesis for this project.
PIGEON also beat one of the world’s best professional GeoGuessr players, Trevor Rainbolt, in a series of six matches, streamed online with more than 1.7 million views.
So how exactly does PIGEON work?
The students leveraged CLIP, a neural network developed by OpenAI that can connect text and images by training it on the names of visual categories to be recognized.
Then, inspired by GeoGuessr, PIGEON was trained on a dataset of 100,000 original, randomly sampled locations from GeoGuessr and a download set of four images to span an entire “panorama” in a given location, making a total of 400,000 images.
Compared to how many images other AI models are trained on, PIGEON’s pales in comparison. For reference, OpenAI’s popular image-generating model, DALL-E 2, is trained on hundreds of millions of images.
The students also worked on a separate model called PIGEOTTO, which was trained on over four million photos derived from Flickr and Wikipedia to identify a location from a single image as input.
PIGEOTTO’s performance achieved impressive results on image geolocalization benchmarks, outperforming previous state-of-the-art results by up to 7.7% in city accuracy and 29.8% in country accuracy, according to the paper.
Also: Apple Maps vs. Google Maps: iPhone users are switching back, but which is better?
The paper addresses the ethical considerations associated with this model, including the benefits and risks. On one hand, image geolocalization has many positive use cases such as autonomous driving, visual investigations, and simply satisfying curiosity about where a photo was taken.
However, the negative implications include the most blatant violation of privacy. As a result, the students have decided to not release the model weights publicly and have only released the code for academic validation, according to the paper.