AI Photo Geolocation

AI Photo Geolocation

131

15d

by hubraumhugo

hhh

14d

FYI this site is keeping everything you upload in a Google storage bucket, which was unauthenticated up until a little bit ago. (Full disclosure, it's my tweet.)

https://twitter.com/spuhghetti/status/1786033761341083731

Zeratoss

14d

thotDBSmash ??

Does this imply that the people behind the website specifically saved "juicy" user-uploaded images?

hhh

14d

no, it looks like it was a separate project, but stored in the same bucket. In the time I had access to the bucket (it's no longer public), it looks like they were scraping images from a dating site/app and each directory represented a profile.

ShamelessC

14d

That doesn’t sound super sketchy or anything.

mrbungie

14d

Yeah, just what we need. Scams on tinder based off LLMs + fake/stolen profile pics.

plorg

14d

I uploaded pictures of a couple of street corners and it confidently identified them as being in Texas and Florida, based on text that was not in the pictures and, in the second case, foliage in a scene that included only concrete. Although in fairness to the model, a parking lot may be the dominant ecosystem in Jacksonville.

Anyways, these pictures were from Iowa.

DougBTX

14d

Same, location identified by the architecture of the buildings in the background and the car's numberplate... of a car with no numberplate driving through a wood.

foobarbecue

15d

I gave it a picture from a bar in Austin. It nailed it, but with some interesting hallucinations in the description. The photo had a small Texas flag, but nobody was wearing cowboy hats, and there was nothing with "Austin" on it in the photo. Description was:

This photo was taken inside a bar. There are several clues that indicate this is Austin. First, there is a sign on the wall that says "Austin." Second, there is a Texas flag on the wall. Third, there are several people wearing cowboy hats, which is a common sight in Austin. The coordinates of this photo are

dylan604

14d

Is it possible that the sign that says Austin is on the wall and is known to the system but not visible in the actual photo?

fer

14d

Perhaps, but I've tried some rural landscapes without any sign and it came up with English signs as a hint for pointing England/Wales.

Even photos with signs in Irish were pointing to England, it's half funny, half offensive.

consumer451

15d

I uploaded a generic photo I took of a field, dandelions, and trees. It confidently stated Switzerland, and provided specific GPS coords.

Of course, it was entirely wrong.

Some level of confidence indication would make a system like this much more useful.

burkaman

14d

I think this is generative AI and it doesn't know how confident it is. So far it hasn't gotten any of my pictures right and it's made pretty bad guesses, a human could do better on most of them.

ubutler

14d

The LLM’s logits should be translatable into probabilities although I’m not too sure how meaningful those might be as models can sometimes be quite confident in entirely invalid predictions.

barrkel

14d

I don't think that's the right way to think about LLM logits. Fundamentally the logits represent probability of similarity with the text it's been trained on, given the current prefix. Mixed in with any correspondence with truth is not only tone, phrasing, dialect, language syntax, but also stuff like the likelihood that specific details are related to general concepts. Even if we're talking about a person with three legs, or a horse riding a man, it'll be hard for the LLM to not assign a fairly high probability to sentences that describe two legs, or a man riding the horse and not the other way around.

vineyardlabs

14d

I haven't done deep reading on LLM architectures and I don't really know if LLMs have logits in the traditional sense of a CNN or something, but I think the problem with this is that the LLM's logits would have absolutely no bearing on it's confidence of the location being correct, only on it's confidence that the tokens making up the answer it provides follow from the tokens that were encoded from the provided image, which isn't the same thing.

consumer451

14d

I can only imagine the quality of the systems being sold to governments, and people trusting them because "AI." I mean, "intelligence" is right in the name!

dylan604

14d

Intelligence is also in the name of the CIA, and that's pretty well understood as an oxymoron. Artificial is also in the name which seems much more apropos. It's clearly not real intelligence, it's purely artificial in the use of the word. I guess, Computer's Best Guess Simulating Intelligence To Low Intelligent Humans would be too on the spot and not as sexy of an acronym.

consumer451

14d

I actually have faith in the intelligence products produced by the modern CIA, much more than AI snake oil.

dylan604

14d

There's a difference in the products used to produce intelligence vs the analysis/centralization of it that the oxymoron comment goes towards.

alistairSH

14d

Tried it a few times, it's hit or miss.

- Rooftop bar in Viejo San Juan, PR was identified correctly, down the intersection.

- Beach on the south coast of Vieques, PR was identified as Jamaica, so reasonably close for a non-descript tropical beach.

- Office building in Reston, VA which is fairly obvious (biggest/tallest building in the area) was identified as being in San Jose, CA.

- Train station in Staunton, VA was identified as somewhere in Massachusetts.

The attributes of the photo were mostly accurate, but were matched to an incorrect location.

sebys7

15d

1/3 was completely wrong, in the sense that the coordinates, country and city had nothing to do with it but THEN the sources were other buildings from the actual country and city. 2/3 got city and coordinates correct, but got the country wrong, which idk how that happened. 3/3 got country, city and coordinates correct Pretty cool

tgv

15d

Doxxing for dough. The ethics committee is out to lunch.

See this thread: https://news.ycombinator.com/item?id=40233248

ToucanLoucan

14d

In their defense, the AI hype folks have been ignoring the ethics community from jump. I'd leave too.

aaron695

14d

[dead]

salamo

14d

Neat demo. It seems like there may be a few things happening in tandem.

I uploaded a picture of a forest and it came back with visually similar images. So the first thing it might be doing is some kind of KNN, and if the pictures have location labels associated applying some sort of weighted average to determine GPS coordinates. This is pretty cool.

I also tried flipping the image horizontally, and it came back with the same images. So their embedding isn't based off of exact matches (good) and seems to be invariant to some basic translations (good). It also seems like it's directly extracting visual features from the image. This can be done with something like Blip[0].

Then I uploaded a screenshot from Magic School Bus. It still extracted information to guess the "location" of the cartoon (San Francisco, which is wrong). So that's probably how it works.

I also found the text output is similar in some ways with OpenGVLab InternViT [1]. So perhaps this or something like it is being used to extract features.

And of course there may be an LLM on top of these extracted features with some sort of prompt template. But I should add that the text explanation is the least useful part of the result, since it is unreliable and less informative than the "boring" similarity metrics above.

[0] https://huggingface.co/Salesforce/blip-image-captioning-larg... [1] https://internvl.opengvlab.com/

1970-01-01

15d

It's heavily biased and therefore easily tricked. I uploaded a photo from NYC.

     The graffiti on the wall is a clue that the photo was taken in Detroit. The vegetation in the background is also consistent with the climate of Detroit.

KeplerBoy

14d

It's also a ridiculously hard task.

ToucanLoucan

14d

Yeah maybe just don't do it then? If someone removes the EXIF data from a photo there's probably a reason for that, and assuming that's suspicious in some way is pretty ridiculous in a society that's supposedly all about personal freedom and the right to a fair trial.

I don't mean to be aggressive here but this seems like yet another tool that will be abused to shit by already powerful people to do even sketchier things.

KeplerBoy

14d

I agree, this project probably shouldn't exist. But oh, well here we are and stuff like this can be built with reasonable effort. Scrape google streetview and every exif tagged Image you can get your hands on and get training.

I have no idea where this is heading, but we aren't turning back.

surfingdino

15d

It is looking for distinct features in the photo and does a probabilistic match against a tagged dataset. The features that match best on the tagged photos in its dataset are used to construct output that looks like a plausible answer. Don't use it to plan your trip.

usaar333

14d

Yeah, gave it a photo of a beach on Lake Ontario with two Asian friends of mine in it. Guessed.. Japan

andoando

14d

I uploaded a photo of a screenshot of a chess game on chess.com

It identified as the golden state bridge of San Francisco, saying the buildings in the background are also consistent with the architecture in San fran.

sanxiyn

15d

This correctly identifies South Korean landmarks, like Diamond Bridge in Busan. Since I don't have encyclopedic knowledge of world landmarks -- I wouldn't be able to recognize Diamond Bridge-like landmarks in United States -- and nearly no one does either, that alone is quite useful.

jononor

14d

Google reverse image search will so that too, and has been working for 10+ years now?

sebzim4500

15d

Pretty cool. Correctly identifies different islands in the Galapagos based on the ground and the plants.

erksa

15d

This is close to an actual need I've managed to create for myself.

I do photography and I store those I want to share on nextcloud. In my selection and export process all metadata etc is stripped. But I realized too late that it also stripped out the geo-coordinates. No problem adding that in, but still have a laaarge amount of photos without geolocation data.

I'm too lazy to re-export all the older ones, so being able to run something like this on them would be perfect. I would be satisfied with a general area, roughly hitting the province/state its taken in. It doesn't have to be accurate at all, it's more for my own geo grouping.

This site though goes bananas on firefox/mac. Flickering and font adjustments..

CaptainOfCoit

15d

> I'm too lazy to re-export all the older ones, so being able to run something like this on them would be perfect. I would be satisfied with a general area, roughly hitting the province/state its taken in. It doesn't have to be accurate at all, it's more for my own geo grouping.

I don't think this is even close to being accurate to be used in this way, out of ~10 images I uploaded it got one "correct" (right country, wrong city). Unless you want all your images to geo-tagged "Somewhere, US", probably better to re-export/re-import with your original metadata.

erksa

14d

That's fair. I couldn't even get this to work, so not in particular looking at this implementation. I just literally was thinking of if this would be viable or not as an approach, so it was fun to see something that tries to match the bill!

solardev

14d

If you still have the original photos, maybe you can write a script to run both the originals and exports against a perceptual hash (so as to easily identify the correct original) and then just update the JPEG EXIF data of the exports?

https://github.com/JohannesBuchner/imagehash

Depending on specific formats, you should be able to read and edit metadata without having to reprocess the images. If the exports are named similarly to the originals, you don't even need to hash them.

wongarsu

15d

Seems to be about as accurate as a good geoguessr player on a time limit. Recognizable vistas are generally right down to the city, and even if there's only general architecture to go off it's often right to within a couple hundred kilometers.

The explanations are a bit hit and miss. Some are great and correctly describe the names of buildings in the picture, some are only vaguely related to the picture.

Ethically this is very questionable. Of course with enough dedication humans can do the same (e.g. Rainbolt has made a Youtube career out of this), but commoditizing this for every stalker around the world has some troubling implications.

surfingdino

15d

Ethics is absent in the minds of the people building and financing this. AI is about wholesale value extraction and destruction of competition done by the ecosystem of small startups repackaging AI APIs. Those APIs will be turned off once ad revenue starts flowing into the bank accounts of AI API providers.

coumbaya

15d

My backyard: Germany because there are trees and a fence (? Also, no). A picture of the farmer's market of my town: correctly assume France but confidently incorrect on the town and landmarks (off by 200-300km I'd say).

fragmede

14d

The question is, how does this do at GeoGuessr, where users are given a picture from Google street view, and are asked where it is on the world by clicking on a map of the world. Users get points based on how close it is, user with the most points after N rounds, wins.

The best player in the world, Rainbolt, played against an AI out of Stanford, so I wonder how this one would do.

https://www.geoguessr.com/

https://www.youtube.com/watch?v=ts5lPDV--cU

amarcheschi

15d

I put an image of villa isnard, cascina, Pisa, and it was recognized as being from France. It is a villa with French architecture and olive trees. I then tried to upload an image still from Pisa with a building in venetian Gothic style and it was recognized being in Venice. It can be deceived quite easily imho, it looks like it just search for the corresponding architecture (maybe?) and details surrounding it but it doesn't search online. Villa isnard is quite famous (at least, you have results online) and a Google lens search would have found it

j-bos

15d

Yeah, I would guess it's identifying elements in the photo, qualifying the likelihood of those combined elements in a particular location, then outputting the assumption. I posted a picture of a Texas lake seen from a privatr residence, and it correctly guessed Texas, but pushed it off by a couple hundred miles into an Austin golf course.

sanxiyn

15d

It seems to use many signals, at least according to its own explanation. For example it looks at road signs and license plates to identify countries.

voidUpdate

15d

I'm not convinced by the quality of this. I took some screenshots of street view, not including any icons, and it identified them as completely the wrong city. One of them included the name of the town on a bus stop, which it completely failed to identify, placing the picture across the county, also asserting that it contained featured that it definitely didn't, such as thatched rooves (all rooves in the image were normal slate). I would make trust it to get me in the correct area of the country, but that's about it

voidUpdate

15d

After a bit more testing, it could successfully identify Buckingham Palace and St Michaels Mount (however the location wasn't great), however a street overlooking a beach in Cornwall was marked as Wales, apparently including house numbers in Welsh and English (despite Welsh using English numerals). It seems to work somewhat ok if there is a clear image of an obvious and distinctive monument, otherwise it isn't particularly accurate

tapland

15d

I got similar results, and that would be ok if it didn't sound so confident with the guess.

axegon_

15d

Assuming this is more of a proof of concept/prototype, that's not bad. It didn't get it[1] right, but at it's core, the guess is not terrible, shift it 560km[2] south-east and you'd be bang on. I'll admit, I did set the bar a bit high.

[1] https://imgur.com/a/67t0TVt

[2] https://imgur.com/a/aspf8px

exar0815

15d

It told me the correct country, but the completely wrong city, and then began to describe a typical place in the style of the country - nothing of which was visible on the image.

boxed

15d

> Sweden

Good

> Rural area

good

> [pin in the center of Stockholm, the most urban area in Sweden]

ouch.. not so good.

poulpy123

14d

to be fair they say they provide the coordinates for the city or the town, so if it's not too far from stockholm I would count it right

boxed

14d

It wasn't. Not even close.

poulpy123

14d

It recognized 2/7 of the pictures I used. The two success are a really well known place in Rome (the roman forum, although it got the arch wrong) and a small but very touristic city. It guessed the country right but was far from the place in 4 cases: landmarks were visible but they are not hugely touristic. In the last case the country was wrong, but it was a picture from my office with no landmark.

itslennysfault

14d

I've been learning deep learning, and I built a very toy version of this recently. It's really just a classifier that can (maybe) tell you if a photo was taken in one of the 5 cities I trained it on.

https://huggingface.co/spaces/itslenny/fastai-lesson-2-big-m...

karaterobot

14d

I had ChatGPT generate some selfies taken in various places, then ran it through this app. My assumption was that this app would do really well, since one model would identify the stereotypical features generated by the other model. It got 1/3. It nailed Minneapolis, it got Damascus, Syria wrong (said Amman, Jordan), and it got the Ballard neighborhood of Seattle wrong (said San Francisco).

orf

12d

> Explanation: The image is of a residential street in London. The architecture of the buildings is典型的英国风格, and the vegetation is consistent with that of London. The street is also relatively narrow, which is common in London.

Not sure about the mix of languages here. It’s correct, but not specific.

ethanholt1

14d

It was correctly able to identify several photos of my vacation to NC, down to the exact location where the photo was taken on the hiking trail. Pretty scary. Additionally, just to be sure, I used an EXIF data wiper to make sure it wasn’t pulling data from there and tried each photo in a seperate Incognito instance. Still got it correct, all 3 times. Mind boggling.

teakie

14d

how?

boesboes

15d

It's not very accurate, but it seems consistent. However it quite often tells me 'this is in X because the language on the sign' when there are no signs at all. Or just now I got 'The house in the background is made of wood, which is a common building material in Finland.' with a photo of a lake. There is no house, there are trees though :)

glonq

14d

I gave it a snippet of Montevideo city skyline, and it responded with:

The photo was taken from a rooftop in Buenos Aires, Argentina. The photo shows a clear view of the city's skyline, including the iconic Obelisk of Buenos Aires. The buildings in the photo are characteristic of Buenos Aires' architecture and the vegetation is typical of the region.

abnry

14d

I took images directly from Google Images search and it got them wrong. But it was sort of directionally right. My local city hall it said was the courthouse in the same county. The local bridge was put into a wrong state.

Interestingly, it provided reference images and the images I posted were basically in the reference images.

timnetworks

14d

Googled 'vacation photo' and picked some off of Flickr. The locations matched the captions, State and Country (FLA and Cancun) correctly.

Obviously uploading a picture of a hot dog will waste compute on trying to figure out what kind of traffic the ketchup is, but it works with snapshots great (not stock photos)

elsadek

15d

Childish AI, I gave him two photos, both was totally wrong and hundred thousands miles from actual locations.

Don't recommend.

erkkonet

15d

Funny enough it was accurate all while citing items that were not in the picture (not even cropped out), like tall buildings and signs in a specific language. I'm sure there will be refined versions that are scarily accurate. Another OSINT tool for better or worse.

mightytravels

14d

Claude Vision can do that for you if you are building something similar. Had similar results with OpenAI.

https://docs.anthropic.com/claude/docs/vision

bigwheeler

14d

I uploaded a photo of a helicopter cockpit, flying up the Talkeetna River in alaska, which was very out of focus in the background. It knew exactly where I was, and even mentioned I was in a helicopter. And that it was fall!

nirav72

15d

I uploaded couple of pics of a beach in Turks and Caicos. It came back with a beach in the Bahamas. Not even close. But I suppose close enough geographically. Also a pic taken from a stationary train in Chicago, came back as NYC.

Comment was deleted :(

karma_pharmer

15d

FirebaseError: Installations: Create Installation request failed with error "400 INVALID_ARGUMENT: API key expired. Please renew the API key." (installations/request-failed).

Loranubi

15d

I uploaded a drone shot of New Taipei City and it gave me Taipei. Close enough. I don’t know if it was cheating though because the image had exif gps coords embedded…

The site worked fine on Firefox on iOS.

trebligdivad

15d

Hmm interesting; one hit, one probably in vaguely the right area; both from scans of ~40 year old photos. (As someone else noted, site is rather brokne on Firefox/Linux but does work).

pphysch

14d

Why use a LLM for this? You'd definitely want a large model, but this seems like a more straightforward classification problem that doesn't require understanding of language.

dmd

15d

I'm blown away; it correctly identified a photo taken inside my house - just a picture of my kitchen - as being in eastern Massachusetts, just based on the architecture.

btasker

14d

I didn't have the same luck.

I gave it a photo from inside a house, you can see a person on the bed, and the white wall behind - that's it.

Obviously I wasn't expecting an accurate location, but

> This photo was taken in Los Angeles, California. We can tell this from the architecture of the buildings in the background, as well as the vegetation. The palm trees are a dead giveaway that this is Los Angeles.

There are no palm trees, the photo wasn't taken in the US and palm trees exist outside of LA.

I also fed a photo of some quite distinctive castle ruins. It mislocated that by 100s of miles.

CaptainOfCoit

15d

Maybe it does a lot better with photos related to the US, training set probably contained mostly US-related images, as only one image out of ~10 taken in various European places were correctly guessed for me. Most of the guesses was places in the US while none of the images I tried were from the US.

fer

14d

There's a certain training set bias. Most pictures from post-Soviet states land in Moscow for me.

Zambyte

14d

I wonder if it looks at EXIF data at all.

dmd

14d

I stripped date and geo info.

Sporktacular

14d

It's funny that with a direct match to a precisely located photo in its database, it got the country right by comparing the architectural style, but still got the city wrong.

jimlawruk

15d

I put in a photo of a small lake near Truckee, CA, several miles from Lake Tahoe, and it reported it was Lake Tahoe. It was wrong, but impressed it was geographically very close.

iLoveOncall

15d

Now try with a photo of a lake that's on the other side of the world compared to Lake Tahoe and see if it doesn't also report it as Lake Tahoe.

ghastmaster

14d

> The photo was taken from a tall structure, possibly a fire tower.

It was way off on the state, but I am still impressed with that spot on description. It was taken from a fire tower.

bambax

15d

This is what happens when I try to scroll to read the results...?

https://i.imgur.com/ywc1Hn0.png

(Chrome on Windows)

Alifatisk

14d

Reminds me of a research paper that used ai to accurately pin point where picture was taken, I had hoped this was it. But it’s better than nothing.

K0balt

14d

I uploaded a nondescript scenery photo with no non-natural cues from the Dominican Republic, it got the general area right within 50km.

kome

15d

Interesting concept, and it works somehow. But they definitely needs better web developers. Very strange flickering, what the hell is that?

rnewme

14d

Even though it missed the town by few kilometers it also recognized my wife's dress and linked to the webstore for it

underlogic

14d

I think it just uses EXIF data, then makes guess of other photos from same IP. Fake it till you make it

mufty

15d

I’m not sure how this works under the hood. My initial observation is it does not work.

vel0city

14d

Uploaded a photo of the Bell Centre. Easy Habs logo on it. City location: Toronto.

stainablesteel

14d

it seems to be a rule of thumb that you can pick a subreddit that operated as some kind of service, ie r/whereisthis, and replace that entire apparatus with an ai of some kind

pt_PT_guy

15d

Is the website glitchy on firefox?

DominoTree

14d

Very - never seen anything explode quite like it

eru

15d

> I'm sorry but GeoSpy is not allowed to process this image. Please try again with a different image or contact support at info@graylark.io

That was with an image I took in London on my phone.

onemoresoop

14d

Entirely wrong result.

noashavit

14d

scary accurate

salade_pissoir

14d

This seems like another Hotdog/Not Hotdog business model.

lkramer

15d

The page is very broken for me (Firefox in Linux), locking up, flickering.

I did manage to get it to place a picture of a praying mantis I took in Japan to be from California...

eru

15d

I get the same flickering in Firefox on MacOS, but it managed to recognise my picture.

mhuffman

14d

Same, Firefox in linux, also tried with all extensions disabled and same thing. Just flickering with an "error" modal.

lordswork

15d

I got "an error has occurred"

ape4

15d

Me too - maybe its overloaded

Comment was deleted :(

Crafted by Rajat

Source Code

hckrnws

AI Photo Geolocation