I wrote some programs to go through 6 GB of OpenStreetMap data from http://metro.teczno.com so that I could extract a list of street names for an upcoming game project. The game will use procedural generation to create cities, so I need to have a dataset of street names but couldn’t easily find one. So I’ve created this one and wanted to share it.
I did a lot of tweaking to remove duplicates. Each street name is on its own line, and you can just add “Rd”, “St”, “Ln”, “Ave”, “Blvd”, “Pkwy” or any other suffix to the end of it. The zip file has a file of street names from each city, and then an allstreets.txt that has all of them combined into one file (with duplicates removed). Streets with numbers have been removed (there is no “7th” but there might be “Seventh”).
The street data comes from Boston, Chicago, Leeds, London, Manchester, St. Paul, New York, Seattle, the San Francisco Bay Area, Sydney, and DC, so you can expect that they mostly have Anglo names.
Then after looking at the data for a while, I realized that these could also be used for Anglo last names. I’ve removed any words that appear in a dictionary file I have, so some common last names like “Smith” or “Hunting” won’t show up. I would consider this list of moderate quality. Here’s the list:
I can’t vouch for the quality of the lists, but from a cursory inspection they seem quite serviceable. Enjoy!