Dev Notes 06: Street Names and Color Optimization

By . . Version 0.1.0

Dev Notes document daily progress towards a larger article. Discussion is preliminary

This note is the first in an exploration of US street named after US States. It's pretty common for a given state to name several streets after itself. For example, in San Francisco there California Street which has a glorious view (and its own wikipedia article). However, it's not just California in CA. There's also a whole little section of San Francisco where all the streets are named after other states.

SF map image (screenshot via Mapbox)

Looking at San Francisco street names, there are 30 streets with state names (28 unique state names, with California and Washington repeated twice). So unfortunately 22 states didn't make the San Francisco invite list.

Thinking about this more broadly, I was curious about exploring "State Street"s across the country. Which state gets the most shout outs? Which state is the most "egotistical" naming streets after itself, or perhaps which is the most "humble" preferring to instead honor others. I set out to download every street in America and get to the bottom of this.

Downloading and Processing Data

As a datasource I used open street maps. This is like Wikipedia, but for maps. People from around the world help keep street maps up to date, and the data is then later adapted into places like Google Maps (as one datasource among many).

Downloading the entire country of map data at once is quite large, so instead I started state by state. Delaware was a reasonable sized state which was more manageable than California, so I started there.

Processing The Data

Open Street maps data comes as a large sequence of nodes and references. My first step was processing out to get the streets. Street maps data has what is called "Ways" which are like lines or boundries like lakes. Sometimes processing was needed to pull out the ways and handle cases where the same street was split into multiple "Ways" (eg, because it split at an intersection where speed changed).

Color Optimization Optimizing Colors

I want to plot every "State street" on a map to be able see where they cluster. This has an interesting question of how to mark these. There are 50 states, so making every one a different color is challenging. There is an interesting problem around to optimally assign colors to every state. We can just choose from a fixed pallet, but these typically only have around 20 category choices at most, so would have duplicates. We could also use algorithms which optimize 50 colors to be the most visually distinct. This is what I did for a first pass. However, I think there is a really cool optimization problem here where you try to make colors visually distinct, take into account states that frequently group together. Certain states which are most common want to have particularly distininct colors. There's a lot of interesting math and algorithmic challenge which I hope write up for the article, but for now is left as future work.

Plotting

Here is a screenshot of this for Delaware:

delaware

It can also be interactive, but these notes are unfortunately rushed and I need to do a bit of infrastructure work to be able to nicely embed the interactive map on this site. Also, this is the non-optimal colors.

Conclusions

Here we began to explore street names. These notes are a rough and rambley (written quickly at the end of the day because a few other things going on). However, demonstrate much of the background for what I think will be a really cool visualization and exploration. I'm curious to see what it ends up being.

Tomorrow will be a short post about a different topic due to other things going on that day. However, I hope to have more state data to share on Sunday or Monday.