Which State Has The Biggest Cartographic Ego?: Trends in US Streets named after US States

By . . Version 0.3.0

Decorative pineapple illustration

Abstract: Street names can reveal what is important to a region. We explore patterns and visualizations of US street names. After looking at broad general trends in names (such as the most common street names), we narrow in on streets that include state names. We estimate there are 47,610 streets with a state name, with Washington in an understandable lead. There are ~94 streets with "North Dakota" in their name, but none of these streets are actually in North Dakota. Meanwhile, about 39% of the ~700 roads with "Wisconsin" in the name are in Wisconsin. There are several conflicting ways to compare usage of state names making exact ordering only a toy. We discuss these trends and how this analysis was done.

Introduction

Growing up in Texas, everyone knew Texas was the best state. We had Texas Hold’em, Texas Toast, Texas Longhorns, and more. But then fate took me west to California, a golden land which knows it is great, but perhaps shows less obvious signs of it. Sometimes though there are literal signs. Stretching across San Francisco there’s “California Street”, which can be stunningly beautiful, with rolling hills and the skyline framing the Bay Bridge.

california street

A view down California Street. Image credit KennyOMG on Wikipedia CC-BY-SA 4.0

But then head to the east part of the city, and, wait a second? Is that Texas St? And there's Vermont? And Mississippi?

street map in SF

San Francisco's Potrero Hill Neighborhood. Map image via Mapbox.

Apparently there are 30 streets named after states in San Francisco, California (28 unique states). Unfortunately 32 states didn't make the SF invite list.

So California has a bunch of "State Sts" (which we will define as a roadway with a name that includes any state, such as "Nevada Lane", "Little Florida Avenue", or "Pennsylvania Run Road". Not to be confused with the actual 1,335 US Streets literally named "State Street"). Is this unusual? Which state has a Texas-sized ego, naming many streets after itself? And which is most humble, preferring instead to honor others? We download over 10GB of OpenStreetMap data, parsing through to explore this.

We study the following research questions:

  • Which state name is most popular? We find Washington, with its presidential heritage, dominates the count (9,535 streets). After Washington, VA and OH lead. (Section 2)
  • Which state is "most street egotistical"? These states frequently name streets after themselves. There are different conflicting ways to measure this, but states WI, KS, and IA lead. (Section 3)
  • Which state is "most street humble", not using their name or naming streets after other states? ND, SC, and NH appear to use their own names relatively rarely. (Section 3)

There's a decent amount of nuance and complexity here. For example, what counts as a street (probably trails don't count, but do service roads)? What happens if a big park or neighborhood is built splitting streets into multiple unconnected parts? We try to make reasonable choices, and note our methodology in Section 4 as well as make our source code available. However, the analysis should be considered an estimate, not exact numbers.

First, we gain an understanding of our streets data. After some processing, we have about 6 million streets. We start by looking at streets of any name (they don't have to be named for a state), before narrowing in on the counts of the "State Sts".

Overall Street Names

The 10 most common street names in the US.
Figure 1. The 10 most common street names in the US.

Counting Every Street In Figure 1 we plot the 10 most common street names. We find Main St is most common. Numbered streets make up 4 of the top 10. Nature-themes streets sprout up in four top spots. Church Street is the only place-themed street in the top ten.

The 10 most common street names in the US.
Figure 1. The 10 most common street names in the US.

If you Google "what is the most common street name in the US?" you will get 2nd St in the summary result. It echoes a common explanation that Main St and 1st St compete with each other, so 2nd St wins out. This does not replicate in our data. The top source here seems to be from a 1993 census report. Chalabi (2014) reports investigating this and got in touch with the researcher who made the 1993 report, but the researcher couldn't remember their methodology.

Our analysis has several advantages, being more recent as well as repeatable.

The most common words in street names in the US.
Figure 2. The most common words in street names in the US.

Most Common Word Next we split every name. The top values are shown in Figure 2.

The most common words in street names in the US.
Figure 2. The most common words in street names in the US.

Intrestingly "road" wins out over "street". "North" is the most common direction mentioned.

Unusual Words Per State. Next we look which words are unusual in the street names. We use an algorithm (TF-IDF) that is used in search engines to find keywords. This algorithm identifies words that appear in streets of a given state, while also being unusual in other states. We plot these top two street keywords for every state in Figure 3. Additional methodology in Section 6

Top two distinctive keywords in street names for each state.
Figure 3. Top two distinctive keywords in street names for each state.

We can observe some paterns. For example the use of spanish words like "calle" and "de" appear in the CA, AZ, NM. "Lake" is a top choice of states like WI and MI while New Hampshire only has "pond". Some keywords like "cementery" in KY are a bit more mysterious.

Streets with State Names

Next we look at streets where a state name occurs anywhere in the name. In total there are 47,610 roads with state names.

The most and least common state names in street names. The bar color is split by occurrences of the roads in the state vs out of the state.
Figure 4. The most and least common state names in street names. The bar color is split by occurrences of the roads in the state vs out of the state.

Which States Are Most Common? In Figure 4 we plot the most and least common state names that appear in streets.

The most and least common state names in street names. The bar color is split by occurrences of the roads in the state vs out of the state.
Figure 4. The most and least common state names in street names. The bar color is split by occurrences of the roads in the state vs out of the state.

We begin to see some complexity to this question. 9,535 roads have "Washington" in the name, dominating the count. However, it has an obvious confounder as it shares a name with George Washington. Somewhat interestingly, Washington's home state of Virginia comes in as the second most common state. While not a famous surname, it is a common given name. Virginia supposedly was originally named after Queen Elizabeth, the "Virgin Queen", and is more than twice as popular in street names than Georgia (named for King George II). For less clear reasons, Ohio comes in third (it was named for a Native American word for 'Great River'). We might speculate that it is partially because it is the shortest state name, making it an easy name to reuse, and it was a major hub during the 1800s westward expansion in the U.S.

On the bottom end, the multiword states such as North and South Dakota appear less than 100 times. Notably though we require a full match of the entire state name. There are 1,451 streets that include "Dakota", no North/South mentioned. This nuance similarly applies for streets like York, Carolina, etc. Is this the right way to handle multiword states? 🤷‍♂️. It is a straightforward-to-define way.

Which States are "Egotistical" or "Humble"?

Top and bottom states ranked by percentage in-state fraction. States where most occurrences of the name are within their borders rank highly.
Figure 5. Top and bottom states ranked by percentage in-state fraction. States where most occurrences of the name are within their borders rank highly.
Top and bottom states ranked by fraction of all of their streets that include their name.
Figure 6. Top and bottom states ranked by fraction of all of their streets that include their name.

Next we explore how often states name streets after themselves. We investigate three ways of quantifying this "egotistical" vs "humble" factor.

In one perspective we could look at the fraction of the roads with a state's name that are in the state vs out of the state. So a state that frequently names roads after itself while no other state is doing so would rank highly here. This is shown in split bars of Figure 4, and is focused in as a percentage in Figure 5.

Top and bottom states ranked by percentage in-state fraction. States where most occurrences of the name are within their borders rank highly.
Figure 5. Top and bottom states ranked by percentage in-state fraction. States where most occurrences of the name are within their borders rank highly.

Here we observe WI and TX on top. WI was a surprising state for the top spot.

Another way to look at this is the fraction of all the streets in the entire state (of any name), and ask how many of them include the state name. This is shown in Figure 6.

Top and bottom states ranked by fraction of all of their streets that include their name.
Figure 6. Top and bottom states ranked by fraction of all of their streets that include their name.
Top and bottom states ranked by fraction of state-named streets that are self-named (named after that state) versus other-named (named after other states).
Figure 7. Top and bottom states ranked by fraction of state-named streets that are self-named (named after that state) versus other-named (named after other states).

Here we observe Kansas on top. Roughly 1 in 220 streets in Kansas have the state in the name. In general this metric biases against small states (every state reasonably has a few key streets named after itself. Having few overall streets makes this appear to be a high percent). WI again appears in the top 7, but TX, with its many streets does not appear high according to this metric.

Finally we examine self-naming vs other-naming. Perhaps some of the states with a lot of in-state references just have a lot of streets named after states? We compare, for each state, the number of streets named after itself to the number named after other states. This is shown in Figure 7.

Top and bottom states ranked by fraction of state-named streets that are self-named (named after that state) versus other-named (named after other states).
Figure 7. Top and bottom states ranked by fraction of state-named streets that are self-named (named after that state) versus other-named (named after other states).

Here we see Hawaii leads with over 40% of its state-named streets being self-named, followed by Virginia and Alaska. ND and DC have about 100 streets named after states despite not having a street named for themselves.

So where does that leave us? Our different ranking measures are not exactly consistent. To reconcile this we take an average ranking across all three of these to get an overall "ego/humble" ranking. By this measure we observe WI, KS, and IA most using their state name, and ND, SC, and NH at the bottom. The full combined ranking is shown in Table 1. New York interestingly is 5th most "street name humble". This is likely for the better, as potentially having your street address be "123 New York Street, New York City, New York" sounds awful.

Map showing combined ego/humble rankings. Hover over states to see detailed metrics.
StateIn-State %State Fraction %Self-Named FractionAvg Rank
Wisconsin1 (39.1%)6 (0.206%)5 (23.6%)4.0
Kansas4 (30.7%)1 (0.448%)13 (17.9%)6.0
Iowa12 (20.6%)2 (0.287%)14 (17.7%)9.3
Oklahoma3 (34.3%)5 (0.207%)21 (15.5%)9.7
Michigan13 (20.3%)10 (0.167%)9 (19.9%)10.7
Minnesota5 (28.5%)17 (0.141%)16 (16.9%)12.7
Nebraska17 (16.0%)7 (0.193%)15 (17.5%)13.0
Texas2 (38.5%)23 (0.108%)18 (16.2%)14.3
Idaho25 (13.2%)9 (0.179%)10 (19.5%)14.7
Montana33 (10.1%)4 (0.261%)12 (18.7%)16.3
Virginia37 (8.4%)12 (0.158%)2 (27.9%)17.0
Illinois6 (26.9%)15 (0.147%)31 (11.3%)17.3
Delaware45 (3.7%)3 (0.285%)6 (23.3%)18.0
Kentucky26 (12.4%)21 (0.115%)8 (20.6%)18.3
Missouri11 (23.3%)19 (0.120%)26 (12.8%)18.7
Alaska36 (8.7%)20 (0.118%)3 (26.6%)19.7
Oregon32 (10.5%)16 (0.146%)11 (19.2%)19.7
Indiana16 (19.4%)11 (0.159%)32 (11.0%)19.7
Washington48 (2.5%)8 (0.191%)4 (26.0%)20.0
Ohio20 (15.1%)18 (0.133%)23 (14.9%)20.3
Alabama21 (14.9%)26 (0.091%)17 (16.2%)21.3
Colorado28 (11.8%)14 (0.151%)24 (13.8%)22.0
Hawaii41 (6.0%)25 (0.095%)1 (42.3%)22.3
Louisiana14 (19.6%)24 (0.104%)30 (11.3%)22.7
Maryland30 (11.2%)22 (0.115%)19 (16.2%)23.7
Florida7 (26.7%)27 (0.089%)37 (10.7%)23.7
California8 (25.9%)34 (0.070%)33 (10.9%)25.0
Arkansas15 (19.6%)30 (0.083%)35 (10.8%)26.7
Georgia27 (12.0%)32 (0.074%)22 (15.5%)27.0
Arizona24 (13.3%)29 (0.084%)28 (12.0%)27.0
Pennsylvania19 (15.4%)28 (0.088%)36 (10.7%)27.7
South Dakota10 (25.3%)31 (0.075%)42 (5.7%)27.7
Tennessee22 (13.9%)36 (0.063%)27 (12.2%)28.3
Maine31 (10.6%)38 (0.061%)20 (15.5%)29.7
Wyoming44 (4.9%)13 (0.157%)34 (10.9%)30.3
New Jersey9 (25.5%)39 (0.053%)43 (4.5%)30.3
Vermont49 (1.9%)37 (0.061%)7 (21.4%)31.0
Nevada43 (5.2%)33 (0.074%)25 (13.3%)33.7
Massachusetts23 (13.8%)43 (0.036%)41 (6.3%)35.7
North Carolina18 (15.6%)48 (0.007%)47 (1.9%)37.7
Mississippi34 (9.8%)40 (0.052%)40 (9.2%)38.0
Rhode Island42 (5.5%)35 (0.068%)38 (9.3%)38.3
West Virginia29 (11.8%)42 (0.040%)46 (3.7%)39.0
Utah46 (3.1%)44 (0.030%)29 (11.8%)39.7
Connecticut39 (6.9%)41 (0.044%)39 (9.3%)39.7
New York35 (8.7%)45 (0.028%)45 (3.9%)41.7
New Mexico38 (7.8%)47 (0.012%)48 (1.4%)44.3
New Hampshire47 (2.8%)46 (0.015%)44 (4.2%)45.7
South Carolina40 (6.9%)49 (0.005%)49 (1.3%)46.0
North Dakota50 (0.0%)50 (0.000%)50 (0.0%)50.0
Table 1. Combined ranking of states across all three metrics. Rank 1 is most 'egotistical' (highest value), with the actual metric value shown in parentheses. States are sorted by average rank (lower = more egotistical).

Methodology

We analyze data from OpenStreetMap (OSM), which is like Wikipedia but for maps. OSM is a core part of powering digital maps, and is supported by over 100,000 individual contributors and is sponsored by major companies such as Microsoft, Meta, and TomTom.

Processing Data

The OSM data comes in Protocol Buffer format (PBF), a compressed binary format containing nodes (points), ways (connected sequences of nodes), and their associated metadata tags. We process this data in multiple passes. First, we identify all nodes that are part of named highways. Second, we extract the coordinates for these nodes and build street segments from ways tagged with both a "name" and "highway" attribute.

The challenge is that a single logical street (like "Main Street") can consist of many disconnected segments in the OSM data such as at intersections or when attributes such as speed changes. To identify unique streets, we group segments with the same name that share nodes (indicating physical connection), and then further merge nearby disconnected components if they're within 1 mile of each other. This spatial merging helps account for mapping artifacts while avoiding incorrectly merging genuinely distinct streets that happen to share a name.

Filtering

We filter to only the top 7 main roadway types, excluding segments labeled as things like trails, pedestrian walkways, and service roads.

When we first ran this analysis, we noticed a few states (particularly NV, VT, NM) had a large number (>20%) of their state streets with numbers in them. For example, "Vermont Route 78" whereas most states did not include the state name in these kinds of highways. We make the debatable choice of filtering these out to focus on unique non-numbered streets.

Keyword Analysis

To identify distinctive words for each state (Figure 3), we adapt the TF–IDF method (Spärck Jones, 1972). Each street name is treated as a document, and we use binary TF (0 or 1), so TF–IDF simplifies to just the IDF score. For each word, IDF = ln(total streets / number of streets containing the word). We sum these scores for all streets in each state and use them to rank words. To avoid common terms, we drop the 25 most frequent "stop words" (like road, street, avenue) and require at least 10 occurrences of a word in a state to include it.

Limitations

Analysis State Level. This can result some streets being double counted if they cross state boundaries.

Merging Process Is Noisy. We merge streets with the same name within 1 mile of each other. Additional streets like "East Ohio St" and "West Ohio St" are treated separately. Other decisions can shift counts.

OpenStreetMap Completeness. While OSM is likely fairly comprehensive in the US, it is not perfect. This can create gaps.

Conclusions

Street names are fascinating to play around with, and offer a glimpse into a geographically diverse country. I found it interesting that the "2nd Street Is Most Common" adage didn't replicate. By the toy definition several states like WI, KS, and IA name streets after themselves comparatively the most compared to others ("egotistical"), but obviously this is just a toy exploration with many factors here. Some state pride and naming is admirable. Given how common state names are, it is surprising North Dakota doesn't have a single street named after itself, but 31 other states still appreciate it. There remain many open things to explore with the data.

Let me know if you have any feedback or thoughts here! Others have a much better street-level view to the quirks of their local state's naming, and it would be interesting to hear perspectives. Also, I thought it would be fun to include actual pictures of the street signs with a distinctly not-that-state background. Let me know if you get any neat ones.