Monday, September 8, 2008

Oh, my beautiful URLs! Ruined!

Although we are using a number of SEO (Search Engine Optimization) strategies on the network of Nelso sites, one of the hardest to get right is "clean" or "pretty" URLs. There is some debate as to whether these kinds of URLs really help in terms of ranking on Google or Yahoo!, but in the interest of completeness we do try to keep the URLs on Nelso as "clean" as possible.

What is a clean URL? Well, it's an internet address that looks like http://www.nelso.com/cz/prague/cafe/ instead of http://www.nelso.com/search/?page=4&where=Praha&type_id=10. The idea is that it is easier for both humans and search engine spiders to see that the first URL is about cafes in Prague. The second URL ("places of type 10 in Prague") is more or less meaningless to both the GoogleBot and human beings.

So, what's the problem? Well, the above URL system worked fine when the site was entirely based on Prague, and worked well even when we expanded to a select number of cities in Denmark and Germany. The problem started when we decided to expand into smaller cities in both Europe and the United States. Take a look at this list of Czech cities (in Czech: "města"). Do you see what I see? Yes? The fact that the same city name is used as many as 10 times for different cities in the same country?

For example, take the case of the city of "Albrechtice". Our usual URL scheme of http://www.nelso.com/cz/albrechtice/ is not going to work in this case because "Albrechtice" is the name of nine different cities in the Czech Republic! There would be no way with the former URL scheme (e.g. http://www.nelso.cz/cz/albrechtice/) to tell exactly which "Albrechtice" the user wants.

To solve this, I've been forced to add a number identifier to the URL, so that the Kladno near Prague will have an URL like http://www.nelso.cz/cz/kladno-g3073699/ and the Kladno on the other side of the country will have an URL like http://www.nelso.cz/cz/kladno-g3073700/.

Not nearly as "pretty" as the former way of doing things at Nelso, but this is the only way this is going to work as we move out of major cities and into smaller cities. The one exception will the be the U.S. - in the U.S., there are as far as I know no duplicate cities within a state, so we can still use nice URLs like http://www.nelso.cz/us/ny/new-york-city/ when referring to U.S. towns. Even in the non-U.S. URLs, we're still stuffing the city name in the URL, so hopefully this won't hurt our search engine rankings too much.

1 comment:

pgl said...

Believe me when I say that I feel your pain - URLs are so often overlooked entirely, the opposite of the attention they should be getting.

So, here's my suggestions to ease the pain: first off, you don't need to keep the same ID in the URL as you're using on the backend - how about a plain number (maybe in hex or something?) that could map back to the proper ID? It would only have to be there for any cities (or towns (or villages (hamlets?))) added after the first, which would retain its Google Juice, it'd just be the new entries who suffered a little (their fault in the first place, they deserve it).

eg,

http://www.nelso.com/cz/prague/whorehouses/
http://www.nelso.com/cz/prague:2/whorehouses/
(or maybe http://www.nelso.com/cz/2:prague/whorehouses/, http://www.nelso.com/cz/prague-2/whorehouses, etc.)

PS: Is Blogger seriously one of the biggest weblog services, owned and operated by the biggest and most successful web business in the world? They are? Weird, you would've thought they'd heard of previewing posts and comments and stuff like that... guess they're just not heavy users themselves (drug dealers (and bankers) are often like that).