SEO Agency

Related Links

Sunday, March 09, 2008

In Is URL Length a Ranking Factor

In Is URL Length a Ranking Factor - what say you? thread potentialgeek suggested we start separate thread on how to choose filenames which are part of URL. So Lets’ do it. Focus of our discussion should be filenames or file paths in URL - or in another words everything after domain name www.example.com/This/Is/MyFilePath/andMyFileNameSpace.html
This post is NOT supposed to be talking ‘at’ you, but rather it’s aim is start a discussion (who says I got it right, and what is “right way” anyways :) )

To clarify any potential confusion when talking about URLs, URIs, URNs, etc., especially for our newer webmasters, take a look at these definitions and clarifications from W3C: URIs, URLs, and URNs: Clarifications and Recommendations 1.0 Axioms of Web Architecture URI Model Conseqences

Why do we care about filepaths and filenames – their structure and what keywords are in them?
First of, just getting filepaths and names to work well for you, your visitors, and bots is not ‘be all end all’ – it just one part of the puzzle.

In a nutshell, both human visitors and bots will ‘see’, and make some kind of a decision, about your site and/or page based of filepaths and file names. For humans it comes down to usability issues, and for bots it provides algorithmical guidance. Reasons and methods for getting this right for both, humans and bots, are intertwined.

Stepping back for a moment - For most part, and for different reasons, we all want a lot of first time and repeat visitors to our sites(s). Good chunk of first time visitors comes to us from search engines (SEs). And usually those visitors will only go through couple pages of SERPs (search engine results page), so that means that we want to rank well – higher the better :). But it’s not just enough to rank well, we want searchers to click on our SERPs listing among all choices. File names we choose can entice searchers to do so – did you notice that in SERPs search keywords are bolded? (It only means that SEs are being helpful to searchers and are emphasizing search terms entered – but we can take advantage of that). To illustrate, say you searched for red widgets, and SEPRs comes back with results containing URLs such as:

h t t p: //example.com/page-34.html?ID=1234567&c=rw and h t t p :// example.com/red-widgets.html

Which one is more likely to draw your eye to it, and hopefully get you to click on it? Just looking at the file name of the second URL you cold get an idea what the page is about. It gives searcher reinforcement signal, that he/she might find what they are looking for on that page. Again, this is only part of the equation; other elements play a role as well (title, snippet, etc.) but we are just talking about files now. Perhaps by similar logic, SE’s algo come to same conclusion – people usually name things for what they are, so SEs might (and I think they do) take this into consideration. There were reports that established site was able to rank for keyword only found in file path (no instances of it in page content). See Keywords in url - still useless right? – Supporters Forum. Don’t conclude anything by just looking at the title of that thread.

With human visitor’s in mind you want to choose file names that are easily comprehendible and memorable. Short of using a bookmark, most repeat visitors will come back by either typing-in domain name in browser’s nav bar, or using SE with your domain name as a search term , sometimes coupled with other search keywords. When they land on any of the pages of your site, you should provide them with intuitive, easy and well structure way to navigate your site. You don’t want to frustrate your visitors, but rather make it very easy to find what they are looking for – making them feel like they are “masters of the internet and know what they are doing” .

Intuitive navigation is a beautiful thing when done right, but it’s a hard work to get there. And this touches on another important topic – site architecture. [u]In part[/u], file paths are expressions of your site architecture. Different approaches work for different sites, but most common ones are flat (where everything is under the root), and vertical. Theme Pyramid is good example of vertical (you could also have inverted pyramid, etc.). So with file paths, or more to the point with subfolders/directories you can structure and organize site in a meaningful way, hence name them appropriately. However, this doesn’t mean that a page three directory levels down , will be buried or unseen especially by SEs. As part of the site architecture, you could (and should) also have link structure as a parallel and complimentary method to organizational (directories/subfolder) structure. This means that page residing in third directory from the root, can be only one click away from the root if you choose so (but this is a huge topic just by itself).

Another thing we can do when naming files and directories, is capitalization, such as for example /widgets/green-widgets.html /Widgets/Green-Widgets.html Generally you should be consistent with your choice across the site. Effect of this on SE’s algorithm is unknown to me, although it wouldn’t surprise me if it is taken into consideration (at times). Another interesting, and sometimes contested, topic is what kind of word separators should be used in file path.

Choice of separators, as with most things, should be approached from human and bot perspective. “-“ vs. “_” gets most attention (so let’s not make discussion into another “dashes vs underscore” thread), however other common separators are “&” and space (you see it as %20 in url). I am firm believer of not using space as a separator in url , mainly from usability standpoint, however I am not sure of affect on ranking since I didn’t experiment with them. Although most of my examples depict static pages, same applies for dynamic pages. Most common field separator there is “&”. As a side note, generally it’s agreed upon that ID field is useless for the topic of our discussion, and actually might hinder your rankings depending where and how it’s implemented in the filepath. I am inclined to believe that number of separators in the file name does not triggers a filter (or “penalty”) just in it self. That is, I think that file name such as
/my-house-on-the-beach-during-renovation.html

would not trigger and adverse actions from SEs algos just due to six separators (remember we are talking files and filepaths not domain names).

And that leads into what would trigger “some dial on a algo filter” to go down? (Excessive) Keyword stuffing in the body content, title, etc., got a lot of attention as a thing that might (and does) adversely affect ranking. Same could be said for ‘spamming’ filepaths
/California/Homes/For-Sale/ /California-Sale/Home-For-Sale/CA-Home-Sale/Buy-and-Save-on-CA-Home-Sales

That doesn’t mean that you should use only one occurrence of the keyword in the file path (you do want to provide logical emphasis), however going overboard, generally, will do you more harm then good. Where that line is, for your particular site, is the ‘money’ question.

There are many ways to organize your site, but just s an example of logical and structured way to do so, take a look at service manual for your car (or appliance, etc.). It will usually start of with general section , and then be broken down into main areas, which in turn are further broken down (just like a pyramid structure mentioned earlier). If you follow that logic and naming convention (main area => folder name, etc..), you are setting yourself for good start.

Although in some examples above I used “.html”, I did so just to make it easier to illustrate the point. If you can (and you do) try to make your file names ‘extensionless’. At some point you might want to change technology you are using to serve your pages, and having file extension might complicate matters greatly, and potentially hurt your rankings. Link to your /MySuperPage.html will not work if you change it to /MySuperPage.asp. Now there are ways to address that, however with little forethought and simple server side rewrite you can have your pages served without extension as in /MySuperPage .

Take a look at Cool URIs don't change
What do you take into consideration when naming files in URL?
-----------------------------
references and additional readings
Is it time to kill the dashes / hyphens in my domain name? Suggestion on Hierarchy for Spidering 101 Signals of Quality : Keywords in File Path 100 variables Is this not SE friendly? Do Subdomains Help with SEO? Treatment of a Subdomain Compared to a Domain