Sites hosted on Microsoft servers can create several problems for search engine optimization. Quite frankly, Microsoft servers allow sloppy practices to be used when building a web site and this leads to problems with search engines.
The number one problem is the fact that folder and file names on Microsoft servers are case insensitive. That means that all of the following are valid URLs and represent the same page. Microsoft servers also allow the use of spaces in file and directory names, which technically creates invalid URLs.
/products/green-widgets.html
/Products/Green-Widgets.html
/PRODUCTS/GREEN-WIDGETS.html
/products/Green-Widgets.html
/products/green widgets.html
When any combination of upper and lower case characters is valid, that presents a serious problem for most spiders who see each variation as a separate URL. Google has always treated two or more different URLs separately if each leads to the same page. The current exception is the home page, but nonetheless the home page URL should be standardized. The default URL for the home page used in all links in a site should all ways be http://www.mydomain.com/ or simply / and not index.php, index.html, default.asp, or any other use of the actual file name. This consolidates all internal and external inheritance factors on a single version of a URL.
If you allow a page to be indexed under Green-Widgets.html and green-widgets.html, you risk having one or both filtered out as duplicate content. Linux and Unix servers avoid this problem because both folder names and file names are case sensitive. Google runs on Linux servers where Green-Widgets.html and green-widgets.html are considered to be two separate file names for two separate pages. On a Linux server, you will get an error 404 (Page Not Found) if the case of the letters in the file name in a hyperlink is not correct.
Most search engine spiders will substitute a hexadecimal code for a space when it encounters a space in the file name, but we have seen situations where spiders will not index the contents of a directory that uses a space in the directory name. In the case of a web page file name, green widgets.html becomes green%20widgets.html with the hexadecimal substitution. At best, this looks confusing to a user and should be avoided.
ALWAYS make sure that one and only one version of a URL is used to represent a page in all of the links leading to the page, especially when you are on a Microsoft server. It is wise to stick with the actual name of the file used on the server. In addition to inconsistent URL naming within a site, Google may find incorrectly named URLs as backlinks because visitors to your site may see the incorrect variations of the URLs.
There is a workaround when multiple versions of a URL are discovered in Google’ index. The workaround is to use the canonical link tag to tell Google which version of a URL is the correct one. Google provides detailed instructions regarding how to do this.
Standardizing the URL for the major search engines will resolve this issue over time, but the root cause of the problem should be fixed whenever possible. That means locating every URL in a web site that uses a case variation of the URL that is different from the standard that you set in the canonical tag.
On a Microsoft server, the issue is more than just the case of inconsistently naming of the URLs. Microsoft servers allow URLs with and without the www subdomain to represent the page. This is easily resolved on a Linux or Unix server with a 301 redirect set up in the .htaccess file, but not so on a Microsoft server because it does not use this file.
On your Microsoft server, all of the following URLs can represent the same page and can cause problems with search engines.
http://www.mydomain.com/bookstore.html
http://mydomain.com/bookstore.html
This can cause sites to become double-indexed under both versions of the domain. The main workaround is once again the use of a canonical tag. A secondary workaround is the establishment of a standard URL in Google Webmaster Tools. You must set up a free Webmaster Tools account in order to use this tool. The Webmaster Tools fix only applies to pages in Google’s index, so it is usually wise to apply both fixes so that you establish a standard URL with all major search engines.