Top Rank Solutions

Using a Sitemap with Google, Yahoo and MSN


by Craig Mazur - Copyright 2007 - All Rights Reserved

June 10, 2007            Updated: June 10, 2007

If your web site is not getting fully indexed by search engines, there may be a problem with the linking structure. Sometimes a site is just too large and it is just not feasible to design a linking structure that places each page within three clicks of the home page, which is a general guideline that should be followed. In situations where a site is not getting indexed properly, a sitemap can be used to help search engine spiders find pages deep within your site.

You can easily check to see which pages have been indexed by the major search engines by using the site query operator.

      site:yourdomainname.com

Just substitute your actual domain name in example above and use it in a search box within a Google, Yahoo or MSN search page. The results will indicate which pages have been getting indexed. It is not unusual to find different page counts in each search engine index. Some search engines do a better job than others at indexing a site.

When a sitemap will help

A sitemap will not benefit a site that is already fully indexed. A sitemap's only function is to help spiders find all of the pages in a site. If you have a 120-page site and you see 120 pages in each search engine index, then a sitemap will probably not benefit the site. Likewise, if you have a 20 or 30 page site that has a good linking structure, a sitemap will be of no benefit. But if you have a 120-page site and you only see 80, 90 or 100 pages--or a lower page count--then you most likely have a problem with your linking structure that is not allowing spiders to find all of your web pages within a few click from your home page. In this situation, a sitemap will likely be beneficial.

What a sitemap does

The purpose for a sitemap is to help search engine spiders find pages deep within a web site. Period. A sitemap will not directly influence your current search engine rankings. An argument can be made, however, that because a sitemap helps search engines find more informational content pages within a site, it increases the chances that the site's rankings will improve due to a newly found page with more relevant content that represents a particular search phrase. That may be true, but do not expect rank positions to improve for pages already in a search engine's index.

The whole idea is to provide a page from which a search engine spider can easily find every page within a site. While it is not always possible to structure links so that each web page is within three clicks of the home page, a sitemap places all pages within one click of the sitemap.

Two primary types of sitemaps

While several different types of sitemaps have been used in the past, there are only two in widespread use today: an HTML sitemap and an XML sitemap.

HTML sitemaps

HTML sitemaps have been in widespread use for many years and particularly benefit small sites. An HTML sitemap is nothing more than a web page in a site that contains links to all of the other pages within the site. A sitemap link is typically placed in the menu or the footer of each page within a site. Each link point to the sitemap, which in essence becomes the central hub of a web site. A spider indexing any page within the site can easily find the link to the HTML sitemap. From the HTML sitemap the spider can find every page in the site. What a concept!

The general rules with HTML sitemaps is that they should have no more than 100 hyperlinks on a page. This means that a large site usually requires multiple sitemaps. It is also a good idea to add some text descriptions to each link so that users can use your HTMl sitemap to find pages within your site. This site uses multiple sitemaps that each focus on different aspects of the site. You can see one of our sitemaps here:

      Internet Marketing and SEO Articles

XML sitemaps

An XML sitemap is a text file that stores a list of URLs within a site. The data in this file is stored in a special format which is read by search engine spiders. The best news about an XML sitemap is that as of November of 2006, all three major search engines recognize a common standard for an XML sitemap, so you only need to set up a single XML file.

      Major Search Engines Unite to Support a Common Sitemap Format

This announcement greatly simplified the use of sitemaps.

It gets better. In April of 2007, the major search engines agreed on a standard for telling spiders where to find a sitemap.

      Search Engines Come Together on Sitemaps Auto-discovery

You do not need to provide links from web pages to an XML sitemap. In order to tell a spider where to find your sitemap, you merely have to add the following directive to the robots.txt file in the root directory. This is the format (substitute the domain):

      Sitemap: http://www.mysite.com/sitemap.xml

The standard naming convention for a sitemap is simply sitemap.xml and the sitemap should always be placed in the root directory. The addition of the directive in the robots.txt file is there to notify spiders that they should check out your sitemap.

How to build a sitemap using free online tools

The easiest way to build a sitemap is to use one of many free online tools. I've been using the freebie tool found at xml-sitemaps.com. To use the tool, simply enter your site's home page URL and let it run. A file will be generated that contains all of the URLs in your site and the file will be in the proper XML format.

Editing the XML sitemap file

Before copying the sitemap to your site's root directory, it may be a good idea to edit some of the entries. The XML sitemap protocol includes a priority tag, which allows you to place a level of importance on each page. The scale runs from 0.0 (lowest priority) to 1.0 (highest priority). The home page should be set at 1.0, while most other pages should be set from 0.5 to 0.8. Form, contact pages and other web pages that do not contain content can be set at 0.0 or 0.1. How a search engine uses the priority tags is not real clear, but theoretically it should help them to identify the pages that they should focus on, while disregarding insignificant pages.

The XML file can be edited with Notepad or any type of pure text editor. Do not edit it with Word or any type of Word processor that may embed invisible codes that might make it difficult for a spider to read. It must be saved as a pure text file.


Once again, an XML sitemap will not benefit a site that is already getting indexed properly and completely. But if you have a large or complex web site and the site needs some help, an XML sitemap may be the best solution.


Top Rank Solutions is located near Phoenix in Mesa, Arizona, and offers personalized website evaluation services, search engine optimization services and training for companies throughout the United States.