|
Making Your Web Site Invisible to Search Engines
by Craig Mazur - Copyright 2004-2007 - All Rights Reserved
June 20, 2004
Updated: July 14, 2007
One of the goals of every SEO (search engine optimization) campaign is to increase the visibility of both the Web pages within a site and the keyword phrases the site's prospective visitors are likely to use when searching for the site's products, services or information.
There are a quite a few methods used to accomplish this.
Most search engine optimization techniques are based upon empirical studies and experimentation.
The fact is that all search engine ranking algorithms are proprietary in nature, which means the ranking rules are trade secrets and never divulged.
Furthermore, they are subject to sudden changes in ranking criteria, which can periodically inject a degree of volatility into search engine rankings.
Aside from numerous improvements made to a Web site's structure and content through the normal optimization of a site, another effective strategy is to identify and eliminate as many potential roadblocks that can limit the ability of a site to rank well by making the site difficult for search engine algorithms to index.
These issues are all "potential problems" that can be hard to pinpoint as a definite cause for low rankings because search engines rarely divulge the specific reasons for low ranking.
Part of our philosophy is to identify as many potential roadblocks to success and remove them or reduce their potential negative side-effects.
These issues do not necessarily produce search engine penalties, but they do make it harder for search engines to perform their essential functions.
The following are some of the most common issues that can reduce the ability of search engine spiders to properly index a web site:
Factors that can reduce your search engine
visibility
- HTML Frames
The use of HTML frames is one of the most common problems with poor rankings.
The reasons behind this are simple.
Each frame page is actual made up of several individual HTML pages defined by an additional page called a frameset.
If a page has a stationary header and a stationary menu on the left side with a scrollable content area, the page is typically made up of three visible HTML pages and the frameset page.
The problem with frames is that this design is confusing to search engine spiders, because they see each inner frame page as a separate web page.
Many individual frames have no outbound links, which therefore create "spider traps" because a search engine algorithm needs links to follow in order to continue to index a site.
If you confuse or trap a spider, they can just say "Adios" and move over to the next site on their list.
- Content in iframes
iframes, or inline frames, are a newer version of HTML frames.
An iframe is typically used to display content from another web page within an iframe web page.
Many site owners use iframes to display news or information pages from other web sites.
This is fairly common in the real estate business.
The problem with this approach is that the search engine spiders can easily see that the content displayed is not really part of your site and does in fact originate on another site.
Although users can see the content with their browser, the content does not exist in the web page code that spiders index and dissect.
If you do not place additional content on a page using iframes, you have no content on that page.
- Lack of Informational Content
Search engine algorithms could care less about how your site looks.
They never actually see your web page, but merely work with the underlying web page code.
They primarily crave one thing--content.
Without content, search engines have no keywords or other information necessary to index and rank a Web site.
You therefore need to provide unique, informational content and lots of it if you want to achieve and keep high rankings.
- Splash Page
A splash page is typically a stand-alone, visually rich Web page set up as the default page for a site, which makes it the first page displayed when someone enters the site using the domain name.
Splash pages are commonly designed using Macromedia's Flash technology or may be made up entirely of images.
While it is very "artsy", a splash page can be very detrimental to your site's search engine rankings.
Search engine spiders seek content.
Splash pages very often have zero content.
Images do not contain content.
Without content, you cannot achieve high rankings.
Because a splash page is set up as the default page for a site, it is the home page.
It doesn't matter if you call another page the home page once a user enters the site.
The default page is the home page, and in this case, the home page has little or no content, and therefore is not worthy of a ranking.
The problem with splash pages is further complicated when the page contains no links to it from the rest of the site.
Some splash pages use automatic, timed redirections to take a user into the site after a specified time.
This is also a bad idea, because search engines do not like most types of redirections and do require hyperlinks in order to index a site.
If you want to achieve higher rankings, get rid of the splash page.
- No Inbound Links to Site
Inbound links have become a critical issue with the major search engines, and Google in particular.
Google sometimes will not index a new Web site until it detects a number of links from other sites.
From a search engine's perspective, the more links they find from other sites to your site, the more important your site is perceived to be.
You need to set up as many quality links from other sites to your site as you can.
Although almost all links do add some weight to the "link popularity" factor for your site, links from large sites within your industry can give your site an extra boost.
For more information about linking tactics, read Web Site Optimization's 3-Legged Stool.
- JavaScript Menus
JavaScript menus are usually identified by their drop-down or expanding list of selections.
The problem is that almost all search engines ignore JavaScript, which means they can neither see nor follow links that are generated by JavaScript.
If your design calls for the use of JavaScript menus, you need to provide alternate text links to the major sections of your site.
The footer is a good place for these links.
A better idea is to investigate using CSS menus.
Most CSS menus use standard hyperlinks that are easily followed by search engine spiders.
- Image Map Menus
Some search engines do not recognize image maps.
Image maps produce a "clickable region" in a defined area of an image.
All links need to be standard hyperlinks in order for search engine spiders to follow them.
If your design calls for the use of image maps, follow the same advice as for JavaScript menus and provide an alternate path for spiders to follow.
- Incorrectly Configured robots.txt File
The robots.txt file is also referred to as the robots exclusion file.
The robots.txt file is optionally placed in the root directory of a site and is used to instruct search engine algorithms (also called spiders or robots) which directories and files they should NOT index.
The most common use is to exclude spiders from image directories or sensitive areas of a site.
An incorrectly configured robots.txt file can block your entire site from spiders.
For more information about how to configure a robots.txt file, visit robotstxt.org or do a Web search for "robots.txt".
- Lack of Spider Friendly URLs
Many search engines, including Google, have difficulty with Web page URLs with more than two or three name-value parameters.
Parameters are commonly used with dynamic Web sites to identify the particular information that needs to be pulled from a database to display for a user.
A search in Google for "SEO" produces the following URL:
http://www.google.com/search?sourceid=navclient&ie=UTF-8&oe=UTF-8&q=seo
The codes and values to the right of the question mark are the name-value parameters.
As a general rule, you need to keep the number of name-value parameters down to three or less.
If that is not feasible, there are inexpensive technology solutions available that can be used to eliminate this problem.
By the way, Google's own search URL would not be considered to be search engine friendly because it contains four name-value parameters.
- Errors in HTML Code
This issue cannot be stressed enough.
Modern browsers are very forgiving and correct most coding errors as they render a page.
While a page may render properly, the price paid is reduced performance because it takes time to re-render a page.
Search engine algorithms sometimes show no such mercy and make no attempt to correct poorly written HTML code.
If they cannot understand the code, they just abandon the indexing of your site.
Don't think you are safe just because you use a WYSIWYG design tool.
Most errors are ignored by the design tools, and some design tools ignore Web coding standards and generally create sloppy code.
The gold standard for code validation is the W3C validator.
Read about it in Proper Web Site Testing - Web Page Validation.
- Excessive Nesting in HTML Code
Even though HTML code may be syntactically correct and can be validated using the W3C validator, it may still create problems with some search engines.
The biggest issue is the deep nesting of tables, which means nesting tables withing tables within tables, etc.
WYSIWYG web page design tools are notorious for allowing a designer to do this.
The problem lies in the complexity of the code created.
I've seen otherwise good looking web pages that search engines would not index because tables were nested five or six deep.
As a general rule, try to keep the nesting of tables down to a single nesting (a table within a table).
Never nest more than three tables (a table within a table within a table).
Top Rank Solutions is located near Phoenix in Mesa, Arizona,
and offers services for customers throughout the United States.
|