Manipulating Search Engine User Agents

A search engine optimisation specialist can instruct a search engine user agent (webbot/spider/crawler) what to index and what not to index using special external files called robots exclusion protocol files and on page meta tags or nofollow attributes. The following b-1st blog describes some of the SEO commands you can use to control what a search engine indexes off your ecommerce website...

It is useful to instruct a search engine not to crawl and index a page that is under construction. Adding the nofollow attribute to a link takes the form "<a href = "www.healthstore.uk.com" rel = "nofollow" >Health Store</a>" and prevents a link from being followed by search engine spiders.

The "nofollow" attribute can also be used in a robots meta tag placed in the head of a webpage. The following will instruct search engines not to index this page and not to follow any links from this page for use in indexing or weighting...

<meta name="robots" content="noindex, nofollow" />

The following will tell a spider not to index this page, but to allow the following of links that can then be indexed and weighted...

<meta name="robots" content="noindex, follow">

The following will instruct the spider to index this page but not to follow any links from it and is most commonly used in message boards...

<meta name="robots" content="index, nofollow">

"Robots exclustion protocol" is used to prevent directories from being indexed in a separate robots.txt file which is located in the site's root directory.

The following instruction tells the search engines to disallow NO directories for any search engine.

User-agent: *
Disallow:

Conversely, the following command will disallow ALL directories for any search engine.

User-agent: *
Disallow: /

Bookmark Tool