We’ve all been told the importance of creating a website that is optimised for search engine bots and the importance of our page ranking in search engine results.
You might be wondering what, exactly, Search Engine Indexing is and how it affects your SERPs. Let’s examine just how search engines find and index web pages.
How Crawler Based Search Engines Work
Perhaps the most widely recognised crawler-based search engine is Google. These types of search engines automatically create their own lists of websites. Google sends out its cadre of spiders, or crawlers, to find pages on the internet, in turn, they bring back specific information.
The spider finds a page, reads it, and then crawls across links to other content on the site. After this initial bivouac, it comes back to visit on a regular basis and makes note of any changes that have occurred.
The spider’s findings are then added to an index. The index, sometimes referred to as the catalogue, is like a huge resource of all web pages on the internet. As changes are made, the listings in the index are updated.
Indexing of your site may not occur right away. A new site, or one with recent changes that affects relevancy for certain keywords, will not appear in search engine results until the information from the spider’s crawl is placed in the index.
The final step is the job of the search engine’s software program. It goes through all the new pages and changes added to the index and then matches it to searches. This is also when pages are ranked according to relevancy.
Search Engine Indexing Factors
So exactly what information are the spiders adding to the index? It is not an easy answer. Each major search engine looks for different website components and some, like Google, change their algorithms frequently.
These are the most common Search Engine Indexing factors:
1. Deep crawl – some search engines do a very thorough job of combing the internet and taking a look at all the content on a site, or a deep crawl. This is done even if the site has not been submitted manually. The major search engines generally use deep crawls, making it important that all content on your site is valuable, informative, and relevant.
2. Frame links – If your website is set up with frames, only some search engines will be able to index the content. There are ways to get around this problem, search the internet for help on this topic as it beyond the scope of this article.
3. Robots.txt/Meta robots tags – these are two ways to prevent spiders from crawling certain pages and adding them to the search engine’s index. There are reasons for doing this, particularly if a certain page adds nothing to the page ranking value of your site.
4. Body text – the body text on any web page is included in the index, although some do not crawl past a certain amount on particularly long pieces and most will skip over obvious spam.
5. Stop words – a pre-defined list of words excluded in the index to save space and optimise the speed of searches.
6. Meta tags – not all search engines index the meta description tags on a site. But there’s no point not including them, other than to save a really small amount of time, so it’s best to include relevant meta tags for each page.
7. Meta keyword tags – most search engines index on meta keyword tags.
8. Alt text – the descriptive words used to tag a photo or other alternate types of content are indexed by most search engines.
Now that you know how Search Engine Indexing works and what type of information they index, it should be easier to see which changes will result in the most positive SEO for your website.
Have a most outstanding day.
Sean RasmussenAussie Internet Marketing
www.SeanSEO.com © 2008 - 2010


