What do search engines want … (written in Mar 07 and moved here in July 08) will update at some point !!
… this can depend on who you ask – but here are some general notes.
YAHOO:
- Original and unique content of genuine value
- Pages designed primarily for humans, with search engine considerations secondary
- Hyperlinks intended to help people find interesting, related content, when applicable
- Metadata (including title and description) that accurately describes the contents of a web page
- Good web design in general
Google:
- A site with a clear hierarchy and text links. Every page should be reachable from at least one static text link.
- Make sure that our TITLE and ALT tags are descriptive and accurate. Check for broken links and correct HTML.
- A site map for our users with links that point to the important parts of our site.
- A useful, information-rich site, with pages that clearly and accurately describe our content.
- Use text instead of images to display important names, content, or links. The Google crawler doesn’t recognize text contained in images.
MSN
- Make sure that each page is accessible by at least one static text link.
- Keep the text that you want indexed outside of images. For example, if you want your company name or address to be indexed, make sure it is displayed on your page outside of a company logo.
- Add a site map. This enables MSNBot to find all of your pages easily. Links embedded in menus, list boxes, and similar elements are not accessible to web crawlers unless they appear in your site map.
- Keep your site hierarchy fairly flat. That is, each page should only be one to three clicks away from the home page.Keep your URLs simple and static. Complicated or frequently changed URLs are difficult to use as link destinations. For example, the URL www.example.com/mypage is easier for MSNBot to crawl and for people to type than a long URL with multiple extensions. Also, a URL that doesn’t change is easier for people to remember, which makes it a more likely link destination from other sites.
Ask:
- Sites should load quickly and be polished, easy to read and easy to navigate.
- Sites should be well maintained and updated regularly.
- Sites should offer thorough and accurate information that provides information that is highly relevant to a user’s search term(s).
- Sites should offer additional links or information related to a user’s search term(s).Sites should demonstrate credibility by providing author and source citations and contact information.
You can get all that directly from their help sections or deduction!
E-commerce sites
There are a few additional considerations for transactional sites or sites with secure areas.
These additional guidelines for e-commerce sites are listed below:
- Sites should provide secure transactions (preferably by SSL/SET)
- Sites should disclose policies for customer privacy, returns, exchanges and other customer concerns
- Sites should offer many types of the product being sought, relevant brands and/or an appropriate range of products
- Sites should provide adequate product information
- Sites should offer customer service by phone, preferably 24 hours a day
How do we give search engines what they want:
Use the words users would type to find our pages, and make sure that our site actually includes those words within it. Understand your customer and use their language. Isnt this marketing 101?
Dynamic pages (i.e., the URL contains a “?” character) – not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them few. Seems obvious – CMS systems and cookies are the first culprits of ruining this !
Use a text browser such as Lynx to examine the site, because most search engine spiders see your site much as Lynx would. If fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing all of our site in a text browser, then search engine spiders may have trouble crawling your site.
Allow search bots to crawl the sites without session IDs or arguments that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in incomplete indexing of our sites, as bots may not be able to eliminate URLs that look different but actually point to the same page. Accessibility is the single quickest way to fail in search engines!
Make sure our web server supports the If-Modified-Since HTTP header. This feature allows your web server to tell Google whether your content has changed since we last crawled your site. Supporting this feature saves you bandwidth and overhead.
The robots.txt file on the web server tells crawlers which directories can or cannot be crawled. Make sure it’s current so that we don’t accidentally block the Googlebot crawler. We can test our robots.txt file to make sure we’re using it correctly with the robots.txt analysis tool available in Google Sitemaps or webmaster central (many others available).
Make sure that the content management system can export our content in a way so that search engine spiders can crawl our sites.
Specifically:
- Avoid hidden text or hidden links.
- Don’t employ cloaking or sneaky redirects.
- Don’t send automated queries to Google.
- Don’t load pages with irrelevant words.
- Don’t create multiple pages, subdomains, or domains with substantially duplicate content.
- Don’t create pages that install viruses, trojans, or other badware.
- Avoid “doorway” pages created just for search engines, or other “cookie cutter” approaches such as affiliate programmes with little or no original content.
- If your site participates in an affiliate program, make sure that your site adds value. Provide unique and relevant content that gives users a reason to visit your site first.
How to make a site search engine friendly:
- Give visitors the information they’re looking for
- Provide high-quality content on pages, especially the homepage. This is the single most important thing to do. If your pages contain useful information, their content will attract many visitors and entice webmasters to link to the sites naturally. In creating a helpful, information-rich site, write pages that clearly and accurately describe vacation rentals. Utilise keyword research findings by using keywords on the page.
Links help our crawlers find our site and can give your site greater visibility in search results. When returning results for a search, Google combines PageRank (their view of a page’s importance) with sophisticated text-matching techniques to display pages that are both important and relevant to each search. Google counts the number of votes a page receives as part of its PageRank assessment, interpreting a link from page A to page B as a vote by page A for page B. Votes cast by pages that are themselves “important” weigh more heavily and help to make other pages “important.”
Keep in mind that Google’s algorithms can distinguish natural links from unnatural links. Natural links to your site develop as part of the dynamic nature of the web when other sites find your content valuable and think it would be helpful for their visitors. Unnatural links to your site are placed there specifically to make your site look more popular to search engines. Some of these types of links (such as link schemes and doorway pages) are covered in Google’s webmaster guidelines.
Only natural links are useful for the indexing and ranking of our sites.
Make your site easily accessible
Build our sites with a logical link structure. Every page should be reachable from at least one static text link.
Use a text browser, such as Lynx, to examine your site. Most spiders see your site much as Lynx would. If features such as JavaScript, cookies, session IDs, frames, DHTML, or Macromedia Flash keep you from seeing your entire site in a text browser, then spiders may have trouble crawling it.
Consider creating static copies of dynamic pages. Although the Google index includes dynamic pages, they comprise a small portion of our index. If you suspect that your dynamically generated pages (such as URLs containing question marks) are causing problems for our crawler, you might create static copies of these pages. If you create static copies, don’t forget to add your dynamic pages to your robots.txt file to prevent us from treating them as duplicates.
Things to Avoid
Don’t fill your page with lists of keywords, attempt to “cloak” pages, or put up “crawler only” pages. If your site contains pages, links, or text that you don’t intend visitors to see, Google considers those links and pages deceptive and may ignore your site.
Don’t use images to display important names, content, or links. Google’s crawler doesn’t recognize text contained in graphics. Use ALT tags.
Don’t create multiple copies of a page under different URLs.
And a new thing – think about how you present your search results. These ideally should not be treated as many many pages. If they add value then that is ok. But dont think that mulit criteria search results making hundreds of thousands of very similar pages is a great thing.
Think about what adds value and is easy for your users and then SEO should be taken care of.
Who is responsible for SEO in your company?
I say everyone !