Website Helpers.com

Articles, tips, and resources for webmasters

a project by Michael Bluejay | email

SEO 101:
Table of Contents

Introduction

Myths & Facts

Submission and Spidering

Submission
The spider keeps on comin'
Removing barriers to spidering

Keywords

Avoid single-word terms
Avoid terms that are too broad
Avoid terms that are too specific
Avoid terms that are unpopular
Avoid highly-competitive terms
Mine your server reports
Target word variants and word order

Ranking Factors

Content is King
One-page factors
Page Weight
Dead Links
META tags
Unknown Factors

NON-Ranking Factors

META Keywords
ALT text
Title attribute
Web Standards
Dedicated IP address
Changing hosts or IP's
Adsense
Resubmitting a site

Penalties

Over-Optimization penalties
Non-WWW penalties
Black Hat SEO penalties
Paid Links penalty
Duplicate Content penalty
Why did my site disappear?!

Black Hat SEO

Invisible text
Cloaking
Keyword stuffing
Doorway Pages
Orphaned Pages
Spam

Links

Anchor Text
Links in the body copy
Internal Links
PageRank
Backlinks
Reciprocal Links
Link Farms and Directories
Buying and Selling Links
Pages not passing PR
Link Age
Relevance and Authority
Suspicious Activity
Splitting PR (removing or forcing theWWW)
Summary of link factors

Changing domains, and renaming pages

Move a whole site
Move a directory to a new domain
Move specific pages
Advanced Redirecting

Hiring professional help

Summarized recommendations

Further Resources

*How to get good search engine rankings*
« Part 2: Myths & Facts	Part 3: Submission & Spidering	Part 4: Choosing good keywords »

Submission -- Filling out a form on a search engine's site to ask them to add your site to their index. This is unnecessary if any site already in the index links to yours, or if your site is already indexed.

Indexed -- How we refer to sites that are in a search engine's database. If your site could come up in a search, even if it's on page 1000, you're indexed.

Spider -- An automated robot program that follows the links from the pages on the web and gathers the data. Google's is called Googlebot.

Crawling (aka spidering) -- The act of a spider following links and gathering the data from the pages it visits.

Sandbox -- The imaginary place where new sites are held in Google before they start ranking well in the SERPs for competitive phrases -- if they ever start ranking well at all.

Submission

Submission means filling out a form on a search engine's site to invite them to add your site to their index. What many people don't realize is that this is unnecessary. Engines find what's on the web by following links. As long as there's a link to your site from any site that's already in the search engines, the engines will find your site. If you don't have any incoming links you're not going to rank well anyway.
Once your site is listed in an engine you're in for good (unless you get kicked out for trying to fool them, as covered below under Black Hat SEO). There's never any reason to resubmit your site once it's already in. Resubmission is a waste of time.

The overwhelming majority of search traffic comes from the top five or so search engines. Some companies will offer to submit your site to "thousands" of search engines. This is a waste of money. If your site is linked to from anywhere, you'll get in all the search engines that matter, automatically, for free.

Search engines use automated robots to follow the links around the web and grab the content from the web pages they find. The robots are called spiders, and when they follow links they're crawling the web (also called spidering). Google's spider is called Googlebot, and you'll see it listed as the user agent in your server logs. Once a search engine has gathered a site's data and analyzed it the site is said to be indexed. To see whether your site is in the Google index, search Google for site:yourdomain.com.

New sites don't always get listed right away. In some cases it can take several months for a new site to show up in the SERPS. Even when a site gets in the index, Many believe that Google puts new sites "in the sandbox" and won't let them rank well for the initial few months. Jennifer Laycock has a better explanation: New sites can rank fine if there's not much competition for that topic, but Google will assume that a new site in an established, competitive market isn't any better than the tons of sites already there, unless that site proves itself to be superior. (More in this discussion thread.) The sandbox issue has been discussed on Webmaster World ad nauseum. (Searching WebmasterWorld for all pages mentioning the sandbox results in nearly 1000 hits at present.) Here's a small sampling of threads from 2004: Sept. 8, Nov. 20, Nov. 23, Nov. 29, Dec. 2, Dec. 9.

The spider keeps on comin'

Once a site is in a search engine, the engine's spider will periodically revisit it and re-index it from scratch. The engines understand that the Internet is dynamic and changing, so they constantly re-evaluate the pages in their indices. So not only will every engine probably find your site on its own the first time, it will keep visiting it over and over again on its own, too.
Google appears to visit most pages in its database at least once a month, though it may take longer. Some pages get visited every day. Sites with a higher PageRank (i.e., sites that have a lot of inbound links from other sites) get spidered more frequently than sites with a low PR. And sites which update more frequently get spidered more often than sites which rarely make updates. You can try to invite more frequent spider visits by updating your pages more frequently, even if the changes themselves are minor and negligible, though there is questionable advantage in doing so. This won't necessarily let you test your page ranking ideas through trial and error any faster because even if an engine spiders your new content to see what you have on your page, it won't necessarily figure out how those changes should affect your rank for weeks or months. And of course, more frequent spider visits by themselves do nothing for your rankings.

Removing barriers to spidering

Search engines find pages by following HTML links. As long as the pages on your site are linked up properly the engines will find them. But if your pages aren't linked properly, your pages will never make it into the index. Here are some typical things that can cause an engine to fail to find your pages.
1. Links are done in Javascript. Many engines don't follow links done in Javascript, such as those found in drop-down menus. If you have Javascript links, make certain you also have text links somewhere on the page as well. It doesn't hurt to have Javascript links as long as you also have plain links on the page.

2. Links are done in Flash. Many engines can't follow links in Flash. If you have Flash links, make certain you also have text links somewhere on the page as well.

3. Orphaned pages. If you forget to link to a certain page from at least one other page, that page page is said to be orphaned. An engine can't find it because it can't follow a link to it. Make sure every page on your site is linked to from at least one other page.

4. Dynamic pages. Search engines can generally follow dynamic URLs (those with a question mark) as long as the have only one or two parameters. Three parameters -- hard to say. Four or more parameters is probably pushing it. But even if the engines can follow dynamic URLs, that doesn't necessarily mean that those pages will rank well. Two noted experts stated flatly in 2003 that pages with dynamic urls rank worse than those without. It's unclear whether that's true today, but many webmasters aren't taking chances: They're using the Mod Rewrite feature of the Apache web server software to turn dynamic urls into static ones. There are many threads on WebmasterWorld about how to do this.

If you prefer not to turn your dynamic urls into static ones, you should at least put the most important parameters in your urls first. There's some feeling that the engines may try the first two or three parameters and ignore the rest. For example, if your url was:
http://domain.com?language=Eng&user=4873&style=15&article=238
Then instead try:
http://domain.com?article=238&language=Eng&user=4873&style=15

Incidentally, here's an article that gives other reasons for removing the query string from urls.

5. Site is down. An engine can't index a site if it's down. Make certain you use a reliable webhost. (You can also use monitoring software or subscribe to an automated monitoring service to email, phone, or page you if your site goes down.) It's unlikely that you'll be removed from an engine just because your site was down once when they tried to visit, but if your site is down for several days that could spell bad news. An engine doesn't want to list your site in the SERPs if visitors can't actually get to it.

Images and Frames. Search engine spiders can follow image links and links in framesets just fine -- depsite what you sometimes read on the net..

Can the spider read your page? Remember then even if a search engine can find a page it might not be able to figure out what that page is about. Spiders eat words, so they have to be able to see the words on your site in order to index them. Spiders can't read the text that's in graphics. Any text that you want the spiders to read and index should be written out as text. At the very least, put any text that appears in graphics into the images' ALT tags. Spiders are getting better at reading the text that's in Flash but they're still not very good at it. Make sure any Flash page you have has a "Skip this intro..." link that takes visitors (and spiders) to the text-rich content of your site.

Now go to Part 4: Choosing good keywords

« Part 2: Myths & Facts

Part 3: Submission & Spidering

Part 4: Choosing good keywords »

If you liked this site you might like some of my other sites:

How to
Buy a House

Step-by-step guide for first-time homebuyers.
Visit now...

How to
Save Electricity

Everything you wanna know. Shows you exactly how much you can save.

(Visit now...)

How to Not Get
Hit by Cars

An illustrated guide for bicyclists. Might save your life.

(Visit now...)

Ben Folds Five

The rise and breakup of the world's greatest piano pop band.

(Visit now...)

Battery Guide

Which battery is best? We cover rechargeable and alkaline batteries to show you what's hot, what's not, and the best way to charge them. (visit now)

How to find the
Cheapest Airfare

Everything you wanna know.

(Visit now...)

My favorite animation on the net:

The Military Budget as Cookies

This excellent animation from TrueMajority shows in graphic detail (using Oreo cookies) how ridiculously, large the military budget is, and how we could solve many domestic problems with a modest 12% cut. A must-see. (watch it now)

I was born into a cult.

The Aesthetic Realism Foundation is a small psychological cult in New York city. My grandparents were members, so my mother was born into it, and so was I. Recently I created a website about the cult to get the word out. I hope you'll check it out.

We'll cry if you don't link to us.

http://WebsiteHelpers.com/seo

Website Helpers.com

How to get good search engine rankings

Part 3: Submission & Spidering

The spider keeps on comin'

Removing barriers to spidering