Search engines are suppose to search millions of documents for queries typed by users and return results in no time, this simple task is more complicated than what it looks like, search engines cannot search the web instantly for each query and return results, they can only search their own database that is hosted on fast servers in order to return results as quick as users are expecting.
In order for search engines to build a searchable database they need to crawl the web on daily basis and turn unstructured data to a structured data, a simple example for that is crawling web pages then saving title tags in column call TITLE-TAGS, meta Keywords tag in a column META-KEYWORDS, properly that is how early search engines started, with resources and technology limitations the easiest way to build a search engine was to crawl the web and save few tags of a page without any content analysis and make it available for search, imagine how easy it will be to manipulate this search engine 🙂 all what you need to do is stuffing your title tag and keyword tag with lots of keywords and wait to get indexed then voila you are #1 for terms that you dream now to hit #100 for.
It did not take spammers a long time to figure out how to rank well in the first generation of search engines, quality of search results start to drop, search engines tried (not that hard) to weed out spam by either improving their content analysis, or using trusted sources of data (Yahoo relied on Yahoo directory at some point for that) but trusted source that time meant human edited which couldn’t handle the growth of the web.
Later on Google’s founders came to say; there must something better than that to control search quality and they came with two great concepts:
1- Page rank, which mean not all pages should be scored the same, their score must be based on how many links pointing back to them (internally or externally but more value will go to external links)
2- The best way to check a document relevancy is not to check what is the owner of this document is saying about it (title tags and keyword tags) but what other people are saying about this document (anchor text in external links)
Great concepts!!! The whole webmasters/web site owners/web 2.0 users turned to be a quality assurance force working for Google!! No need for human editors at all just get lots of powerful machines to crawl the web on regular basis and analyse pages and links.
That worked really well for Google and made it stand out search engines crowd as the quality of results have been increased significantly.
Many people may be figured it out but it is not as manipulable as the title tags keyword tags one.
Lots of people in the SEO community keep saying why Google is still relaying on anchor text to measure relevancy? The answer is simply because there is no alternative YET, Google needs to analyse billion of documents, score them using machines without any human intervention, it is hard to do that without using links/anchor data.
Remember also that Google has a web spam team with a main focus of fighting link spam, doesn’t that tell you anything? Simply in the near future Google has no alternative for links/anchors the solution was to create web spam team that is trying always to go after link traders and keep reminding/scaring the SEO community not to manipulate links.
Google also must be given a big credit for several enhancements that have been done on the page rank formula along with many other quality control improvements.
Finally in this post I did not try to encourage buying links, it is more about explaining why Google is still relaying on links/anchors and will for a long time.