Indexing is an important tool which search engines use to facilitate fast and accurate caching of information for retrieval later during a user’s search. This is similar to an index that uses the initial pages of a book. By using these few pages, it is helpful for us to locate the desired content in the book. The purpose of storing index is to optimize speed and performance of finding relevant documents in search query. An index of 1 million can be queried in seconds through this system. Without indexing, search engines will have to search each and every word out of a million documents which will take considerable time and computing power.
Recently Google launched Index status that charts the number of index pages. This index chart does not reveal the duplicate URL’s. A URL may not be indexed for many reasons such as:
- It redirects to other page.
- It is canonical to other page.
Canonicalization deals with web contents that have more than one URL. Having multiple URL for same web page can cause problem to search engine specifically, which URL should be shown in search results.
- Google detected with the use of algorithms that content is similar to other URL and selected other URL for representing content.
- It is blocked under blocked URL.
An index secret revealed by Google is a good move. Now webmasters can identify redirected and canonical pages as well as how much Google is indexing.