This article aims to help you get ALL your pages (everyone of them!) indexed by Google and all the other search engines as well. This can be a very frustrating and difficult task requiring the following steps:
=> Find out how many pages are indexed – accurately!!!!
=> List all the individual pages that are NOT indexed
=> Discover why an individual page has not been indexed
=> Discover you can do to get a page NOT indexed, to be indexed
How to find out how many pages have been indexed and get a list of ones NOT indexed?
This task many seem very simple – but it is NOT. In my experience:
=> none of the that various SEO tools and software designed to do this are accurate. Trust me! I have tried just about all of them and in my experience they simply do not work, and others have reported similar findings
=> the site:www.mysite.com (entered into the Google search box with ‘mysite’ replaced by your domain name) and its various variations and options are neither reliable, accurate or up to date.
=> the inurl:mysite.com and its various versions, options and variations are similarly not accurate
=> the information you can see on Google Webmaster Central about your site and the data about your sitemap that has information about the number of pages indexed, is not accurate nor up to date.
None of these tools works! OK Well what can we do about it?
Sorry, the only reliable way that I know is to check every individual page.
This can be done using one of the following tools
=> Use the site:www.mysite.com test for each individual page e.g. site:www.mysite.com/mypage.html
=> Use the cache:www.mysite.com test for each individual page e.g. cache:www.mysite.com/mypage.html
=> Conduct a Google search for the page e.g. search for “mysite.com mypage”
OK, this works but it is a slow and tedious task for my sites which have over 2000 individual pages, each of which is vital for the business plan.
Fortunately there is one tool that does this – there are others I am sure.
Andy Blacks Index Checker -
This tool is also slow as you have to confine the search to about 40 pages at a time, and only do about three blocks of 40 pages each day, otherwise it get blocked. BUT IT DOES WORK if you are determined and patient. It works because it does a ‘site:’ and ‘cache:’ test for each page on the list.
You can copy the list of not-indexed pages from each test into a file and eventually produce a complete list and count them.
Voila! You now know how many of your pages have been indexed and importantly you have a list of pages that have NOT been indexed!
How to get a page NOT indexed to be indexed This can be a very frustrating process!
OK you have followed the steps above and found pages that have NOT been indexed, and you now want to Get them indexed.
How do you do it?
Firstly you need to find out WHY your page has NOT been indexed. Once again this can be extremely hard and frustrating as Google does not tell you why, and the subject is very confuse with nothing much to go on. Here are some suggestions:
=> It could be a timing problem. With a relatively large site with over 1000 pages, it may take one or more weeks, perhaps longer with a low PR site to get a new individual page indexed.
=> If you tend to be paranoid you can worry that your site has been penalised in some way, or perhaps that Google sets limits on the number of pages on a website that it indexes, perhaps related to PR.
=> Maybe Google will never index more than 80-90% of the pages on a website?
Once you have got beyond this paranoia phase and you still want to try to get your pages indexed there are several things you can try, based on the assumption that there is something wrong with the page that prevents it being indexed. The other assumptions is that you must be on the right track if the majority of your other pages have been indexed
From experience the reasons why an individual page has not been indexed, and the possible fixes for the problems are the following:
=> The title and description tag information are too similar to another page on your site THE FIX change the title and description so that they are unique and different
=> The page contains duplicate text. THE FIX is to use the pro version of Copyscape (or another similar tool) to test for duplicate text (use both the URL test and the TEXT test). Change the text and re-test until the copyscape tests are clear. There are a number of tools to help you with this including my own tool Gorewrite.com
=> There is not enough text on the page. In my experience you need at least 200 words of unique text. THE FIX is to add more unique text.
=> There are not enough backlinks to that individual page – Sorry this is what Google wants – deep links to that specific page both internal on the website itself, BUT importantly EXTERNAL links. THE FIX – Now this is a real problem with over 2000 page to consider!!! – how do you get these links? In my experience the quickest to get some links before you get ‘natural links’ is to add links from one of your other related websites, particularly one that is hosted on a different server or by a different hosting company. Google will eventually (it may take several weeks) pick these up and list them as backlinks on Webmaster Central. They may have low value, but if you get several for each page it does seem to work.
=> There is a host of other issues with your pages such as errors, tags etc. etc. THE FIX is to apply all the various SEO advice available on various websites and article to optimise you page design its contents. If the majority of similar pages on your website have been indexed then it is unlikely to be related to general SEO issues, but it is worth checking the obvious ones such as the Title and Description TAGS, HTML errors etc for the sites that have been rejected.
Sorry there is no quick fix to getting ALL your pages Indexed until some very smart person invents a tool that will tell you why a particular page has not been indexed (it would be nice if Google Webmaster Central did this, but its error checking tools are very limited). Note: It is still worth checking for errors in Google Webmaster Central, but they are very incomplete.





