How to remove a web page from Google index and other search engines

Posted on January 1, 2009 at 5:20 am

So you have created a web site or a web page and you don’t want anyone else to be able to access it right? That’s a bit of a problem once Google, Yahoo, MSN, or some other search engine indexes it!

Once a web page or website is indexed, it can be found by anyone on the planet with an Internet connection. If you want to hide a page or website from search engines, you can do it in several ways.

I’ll try to walk you through the easier method first because it requires less technical knowledge. Basically, you can add a line of code to your HTML page or you can setup your web server to protect a file or directory.

hide page from google

Luckily, just about all search engines follow a web robots standard while crawling websites called Robots Exclusion Protocol. As a website owner, you can use the robots.txt file to give instructions to a search engine on what to index and what not to index.

So how does this work? It’s actually super simple! First, you create a text file called robots.txt using Notepad or any text editor. Now let’s say you want to block your entire website from being indexed by the search engines, so you would add these lines to your text file:

User-agent: *
Disallow: /

The User-agent refers to the robot that is crawling your website, i.e. Google, Yahoo, etc. * means all robots. Note that a robot, such as a spam robot, can ignore your file altogether if it feels like.

Only use a robots.txt file to block content from being indexed by major search engines, not for hiding information. If someone comes to your website, a robots.txt file will not prevent them from accessing that webpage and viewing it. So just make sure you understand what the file does, it prevents your site from showing up in Google search results pages (Yahoo and MSN also).

You can also block directories or individual pages on your site using a robots.txt file instead of blocking the entire website. To block a directory, you could add the following lines:

Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /~secret/

Note that you only need to add the user-agent line once, unless you want each robot to get a different set of instructions. If you want to block a page, you could use this:

Disallow: /private_file.html

Also, check out the Help section at Google to learn more on how to create a robots.txt file. Once you have finished writing up the file, you just need to upload it to the root of your website so that it can be accessed as follows:

http://www.example.com/robots.txt.

The next time the robot visits your site, it will read the information and follow the instructions. If this seems too complicated, you can also block access to your website or webpage using META tags.

The noindex meta standard is also followed by all of the major search engines. To use it, you have to add a line of code to the HEAD section on the webpage. To prevent all robots from indexing a page on your site, add this line to the HEAD section:

<meta name="robots" content="noindex">

When Google or any other search engine sees that line on the page, it will automatically drop the page from the search results, even if other pages link to it.

So those are the two ways you can hide a page from Google and other search engines. If you are not able to get this to work, post a comment and I will try to help you out.

Also, check out my previous post if you are looking for a way to remove your name from search engines like Google, etc that are on other peoples websites. Enjoy!

» Filed Under Google Software/Tips

Related Posts

Comments

15 Responses to “How to remove a web page from Google index and other search engines”

  1. Chris said on :

    Keep in mind that if you place a line in robots.txt, it might have the opposite to the desired effect because you’ll be announcing that the file exists.

    For example if you put
    Disallow: /topsecretfile.html
    then (most) search engines will ignore it, but any human that loads your robots.txt will learn of it’s presence.


  2. Ankur Jain said on :

    This method will work only if your web page is v new and is not already indexed by Google. If it is already indexed it won’t be deleted unless you go through the Google webmasters tools > link deletion route.

    my $0.02


  3. Humayun said on :

    I want to remove a patient’s information from the website which otherwise will cause serious legal implications. Kindly do help


  4. faith said on :

    I am seeking for help. I was defamed and cyberstalked in the internet. 1)How can I totally remove a page of the cache from the search index? 2)Can I be able to remove the posted cache without letting the stalker/owner know?
    Please help me!!!!


  5. Rohan said on :

    @Comment no 2 (Ankur Jain),
    It will work even if the page is already indexed. However, using the Google’s webmaster tools will do it fast.

    Refer last line of google’s help: http://www.google.com/support/webmasters/bin/answer.py?answer=93710


  6. black hattitude said on :

    Nice article, i will complete my robots.txt now ;)


  7. Mrs G said on :

    Please help us, someone made a blog on google blogger about myself and our company. When I clicked on the “report abuse” the blogger says it is freedom of speech, but it is libel pure and simple! please can you help us out, I saw the link about us on another blog and the admin of that blog removed it promptly but I saved it to my favorites so I have the link. It is not on the google search engines yet but I fear it might show up there. How can I get the whole page removed?? Please help me!


  8. akishore said on :

    @Mrs G – Your best bet to get the person to remove content that is harmful to your reputation is to contact the Blogger legal team. Here is the link:

    http://www.google.com/support/blogger/bin/answer.py?hl=en&answer=76315


  9. Mrs G said on :

    I went to the link you gave me, it says they will not remove libelous or defaming stuff without a court order! How do I get a court order if I don’t know for sure who the author is! They have it set up like a profile or blog,I am not sure! Please is there anyway to have this deleted


  10. Coery said on :

    A reporter out of state posted a report about me regarding an incident that took place in his district. Some of the information was in accurate and incomplete; in addition there was no follow-up done. Currently, this 8 year old report is still listed # 1 in my name search. How can I address this? (it is embarrassing)


  11. Stalked said on :

    Does anyone know of an email contact info for escalating a matter of harassment via Google/Blogger. “Reporting” abuse does nothing and there is no follow up, it’s just click the box and no place to point out how the TOS has been violated. I can not find a contact email anywhere to escalate a matter.

    Any help is appreciated


  12. Kasey said on :

    First of all, this post is about how to remove a page from Google’s index if you are the site owner. If someone else posted info about you, there is no way you can get it removed yourself.

    Secondly, you can also use Google Webmaster Tools to remove pages on your site from their index.


  13. Gary said on :

    I have a related question: Soon after publishing a new web page, we realized its content is more appropriate for one of our other domains, so we want to move the content to a new domain and remove it entirely from the old domain. Once we make the change, then give the new page some inbound link juice, won’t Google naturally recognize the old page on the original domain is no longer there and remove it from the rankings — almost like it never existed — then rank the new content as if it was brand-new to the web? (… which is what we’re after.) Thanks!


  14. Parveen Hatefi said on :

    I need to get my name off of the google search engine. It is ruiing my chances of getting a job. When I checked the other day it was moved to the bottom of the second page. Today it’s back on the first page in the number one spot. Original source was the Naperville Sun. How can I get this removed? Please help me!!


  15. Karen smith said on :

    This is all too complicated. I don’t know what a URL is or a Notepad. are you speaking of the notepad from my Yahoo email? What are robots? I don’t even know where to begin. I will pay you to delete my name off of Google. Google can ruin someone who is completely innocent. Is there a way to show me step by step exactly what to do?

    I am not computer illiterate. I’m just not that good of a tech savvy person.

    Help!!!


Please post your comments/suggestions!