Use X-Robots-Tag to prevent google or yahoo from indexing documents with file extensions like .txt, .doc, .pdf etc. Google and yahoo can index all documents uploaded on a website, whether it be .txt files or .doc files or .pdf files.
However you can use X-Robots-Tag to prevent SERPs like google and yahoo from indexing your website.Here is an example of using X-Robots-Tag in your .htaccess file, to tell Google or yahoo not to index the robots.txt file.
<FilesMatch "robots.txt"> Header set X-Robots-Tag "noindex, follow" </FilesMatch>
Here is another example of using X-Robots-Tag in your .htaccess file, to tell Google or Bing not to index files with .doc extensions.
<FilesMatch "\.doc$"> Header set X-Robots-Tag "index, noarchive, nosnippet" </Files>
This is how to tell Google to not include PDF files in the search engine Index
<FilesMatch "\.pdf$"> Header set X-Robots-Tag: noindex </FilesMatch>
To exclude Excel files from Google Search Engine Indiex add this piece of code to the Headers.
<FilesMatch "\.xls$"> Header set X-Robots-Tag: noindex </FilesMatch>
Similarly you can exclude any Word documents i.e. .doc or .docx files, PpwerPoint Presentations and any other file hosted on your server from getting indexed using their file extensions.
If you are on Apache Server, you can add these code to your .htaccess file in the root directory.
We can also try using the “noindex” directive in the robots.txt.However only google obeys this directive though they have not yet announced full support for it in future. Thus X-Robots-Tag seem to be the best option to prevent google or yahoo from indexing robots.txt or any other document.