It sometimes pays to read a website’s robots.txt file, as it may contain rather funny comments about particular bots. Case in point, Twitter’s robots.txt:
http://www.twitter.com/robots.txt
#Google Search Engine Robot User-agent: Googlebot # Crawl-delay: 10 -- Googlebot ignores crawl-delay ftl Allow: /*?*_escaped_fragment_ Disallow: /search Disallow: /*? Disallow: /*/with_friends
...
# Every bot that might possibly read and respect this file.
FTL may mean one of two things: “faster than light” or “for the lose.” I am going to guess that since this likely has to do bandwidth usage, it is the latter.
Because here are at Kitsch-Posh we are all about TEH OPEN INTERWEBZ RDY 4 UR CRAWLING, here is our robots.txt file:
User-agent: * Disallow:
Ron Paul would be proud.







