Sigram's Nutch robot
If you're reading this, chances are you've seen our Nutch-based robot visiting your site while looking through your server logs. Our software obeys robots.txt files and robot META tags in HTML. These are the standard mechanisms for webmasters to tell web robots which portions of a site a robot is welcome to access.
Sysadmins/robots.txt
Our company doesn't operate a search engine service - we are a software development house. From time to time we perform occasional runs of the crawler to test
its functionality.
OUR POLICY IS THAT WE DO NOT HARVEST EMAILS OR PERSONAL INFORMATION. WE DO NOT REDISTRIBUTE THE CRAWLED
CONTENT IN ANY WAY - IT IS USED EXCLUSIVELY FOR OUR INTERNAL TESTING PURPOSES.
Our software obeys the robots.txt exclusion standard, described at
http://www.robotstxt.org/wc/exclusion.html#robotstxt. Our Nutch installations respond to the agent name "Sigram". Thus to ban all our Nutch-based crawlers from your site, place the following in your robots.txt file:
User-agent: Sigram
Disallow: /
Webmasters/Robots META
If you do not have permission to edit the /robots.txt file on your server, you can still tell robots not to index your pages or follow your links. The standard mechanism for this is the robots META tag, as described at http://www.robotstxt.org/wc/meta-user.html.
Contact us
If your site has problems or questions about our Nutch crawler, please send an email to bot at sigram dot org. We appreciate your feedback!