> For the complete documentation index, see [llms.txt](https://hoaxly.gitbook.io/documentation/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://hoaxly.gitbook.io/documentation/polite-scraping.md).

# Polite scraping

"The first rule of web crawling is you do not harm the website. The second rule of web crawling is you do **NOT** harm the website. We’re supporters of the democratization of web data, but not at the expense of the website’s owners. "&#x20;

A polite crawler respects robots.txt\
A polite crawler never degrades a website’s performance\
A polite crawler identifies its creator with contact information\
A polite crawler is not a pain in the buttocks of system administrators

We are full committed to the [polite scraping guidelines](https://blog.scrapinghub.com/2016/08/25/how-to-crawl-the-web-politely-with-scrapy/) defined by scrapinghub (although our spiders don't run on scrapinghub).

Tell us if you have an API we can use instead of scraping your site! If not available you can help us by providing structured content like [ClaimReview](https://schema.org/ClaimReview) schema.org.

**What data do we scrape?**

We just index metadata and link to the original source whenever possible. Our tools are aimed to get your content more visitors and readers. You can see it in action when you use the chatbot or the Browser Extension. The listed reviews are displayed as linked headings only with your organization name included.

***Is your site being scraped by us and you have a complaint? Write us a mail at*** [***bot@hoax.ly***](mailto:bot@hoax.ly)