63 lines
2.4 KiB
Text
63 lines
2.4 KiB
Text
# Based on https://seirdy.one/robots.txt
|
||
|
||
# The next three are borrowed from https://www.videolan.org/robots.txt
|
||
|
||
# "This robot collects content from the Internet for the sole purpose of
|
||
# helping educational institutions prevent plagiarism. [...] we compare
|
||
# student papers against the content we find on the Internet to see if we
|
||
# can find similarities." (http://www.turnitin.com/robot/crawlerinfo.html)
|
||
# --> fuck off.
|
||
User-Agent: TurnitinBot
|
||
Disallow: /
|
||
|
||
# "NameProtect engages in crawling activity in search of a wide range of
|
||
# brand and other intellectual property violations that may be of interest
|
||
# to our clients." (http://www.nameprotect.com/botinfo.html)
|
||
# --> fuck off.
|
||
User-Agent: NPBot
|
||
Disallow: /
|
||
|
||
# "iThenticate® is a new service we have developed to combat the piracy
|
||
# of intellectual property and ensure the originality of written work for
|
||
# publishers, non-profit agencies, corporations, and newspapers."
|
||
# (http://www.slysearch.com/)
|
||
# --> fuck off.
|
||
User-Agent: SlySearch
|
||
Disallow: /
|
||
|
||
# BLEXBot assists internet marketers to get information on the link structure
|
||
# of sites and their interlinking on the web, to avoid any technical and
|
||
# possible legal issues and improve overall online experience.
|
||
# (http://webmeup-crawler.com/)
|
||
# --> fuck off.
|
||
User-Agent: BLEXBot
|
||
Disallow: /
|
||
|
||
# Providing Intellectual Property professionals with superior brand protection
|
||
# services by artfully merging the latest technology with expert analysis.
|
||
# (https://www.checkmarknetwork.com/spider.html/)
|
||
# "The Internet is just way to big to effectively police alone." (ACTUAL quote)
|
||
# --> fuck off.
|
||
User-agent: CheckMarkNetwork/1.0 (+https://www.checkmarknetwork.com/spider.html)
|
||
Disallow: /
|
||
|
||
# Stop trademark violations and affiliate non-compliance in paid search.
|
||
# Automatically monitor your partner and affiliates’ online marketing to
|
||
# protect yourself from harmful brand violations and regulatory risks.
|
||
# We regularly crawl websites on behalf of our clients to ensure content
|
||
# compliance with brand and regulatory guidelines.
|
||
# (https://www.brandverity.com/why-is-brandverity-visiting-me)
|
||
# --> fuck off.
|
||
User-agent: BrandVerity/1.0
|
||
Disallow: /
|
||
|
||
# Eat shit, OpenAI.
|
||
User-agent: ChatGPT-User
|
||
Disallow: /
|
||
User-agent: GPTBot
|
||
Disallow: /
|
||
User-agent: CCBot
|
||
Disallow: /
|
||
|
||
Sitemap: https://daudix-ufo.codeberg.page/blog/sitemap.xml
|