Cloudtrellis

Scanner Bot Information

Cloudtrellis/1.0

Bot parameters

Bot name

Cloudtrellis

Bot version

1.0

Bot type

Good, identifies self

Bot category

Website monitoring

Obeys robots.txt

Yes, by default (website owners can opt to ignore for their own site)

User agent token

Cloudtrellis

User agent string

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Cloudtrellis/1.0) Chrome/126.0.6478.126 Safari/537.36

IP Addresses

See below

What is the Cloudtrellis bot?

The Cloudtrellis bot is website scanning and error detection tool operated by Cloudtrellis, a DBA of Foxhound Systems, LLC. The bot is designed to find issues that a user navigating through a web browser would come across, so the bot uses a user agent string that simulates a desktop browser. The user agent string is subject to change over time, and will always be listed on this page.

The bot searches for broken links and other issues in HTML. The bot may crawl across multiple pages and request some or all linked files. The bot may request the sitemap.xml or robots.txt files at the root of a website. If the sitemap.xml defines other sitemap files, the bot may also request and examine those files.

The bot will send HTTP requests using both the HEAD and GET methods as it works. The bot prefers HEAD requests (to which the server should respond with only the headers that would be returned if the HEAD request's URL was instead requested with the HTTP GET method). Unfortunately, many websites do not respect HEAD requests, so the bot may fall back to GET requests to ascertain whether a page is actually broken or not.

The Cloudtrellis bot is not a web scraper. The bot does not store any data about the content of a page for the purposes of indexing or populating search engines. Any data stored by the bot is solely for the purpose of reporting errors or other issues on the pages detected.

Why did the bot visit my site?

The bot will only visit pages that are discovered during a scan. Users of Cloudtrellis can configure scans for websites that they have verified ownership of. However, these websites may link to external sites, such as yours. The bot will follow these links to verify that they are not broken.

If neither you nor your organization use Cloudtrellis and the bot requested one or more pages on your website, it is because those pages were hyperlinked elsewhere on the internet. The only purpose of the request was to confirm that the page returned a 200 HTTP status code and verify that the content type was expected.

Detecting and authenticating Cloudtrellis bot

Published IP addresses

The IP addresses that the bot uses are updated dynamically and published under the A record of scanner.services.cloudtrellis.com. You can view the list of addresses by running the following command using the dig utility:

$ dig scanner.services.cloudtrellis.com A

which will respond with

...
 
;; QUESTION SECTION:
;scanner.services.cloudtrellis.com. IN  A
 
;; ANSWER SECTION:
scanner.services.cloudtrellis.com. 60 IN A  xxx.xxx.xxx.xxx

In this example, xxx.xxx.xxx.xxx is the sole IP address used by the Cloudtrellis bot.

Reverse DNS lookup

You can use reverse DNS lookups to verify that a given IP address is used by Cloudtrellis. Here's an example using the host utility for IP address aaa.bbb.ccc.ddd:

$ host aaa.bbb.ccc.ddd
ddd.ccc.bbb.aaa.in-addr.arpa domain name pointer scanner.services.cloudtrellis.com.

You can also send a regular DNS query for a PTR record to the in-addr.arpa domain with the reverse of the IP address to be checked prepended. The query below can be used for the IP address aaa.bbb.ccc.ddd:

$ dig ddd.ccc.bbb.aaa.in-addr.arpa PTR

which will respond with

...
 
;; QUESTION SECTION:
;ddd.ccc.bbb.aaa.in-addr.arpa.	IN	PTR
 
;; ANSWER SECTION:
ddd.ccc.bbb.aaa.in-addr.arpa. 300 IN	PTR	scanner.services.cloudtrellis.com.

This response indicates that IP address aaa.bbb.ccc.ddd is used by Cloudtrellis.

Stop Cloudtrellis bot from visiting your site

The bot respects the robots.txt file when placed at the webroot of a website. You can use this to instruct bots, including the Cloudtrellis bot, to not scan some or all of the pages on your website. To instruct the Cloudtrellis bot to not visit your website, add the following two lines to your robots.txt file:

User-agent: Cloudtrellis
Disallow: /

Get in touch with us

If you would like to contact us about the bot, please email us at: scanner@cloudtrellis.com