Have a Question?
Print

Licenses Can Be Exhausted from Bots Crawling BBJSP Pages

Problem Description

Bots called “spiders” or “crawlers” are web-based programs that visit a site systematically by loading the site’s links and visiting any links inside the visited content.  Regardless of these bot’s purposes, when visiting a BBJSP site, the bots can end up exhausting the host site’s licenses, due to the fact that the bots load the site’s pages but do not store the site’s cookies.  Fortunately, it’s possible to block bots and keep them from unnecessarily using the host site’s licenses. 

Robots.txt

The first line of defense is to create a robots.txt file in the root folder of your website, so that it’s accessible as “www.mysitename.com/robots.txt“.   To indicate that you don’t want any bots crawling your website, you would put the following content in your robots.txt file: 

 

User-agent: *

Disallow: /

 

To read more about robots.txt, visit https://en.wikipedia.org/wiki/Robots_exclusion_standard.  Robots.txt is a good strategy for keeping compliant crawlers from crawling your site, but malicious or poorly written bots will ignore the robots.txt file.  

Application Firewalls

To stop malicious bots from creating unwanted traffic on your website, you will need an application firewall, and there are a variety of solutions on the web; from programs you put on the host to intercept your web traffic to third party services that control what traffic comes to your website.  

One such service is CloudFare.  CloudFare provides a service called “bot management” that allows you to determine what bots if any should be able to crawl your website.  https://www.cloudflare.com/products/bot-management/

 

Table of Contents
Scroll to Top