Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Shieldon: Protecting Your Website From Bad Crawlers

DZone 's Guide to

Shieldon: Protecting Your Website From Bad Crawlers

Take a look at Shieldon, an open source project designed to protect your website from crawlers, web scrapers, and vulnerability scanners.

· Open Source Zone ·
Free Resource

Shieldon is a PHP library that provides anti-scraping and online session control for your web applications. It's as if you are using a shield on your web application to fight against bad-behavior bots, crawlers, vulnerability scanners, and so on.

First, the documentation: https://shield-on-php.github.io.

Installing Shieldon

Use PHP Composer:

composer require terrylinooo/shieldon


Or, download it and include the Shieldon autoloader.

require 'Shieldon/src/autoload.php';


How to Use Shieldon

Here is a full example to let you know how Shieldon works.

$shieldon = new \Shieldon\Shieldon();

// Use SQLite as the data driver.
$dbLocation = APPPATH . 'cache/shieldon.sqlite3';
$pdoInstance = new \PDO('sqlite:' . $dbLocation);
$shieldon->setDriver(new \Shieldon\Driver\SqliteDriver($pdoInstance));

// Set components.
// This compoent will only allow popular search engline.
// Other bots will go into the checking process.
$shieldon->setComponent(new \Shieldon\Component\TrustedBot());

// You can ignore this setting if you only use one Shieldon on your web application. This is for multiple instances.
$shieldon->setChannel('web_project');

// Only allow 10 sessions to view current page.
// The defailt expire time is 300 seconds.
$shieldon->limitSession(10);

// Set a Captcha servie. For example: Google recaptcha.
$shieldon->setCaptcha(new \Shieldon\Captcha\Recaptcha([
    'key' => '6LfkOaUUAAAAAH-AlTz3hRQ25SK8kZKb2hDRSwz9',
    'secret' => '6LfkOaUUAAAAAJddZ6k-1j4hZC1rOqYZ9gLm0WQh',
]));

// Start protecting your website!

$result = $shieldon->run();

if ($result !== $shieldon::RESPONSE_ALLOW) {
    if ($shieldon->captchaResponse()) {

        // Unban current session.
        $shieldon->unban();
    }
    // Output the result page with HTTP status code 200.
    $shieldon->output(200);
}


If users reach the pageview limit in a short period of time, they will be banned temporarily.

















User can be unbanned by solving a Captcha. They can then continue browsing your website, but crawlers can't.

Topics:
open source ,php ,shieldon ,web crawling web scraping ,website protection ,tutorial

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}