Skip to content
RTILA Web Business Automation
  • Home
  • Features
  • Pricing
  • Documentation
  • Support

Buy our LifeTimeDeal on Appsumo !

RTILA Web Business Automation
  • Home
  • Features
  • Pricing
  • Documentation
  • Support
Free Download
Free Download
Popular Search variableapidownloadchatgptcommand folder

Billing & Licensing

  • Manage License Activation Count
  • Standalone Exe Antivirus False Positive Alert
  • Team Member Activation URL & License
  • AppSumo Codes Redemption
  • Download & Activate RTILA Studio
  • AppSumo Code Stacking & Upgrade

How-To & Tutorials

  • Public Templates
  • How to target a CSS element
  • Change default Browser
  • Export Results to a CSV file
  • Profile Session Feature
  • License Check for Standalone Executables
  • OCR Feature: read text from images
  • Auto Download Pinterest Images
  • Save current URL using JavaScript
  • Search & Filter Projects
  • Run Automations in Silent Mode (no browser opening)
  • Website Load Testing Automation
  • Read data from a txt or csv file
  • Downloading files
  • Open in a New tab
  • Using filters to complete a URL
  • Choosing the right collection
  • Set a Counter with JavaScript
  • Setup re-Captcha Resolution
  • Woo Categories & ChatGPT API
  • How to login to a Google Account
  • Google Search Baby Steps
  • Auto-Recorder as a 1st step
  • Standalone Executable Bots

Technical Documentation

  • Go To Url Command
  • Execute JavaScript Code command
  • Reload Page Command
  • Take Screenshot Command
  • Smart Variable (ChatGPT API)
  • Populate Text Field Command
  • Hover Mouse On Element Command
  • Child Projects
  • Check Radio Input Command
  • Dynamic Variables (JavaScript Code)
  • Static Variables
  • Set Checkbox State Command
  • Set Dropdown Value Command
  • Press a Keyboard Key Command
  • Upload File Command
  • Double Click On An Element Event
  • Click On An Element Event
  • Switch Browser Identity Command
  • Slack Notification Command
  • Save as Pdf Command
  • Go Back To Previous Page Command
  • Go Forward To Next Page Command
  • Proxies Built-In Rotation
  • External Proxy Rotation API
  • Regular Expressions
  • Ejecute JavaScript (Global)
  • Mock Location Command
  • Close Page Command
  • Desktop Notification Command
  • Command Folder
  • Clear Cookie Command
  • Change Page Size Command
  • Break Loop Command
  • DataSet Types
  • Automation Commands Panel
  • Crawler Commands
  • Alert Message Command
  • Wait Commands
  • Home
  • Docs
  • Technical Documentation
  • Crawler Commands

Crawler Commands

Table of Contents
  • Types of Crawlers & Use Cases
  • Crawler Configuration
  • Crawler in action

Types of Crawlers & Use Cases #

RTILA Studio has 3 types of Crawler commands with slight differences but they overall work the same way. The differentiation is about the location of the links/pages that are to be crawled, whether they are “Internal”, “External” pages to the website we are on, or a mix of both.

The Crawler command is a powerful scrapping enabler that automatically recognizes and crawls web links of a given page, in a complete (all links) or selective manner (conditional logic).

In addition the Crawler is equipped with a Multi-threading capacity that allows you to crawl and open multiple tabs at the same time and significantly increase the speed of your automation. Assuming no firewall limits exist, the Crawler could crawl 10 pages per second or even more.

Crawler Configuration #

A great number of configurations are available for you to define and fine tune your crawler automation flow.

1: To rename your crawler block
2: Depth of crawling. If 1 it will only crawl the links available on that page. If set to 2 it will crawl all the links of that page and also crawl the links inside the secondary pages.
3: Number of tabs that are opened at the same time. Up to 10 if your internet connection and the website are fast. Otherwise a safer cruise speed is 3 to 5.
4: If you want to limit the number of pages crawled, otherwise leave zero to crawl everything.
5: Types of file extensions you want to include
6: Check if you want to create a “human like” random crawling instead of sequential top to bottom order.
7: You can add conditions for the Crawler to ignore or exclusively focus on URLs with specific keyword appearing (or not) in the URLs.
8: Here it will only crawl links that contain “/product/”
9: If you want to add a delay before each link opening
10: Timeout for when one of the tabs/links is not loading properly
11: Specify what you want to wait for in terms of page loading status.

Once you have setup your Crawler for mass data acquisition, you can just add an “Extract” command inside this block to ensure that the data properties you have set in the inspection panel are captured for each crawled page.

Crawler in action #

Below is a screenshot of our crawler going through the topics pages of GitHub to open all the links that contain “/topics/” in the URL. The multithreading is set to 7 tabs (includes the starting page) as this was the most reliable speed for our internet connection. The same can be achieved for any type of listing or directory website that contains structured internal or external links. Our crawler is able to crawl over 1 Million internal links of a specific directory, depending on the security threshold and load capacity of the target website and we advise to span the crawler over a longer period of time and a lower number of concurrent tab for better and more ethical results. RTILA is very stable so take it as a Marathon not a Sprint and pace your automation.

What are your Feelings
Share This Article :
  • Facebook
  • Twitter
  • LinkedIn
  • Pinterest
Still stuck? How can we help?

How can we help?

Updated on March 19, 2023
Automation Commands PanelAlert Message Command

Powered by BetterDocs

Table of Contents
  • Types of Crawlers & Use Cases
  • Crawler Configuration
  • Crawler in action

RTILA CORPORATION

Suite 20C Trolley Square.
Wilmington, 19806 DE, USA.

Support

  • Community
  • Documentation
  • Videos & Media
  • User Handbook

Company

  • Cookie Policy
  • Privacy Policy
  • License Agreement
  • Payments & Refunds

Copyright © RTILA CORPORATION

Facebook GroupYouTube