Login
This property defines how the crawler can authenticate it self to protected websites. Currently only the playbook is supported our must versatile and flexible authentication option.
Playbook
In the playbook you define a step by step guide how the authentication needs to be completed.
Playbook types
GOTO
Go to a specific url during authentication or start the authentication at a specific url
TYPE
Type in the value into a input field or type in a generated value
Selector: HTML query selector to select the specific field
Value: Field value
Action: Action that must be taken on the value (generate TOTP)
CLICK
Click on element on the page to continue example the submit button of a form
Selector: HTML query selector to select the element that must be clicked
WAIT
Wait for a specific event to be completed before continuing to the next step
waitForSelector: Wait for a specific element to be visible on the page
waitForNavigation: Wait until crawler has navigated to another page
waitForTimeout: Wait for a pre set timeout
Examples
Simple login form without field id's
{
"login": {
"playbook": [
// Navigate to the login page
{
"type": "goto",
"url": "https://example.com/login"
},
// Type in the required fields
{
"type": "type",
"selector": "input[type='email']",
"value": "[email protected]",
},
{
"type": "type",
"selector": "input[type='password']",
"value": "Your_password",
},
// Submit form
{
"type":"click",
"selector": "input[type='submit']",
},
// Wait until form submit is completed
{
"type": "wait",
"action": "waitForNavigation"
}
]
},
}
With TOTP token beining asked on a next page
{
"login": {
"playbook": [
// Navigate to the login page
{
"type": "goto",
"url": "https://example.com/login"
},
// Type in the required fields
{
"type": "type",
"selector": "#username",
"value": "[email protected]"
},
{
"type": "type",
"selector": "#password",
"value": "Your_password"
},
{
"type": "click",
"selector": "#submitForm"
},
// Wait until form is submitted and TOTP field becomes visible
{
"type": "wait",
"action": "waitForSelector",
"selector": "#totpToken"
},
{
"type": "type",
"selector": "#totpToken",
"action": "generateTOTP",
"totp": {
"secret": "TOTP_SECRET",
"algorithm": "SHA-1"
}
},
{
"type": "click",
"selector": "button[type=\"submit\"]"
},
// Wait until form submit is completed
{
"type": "wait",
"action": "waitForNavigation"
}
]
}
}
Best practices
Always specify a goto url as first step
Use wait step as last to be sure login succeeded and credentials are provided
Last updated
Was this helpful?