Loading
Loading
Extract structured data from any website through a single API request. The API handles dynamic content rendering, JavaScript execution, proxy rotation, CAPTCHA solving, SSL handling, and browser-level interactions for consistent data extraction.
Send the target website URL and configuration parameters as query parameters. Pass the scraping instructions array in the request body. The response contains an extractedData object with content matching your specified extraction rules.
Pass your API key as the apiKey parameter in every request.
This is version v1.0 of the API.
The following query parameters configure the scraping behavior. Pass the target URL and optional settings as query string parameters in the request URL.
urlrequiredstringThe URL of the web page to scrape.
textoptionalbooleandefault: trueIf true, returns extracted text content instead of full HTML.
jsEnabledoptionalbooleandefault: falseIf true, enables JavaScript rendering for dynamic pages.
proxyoptionalstringProxy country for the request. Options: 'US', 'UK', 'CA', 'AU', 'DE', 'FR', 'IN'.
sslIgnoreoptionalbooleandefault: falseIf true, ignores SSL certificate errors when scraping HTTPS sites.
windowSizeoptionalstringdefault: 1920x1080Browser viewport size for JS-rendered pages. Format: WxH.
adBlockoptionalbooleandefault: falseIf true, blocks ads and trackers during page load.
captchaoptionalbooleanIf true, enables automatic CAPTCHA solving.
The request body is a JSON object that defines browser session settings and the scraping workflow. The instructions array contains the core scraping logic.
{
"instructions": [
{
"fill": {
"place": "//*[@id='nombre']",
"value": "google"
}
},
{
"click": "//*[@id='page']/div/div/section[1]/div/div/div/div[2]/div/div/form/div[2]/button"
},
{
"clickButtonByValue": {
"place": ".col-6:nth-child(2) > .col-12",
"value": "google.sv "
}
},
{
"wait": 1000
},
{
"extract": {
"html": "//*[@id='DivResultado']/div"
}
}
]
}blockUrlList of script or URL patterns to block during page load.
cookiesCookies to be injected into the browser session.
instructionsList of step-by-step scraping instructions. Each object in the array represents an action to execute. Can be empty to return full page content.
When jsEnabled is set to true, each object in the instructions array can contain one or more of the following action fields. The actions are executed in sequence within a headless browser.
fillFills an input field with a value.
clickClicks an element using XPath or CSS selector.
clickIfExistClicks element only if it appears within a short timeout.
enterSends Enter key to the targeted element.
newTabSwitches to a newly opened browser tab.
moveToRelativeTabMoves between tabs relative to the current tab index.
waitPauses execution for the specified number of milliseconds.
waitForWaits until an element matching the selector becomes visible.
selectSelects an option from a dropdown element.
jsExeExecutes custom JavaScript on the page.
conditionalCheckExecutes conditional logic steps based on element state.
clickButtonByValueClicks a button matching specific text or value.
generalImageCaptchaSolves an image CAPTCHA using automated logic.
fillImageCaptchaCaptures and fills CAPTCHA values automatically.
switchToIframeSwitches context to an iframe by name or ID.
switchToParentFrameReturns from iframe context to the parent frame.
resolveAudioCaptchaSolves audio CAPTCHA challenges.
screenshotCaptures a screenshot of the current page.
saveimageSaves an image from the page using selector or ID.
blockElementBlocks specific elements from loading on the page.
extractDefines what data to extract from the page using CSS selectors or XPath.
When a fill action is provided inside an instruction, the following sub-fields define the input target and value.
placeXPath or CSS selector of the input field.
valueValue to enter into the input field.
When a select action is provided inside an instruction, the following sub-fields define the dropdown target and value.
placeXPath or CSS selector of the parent or container element.
valueText or value to match for selecting the option.
When a generalImageCaptcha action is provided inside an instruction, each object in its array can contain the following fields.
imagePathPath or selector to the CAPTCHA image element.
textFieldSelector for the field where CAPTCHA text is entered.
imageUpdatePathSelector to refresh or update the CAPTCHA image.
captchaFailedPathSelector indicating a CAPTCHA failure so the solver can retry.
modelModel used for CAPTCHA solving. Available models: mini-ocr-v1, Model20, Model05, Model141, Model150, Model102.
When jsEnabled is false or omitted (static scraping), each object in the instructions array can use the following action fields. These do not require a browser environment.
postFormSubmits a form using an HTTP POST request.
getFormSubmits a form using an HTTP GET request.
getPageRetrieves page content from a URL or the current page.
extractExtracts specific data from the page using CSS selectors or XPath.
When a getForm or postForm action is provided inside an instruction, the following sub-fields define the form target.
selectorForm selector (XPath or CSS) identifying the form element.
dataKey-value pairs of form field names and values to submit.
A successful request returns a 200 OK response with a JSON object containing the extracted data. The extractedData object is dynamic and reflects the field names defined in your extract instruction.
extractedDataContainer for the values extracted by your scraping instructions. The returned keys depend on the names used inside the extract action.
The extractedData object is instruction-driven. Common response shapes include the following fields:
htmlHTML content extracted from a selector or XPath when the extract instruction requests html output.
textPlain text content extracted from a selector or XPath when the extract instruction requests text output.
<custom field>Any custom key defined inside extract. The response type depends on the selector result and the structure of the extraction rule.
<custom array field>Returned when an extract rule maps a field to multiple selectors, such as repeated rows, links, or name server values.
Error responses can contain the following fields when scraping fails or the request is invalid:
errorHuman-readable error message describing the failure.
codeNumeric error code returned by the API when available.
detailsAdditional context about invalid parameters, failed instructions, or provider-side processing issues.
The API uses standard HTTP status codes to indicate the success or failure of requests. For common status codes like 429 (Too Many Requests), refer to the general API documentation.
please pass correct parameters
The requested resource could not be found. Please verify the URL and try again.
Wrong HTTP method was used on the endpoint.
Timed out while connecting to the remote URL.
Too many requests. Please wait before making additional requests.