SearchCtrl+K

Web Scraper API Reference

Extract structured data from any website through a single API request. The API handles dynamic content rendering, JavaScript execution, proxy rotation, CAPTCHA solving, SSL handling, and browser-level interactions for consistent data extraction.

POSThttps://api.apifreaks.com/v1.0/scraping

Test

Send the target website URL and configuration parameters as query parameters. Pass the scraping instructions array in the request body. The response contains an extractedData object with content matching your specified extraction rules.

Getting Started

Authentication

Pass your API key as the apiKey parameter in every request.

API Version

This is version v1.0 of the API.

LANGUAGE

curl -X 'POST' \
  'https://api.apifreaks.com/v1.0/scraping?url=https%3A%2F%2Fsvnet.sv%2F&text=true&jsEnabled=true' \
  -H 'Content-Type: application/json' \
  -H 'X-apiKey: API-KEY' \
  -d '{
  "instructions": [
    {
      "fill": {
        "place": "//*[@id='nombre']",
        "value": "google"
      }
    },
    {
      "click": "//*[@id='page']/div/div/section[1]/div/div/div/div[2]/div/div/form/div[2]/button"
    },
    {
      "clickButtonByValue": {
        "place": ".col-6:nth-child(2) > .col-12",
        "value": "google.sv "
      }
    },
    {
      "wait": 1000
    },
    {
      "extract": {
        "html": "//*[@id='DivResultado']/div"
      }
    }
  ]
}'

Response

{
  "extractedData": {
    "html": "El nombre de dominio: google.sv se encuentra RegistradoNombre de dominio:  google.svEstado:  RegistradoContacto Administrativo:  Cesar Ulises  Trujillo MartínezCorreo Electrónico: admin@admindotsv.comTeléfono: 503 2284-8531Fecha Registro: 01-01-2013Fecha de vencimiento: 01-01-2027Fecha de Baja: 01-02-2027"
  }
}

API Request

The following sections describe every input accepted by the Web Scraper API. Query parameters configure browser behavior; the request body carries the scraping workflow.

Query Parameters

The following query parameters configure the scraping behavior. Pass the target URL and optional settings as query string parameters in the request URL.

formatoptionalStringdefault: json

Response format: 'json'.

urlrequiredString

The URL of the web page to scrape.

textoptionalBooleandefault: true

If true, returns extracted text content instead of full HTML.

jsEnabledoptionalBooleandefault: false

If true, enables JavaScript rendering for dynamic pages.

proxyoptionalBooleandefault: false

Enables proxy usage for requests.

sslIgnoreoptionalBooleandefault: false

If true, ignores SSL certificate errors when scraping HTTPS sites. Only works if jsEnabled is true.

windowSizeoptionalStringdefault: Browser default

Browser viewport size in width/height format for rendered scraping sessions. Only works if jsEnabled is true.

adBlockoptionalBooleandefault: false

If true, blocks ads and trackers during page load. Only works if jsEnabled is true.

captchaoptionalBooleandefault: false

If true, enables automatic CAPTCHA solving. Only works if jsEnabled is true.

Request Body

The request body is a JSON object that defines browser session settings and the scraping workflow. The instructions array contains the core scraping logic.

{
    "instructions": [
        {
            "fill": {
                "place": "//*[@id='nombre']",
                "value": "google"
            }
        },
        {
            "click": "//*[@id='page']/div/div/section[1]/div/div/div/div[2]/div/div/form/div[2]/button"
        },
        {
            "clickButtonByValue": {
                "place": ".col-6:nth-child(2) > .col-12",
                "value": "google.sv "
            }
        },
        {
            "wait": 1000
        },
        {
            "extract": {
                "html": "//*[@id='DivResultado']/div"
            }
        }
    ]
}

blockUrloptionalArray

List of script or URL patterns to block during page load.

cookiesoptionalArray

List of cookies to be set in the browser session.

instructionsrequiredArray

List of step-by-step scraping instructions. Each object in the array represents an action to execute. Can be empty to return full page content.

Fields for each cookie object

Each object in the cookies array must contain the following fields.

namerequiredString

Cookie name.

valuerequiredString

Cookie value.

Instructions (JS-Enabled Scraping)

When jsEnabled is set to true, each object in the instructions array can contain one or more of the following action fields. The actions are executed in sequence within a headless browser.

filloptionalObject

Fills an input field with a value.

clickoptionalString

Clicks an element using XPath or CSS selector.

clickIfExistoptionalString

Clicks element only if it appears within a short timeout.

enteroptionalString

Sends Enter key to the targeted element.

newTaboptionalBoolean

Switches to a newly opened browser tab.

moveToRelativeTaboptionalInteger

Moves between tabs relative to the current tab index.

waitoptionalInteger

Pauses execution for the specified number of milliseconds.

waitForoptionalString

Waits until an element matching the selector becomes visible.

selectoptionalObject

Selects an option from a dropdown element.

jsExeoptionalString

Executes custom JavaScript on the page.

conditionalCheckoptionalArray

Executes conditional logic steps based on element state.

clickButtonByValueoptionalObject

Clicks a button matching specific text or value.

generalImageCaptchaoptionalArray

Instructions for solving image captchas.

fillImageCaptchaoptionalArray

Captures and fills CAPTCHA values automatically.

switchToIframeoptionalString

Switches context into an iframe by name or ID.

switchToParentFrameoptionalBoolean

Returns from an iframe to the parent context.

resolveAudioCaptchaoptionalObject

Solves audio CAPTCHA challenges.

screenshotoptionalString

Captures a screenshot of the page.

saveimageoptionalString

Saves an image by selector or ID.

blockElementoptionalArray

List of CSS selectors or XPaths for elements to block or hide on the page. Example: [".description", "//input[@id='username']"]

extractoptionalObject

Defines what data to extract and how to extract it.

Fields for fill Action

When a fill action is provided inside an instruction, the following sub-fields define the input target and value.

placerequiredString

XPath or CSS selector of the input field.

valuerequiredString

Value to enter into the input field.

Fields for select Action

When a select action is provided inside an instruction, the following sub-fields define the dropdown target and value.

placerequiredString

XPath or CSS selector of the parent or container element.

valuerequiredString

Text or value to match for selecting the option.

Fields for clickButtonByValue Action

When a clickButtonByValue action is provided inside an instruction, the following sub-fields define the button target and matching value.

placerequiredString

XPath or CSS selector of the button to click.

valuerequiredString

Text or value the button must match.

Fields for generalImageCaptcha Action

When a generalImageCaptcha action is provided inside an instruction, each object in its array can contain the following fields.

imagePathoptionalString

Path or selector to the CAPTCHA image element.

textFieldoptionalString

Selector for the field where CAPTCHA text is entered.

imageUpdatePathoptionalString

Selector to refresh or update the CAPTCHA image.

captchaFailedPathoptionalString

Selector indicating a CAPTCHA failure so the solver can retry.

modeloptionalString

Model used for CAPTCHA solving. Available models: Model_1, Model_2, Model_3, Model_4, Model_5, Model_6, basicTnImageProcessing, basicPhImageProcessing.

Fields for extract Action

When an extract action is provided inside an instruction, the following sub-fields define what content to extract and how to locate it.

htmloptionalString

CSS selector or XPath to extract HTML content. Example: "/html/body"

textoptionalString

CSS selector or XPath to extract text content. Example: "/html/body/div/div[2]/text"

user_dataoptionalString

CSS selector or XPath to extract user data. Example: "/html/body/div/div[2]"

Instructions (Static Scraping)

When jsEnabled is false or omitted (static scraping), each object in the instructions array can use the following action fields. These do not require a browser environment.

postFormoptionalObject

Submits a form using POST method. Provide the form’s XPath/CSS selector and input values.

getFormoptionalObject

Submits a form using GET method. Provide the form’s XPath/CSS selector and input values.

getPageoptionalString

Retrieves page content.

extractoptionalObject

Defines what data to extract and how to extract it.

Fields for getForm and postForm Actions

When a getForm or postForm action is provided inside an instruction, the following sub-fields define the form target and the data to submit.

selectorrequiredString

Form selector (XPath or CSS) identifying the form element.

datarequiredObject

Object containing the form input values. Each key is the field name and the value is the data to submit (e.g., { username: "myuser" }).

Fields for extract Action

When an extract action is provided inside an instruction, the following sub-fields define what content to extract and how to locate it.

htmloptionalString

CSS selector or XPath to extract HTML content. Example: "/html/body"

textoptionalString

CSS selector or XPath to extract text content. Example: "/html/body/div/div[2]/text"

API Response Schema

A successful request returns a 200 OK response with a JSON object containing the extracted data. The extractedData object is dynamic and reflects the field names defined in your extract instruction.

title (String), content (String), <key> (Array | String), links (Array), screenshot (String)

Error Status & Codes

See the HTTP Status Codes documentation for a complete reference of common API errors and HTTP status codes.

400

please pass correct parameters

400

The CSS or XPath selector provided is incorrect or unparseable.

400

Timed out while finding or interacting with elements on the page.

400

Missing required fields in postForm or getForm instruction.

400

The target URL returned a non-200 HTTP response.

400

The instruction contains an invalid or unsupported method.

400

One or more required parameters are missing or invalid.

400

An I/O error occurred while processing the request.

400

CAPTCHA was attempted but failed after retries.

400

CAPTCHA instructions were provided but captcha is not enabled in parameters.

400

URL cannot be null or empty. Please enter a valid URL.

400

An unexpected error occurred. Proxy connection failed.

404

Scraping stopped due to an Unknown Exception [Protocol error (Page.navigate): Cannot navigate to invalid URL].

405

Wrong HTTP method was used on the endpoint.

408

The request was not completed within the expected time frame.

429

Too many requests. Please wait before making additional requests.

Web Scraper API Reference

POSThttps://api.apifreaks.com/v1.0/scraping

Test

Getting Started

Authentication

Pass your API key as the apiKey parameter in every request.

API Version

This is version v1.0 of the API.

LANGUAGE

curl -X 'POST' \
  'https://api.apifreaks.com/v1.0/scraping?url=https%3A%2F%2Fsvnet.sv%2F&text=true&jsEnabled=true' \
  -H 'Content-Type: application/json' \
  -H 'X-apiKey: API-KEY' \
  -d '{
  "instructions": [
    {
      "fill": {
        "place": "//*[@id='nombre']",
        "value": "google"
      }
    },
    {
      "click": "//*[@id='page']/div/div/section[1]/div/div/div/div[2]/div/div/form/div[2]/button"
    },
    {
      "clickButtonByValue": {
        "place": ".col-6:nth-child(2) > .col-12",
        "value": "google.sv "
      }
    },
    {
      "wait": 1000
    },
    {
      "extract": {
        "html": "//*[@id='DivResultado']/div"
      }
    }
  ]
}'

Response

{
  "extractedData": {
    "html": "El nombre de dominio: google.sv se encuentra RegistradoNombre de dominio:  google.svEstado:  RegistradoContacto Administrativo:  Cesar Ulises  Trujillo MartínezCorreo Electrónico: admin@admindotsv.comTeléfono: 503 2284-8531Fecha Registro: 01-01-2013Fecha de vencimiento: 01-01-2027Fecha de Baja: 01-02-2027"
  }
}

API Request

The following sections describe every input accepted by the Web Scraper API. Query parameters configure browser behavior; the request body carries the scraping workflow.

Query Parameters

The following query parameters configure the scraping behavior. Pass the target URL and optional settings as query string parameters in the request URL.

formatoptionalStringdefault: json

Response format: 'json'.

urlrequiredString

The URL of the web page to scrape.

textoptionalBooleandefault: true

If true, returns extracted text content instead of full HTML.

jsEnabledoptionalBooleandefault: false

If true, enables JavaScript rendering for dynamic pages.

proxyoptionalBooleandefault: false

Enables proxy usage for requests.

sslIgnoreoptionalBooleandefault: false

If true, ignores SSL certificate errors when scraping HTTPS sites. Only works if jsEnabled is true.

windowSizeoptionalStringdefault: Browser default

Browser viewport size in width/height format for rendered scraping sessions. Only works if jsEnabled is true.

adBlockoptionalBooleandefault: false

If true, blocks ads and trackers during page load. Only works if jsEnabled is true.

captchaoptionalBooleandefault: false

If true, enables automatic CAPTCHA solving. Only works if jsEnabled is true.

Request Body

The request body is a JSON object that defines browser session settings and the scraping workflow. The instructions array contains the core scraping logic.

{
    "instructions": [
        {
            "fill": {
                "place": "//*[@id='nombre']",
                "value": "google"
            }
        },
        {
            "click": "//*[@id='page']/div/div/section[1]/div/div/div/div[2]/div/div/form/div[2]/button"
        },
        {
            "clickButtonByValue": {
                "place": ".col-6:nth-child(2) > .col-12",
                "value": "google.sv "
            }
        },
        {
            "wait": 1000
        },
        {
            "extract": {
                "html": "//*[@id='DivResultado']/div"
            }
        }
    ]
}

blockUrloptionalArray

List of script or URL patterns to block during page load.

cookiesoptionalArray

List of cookies to be set in the browser session.

instructionsrequiredArray

List of step-by-step scraping instructions. Each object in the array represents an action to execute. Can be empty to return full page content.

Fields for each cookie object

Each object in the cookies array must contain the following fields.

namerequiredString

Cookie name.

valuerequiredString

Cookie value.

Instructions (JS-Enabled Scraping)

When jsEnabled is set to true, each object in the instructions array can contain one or more of the following action fields. The actions are executed in sequence within a headless browser.

filloptionalObject

Fills an input field with a value.

clickoptionalString

Clicks an element using XPath or CSS selector.

clickIfExistoptionalString

Clicks element only if it appears within a short timeout.

enteroptionalString

Sends Enter key to the targeted element.

newTaboptionalBoolean

Switches to a newly opened browser tab.

moveToRelativeTaboptionalInteger

Moves between tabs relative to the current tab index.

waitoptionalInteger

Pauses execution for the specified number of milliseconds.

waitForoptionalString

Waits until an element matching the selector becomes visible.

selectoptionalObject

Selects an option from a dropdown element.

jsExeoptionalString

Executes custom JavaScript on the page.

conditionalCheckoptionalArray

Executes conditional logic steps based on element state.

clickButtonByValueoptionalObject

Clicks a button matching specific text or value.

generalImageCaptchaoptionalArray

Instructions for solving image captchas.

fillImageCaptchaoptionalArray

Captures and fills CAPTCHA values automatically.

switchToIframeoptionalString

Switches context into an iframe by name or ID.

switchToParentFrameoptionalBoolean

Returns from an iframe to the parent context.

resolveAudioCaptchaoptionalObject

Solves audio CAPTCHA challenges.

screenshotoptionalString

Captures a screenshot of the page.

saveimageoptionalString

Saves an image by selector or ID.

blockElementoptionalArray

List of CSS selectors or XPaths for elements to block or hide on the page. Example: [".description", "//input[@id='username']"]

extractoptionalObject

Defines what data to extract and how to extract it.

Fields for fill Action

When a fill action is provided inside an instruction, the following sub-fields define the input target and value.

placerequiredString

XPath or CSS selector of the input field.

valuerequiredString

Value to enter into the input field.

Fields for select Action

When a select action is provided inside an instruction, the following sub-fields define the dropdown target and value.

placerequiredString

XPath or CSS selector of the parent or container element.

valuerequiredString

Text or value to match for selecting the option.

Fields for clickButtonByValue Action

When a clickButtonByValue action is provided inside an instruction, the following sub-fields define the button target and matching value.

placerequiredString

XPath or CSS selector of the button to click.

valuerequiredString

Text or value the button must match.

Fields for generalImageCaptcha Action

When a generalImageCaptcha action is provided inside an instruction, each object in its array can contain the following fields.

imagePathoptionalString

Path or selector to the CAPTCHA image element.

textFieldoptionalString

Selector for the field where CAPTCHA text is entered.

imageUpdatePathoptionalString

Selector to refresh or update the CAPTCHA image.

captchaFailedPathoptionalString

Selector indicating a CAPTCHA failure so the solver can retry.

modeloptionalString

Model used for CAPTCHA solving. Available models: Model_1, Model_2, Model_3, Model_4, Model_5, Model_6, basicTnImageProcessing, basicPhImageProcessing.

Fields for extract Action

When an extract action is provided inside an instruction, the following sub-fields define what content to extract and how to locate it.

htmloptionalString

CSS selector or XPath to extract HTML content. Example: "/html/body"

textoptionalString

CSS selector or XPath to extract text content. Example: "/html/body/div/div[2]/text"

user_dataoptionalString

CSS selector or XPath to extract user data. Example: "/html/body/div/div[2]"

Instructions (Static Scraping)

When jsEnabled is false or omitted (static scraping), each object in the instructions array can use the following action fields. These do not require a browser environment.

postFormoptionalObject

Submits a form using POST method. Provide the form’s XPath/CSS selector and input values.

getFormoptionalObject

Submits a form using GET method. Provide the form’s XPath/CSS selector and input values.

getPageoptionalString

Retrieves page content.

extractoptionalObject

Defines what data to extract and how to extract it.

Fields for getForm and postForm Actions

When a getForm or postForm action is provided inside an instruction, the following sub-fields define the form target and the data to submit.

selectorrequiredString

Form selector (XPath or CSS) identifying the form element.

datarequiredObject

Object containing the form input values. Each key is the field name and the value is the data to submit (e.g., { username: "myuser" }).

Fields for extract Action

When an extract action is provided inside an instruction, the following sub-fields define what content to extract and how to locate it.

htmloptionalString

CSS selector or XPath to extract HTML content. Example: "/html/body"

textoptionalString

CSS selector or XPath to extract text content. Example: "/html/body/div/div[2]/text"

API Response Schema

title (String), content (String), <key> (Array | String), links (Array), screenshot (String)

Error Status & Codes

See the HTTP Status Codes documentation for a complete reference of common API errors and HTTP status codes.

400

please pass correct parameters

400

The CSS or XPath selector provided is incorrect or unparseable.

400

Timed out while finding or interacting with elements on the page.

400

Missing required fields in postForm or getForm instruction.

400

The target URL returned a non-200 HTTP response.

400

The instruction contains an invalid or unsupported method.

400

One or more required parameters are missing or invalid.

400

An I/O error occurred while processing the request.

400

CAPTCHA was attempted but failed after retries.

400

CAPTCHA instructions were provided but captcha is not enabled in parameters.

400

URL cannot be null or empty. Please enter a valid URL.

400

An unexpected error occurred. Proxy connection failed.

404

Scraping stopped due to an Unknown Exception [Protocol error (Page.navigate): Cannot navigate to invalid URL].

405

Wrong HTTP method was used on the endpoint.

408

The request was not completed within the expected time frame.

429

Too many requests. Please wait before making additional requests.