Crawl API

Automate content extraction from any domain. Simply define the root URL and retrieve the full website content as Markdown, Text, HTML, or JSON files.

No se requiere tarjeta de crédito
  • Map entire site structures in one request
  • Capture both static and dynamic web content
  • Flexible for SEO, AI, and compliance needs
  • Integrates with popular dev frameworks and no-code
TRUSTED BY 20,000+ CUSTOMERS WORLDWIDE

                              const options = {
  method: 'POST',
  headers: {Authorization: 'Bearer ', 'Content-Type': 'application/json'},
  body: '[{"url":"https://il.linkedin.com/company/bright-data"}]'
};

fetch('https://api.brightdata.com/datasets/v3/trigger', options)
  .then(response => response.json())
  .then(response => console.log(response))
  .catch(err => console.error(err));






                              
                            
                              import requests

url = "https://api.brightdata.com/datasets/v3/trigger"

payload = [{"url": "https://il.linkedin.com/company/bright-data"}]
headers = {
    "Authorization": "Bearer ",
    "Content-Type": "application/json"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)
                              
                            

Easy to start, easier to scale

  1. Choose target domain
    Define target URL and connect to the API with a single line of code
  2. Send request
    Edit crawl parameters and insert your custom logic using Python or JavaScript
  3. Get your data
    Retrieve website data as Markdown, Text, HTML, or JSON files
Read documentation

Developer-first experience

Quick Start

Connect to the Crawl API with a single line of code, or use the control panel to get results directly through the Control Panel.

Custom Collection

Use request parameters to customize collection and delivery, including pagination, scheduling and log collection.

Data Parsing

Efficiently converts raw HTML into structured data files, delivered as Markdown, Text, HTML, or JSON, directly to your database.

Crawl API pricing

pay as you go plan icon
PAGUE O QUE GASTAR
$1.5 /1K REGISTROS
Sem compromisso
Prueba gratuita

Pague conforme o uso, sem compromisso mensal
25% OFF
2nd plan icon
Crescimento
$1.3
$0.98 /1K REGISTROS
$499 Facturado mensualmente
Prueba gratuita
Use this coupon code: APIS25

Diseñado a medida para equipos que buscan escalar sus operaciones.
25% OFF
3rd plan icon
NEGÓCIO
$1.1
$0.83 /1K REGISTROS
$999 Facturado mensualmente
Prueba gratuita
Use this coupon code: APIS25

Diseñado para equipos grandes con amplias necesidades operativas
25% OFF
4th plan icon
PRÉMIUM
$1
$0.75 /1K REGISTROS
$1999 Facturado mensualmente
Prueba gratuita
Use this coupon code: APIS25

Suporte avançado e recursos para operações críticas
EMPRESA
Servicios de datos de élite para requisitos empresariales de primer nivel.
CONTACTANOS
  • Gestor de contas
  • Pacotes sob medida
  • SLA Premium
  • Suporte prioritário
  • Onboarding personalizado
  • SSO
  • Personalizações
  • Logs de auditoria
compliance badges

Leading the way in ethical web data collection

Bright Data sets the gold standard in compliance, effectively self-regulating the industry. With transparent operations validated by top security firms, clear peer consent, and pioneering compliance units, we ensure legitimate and safe data collection. Upholding international privacy laws and utilizing tools like BrightBot, we minimize your legal exposure, making partnership with us a strategic move to curtail legal risks and associated costs.

Start free trial

Every 15 minutes, our customers scrape enough data to train ChatGPT from scratch.

API for Seamless Crawl Data Access

Comprehensive, Scalable, and Compliant Crawl Data Extraction

FLEXIBLE

Tailored to your workflow

Get structured data in JSON, NDJSON, or CSV files through Webhook or API delivery.

SCALABLE

Built-in infrastructure and unblocking

Get maximum control and flexibility without maintaining proxy and unblocking infrastructure. Easily scrape data from any geo-location while avoiding CAPTCHAs and blocks.

STABLE

Battle-proven infrastructure

Bright Data’s platform powers over 20,000+ companies worldwide, offering peace of mind with 99.99% uptime, access to 150M+ real user IPs covering 195 countries.

COMPLIANT

Industry leading compliance

Our privacy practices comply with data protection laws, including the EU data protection regulatory framework, GDPR, and CCPA – respecting requests to exercise privacy rights and more.

Desea obtener más información?

Hable con un experto para analizar sus necesidades de raspado.

Crawl API FAQs

Bright Data’s Crawl API is a tool that lets you extract, map, and transform content from any website into structured data in formats like HTML, Markdown, and JSON, making it easy to use for AI training, SEO, compliance audits, and more.

You can crawl any public website, extracting both static and dynamic content such as articles, product listings, reviews, and complete site structures from any domain worldwide.

Crawl API delivers results in multiple formats, including Markdown, HTML, plain text, and structured schemas like ld_json. Choose the format that best fits your workflow.

Simply send an HTTP POST request to the API with your target URLs and preferred output format. You’ll receive a snapshot_id, which you can use to fetch the collected data once it's ready.

Yes! Use the no-code option in the Bright Data Control Panel. Just enter your URLs, select an output format, and start crawling with no coding required.

Results can be delivered via webhook, downloaded through the API or Control Panel, or sent to your preferred external storage (such as AWS S3, Google Cloud Storage, etc.).

Yes, the Crawl API supports scheduling, so you can automate crawls daily, weekly, or on a custom timetable to keep your datasets up to date.

Absolutely! The API integrates seamlessly with Python, Node.js, BeautifulSoup, Cheerio, and many other popular libraries for developer flexibility.

Customers use the Crawl API for LLM training dataset creation, SEO site audits, competitive research, compliance/accessibility checks, and website content migration and archiving.

You can include detailed error logs via the include_errors parameter for every crawl. Troubleshoot issues efficiently, or reach out to Bright Data support for further assistance.