๐Ÿ”

Structured Web Extraction API

Extract clean content from any webpage โ€” markdown, metadata, links, images, and schema-defined fields. Powered by Readability + Playwright.

API Docs $0.005 / extraction

About this tool

Unlike a plain scraper, /extract uses Mozilla's Readability algorithm to strip navbars, ads, footers, and sidebars โ€” leaving only the meaningful content. Output includes clean markdown, title, description, author, publication date, links, and images. You can also pass a schema object to extract specific fields via CSS selectors or regex.

๐Ÿงช Try it live

Quick Start

curl -X POST https://api.iteratools.com/extract \ -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com/article", "wait_for": 2000 }'

With Schema Extraction

Pass a schema object to extract specific fields using CSS selectors. Falls back to regex if selector not found.

curl -X POST https://api.iteratools.com/extract \ -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "url": "https://shop.example.com/product", "schema": { "price": ".price, .product-price", "name": "h1", "rating": "[data-rating]" } }'

Response

{ "ok": true, "data": { "title": "Article Title", "description": "Meta description...", "author": "Jane Doe", "published_date": "2024-01-15", "markdown": "# Article Title\n\nClean article content...", "links": [{ "text": "Read more", "href": "https://..." }], "images": [{ "src": "https://example.com/img.jpg", "alt": "..." }], "word_count": 842, "url": "https://example.com/article", "schema_data": { "price": "$29.99", "name": "Product Name" } } }

Request Parameters

url required โ€” the URL to extract content from
schema optional โ€” object mapping field names to CSS selectors or regex patterns
wait_for optional โ€” milliseconds to wait for JS rendering (default: 2000, max: 5000)

Details

EndpointPOST /extract
Price$0.005 / extraction
AuthBearer token or x402 micropayment
EngineMozilla Readability + Playwright (JS rendering)
Base URLhttps://api.iteratools.com
Full Documentation Browse All Tools