API Reference

Complete reference for the PageSight API endpoints, parameters, and responses.

GET /api/scrape

Extract webpage metadata and content based on specified categories. All requests must include the params query parameter to specify which categories to extract, or to hit the cache for previously scraped data.

Endpoint
https://pagesight.com/api/scrape

Query Parameters

ParameterTypeRequiredDescription
urlstringYesThe URL of the webpage to analyze (must be HTTP/HTTPS)
paramsstringYesRequired. Comma-separated list of categories to extract. Used to determine cache key. If not provided, request will fail.
Valid categories: metadata, openGraph, twitterCard, favicon, images, and more...
formatstringNoResponse format: json (default) or toon
cacheTimenumberNoCache duration in minutes (Pro users only, minimum 5 minutes)
revalidatebooleanNoForce cache refresh by setting revalidate=true (Pro users only)
⚠️ Important: params Parameter

The params parameter is required for all requests. It serves two purposes:

  • Cache Key Generation: The combination of url + params + format creates a unique cache key
  • Data Extraction: Specifies which categories of data to extract from the webpage

Example: ?url=https://example.com&params=metadata,openGraph

Valid Categories

The following categories can be used in the params parameter (comma-separated):

metadataopenGraphtwitterCardfaviconimagesrobotssitemapcontentstructuredDatatechnicalmobileViewdesktopViewperformanceaccessibilitysecuritysocialanalyticslinksformsmediatechStackinfrastructure

Basic Request

A simple request to extract basic metadata. Note: params is required.

Loading code...

Request with Specific Categories

Extract specific data categories from a webpage. Multiple categories can be specified.

Loading code...

Custom Cache Duration (Pro Only)

Set a custom cache duration for your request. Minimum 5 minutes.

Loading code...

Cache Revalidation (Pro Only)

Force a fresh scrape by bypassing the cache.

Loading code...

Response Format

Successful responses return JSON with the following structure.

Success Response (200)
Loading code...

Response Headers

HeaderDescription
X-CacheCache status: HIT or MISS
X-Cache-Expires-AtISO timestamp when cache expires
X-RateLimit-LimitYour rate limit (requests per minute)
X-RateLimit-RemainingRemaining requests in current window
X-RateLimit-ResetISO timestamp when rate limit resets

Error Responses

400 Bad Request - Missing params

Loading code...

401 Unauthorized

Loading code...

429 Too Many Requests

Loading code...
PageSight | PageSight