Skip to main content

One-Liner Scrape, Screenshot, and PDF

Scrape content, take screenshots, or generate PDFs from any URL - without managing sessions, browsers, or cleanup yourself. You provide a URL, you get back the result. The TestMu AI Browser SDK handles everything behind the scenes.

This is useful when your agent needs data from a page but doesn't need to interact with it beyond extraction.

Scrape

Extract content from any webpage in your choice of format.

Simplest form - just pass a URL:

const result = await client.scrape('https://example.com');
console.log(result.content);

With options - control format, timing, and selectors:

const result = await client.scrape({
url: 'https://example.com',
format: 'markdown', // 'html' | 'markdown' | 'text' | 'readability'
delay: 3000, // Wait 3 seconds for JS-heavy pages
waitFor: '#content', // Wait for this CSS selector before extracting
});

Response:

interface ScrapeResponse {
title: string; // Page title
content: string; // Extracted content in requested format
url: string; // Final URL (after redirects)
markdown?: string; // Markdown version
html?: string; // Raw HTML
metadata?: Record<string, string>; // Meta tags
}

Choosing a format:

FormatWhat You GetWhen To Use It
htmlRaw HTML of the pageWhen your agent needs to parse the DOM
textPlain text, tags strippedWhen you want minimal tokens for LLM input
readabilityCleaned article content (like Reader Mode)When the page has an article you want to extract
markdownHTML converted to markdownWhen you want structure + low token count for LLMs

Screenshot

Capture a visual snapshot of any webpage.

// Simple
const result = await client.screenshot('https://example.com');
fs.writeFileSync('screenshot.png', result.data);

// With options
const result = await client.screenshot({
url: 'https://example.com',
fullPage: true, // Capture entire scrollable page
format: 'jpeg', // 'png' | 'jpeg' | 'webp'
quality: 80, // JPEG/WebP quality (1-100)
delay: 2000, // Wait before capturing
});

Response: { data: Buffer, format: string, width: number, height: number }

PDF

Generate a PDF document from any webpage.

// Simple
const result = await client.pdf('https://example.com');
fs.writeFileSync('page.pdf', result.data);

// With options
const result = await client.pdf({
url: 'https://example.com',
format: 'A4', // 'A4' | 'Letter' | 'Legal'
landscape: false,
printBackground: true,
margin: { top: '1cm', right: '1cm', bottom: '1cm', left: '1cm' },
});

Response: { data: Buffer, pageCount: number }

Standalone vs Session Mode

Quick Actions operate in two modes:

Standalone mode (default). The SDK creates a temporary headless browser with stealth enabled, navigates to the URL, performs the operation, and closes the browser. Fully automatic. No session management needed.

Session mode. If your agent already has a session running and you want the Quick Action to use that session's browser (with its cookies, tunnel, or extensions), register the page first:

const session = await client.sessions.create({ ... });
const browser = await client.puppeteer.connect(session);
const page = (await browser.pages())[0];

// Register the page for Quick Actions
client.quick.registerSessionPage(session.id, page);

// Now Quick Actions use the existing session
const result = await client.scrape({
url: 'https://example.com',
sessionId: session.id,
});

This is useful when the page requires authentication, a tunnel connection, or specific extensions to access.

Test across 3000+ combinations of browsers, real devices & OS.

Book Demo

Help and Support

Related Articles