html2rss.github.io/src/content/docs/ruby-gem/how-to/handling-dynamic-content.mdx at ce3b3f768ec59de3ca5d075e2612c72a4fa2651f · html2rss/html2rss.github.io

title	Handling Dynamic Content
description	Learn how to handle JavaScript-heavy websites and dynamic content with html2rss using browser-based extraction strategies.

import { Code } from "@astrojs/starlight/components";

Some websites load their content dynamically using JavaScript. Static fetch paths may not see this content reliably.

Solution

Use a browser-based extraction strategy when JavaScript-heavy pages do not work with default static fetching.

browserless is common for this workflow, and botasaurus is an alternate browser-based strategy when you run a Botasaurus scrape API.

Keep the strategy at the top level and put request-specific options under request:

When to Use Browser-Based Extraction

A browser-based extraction strategy is necessary when:

Content loads after page load - JavaScript fetches data from APIs
Single Page Applications (SPAs) - React, Vue, Angular apps
Infinite scroll - Content loads as you scroll
Dynamic forms - Content changes based on user interaction

Preload Actions

For dynamic sites, rendering once is often not enough. Use request.browserless.preload to wait, click, or scroll before the HTML snapshot is taken.

Wait Before Capturing Dynamic Content

Click "Load More" Buttons

Scroll Infinite Lists

These preload steps can be combined in a single config when a site needs several interactions before all items appear.

Performance Considerations

Browser-based extraction is slower than default static HTTP fetching because it:

Launches a headless Chrome browser
Renders the full page with JavaScript
Takes more memory and CPU resources

Use static HTTP fetching for static content and switch to browser-based extraction when needed. See the Strategy Reference for concrete transports, defaults, and environment requirements.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solution

When to Use Browser-Based Extraction

Preload Actions

Wait Before Capturing Dynamic Content

Click "Load More" Buttons

Scroll Infinite Lists

Performance Considerations

Related Topics

FilesExpand file tree

handling-dynamic-content.mdx

Latest commit

History

handling-dynamic-content.mdx

File metadata and controls

Solution

When to Use Browser-Based Extraction

Preload Actions

Wait Before Capturing Dynamic Content

Click "Load More" Buttons

Scroll Infinite Lists

Performance Considerations

Related Topics