Skip to content

Latest commit

 

History

History
177 lines (134 loc) · 6.77 KB

File metadata and controls

177 lines (134 loc) · 6.77 KB

Utilities Module Implementation Summary

Overview

This implementation addresses the GitHub issue about extending trio-chrome-devtools-protocol with higher-level utility functions and classes for common browser automation tasks, inspired by Puppeteer/Pyppeteer.

Decision: Integrated Approach

Rather than creating a separate trio-puppeteer package, the utilities are integrated directly into the main trio_cdp package as a util module. This approach was chosen because:

  1. Lightweight: The utilities are thin wrappers around CDP commands
  2. No External Dependencies: Everything uses native CDP, no JavaScript injection
  3. Tight Integration: Direct access to session and connection objects
  4. Simplicity: Users don't need to install/manage a separate package

Implementation

New Module: trio_cdp/util.py

Contains three main classes and utility functions:

1. Keyboard Class

Provides keyboard input simulation:

  • down(key, text=None) - Press key down
  • up(key) - Release key
  • press(key, delay=0) - Complete key press (down + up)
  • type(text, delay=0) - Type a string character by character

Example:

keyboard = Keyboard(session)
await keyboard.type("Hello, World!")
await keyboard.press("Enter")

2. Mouse Class

Provides mouse action simulation:

  • move(x, y, steps=1) - Move mouse with optional smooth interpolation
  • click(x, y, button='left', click_count=1, delay=0) - Click at position
  • down(button='left', click_count=1) - Mouse button down
  • up(button='left', click_count=1) - Mouse button up

Example:

mouse = Mouse(session)
await mouse.move(100, 200, steps=10)  # Smooth movement
await mouse.click(100, 200)

3. ElementHandle Class

Represents a handle to a DOM element with convenient interaction methods:

  • click(button='left', click_count=1, delay=0) - Click the element
  • type(text, delay=0) - Focus and type into element
  • get_attribute(name) - Get HTML attribute value
  • get_property(name) - Get JavaScript property value
  • get_text_content() - Extract text content

Example:

input_field = await query_selector(session, 'input[name="email"]')
if input_field:
    await input_field.type('user@example.com')

Element Selection Functions

  • query_selector(session, selector, node_id=None) - Find first matching element
  • query_selector_all(session, selector, node_id=None) - Find all matching elements
  • wait_for_selector(session, selector, timeout=30, visible=False) - Wait for element

Example:

# Find and interact with elements
button = await query_selector(session, 'button.submit')
if button:
    await button.click()

# Wait for dynamic content
result = await wait_for_selector(session, '.result', timeout=10, visible=True)

Documentation

Added Files

  1. docs/utilities.rst - Comprehensive documentation for all utilities
  2. examples/form_interaction.py - Example showing form interaction
  3. examples/keyboard_mouse.py - Example demonstrating keyboard/mouse usage
  4. tests/test_util.py - Unit tests for utility functions
  5. validate_utilities.py - Validation script to verify module structure

Updated Files

  1. README.md - Added utilities section with examples
  2. docs/index.rst - Added utilities to documentation table of contents
  3. trio_cdp/init.py - Export util module

Key Design Principles

  1. Pure CDP: No JavaScript injection, all interactions use native CDP commands
  2. Async-First: Fully compatible with Trio's async/await patterns
  3. Lightweight: Minimal abstractions, close to underlying CDP
  4. Type-Safe: Complete type hints for IDE support
  5. Composable: Small, focused utilities that work well together
  6. Optional: Core CDP functionality remains available; utilities are opt-in

Benefits

For Users

  • Intuitive API: Familiar patterns for anyone coming from Puppeteer
  • Less Boilerplate: Common tasks simplified with high-level methods
  • Type Safety: Full IDE support with autocomplete and type checking
  • Pure Python: No JavaScript knowledge required

For the Project

  • Maintains Philosophy: Stays true to lightweight, CDP-focused approach
  • No Breaking Changes: Completely additive, existing code unaffected
  • Extensible: Users can easily add custom utilities following same patterns
  • Well-Documented: Comprehensive docs and examples

Technical Details

Generator Fix

Fixed generator/generate.py to handle typing.Optional type hints, which was preventing regeneration of CDP bindings with newer Python versions.

CDP Bindings Regenerated

Regenerated all CDP binding code to be compatible with chrome-devtools-protocol==0.4.0, resolving import errors with the generated code.

Testing & Validation

  1. Unit Tests: Comprehensive test suite in tests/test_util.py
  2. Validation Script: validate_utilities.py verifies all classes and methods exist
  3. Code Quality: Passed CodeQL security scan with 0 alerts
  4. Examples: Two working examples demonstrate real-world usage

Usage Example

Here's a complete example showing the utilities in action:

import trio
from trio_cdp import open_cdp, page, target
from trio_cdp.util import query_selector, wait_for_selector, Keyboard

async def automate_form(cdp_url):
    async with open_cdp(cdp_url) as conn:
        # Get a target
        targets = await target.get_targets()
        target_id = targets[0].target_id
        
        async with conn.open_session(target_id) as session:
            # Navigate
            await page.enable()
            await page.navigate('https://example.com/form')
            
            # Wait for and fill form
            name_field = await wait_for_selector(session, 'input[name="name"]', timeout=10)
            if name_field:
                await name_field.type('John Doe')
            
            # Use keyboard for submission
            keyboard = Keyboard(session)
            await keyboard.press('Enter')

Future Enhancements

Potential additions that maintain the same design philosophy:

  1. Page utilities: Screenshot helpers, PDF generation utilities
  2. Network utilities: Request interception helpers, mock response utilities
  3. Cookie utilities: Easy cookie management
  4. Dialog utilities: Alert/prompt/confirm handlers
  5. File upload: File chooser utilities

Each would follow the same pattern: lightweight wrappers around CDP commands with convenient async interfaces.

Conclusion

This implementation successfully extends trio-chrome-devtools-protocol with higher-level utilities while maintaining the library's core principles of being lightweight, pure-CDP, and Trio-native. The utilities provide a more intuitive interface for common automation tasks without sacrificing the power and flexibility of the underlying CDP protocol.