Capturing a Clickable URL
The capture_url function clicks an element and intercepts network requests to retrieve the target URL.
Features
-
Intercepts Dynamic Requests: Extracts URLs from click-triggered events.
-
Supports Multiple Resource Types: Works with documents, images, APIs, and more.
-
Customizable Timeout: Defines waiting time for URL capture.
Parameters
-
clickable
(ElementHandle
): The element to be clicked. -
resource_type
(ResourceType
): Type of resource to capture (default:"document"
). -
timeout
(Optional[int]
): Time to wait (in ms) for the new page to open.
Returns
-
URL | None
: The captured URL orNone
if no match was found.
Usage
async def scrape(
sdk: SDK, current_url: str, context: dict[str, Any], *args: Any, **kwargs: Any
) -> None:
button = await sdk.page.query_selector("button.download")
download_metadata = await sdk.capture_url(button,timeout=10000)
data = {'url': download_metadata}
await sdk.save_data(data)