Overview: How Astra Crawls Your Application

Last updated: June 3, 2026

Introduction

Astra’s Dynamic Application Security Testing (DAST) engine uses real browsers, such as Chromium and Firefox, to crawl and explore web applications. Before Astra begins scanning for vulnerabilities, our engine first performs a crawl of your application. This process renders pages as a real user would, ensuring that all reachable pages, API endpoints, and parameters are discovered to achieve maximum coverage for subsequent security tests.

Who Should Read This

Security Engineers: To understand how Astra maps the complete attack surface of modern applications.
Developers: To learn how the crawler executes JavaScript and interacts with the live DOM to discover dynamic elements and background API calls.
DevOps Teams: To utilize automated crawling for asset discovery and to power Delta Scans within CI/CD pipelines.

Key Functions: The Step-by-Step Crawling Flow

Scan Initiation: Configured parameters, including target URL, authentication methods, and crawl scope, are loaded.
Real Browser Launch: A headless Chromium browser is launched to simulate a real user environment, enabling full JavaScript execution.
Authentication: If required, a recorded login flow is replayed to ensure the crawl starts from an authenticated session.
Initial Navigation: The crawler navigates to the primary target URI and waits for the page to fully render, including asynchronous data.
Page Exploration: All links are extracted, resources are recorded, and API calls (XHR, fetch, WebSocket) are captured as they trigger.
User Interaction Simulation: Astra mimics a real user by filling forms, clicking buttons, and scrolling to expose hidden content and deeper endpoints.
Recursive Crawling: Discovered pages are processed identically until the defined scope is exhausted or no new content is found.
Sitemap Construction: Discovered assets are normalized and added to an evolving Sitemap, which serves as the inventory for testing.
Session Monitoring: The crawler continuously monitors session validity and will re-authenticate if the session expires.

How Crawling Works

Astra uses a combination of techniques to ensure deep and accurate exploration of your application.

Static and Dynamic Analysis The crawler parses both server responses and in-browser behavior. This allows it to detect routes and APIs exposed via JavaScript frameworks, dynamic rendering, or AJAX calls — ensuring full coverage beyond what static HTML alone would reveal.

Crawl Depth To avoid infinite loops or excessive requests, the crawler limits how deeply it follows nested links. By default, it explores pages up to a defined crawl depth, balancing coverage and performance.

Sitemap and Specification Imports If your site provides a sitemap.xml, Astra automatically parses it to include listed URLs as part of the crawl. You can also import an OpenAPI (Swagger) specification to directly populate API endpoints for scanning — ensuring accurate coverage even for non-public or hidden APIs.

Form Discovery and Fuzzing When the crawler encounters forms or query parameters, it performs parameter fuzzing to uncover dynamic endpoints and hidden parameters. This helps reveal endpoints that may not be linked directly in your application but are still accessible.

How the Sitemap Is Built Each discovered URL or API endpoint is normalized and categorized by:

HTTP method (GET, POST, PUT, etc.)
Parameter patterns
Authentication context (if applicable)

This structure ensures the scan phase targets unique and meaningful routes, not duplicate or redundant paths, giving you a clear and comprehensive view of your application's attack surface.

Available Actions

Automated Crawling (Web): You can trigger a standalone crawl to update your endpoint inventory without performing a full security test.
Get Sitemap: Download a .csv file of all URLs and API endpoints scanned by the engine via the Continuous Scans page.
Import Specifications: Manually import OpenAPI (Swagger) files to populate endpoints that might not be reachable through the UI.
Adjust Scan Speed: Control the rate of requests per second to optimize for application performance.

Best Practices

Manage Dynamic Parameters: Astra automatically groups URLs that differ only by unique IDs (e.g., /user?id=1 and /user?id=2) into a single route pattern to prevent redundant testing.
Handle SPA Fragments: While Astra inspects network activity within "#" fragments, it does not list them as separate sitemap entries to keep results concise.
Separate Crawl and Scan: For large applications, schedule an Automated Crawl (e.g., 2:00 AM) followed by a Delta Scan (e.g., 4:00 AM) to ensure tests target the most recent changes.
Whitelist IPs: To avoid crawling interruptions, ensure you have allowlisted Astra's official IP ranges in your firewall, Cloudflare, or WAF settings.