How Astra Crawls Your Application
Last updated: October 23, 2025
Before Astra begins scanning for vulnerabilities, our engine first performs a crawl of your application.
This step helps us discover all reachable pages, API endpoints, and parameters — so that the subsequent security tests achieve maximum coverage.
What Crawling Does
Crawling is the process of automatically exploring your application much like a user or search engine would.
Astra’s crawler:
Follows internal links and redirects
Executes JavaScript to discover client-side navigation paths
Identifies and records forms, input fields, and parameters
Extracts API calls triggered from within web pages
Builds a sitemap representing all discovered routes and endpoints
The result is a complete, structured view of your application that guides our vulnerability testing phase.
How Crawling Works
Astra uses a combination of techniques to ensure deep and accurate exploration:
1. Static and Dynamic Analysis
Our crawler parses both server responses and in-browser behavior. This allows it to detect routes and APIs exposed via JavaScript frameworks, dynamic rendering, or AJAX calls.
2. Crawl Depth
To avoid infinite loops or excessive requests, the crawler limits how deeply it follows nested links.
By default, it explores pages up to a defined crawl depth, balancing coverage and performance.
3. Sitemap and Specification Imports
If your site provides a
sitemap.xml, Astra automatically parses it to include listed URLs as part of the crawl.You can also import an OpenAPI (Swagger) specification to directly populate API endpoints for scanning — ensuring accurate coverage even for non-public or hidden APIs.
4. Form Discovery and Fuzzing
When the crawler encounters forms or query parameters, it performs parameter fuzzing to uncover dynamic endpoints and hidden parameters.
This helps reveal endpoints that may not be linked directly in your application but are still accessible.
How the Sitemap Is Built
Each discovered URL or API endpoint is normalized and categorized by:
HTTP method (GET, POST, PUT, etc.)
Parameter patterns
Authentication context (if applicable)
This structure ensures that the scan phase targets unique and meaningful routes, not duplicate or redundant paths.
The resulting sitemap provides a clear and comprehensive view of your application’s attack surface.
Special Cases & FAQs
What happens with URLs containing “#” fragments?
Single-page applications (SPAs) often use # fragments for client-side routing.
Astra crawls these fragments and inspects all API calls and network activity that occur within them, but these fragments are not shown as separate entries in the sitemap.
Instead, only the unique API calls or requests triggered from those fragments are listed.
How does Astra handle dynamic IDs or parameters?
When multiple URLs differ only by numeric or unique IDs (e.g., /user?id=1, /user?id=2), Astra groups them into a single route pattern. This avoids redundant testing and keeps the sitemap concise and meaningful.