Overview: Service Unavailability Escalation Matrix

Last updated: June 5, 2026

Summary

Astra maintains a clearly defined escalation path with documented roles, responsibilities, and contact information to ensure a timely and effective response if the platform or any of its services become unavailable. This process is designed to minimize downtime, ensure accountability, and provide transparent communication during incidents.

Who Should Read This

Organization Admins and Security Leads: To understand Astra's reliability protocols and how incidents are managed and communicated.
Compliance Officers: To verify that Astra adheres to structured incident management and escalation frameworks as required by security standards.

The Escalation Framework

The framework is divided into three distinct levels based on the severity and duration of the service disruption:

Level 1: Automated Detection & Initial Triage
- Trigger: Automated monitoring and alerting systems detect service degradation or unavailability (publicly viewable at https://status.getastra.com/en/).
- Responsibility: The 24×7 on-call Engineering team acknowledges alerts, performs initial diagnosis, and applies immediate remediation or rollbacks.
Level 2: Engineering Escalation
- Trigger: The issue cannot be resolved within the defined response time or impacts multiple customers and core functionality.
- Responsibility: Senior Engineers or the Engineering Lead on-call conduct deep technical investigations and coordinate cross-service fixes.
Level 3: Incident Management & Leadership
- Trigger: Prolonged outages, risks to customer-facing SLAs, or impacts on security, data integrity, and compliance.
- Responsibility: An Incident Commander and the CTO/Engineering Manager oversee overall incident coordination, external dependency escalation, and high-level decision-making regarding customer communications.

Vendor & Third-Party Escalation

For incidents involving external dependencies (such as cloud providers or monitoring tools), the Incident Commander or Engineering Lead will:

Open priority support tickets with the relevant vendors.
Engage vendor on-call or premium support channels to track resolution timelines.

Communication Protocols

Internal: Contact details and on-call schedules are maintained in internal runbooks accessible to authorized personnel.
Customer Facing: Status updates are provided via the Public Status Page. For high-severity incidents, Astra provides direct customer notifications, with the communication cadence following established incident severity guidelines.

Continuous Improvement

Astra periodically tests and refines this escalation process through incident simulations and real-world reviews. Every major incident undergoes a post-incident review with a documented root cause analysis (RCA) and corrective actions to prevent future occurrences.