Reliability and SLA
Service Level Commitments
Service Level Agreement
Sistava commits to 99.95% monthly availability for the webapp and API, which corresponds to roughly 21 minutes of downtime per month. Background work targets 99.9%, the marketing site targets 99.99%, and customer file storage targets eleven nines of durability through redundant object storage. Internally we steer to a higher Service Level Objective than the public SLA so issues surface before they affect customers. Latency commitments include p95 under 300 ms and p99 under 1 second on the API.
Monitoring and Alerting
The platform is instrumented with continuous monitoring across infrastructure, application, and business metrics. Critical incidents page our on-call rotation in real time through a dedicated alert channel. Mean time to detect targets under 5 minutes; mean time to resolve a P1 incident targets under 60 minutes. A public status page at status.sista.ai shows current platform health and a history of incidents. Major incidents trigger customer communication through both the status page and direct email.
Backups and Disaster Recovery
Customer data is backed up daily and retained for 14 days in a separate geographic failure domain. Continuous transaction-log archiving keeps our recovery point objective under 5 minutes for the primary database. Our recovery time objective for a full restore is under 4 hours, validated through documented procedure. Deployments are zero-downtime by default. Planned maintenance, when required, is announced at least 7 days in advance through the status page.
Incident Response
Every incident is classified by severity, communicated through the status page, and followed by a post-mortem when material. We track incident frequency and recurrence as a key reliability metric and feed every learning back into platform hardening. Service credits for SLA breaches are available to enterprise customers per contract. Reach out at security@sista.ai to discuss specific reliability requirements.
What this means for customers
- Public 99.95% uptime SLA for the webapp and API
- Eleven nines of durability for customer file storage
- Recovery point under 5 minutes, recovery time under 4 hours
- Real-time monitoring with on-call rotation for critical incidents
- Public status page and incident communication