Reliability and SLA

Service Level Commitments

Service Level Agreement

Sistava commits to 99.95% monthly availability for the webapp and API, which corresponds to roughly 21 minutes of downtime per month. Background work targets 99.9%, the marketing site targets 99.99%, and customer file storage targets eleven nines of durability through redundant object storage. Internally we steer to a higher Service Level Objective than the public SLA so issues surface before they affect customers. Latency commitments include p95 under 300 ms and p99 under 1 second on the API.

Monitoring and Alerting

The platform is instrumented with continuous monitoring across infrastructure, application, and business metrics. Critical incidents page our on-call rotation in real time through a dedicated alert channel. Mean time to detect targets under 5 minutes; mean time to resolve a P1 incident targets under 60 minutes. A public status page at status.sista.ai shows current platform health and a history of incidents. Major incidents trigger customer communication through both the status page and direct email.

Backups and Disaster Recovery

Customer data is backed up daily and retained for 14 days in a separate geographic failure domain. Continuous transaction-log archiving keeps our recovery point objective under 5 minutes for the primary database. Our recovery time objective for a full restore is under 4 hours, validated through documented procedure. Deployments are zero-downtime by default. Planned maintenance, when required, is announced at least 7 days in advance through the status page.

Incident Response

Every incident is classified by severity, communicated through the status page, and followed by a post-mortem when material. We track incident frequency and recurrence as a key reliability metric and feed every learning back into platform hardening. Service credits for SLA breaches are available to enterprise customers per contract. Reach out at security@sista.ai to discuss specific reliability requirements.

What this means for customers

Public 99.95% uptime SLA for the webapp and API
Eleven nines of durability for customer file storage
Recovery point under 5 minutes, recovery time under 4 hours
Real-time monitoring with on-call rotation for critical incidents
Public status page and incident communication