Control + Continuity. One Operations Console.
Every operational concern in Aforo resolves to two inseparable halves. Control (governance, RBAC, SSO, audit trails, tenant isolation, encryption) is how you protect the platform from misuse. Continuity (uptime, incidents, event log, system health, dead-letter recovery) is how you keep it running. Most billing platforms ship one or the other. Aforo ships both.
The governance half. Audit trails, role-based access, tenant isolation, approval workflows, encryption at rest + in transit. The controls your CISO requires before signing the procurement contract.
The observability half. Uptime monitoring, incident management, event log, system health, dead-letter recovery. The dashboards your SRE team needs before the first 3am page.
Generic SaaS billing platforms force you to bolt on Datadog (uptime), PagerDuty (incidents), Splunk (event log), Auth0 (SSO + RBAC), and a custom audit-log schema. Aforo ships all of it natively, under one tenant boundary.
Public Status Page. Built In, Not Statuspage.io.
Auto-published from the same Uptime Monitor that pages your on-call. 12 services tracked. 90-day uptime bars. Active incidents + scheduled maintenance in one place. Customers subscribe by email or webhook. No third-party status-page vendor, no manual updates, no separate audit trail.
Customers see the same data your SRE team operates on. The status page is auto-published from the Uptime Monitor table, not maintained in a separate Statuspage.io account that drifts out of sync.
Enterprise-grade controls built into every transaction.
Security is not an add-on. Every pricing change, every billing event, and every configuration modification flows through the same governance framework, auditable, access-controlled, and tenant-isolated by default.
Immutable Audit Trails
Every modification to pricing rules, billing configurations, catalog entries, and subscription states is recorded in an append-only audit log. Each record captures the actor, timestamp, previous value, and new value. Required for SOC 2 compliance, internal financial accountability, and regulatory examinations.
Strict Access Control (SSO & RBAC)
SAML 2.0 and OIDC integration with your existing identity provider. Role-based access control determines exactly which team members can modify pricing, issue refunds, adjust subscriptions, or access financial reports. No shared credentials. No ambiguous permissions. Full attribution on every action.
Multi-Tenant Data Isolation
Complete data separation at the schema, cache, and event-stream level. Large enterprises operating multiple business units, brands, or subsidiaries manage each independently under a single administrative umbrella, without risk of cross-tenant data exposure.
Operational Observability
Real-time uptime monitoring across 12+ services, ClickHouse-backed event log with 15+ event types, SSE live streaming, and full-text search. System Health dashboard surfaces Kafka lag, Redis health, ClickHouse status, and PostgreSQL connection pool — all on one operations console.
Incident Lifecycle Management
Declare incidents manually or auto-create from uptime alerts. Structured timeline updates flow to the public status page. Postmortem editor with root cause + impact + remediation + prevention sections. Scheduled maintenance windows with customer notifications. MTTR captured per incident.
The checklist your IT team is looking for.
Every item below is production-verified, not a roadmap promise. Hand this page directly to your security review committee.
Five Surfaces. One Operations Console.
Aforo ships the full operational stack natively, no Datadog for uptime, no PagerDuty for incidents, no Splunk for event log, no separate Statuspage.io vendor. All five surfaces share the same tenant boundary, the same audit log, the same on-call workflow.
Uptime Monitor
HTTP / TCP / DNS / Cert check types per service. 90-day uptime bars + daily summaries + SLA report generator. Auto-publishes to the public status page. ClickHouse-backed uptime_checks table with hourly + daily summary materialized views.
Incident Manager
Declare manually OR auto-create from uptime alerts (3 consecutive failures over 5 min). Structured timeline updates flow to the public status page. Postmortem editor with root cause + impact + remediation + prevention sections. Scheduled maintenance windows with customer notifications.
Event Log
Every API, billing, auth, system, usage, and support event captured to ClickHouse platform_events table (MergeTree, 365-day TTL). Full-text search + structured filters + saved presets. Live SSE streaming with per-tenant concurrent-stream cap. Auto-dedupe via event_id idempotency.
System Health
Service map (topology), per-service resource gauges (CPU / memory / disk / connections), Apdex + p50/p95/p99 latency. Infrastructure health cards for Kafka (lag + consumer-group health), Redis (memory + key count), PostgreSQL (pool + slow queries), and ClickHouse (query queue + storage).
Notifications
Per-customer webhook delivery with HMAC SHA-256 signing + retry-with-backoff. In-portal notification inbox for customer-facing events (invoice.delivered, payment.failed, subscription.cancelled, etc). Event definitions catalog. Delivery log feedback loop (F1+F2 closure 2026-05-09) so operators see per-invoice delivery state.
Plus Knowledge Base, Documentation Hub, API Playground, Community Center, Changelog, and Public API Status (covered on the Developer Console page). 11 operational features total, all wired into the same admin console.
From Provisioning to Continuous Assurance
Four layers of governance that operate from day one, with no additional configuration required after initial setup.
Provision with Identity
Connect your existing identity provider via SAML 2.0 or OIDC. Assign RBAC roles, Administrator, Billing Manager, Pricing Editor, Read-Only Auditor, to each team member. Zero shared credentials. Every action is attributed to a named individual.
Enforce Financial Controls
Every pricing change, every refund issued, every subscription modification, and every invoice adjustment is recorded in an immutable audit log. Define approval workflows for high-value operations. Restrict refund authority to designated roles. Maintain a complete chain of custody over revenue-impacting decisions.
Isolate Tenant Data
Each tenant operates within a fully separated data boundary, separate database schemas, separate cache namespaces, separate Kafka event streams. Business units within the same enterprise share administrative tooling without sharing transactional data, billing records, or customer information.
Monitor and Recover
Real-time uptime dashboards track service availability across all endpoints. Incident management provides structured escalation, postmortem documentation, and status page updates. Dead-letter recovery ensures that no billing event is ever permanently lost, failed events are captured, inspected, and replayed.
One Console. Four Stakeholders. Zero Friction.
Security, Finance, Engineering, and SRE each get the controls they need to protect the platform AND keep it running, without stepping on each other. One audit trail. One tenant boundary. One incident console.
Security / CISO
Hand the audit committee a self-serve report.
- 1Quarterly review hits. Audit committee needs full chain of custody on every pricing / refund / billing change for Q1.
- 2Open the Audit Log. Filter by financial ops + date range + actor.
- 3Export to CSV. Every record carries actor + timestamp + previous value + new value + IP.
- 4Auditor walks the log. Review wraps in 2 hours, not 2 weeks of reconstruction.
Finance / Compliance
Approval workflows enforced at the platform layer.
- 1Finance configures: refund > $5K requires VP sign-off. Discount > 15% routes to Finance queue.
- 2Sales submits a $24K refund request. Auto-routes to Finance approval queue.
- 3Finance reviews + approves with comment. Refund issues. Audit trail updated.
- 4SOC 2 evidence: separation of duties enforced by the platform, not by policy memo.
CTO / VP Engineering
Six business units, one platform, zero cross-contamination.
- 1Acme Holdings operates 6 sub-brands on one Aforo deployment.
- 2Each tenant gets its own schema, cache namespace, Kafka topic, dead-letter queue.
- 3Configuration error in Brand A pricing config → contained to Brand A tenant.
- 4System Health dashboard shows per-tenant health independently. Cross-tenant probe returns 404.
SRE / Platform Engineering
Full incident lifecycle in one console.
- 1Uptime Monitor detects 3 consecutive failures on /v1/translate. Status flips to DEGRADED.
- 2PagerDuty integration fires; on-call SRE pages in.
- 3SRE declares Incident; public status page auto-updates with structured timeline.
- 4Event Log + System Health surface root cause (DB connection pool exhausted) in 4 clicks. Postmortem published, customers notified, MTTR captured.
Protect the platform. Keep it running. One console.
Immutable audit trail + RBAC + tenant isolation + encryption on the control side. Uptime monitor + incident manager + event log + system health on the continuity side. Every signal lands in one operations console, with one audit trail covering all of it.