Capability Use Case
Cloud-Hybrid Video Surveillance Architecture
Resilient surveillance architectures that bridge on-premise recording with cloud-managed operations and disaster recovery.
Executive Summary
Our cloud-hybrid surveillance architecture gives multi-site organizations centralized visibility and management without the single-point-of-failure risk of pure-cloud systems. Each site records locally for guaranteed continuity, while cloud services handle health monitoring, remote access, firmware management, and long-term archival. Clients consistently reduce total cost of ownership by 30-40% compared to traditional on-premise-only deployments by eliminating dedicated NVR hardware refresh cycles and leveraging elastic cloud storage tiering.
The Challenge
Organizations with distributed footprints—retail chains, school districts, logistics hubs—face a dilemma. Pure on-premise VMS deployments create management silos: each site runs its own recording infrastructure with local credentials, independent firmware versions, and no centralized health monitoring. When a recorder fails at a remote site, it can go unnoticed for days until someone physically checks. Scaling requires hardware procurement, racking, and on-site configuration visits that cost thousands per location.
Pure cloud surveillance solves the management problem but introduces existential reliability risk. When the WAN link goes down—and at remote sites, it will—cameras stop recording. Cloud-only systems also require sustained upstream bandwidth that many sites cannot provide: a single 4K camera at 15 fps generates 12-16 Mbps, and a 64-camera site would need over 500 Mbps of dedicated upload capacity. Most branch offices operate on 50-100 Mbps connections shared with business operations.
Compliance requirements add further constraints. Healthcare and financial services organizations must retain footage for defined periods (often 90-180 days) with tamper-evident chain of custody. Insurance underwriters increasingly require proof of continuous recording as a condition of coverage. Any architecture must guarantee uninterrupted local recording, verifiable retention compliance, and secure remote access without exposing systems directly to the public internet.
Our Approach
We deploy a split-brain architecture where each site runs a lightweight Linux-based recording appliance (built on Ubuntu Server with ZFS storage pools) that handles all local RTSP ingestion and recording independently of cloud connectivity. Cameras are segmented onto a dedicated VLAN with no default gateway, ensuring they are never directly internet-accessible. The recording appliance maintains a persistent WireGuard tunnel to the cloud management plane, through which it reports health telemetry, receives configuration updates, and serves remote video requests.
The cloud layer, deployed on AWS or Azure via Terraform, provides centralized services: device health dashboards, alert aggregation, user identity management (integrated with Azure AD or Okta via SAML 2.0), and a WebRTC-based remote viewing gateway. When an operator requests live or recorded video, the cloud gateway establishes a WebRTC session that routes media through the WireGuard tunnel, delivering sub-second live view latency without requiring port-forwarding or VPN client software on the viewer's device.
Storage tiering is policy-driven. Hot footage (0-30 days) lives on local NVMe/SSD storage with immediate playback. Warm footage (30-90 days) replicates to S3 Intelligent-Tiering or Azure Blob Cool tier during off-peak hours. Cold footage (90-365 days) transitions to S3 Glacier Deep Archive or Azure Archive, reducing per-TB storage costs by 90% compared to on-premise disk. A metadata index in PostgreSQL maps every recording segment to its storage location, enabling seamless playback regardless of tier—the system transparently retrieves and transcodes archived footage on demand.
Key Capabilities
Site-Autonomous Recording
Each location records independently on local storage with ZFS checksumming and RAID-Z2 redundancy, guaranteeing uninterrupted capture during WAN outages of any duration without data loss.
WebRTC Remote Viewing
Browser-based live and playback access via WebRTC delivers sub-500ms latency without requiring VPN clients, browser plugins, or inbound firewall rules at remote sites.
Elastic Cloud Storage Tiering
Policy-driven replication moves aging footage through hot, warm, and cold storage tiers across AWS S3 or Azure Blob, cutting long-term retention costs by up to 90% while maintaining on-demand retrieval.
Centralized Fleet Management
A single cloud dashboard provides firmware versioning, health monitoring, storage utilization tracking, and configuration management across hundreds of sites with role-based access control tied to enterprise identity providers.
Technical Architecture
The local recording appliance runs a custom Go service that manages RTSP session lifecycle for each camera. It discovers cameras via ONVIF Profile S device discovery (WS-Discovery), negotiates media profiles for optimal resolution/framerate combinations, and maintains persistent RTSP sessions with automatic reconnection and session timeout handling. Video is recorded in MP4 fragmented format (fMP4) with 2-second GOP alignment, enabling instant random-access playback without index file dependencies. Each fragment is integrity-stamped with a SHA-256 hash chain, providing tamper-evidence for evidentiary use.
The WireGuard tunnel configuration uses 256-bit Curve25519 keys with PresharedKey enabled for post-quantum resistance readiness. Tunnel health is monitored via a custom keepalive protocol that distinguishes between tunnel-down and peer-unreachable states, triggering automatic failover to a secondary cloud endpoint if the primary becomes unavailable. All management API traffic traverses the tunnel using mTLS with client certificates issued from an internal PKI (step-ca), ensuring that even if the WireGuard layer were compromised, API authentication would remain intact.
Cloud-side video retrieval from archived tiers uses a prefetch strategy: when an operator navigates to a time range stored in Glacier or Azure Archive, the system initiates an expedited retrieval (1-5 minute restore for Glacier Expedited, 1 hour for Azure Archive rehydration) and streams the first available frames while the full segment restores. Transcoding from the archived H.264/H.265 format to WebRTC-compatible VP8/H.264 Baseline is handled by FFmpeg instances running on autoscaled ECS Fargate tasks, scaling from zero during off-hours to minimize costs.
Specifications & Standards
- Camera Protocol
- ONVIF Profile S, RTSP/RTP, H.264/H.265
- Tunnel Encryption
- WireGuard (Curve25519, ChaCha20-Poly1305)
- Recording Format
- fMP4, 2s GOP, SHA-256 hash chain
- Cloud Platforms
- AWS (S3, ECS Fargate, RDS) / Azure (Blob, ACI, PgSQL)
- Remote Viewing
- WebRTC (VP8/H.264 Baseline), < 500 ms latency
- Storage Retention
- Hot 30d / Warm 90d / Cold 365d (policy-configurable)