Cloud-Hybrid Video Surveillance Architecture

Resilient surveillance architectures that bridge on-premise recording with cloud-managed operations and disaster recovery.

AWSAzureTerraformRTSPONVIFS3WebRTCGoDockerWireGuard

Cloud-Hybrid Video Surveillance Architecture

Executive Summary

Our cloud-hybrid surveillance architecture gives multi-site organizations centralized visibility and management without the single-point-of-failure risk of pure-cloud systems. Each site records locally for guaranteed continuity, while cloud services handle health monitoring, remote access, firmware management, and long-term archival. Clients consistently reduce total cost of ownership by 30-40% compared to traditional on-premise-only deployments by eliminating dedicated NVR hardware refresh cycles and leveraging elastic cloud storage tiering.

The Challenge

Organizations with distributed footprints—retail chains, school districts, logistics hubs—face a dilemma. Pure on-premise VMS deployments create management silos: each site runs its own recording infrastructure with local credentials, independent firmware versions, and no centralized health monitoring. When a recorder fails at a remote site, it can go unnoticed for days until someone physically checks. Scaling requires hardware procurement, racking, and on-site configuration visits that cost thousands per location.

Pure cloud surveillance solves the management problem but introduces existential reliability risk. When the WAN link goes down—and at remote sites, it will—cameras stop recording. Cloud-only systems also require sustained upstream bandwidth that many sites cannot provide: a single 4K camera at 15 fps generates 12-16 Mbps, and a 64-camera site would need over 500 Mbps of dedicated upload capacity. Most branch offices operate on 50-100 Mbps connections shared with business operations.

Compliance requirements add further constraints. Healthcare and financial services organizations must retain footage for defined periods (often 90-180 days) with tamper-evident chain of custody. Insurance underwriters increasingly require proof of continuous recording as a condition of coverage. Any architecture must guarantee uninterrupted local recording, verifiable retention compliance, and secure remote access without exposing systems directly to the public internet.

Our Approach

We deploy a split-brain architecture where each site runs a lightweight Linux-based recording appliance (built on Ubuntu Server with ZFS storage pools) that handles all local RTSP ingestion and recording independently of cloud connectivity. Cameras are segmented onto a dedicated VLAN with no default gateway, ensuring they are never directly internet-accessible. The recording appliance maintains a persistent WireGuard tunnel to the cloud management plane, through which it reports health telemetry, receives configuration updates, and serves remote video requests.

The cloud layer, deployed on AWS or Azure via Terraform, provides centralized services: device health dashboards, alert aggregation, user identity management (integrated with Azure AD or Okta via SAML 2.0), and a WebRTC-based remote viewing gateway. When an operator requests live or recorded video, the cloud gateway establishes a WebRTC session that routes media through the WireGuard tunnel, delivering sub-second live view latency without requiring port-forwarding or VPN client software on the viewer's device.

Storage tiering is policy-driven. Hot footage (0-30 days) lives on local NVMe/SSD storage with immediate playback. Warm footage (30-90 days) replicates to S3 Intelligent-Tiering or Azure Blob Cool tier during off-peak hours. Cold footage (90-365 days) transitions to S3 Glacier Deep Archive or Azure Archive, reducing per-TB storage costs by 90% compared to on-premise disk. A metadata index in PostgreSQL maps every recording segment to its storage location, enabling seamless playback regardless of tier—the system transparently retrieves and transcodes archived footage on demand.

Key Capabilities

Site-Autonomous Recording

Each location records independently on local storage with ZFS checksumming and RAID-Z2 redundancy, guaranteeing uninterrupted capture during WAN outages of any duration without data loss.

WebRTC Remote Viewing

Browser-based live and playback access via WebRTC delivers sub-500ms latency without requiring VPN clients, browser plugins, or inbound firewall rules at remote sites.

Elastic Cloud Storage Tiering

Policy-driven replication moves aging footage through hot, warm, and cold storage tiers across AWS S3 or Azure Blob, cutting long-term retention costs by up to 90% while maintaining on-demand retrieval.

Centralized Fleet Management

A single cloud dashboard provides firmware versioning, health monitoring, storage utilization tracking, and configuration management across hundreds of sites with role-based access control tied to enterprise identity providers.

Technical Architecture

The local recording appliance runs a custom Go service that manages RTSP session lifecycle for each camera. It discovers cameras via ONVIF Profile S device discovery (WS-Discovery), negotiates media profiles for optimal resolution/framerate combinations, and maintains persistent RTSP sessions with automatic reconnection and session timeout handling. Video is recorded in MP4 fragmented format (fMP4) with 2-second GOP alignment, enabling instant random-access playback without index file dependencies. Each fragment is integrity-stamped with a SHA-256 hash chain, providing tamper-evidence for evidentiary use.

The WireGuard tunnel configuration uses 256-bit Curve25519 keys with PresharedKey enabled for post-quantum resistance readiness. Tunnel health is monitored via a custom keepalive protocol that distinguishes between tunnel-down and peer-unreachable states, triggering automatic failover to a secondary cloud endpoint if the primary becomes unavailable. All management API traffic traverses the tunnel using mTLS with client certificates issued from an internal PKI (step-ca), ensuring that even if the WireGuard layer were compromised, API authentication would remain intact.

Cloud-side video retrieval from archived tiers uses a prefetch strategy: when an operator navigates to a time range stored in Glacier or Azure Archive, the system initiates an expedited retrieval (1-5 minute restore for Glacier Expedited, 1 hour for Azure Archive rehydration) and streams the first available frames while the full segment restores. Transcoding from the archived H.264/H.265 format to WebRTC-compatible VP8/H.264 Baseline is handled by FFmpeg instances running on autoscaled ECS Fargate tasks, scaling from zero during off-hours to minimize costs.

Specifications & Standards

Camera Protocol: ONVIF Profile S, RTSP/RTP, H.264/H.265
Tunnel Encryption: WireGuard (Curve25519, ChaCha20-Poly1305)
Recording Format: fMP4, 2s GOP, SHA-256 hash chain
Cloud Platforms: AWS (S3, ECS Fargate, RDS) / Azure (Blob, ACI, PgSQL)
Remote Viewing: WebRTC (VP8/H.264 Baseline), < 500 ms latency
Storage Retention: Hot 30d / Warm 90d / Cold 365d (policy-configurable)

Integration Ecosystem

AWS S3 / GlacierAzure Blob StorageWireGuard VPNTerraform / OpenTofuONVIF Profile SAzure AD / Okta (SAML 2.0)Grafana / PrometheusFFmpeg

Measurable Outcomes

35% reduction in total cost of ownership

A 120-site retail chain eliminated dedicated NVR appliance refresh cycles and reduced per-site infrastructure costs from $18,000 to $6,500 by shifting to lightweight recording nodes with cloud-tiered storage.

99.97% recording uptime across 120 sites

Site-autonomous recording maintained continuous capture through 14 WAN outages (longest: 38 hours) during the first year of operation, with zero footage gaps detected in quarterly compliance audits.

60% faster incident response for remote sites

WebRTC-based remote viewing eliminated the 15-20 minute VPN connection process previously required for remote video access, enabling operators to view live feeds from any site within seconds of an alert.