CASE STUDY

Full-Stack Sports Commentary App with Live Audio Streaming

HearSports

Industry Sports Media / Live Streaming

Region International

Timeline Full-cycle engagement

Team Trembit dedicated engineering team

Streaming

Janus (WebRTC)

Mobile

Swift (iOS)

Backend

Python Django

Realtime

WebSocket

The Problem

HearSports set out to change how fans experience live sports — not by replacing the broadcast, but by letting viewers choose who narrates it. The idea was simple: layer real-time audio commentary tracks on top of live sports feeds, so fans could swap the default broadcast for a friend's play-by-play, a professional analyst in another language, or a community commentator with a following. Nothing on the market did this. Podcast apps had no concept of live sync; streaming platforms locked audio to their own feeds; social audio tools were not built around precise timing against an external broadcast signal. HearSports needed a mobile-first platform where audio commentary stayed locked to the live event — frame-accurate, low-latency, and switchable mid-game — while supporting thousands of concurrent listeners and dozens of simultaneous commentary streams per match.

Why Synchronizing User-Generated Audio with Live Sports Broadcasts Is Hard

Layering real-time commentary onto a live broadcast creates a deceptively difficult set of problems at the intersection of live streaming, social platforms, and broadcast-grade timing:

Frame-accurate audio-to-broadcast sync — commentary must align precisely with the live event timeline; even a two-second drift makes a goal call arrive before or after the viewer sees it, destroying the experience
Variable broadcast delay — different viewers receive the same broadcast at different times depending on their provider, device, and network; the sync layer must compensate per listener
Multi-stream concurrency — a single match may have dozens of active commentary tracks from pros, community creators, and friends, all streaming simultaneously to overlapping audiences
Low-latency mobile delivery — fans watch on phones over cellular networks with fluctuating bandwidth; audio must arrive with minimal latency while gracefully handling drops and reconnects
User-generated content at scale — anyone can start a commentary stream, so the platform must handle unpredictable ingest volumes, moderate content in near-real-time, and manage stream quality without manual intervention
Social discovery and sharing — community-shared clips, social playback, and multi-language browsing require a content layer as responsive as the streaming layer

What We Did

Streaming Architecture & Sync Engine

Designed the real-time audio streaming infrastructure around Janus WebRTC — optimized for audio-only delivery with minimal overhead, supporting dozens of concurrent commentary streams per live event
Built the broadcast sync engine — a timestamp-alignment system that maps commentary audio to the live event timeline, letting listeners hear narration locked to what is on screen regardless of their individual broadcast delay
Established WebSocket channels for real-time signaling, stream switching, and sync offset negotiation between the mobile client and backend, with adaptive audio bitrate handling across cellular and Wi-Fi

Mobile Application & Listening Experience

Developed the native iOS application in Swift — built around a seamless listening interface where fans browse live commentary tracks, preview commentators, and switch streams mid-game without interruption
Built the stream-switching UX — listeners hop between commentary tracks (different languages, styles, friends vs. pros) with crossfade transitions and automatic sync re-alignment
Implemented per-listener broadcast delay calibration — each user sets their offset once and the platform compensates automatically so commentary stays aligned with their specific TV or streaming feed

Creator Tools & Community Platform

Built the commentator ingest pipeline — anyone can start a live commentary stream from their phone, with automatic audio normalization, quality checks, and metadata tagging for discoverability
Developed the community clip system — listeners and commentators mark highlights, clip audio segments, and share them as social playback snippets tied to specific match moments
Implemented multi-language commentary support with language filtering and recommendation logic, plus moderation tooling for user-generated streams

Scalability & Performance Optimization

Deployed the Python/Django backend handling user management, content catalog, social graph, clip storage, and API orchestration across all platform features
Load-tested the streaming infrastructure under peak match-day scenarios — thousands of concurrent listeners across dozens of simultaneous commentary streams
Optimized Janus for audio-only forwarding and implemented reconnection and buffer management to handle mobile network transitions without losing sync or interrupting playback

Discuss Your Project

Key Results

Frame-accurate Commentary locked to the live event timeline with per-listener delay calibration

Dozens of live tracks Multiple commentators streaming to thousands of concurrent listeners per match

Sub-second audio Latency maintained across cellular and Wi-Fi with adaptive bitrate

User-gen + pro Open commentary creation with moderation and multi-language support

Full-cycle iOS app, streaming infrastructure, backend, and community platform

In Their Words

Trembit built us a streaming platform that solves a problem no one else was tackling — letting fans choose their own commentator during a live match. The sync engine is what makes the whole thing work.

HearSports product stakeholder

Their proactive team gets things done as if it were their own project.

Trembit client

What We Learned

Audio sync against an external broadcast is harder than syncing within your own platform

When you control both sides of the stream, you set the clock. When the video feed comes from a third-party broadcaster with variable delay per viewer, your sync engine has to work backward — measuring each listener's offset and adjusting commentary delivery accordingly. We built a calibration flow where users tap a button when they see a reference event (kickoff, whistle), and the platform calculates their specific delay. Getting it reliable across providers, devices, and networks was one of the hardest pieces.

Audio-only streaming is not just "video streaming minus the video"

Stripping the video track does not automatically make things simpler. Audio is more sensitive to jitter and packet loss — listeners notice a 50ms glitch they would never catch in a video frame. We tuned Janus specifically for audio forwarding, adjusting buffer sizes, jitter compensation, and packet loss concealment for speech-like content rather than music.

Social features and streaming must share the same real-time backbone

Clip sharing, live listening activity, and follow counts all need to update in real time during a match. We built the social layer on the same WebSocket infrastructure that handles stream signaling — so when a listener clips a highlight, it propagates to followers instantly rather than through a polling feed that updates minutes later. During a live match, minutes might as well be hours.

Need a Live Audio Platform?

Book a 30-minute architecture session — we'll discuss your streaming requirements and the infrastructure decisions that matter most. No pitch deck. Just engineering clarity.