Deep Dive into MP4 File Structure
- When metadata is at the beginning, [mdat] and [moov] swap positions
+---------------------------------------------------------------+
| MP4 File Structure |
| |
| +--------+ +--------+ +--------+ |
| | [ftyp] | ------> | [mdat] | ------> | [moov] | |
| +--------+ +--------+ +--------+ |
| File Type Box Media Data Metadata Container |
| | |
| +---------------------+ |
| | |
| +--------------------------------------------------+ |
| | [moov] | |
| | +-----------------------------+ | |
| | | [mvhd] | | |
| | +-----------------------------+ | |
| | Movie Header Box | |
| | | |
| | +-----------------------------+ | |
| | | [trak] | | |
| | +-----------------------------+ | |
| | Track Container | |
| | / \ | |
| | / \ | |
| | +-------------+ +-------------+ | |
| | | [tkhd] | | [mdia] | | |
| | +-------------+ +-------------+ | |
| | Track Header Media Box | |
| | | | |
| | | | |
| | +------------------+ | |
| | | [minf] | | |
| | +------------------+ | |
| | Media Info Box | |
| | / \ | |
| | / \ | |
| | +-------------+ +-------------+ | |
| | | [vmhd/smhd] | | [stbl] | | |
| | +-------------+ +-------------+ | |
| | Video/Audio Box Sample Table | |
| | | | |
| | +------------------+ | |
| | | | | |
| | | +-------------+ | | |
| | | | [stsd] | | | |
| | | +-------------+ | | |
| | | Sample Desc. | | |
| | | | | |
| | | +-------------+ | | |
| | | | [stts] | | | |
| | | +-------------+ | | |
| | | Time-to-Sample | | |
| | | | | |
| | | +-------------+ | | |
| | | | [stss] | | | |
| | | +-------------+ | | |
| | | Sync Sample | | |
| | | | | |
| | | +-------------+ | | |
| | | | [stsz] | | | |
| | | +-------------+ | | |
| | | Sample Size | | |
| | +------------------+ | |
| +--------------------------------------------------+ |
| |
+---------------------------------------------------------------+
HTTP Range Request Mechanism Explained
The HTTP Range
request is a crucial feature in HTTP/1.1
protocol that enables partial content retrieval, particularly important for streaming large media files like videos.
Range Header Syntax
Range: bytes=<start>-<end>
Video Playback with HTTP-Range Support
1. Initial GET Request (with range)
Chrome Server
+----------------------------+ +--------------------------------------------------+
| GET /video.mp4 HTTP/1.1 | | HTTP/1.1 206 Partial Content |
| Host: cdn.com | <--------> | Accept-Ranges: bytes |
| Range: bytes=0- | | Content-Range: bytes 0-11799707/828908176 |
+--------------------------+ | Content-Length: 827040401 |
| (body: .........) |
+--------------------------------------------------+
2. Seek Operation During Playback
Chrome Server
+--------------------------------+ +-----------------------------------------------------+
| GET /a.mp4 HTTP/1.1 | <--------> | HTTP/1.1 206 Partial Content |
| Host: cdn.com | | Accept-Ranges: bytes |
| Range: bytes=1867776-828908176 | | Content-Range: bytes 1867776-828908176/828908177 |
+--------------------------------+ | Content-Length: 827040400 |
| (body: .........) |
+-----------------------------------------------------+
Video Playback Without HTTP-Range Support
- GET requests enable streaming playback without seek functionality
- Servers must configure Accept-Ranges: none or omit the header
- Browsers fall back to linear playback mode when Range isn’t supported
Metadata Positioning Impact
- moov box placement in MP4 files significantly affects playback:
- Front-positioned metadata (faststart):
- Immediate access to critical video information
- Enables playback without full file download
- Supports random seek operations
- End-positioned metadata:
- Requires full download or Range requests for metadata retrieval
- Previews unavailable before metadata acquisition
- Impacts UX for large files
Video Generation Testing
## Metadata at start
ffmpeg -f lavfi -i testsrc=duration=60:size=1920x1080:rate=30 -c:v libx264 -c:a aac -movflags +faststart meta_at_start.mp4
## Metadata at end
ffmpeg -f lavfi -i testsrc=duration=60:size=1920x1080:rate=30 -c:v libx264 -c:a aac -movflags empty_moov+default_base_moof meta_at_end.mp4
Test Results
- Table order matches image sequence
Video File | Description | Playback Behavior |
---|---|---|
meta_at_start.mp4 | Front metadata | Immediate playback, no seeking (streaming only) |
meta_at_end.mp4 | End metadata | Immediate playback, no seeking (shows 0:08 duration error) |
meta_at_start_with_range.mp4 | Front metadata + range | Full playback control with seeking |
meta_at_end_with_range.mp4 | End metadata + range | Requires complete download for playback control |
HLS and M3U8 Streaming Technology
Beyond HTTP Range-based streaming, HTTP Live Streaming (HLS) using M3U8 playlists offers alternative media delivery:
+--------------------------------------------------------------------------------------+
| M3U8 File Structure |
| |
| Master Playlist (master.m3u8) |
| +--------------------------------------------------------------------------------+ |
| | #EXTM3U | |
| | #EXT-X-VERSION:3 | |
| | #EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080 | |
| | stream_0.m3u8 | |
| | #EXT-X-STREAM-INF:BANDWIDTH=3000000,RESOLUTION=1280x720 | |
| | stream_1.m3u8 | |
| | #EXT-X-STREAM-INF:BANDWIDTH=1000000,RESOLUTION=640x360 | |
| | stream_2.m3u8 | |
| +--------------------------------------------------------------------------------+ |
| | |
| +-------------+-------------------------------------------+ |
| | | | |
| v v v |
| +------------------------+ +------------------------+ +------------------------+ |
| | HD Playlist | | SD Playlist | | LD Playlist | |
| | stream_0.m3u8 | | stream_1.m3u8 | | stream_2.m3u8 | |
| | | | | | | |
| | #EXTM3U | | #EXTM3U | | #EXTM3U | |
| | #EXT-X-VERSION:3 | | #EXT-X-VERSION:3 | | #EXT-X-VERSION:3 | |
| | #EXT-X-TARGETDURATION:6| | #EXT-X-TARGETDURATION:6| | #EXT-X-TARGETDURATION:6| |
| | #EXT-X-MEDIA-SEQUENCE:0| | #EXT-X-MEDIA-SEQUENCE:0| | #EXT-X-MEDIA-SEQUENCE:0| |
| | #EXTINF:6.0, | | #EXTINF:6.0, | | #EXTINF:6.0, | |
| | data000.ts | | data000.ts | | data000.ts | |
| | #EXTINF:6.0, | | #EXTINF:6.0, | | #EXTINF:6.0, | |
| | data001.ts | | data001.ts | | data001.ts | |
| | ... | | ... | | ... | |
| | #EXT-X-ENDLIST | | #EXT-X-ENDLIST | | #EXT-X-ENDLIST | |
| +------------------------+ +------------------------+ +--------------------+ |
| | | | |
| v v v |
| +------------+ +------------+ +------------+ |
| | HD Segments | | SD Segments | | LD Segments | |
| | data000.ts | | data000.ts | | data000.ts | |
| | data001.ts | | data001.ts | | data001.ts | |
| | ... | | ... | | ... | |
| +------------+ +------------+ +------------+ |
| |
+----------------------------------------------------------------------------------------+
-
Video Segmentation: Split video into small .ts files
- Typical segment duration: 6-10 seconds
- Balance between latency and bandwidth adaptation
-
Index File Creation: Generate .m3u8 with segment metadata
- Master Playlist: Multi-bitrate manifest
- Media Playlist: Per-quality segment list
-
Adaptive Bitrate: Multiple quality versions
- HD (1080p): 5Mbps bandwidth
- SD (720p): 3Mbps bandwidth
- LD (360p): 1Mbps bandwidth
-
Client Workflow: Dynamic quality switching
- Network monitoring
- Seamless quality transitions
Comparison with HTTP Range
Aspect | HLS (M3U8) | HTTP Range |
---|---|---|
Format | Segmented .ts files | Single MP4 file |
Pros | • Adaptive bitrate • Broad compatibility • Native live support |
• Low latency • Storage efficiency • Precise seeking |
Cons | • Higher latency • Storage overhead |
• No ABR • Limited compatibility |
Use Cases | • Live streaming • Multi-device |
• VOD • Low-latency needs |
Complexity | High (transcoding system) | Low (server support) |