Zoom has become a central platform for meetings, webinars, and collaboration. Developers building meeting bots, analytics tools, or AI assistants often need direct access to meeting streams.
Instead of relying only on finished recordings, they use Zoom raw data from the Meeting SDK. However, raw video streams are not directly usable. Developers must convert them into common formats such as PNG images or MP4 videos.
This guide explains how Zoom raw data works and how to convert it into PNG frames and MP4 videos using practical tools like FFmpeg.
Understanding Zoom Raw Data
Zoom provides access to real-time raw audio and video streams through its SDK. This allows developers to process meeting data programmatically for tasks like transcription, video analytics, or AI meeting assistants.
When you enable raw data access in the Zoom Meeting SDK, the system delivers:
- Video data in I420 / YUV420p format
- Audio data in PCM format
These formats are low-level representations of video and audio. They are efficient for processing but difficult to use directly in typical applications.
For example, raw video frames are delivered as YUV420p, which stores pixel values as separate Y, U, and V planes. Each frame contains numerical pixel data instead of a standard image file format.
Because these frames lack metadata like resolution or file type, developers must specify details such as width and height during conversion.
Why Convert Zoom Raw Data?
Raw Zoom data is powerful but not user-friendly. Converting it into standard formats enables many practical applications.
Common use cases include:
- Creating video recordings from meeting streams
- Capturing frame images for thumbnails or analytics
- Performing video processing with computer vision
- Building AI meeting assistants or transcription tools
By converting raw frames into PNG images or MP4 videos, developers can store, analyze, or display meeting data using widely supported formats.
Tools Required for Conversion
The most commonly used tool for converting Zoom raw data is FFmpeg, a widely used open-source multimedia framework capable of processing audio and video streams.
FFmpeg can:
- Convert raw frames into images
- Encode video files
- Combine image sequences into video
- Process multimedia streams in real time
Because Zoom raw frames come as YUV data, FFmpeg is particularly useful for transforming them into RGB-based formats like PNG or encoding them into MP4 videos.
How to Convert Zoom Raw Data into PNG
The simplest way to convert a raw Zoom frame into an image is by using FFmpeg.
Raw frames captured from the Zoom SDK usually come as YUV420p files. These files represent uncompressed video frames.
You can convert a single frame to PNG using a command like this:
ffmpeg \
-s WIDTHxHEIGHT \
-pix_fmt yuv420p \
-i input.yuv \
-f image2 \
-pix_fmt rgb24 \
output.png
Explanation of Parameters
- -s WIDTHxHEIGHT Defines the resolution of the raw frame.
- -pix_fmt yuv420p Specifies that the input frame uses the YUV420p pixel format.
- -i input.yuv The raw frame input file.
- -pix_fmt rgb24 Converts the frame into RGB color format required for PNG.
Because raw YUV files do not contain headers or metadata, the resolution must always be specified manually.
Result
The output file will be a PNG image representing the captured video frame from the Zoom meeting.
This technique is useful when:
- capturing screenshots from meetings
- building video thumbnails
- performing image analysis on meeting content
How to Convert Zoom Raw Data into MP4
To generate a complete video recording, you need to combine multiple raw frames into a video sequence.
Typically, Zoom bots or applications save raw frames sequentially:
frame-1.yuv
frame-2.yuv
frame-3.yuv
These frames can be encoded into an MP4 video using FFmpeg.
Example command:
ffmpeg \
-framerate 30 \
-s WIDTHxHEIGHT \
-pix_fmt yuv420p \
-pattern_type glob \
-i "*.yuv" \
-c:v libx264 \
output.mp4
Explanation
- -framerate 30 Defines the playback frame rate.
- -pattern_type glob -i “*.yuv” Loads all YUV frames in the directory.
- -c:v libx264 Encodes the video using the H.264 codec.
H.264 encoding is commonly used because it is compatible with the MP4 container format and widely supported across devices and browsers.
Handling Common Conversion Challenges
Developers frequently encounter issues when working with Zoom raw video.
1. Incorrect Frame Dimensions
If width and height are incorrect, the image may appear distorted or corrupted.
Always confirm the frame resolution provided by the SDK.
2. Plane Alignment Issues
Because YUV420p stores Y, U, and V data separately, incorrect handling may produce artifacts such as stripes or color distortion.
Ensure your conversion pipeline processes the planes correctly.
3. Frame Ordering
When creating MP4 videos from frames, filenames must follow sequential ordering. Otherwise, the resulting video may appear out of sequence.
Real-Time Conversion of Zoom Raw Data
The examples above assume you save raw frames to disk before processing them. However, many applications require real-time processing.
In such cases, developers often use streaming media frameworks like GStreamer or integrate FFmpeg directly into their application pipeline.
Real-time conversion is useful for:
- live video analytics
- real-time meeting transcription
- streaming AI processing
- live recording systems
Instead of writing frames to files, applications send raw data directly into the encoding pipeline.
Best Practices for Working with Zoom Raw Data
To ensure reliable processing and video quality, follow these best practices:
Store Raw Frames Efficiently
Raw video data can be large. Use compression or temporary storage pipelines when possible.
Synchronize Audio and Video
Zoom raw streams deliver video and audio separately. Applications must synchronize them before producing final recordings.
Use Hardware Acceleration
For large-scale processing, hardware encoders (GPU-based) significantly improve performance.
Monitor Frame Rate
Maintaining consistent frame timing ensures smooth video output.
Conclusion
Zoom raw data provides powerful access to real-time meeting streams, enabling developers to build advanced meeting applications, AI assistants, and analytics tools. However, raw video frames in formats like YUV420p require conversion before they can be used in standard media workflows.
By using tools like FFmpeg, developers can convert raw frames into PNG images for analysis or MP4 videos for playback and storage. Understanding how to handle raw formats, frame metadata, and encoding pipelines is essential for building reliable video processing systems.
As video collaboration continues to grow, mastering these techniques allows developers to unlock the full potential of Zoom meeting data.