Table of Contents

H.264 encoding guide

This article describes briefly what H.264 is and how to get H.264 encoding support for Avidemux. It also summarizes and explains the x264 options available in Avidemux. This can be considered a (simple) guide to the encoder.

Overview

H.264, which is also known as “MPEG-4 Part 10” or “MPEG-4 Advanced Video Coding” (AVC), is a digital video compression standard, which is noted for achieving very high data compression. While H.264 generally requires more CPU power for playback than video encoded with the older MPEG-4(nbsp)Part(nbsp)2 standard (as used by Xvid or DivX), the compression efficiency is much better! That means: With H.264/AVC you can get a significantly better quality at the same file size -or- you can get the same quality at a significantly smaller file size (compared to MPEG-4(nbsp)ASP). While H.264 compresses much more efficient than MPEG-4(nbsp)Part(nbsp)2, the advantage over MPEG-2 is even greater.

More detailed information about H.264 can be found in the corresponding Wikipedia article. A comparison of various H.264 encoders against MPEG-4 Part-2, MPEG-2 and other video formats can be found at http://mirror05.x264.nl/Dark/website/compare.html.

x264 introduction

While Avidemux uses “built-in” libavcodec from FFmpeg for H.264 decoding, it needs an additional (external) library for H.264 encoding. Therefore Avidemux uses x264. x264 is a free library for encoding H.264/AVC video streams. The code is written from scratch by Laurent Aimar, Loren Merritt, Eric Petit (OS X), Min Chen (VfW/asm), Justin Clay (VfW), Måns Rullgård, Radek Czyz, Christian Heine (asm), Alex Izvorski (asm), and Alex Wright. It is released under the terms of the GPL license. So to clarify, the encoder library is called x264 while the compression standard it uses is called H.264 (or MPEG-4 AVC). In other words: The x264 encoder software creates H.264/AVC video. It should be noted that x264 while being “free” software can compete with commercial H.264 encoders in terms of quality and speed. Major companies in the video business, such as Youtube and Facebook, are known to use the x264 encoder.

Get x264 for Avidemux

If x264 is not available in your version of Avidemux, there is a guide on how to download and compile x264 by yourself. It is in the Compiling x264 section.

After you compile x264, you will have to re-compile Avidemux to build in the x264 feature. There is also a guide on how to do this in the Compiling Avidemux section.

Note that if you are using the pre-compiled Avidemux builds for Microsoft Windows, the required x264 library ships with the installer. Hence no additional software is required! Stuff like “Codec Packs”, “VFW Codecs” or “DirectShow Filters” will not work with Avidemux! Anyway, the latest builds of the x264 library for Avidemux can be found in the libx264 GIT builds thread (make sure you navigate to the very last post!). These builds usually are newer – and less tested – than the ones that ships with Avidemux.

x264 options available in Avidemux

Avidemux contains most of the options available in the x264 library. For options not yet available, see the “Unavailable” section in this article.

General

Rate Control

Macroblock-Tree Rate Control

Multi-Threading

Motion

Motion Estimation

Motion Vector

Prediction

Partition

Frame

Frame Encoding

B-Frames

I-Frames

Analysis

Analysis Configuration

Psycho-visually optimized RDO & Trellis

The human eye doesn't just want the image to look similar to the original, it wants the image to have similar complexity. Therefore, we would rather see a somewhat distorted but still detailed block than a non-distorted but completely blurred block. The result is a bias towards a detailed and/or grainy output image, a bit like xvid except that its actual detail rather than ugly blocking (see http://x264dev.multimedia.cx/?p=164 and http://forum.doom9.org/showpost.php?p=1144270&postcount=1 for more info). The purpose of Psy RDO is to keep the complexity of an encoded block similar to the complexity of the original block. This way Psy RDO produces an image that looks much sharper and more detailed in many cases (compared to none Psy RDO). It also helps to preserve film grain greatly! Please note that Psy RDO will inherently hurt metrics, such as PSNR and SSIM. As soon as psycho-visual optimizations are involved, the classical metrics become useless! Also note that Psy RDO will work with RDO modes only: If Partition Decision is set to 6 (or higher), then Psy RDO will be on by default, otherwise it will be disabled. In addition to Psy RDO there also is Psy-Trellis now. This is still considered an “experimental” feature and disable by default, but it seems to help greatly for retaining textures in the video. Note that Psy Trellis is based on Trellis quantization. Consequently it will only be effective with Trellis quantization enabled too (Trellis 1 is sufficient now, but 2 will be more effective).

Luma Quantization Deadzones

Quantization Matrix

Quantizer

Quantizer Control

Quantizer Curve Compression

Adaptive Quantization

Adaptive Quantization (AQ) allows each macroblock within the frame to choose a different quantizer, instead of assigning the same quantizer to all blocks within the frame. The purpose of AQ is moving more bits into “flat” macroblocks. This is done by adaptively lowering the quantizers of certain blocks (and raising the quantizers of other blocks). Without AQ, flat and dark areas of the image tend to show ugly blocking or banding. Thanks to the new AQ algorithm, blocking and banding can be greatly reduced! With AQ enabled, you can expect a significant(!) gain in overall image quality. Especially in dark scenes and scenes with “flat” backgrounds (sky, grass, walls, etc.) much more details can be preserved. Nevertheless AQ seems to perform less efficient with “Animation” material than it does with “Film” material, but still helps to prevent banding. Note that AQ can be used with the bitrate-base modes (Single-Pass and Two-Pass) as well as with the CRF mode. It can not be used with the QP mode! That's because QP mode uses constant quantizers per definition, which is one of the reasons why QP mode generally should be avoided nowadays.

Advanced

Video Buffer Verifier

VBV (Video Buffering Verifier) defines a specific buffering model. In that model the decoder (player) reads the input data from a buffer. That buffer has a limited size. Also the buffer is filled at a limited data rate. VBV makes sure that the buffer will never run out of data, i.e. it makes sure that there is always enough data left in the buffer to decode the next frame. Therefore VBV forces additional bitrate and buffering constraints on the encoder. It's highly recommended to not use VBV, unless you can't get around it. VBV may hurt the video quality, but it never will improve the quality! Unfortunately hardware players (including mobile devices) may need VBV for proper playback. You will have to look up the particular VBV requirements for each device individually. In particular BluRay has strict VBV requirements. Note that x264's VBV implementation now works just fine with both, 1-Pass and 2-Pass modes. There's no need to use 2-Pass mode for VBV anymore. (See http://en.wikipedia.org/wiki/Video_buffering_verifier for details about VBV)

Slicing

H.264 allows to segment each frame into several parts. These parts are called “slices”. The advantage of using multiple slices (per frame) is that the slices can be processed independently and in parallel. This allows easy multi-threading implementations in H.264 encoders and decoders. Unfortunately using multiple slices hurts compression efficiency! The more slices are used the worse! Therefore you should not use slices, if you don't have to. But if your H.264 decoder uses slice-based multi-threading (i.e. multiple slices are decoded in parallel), then multi-threading will only be used, if the video was encoded with multiple slices. Fortunately most software decoders do not require slices, because they use frame-based multi-threading (i.e. multiple frames are decoded in parallel). Hardware decoders may require slices though. In particular the Blu-ray specs say that at least 4 slices must be used.

Zones

Zones can be used to manually assign a lower or higher bitrate to a certain section of the video (e.g. enforce a lower bitrate for the ending credits). There are two modes to control the bitrate of a zone: Using a “Bitrate Factor” you can change the bitrate relative to the encoders decision and using a “Quantizer” you can overwrite the encoders decision with a constant quantizer value.

Output

Output Settings

Pixel Aspect Ratio

This setting defines the “Pixel Aspect Ratio” (PAR) of the video. Do not change the default value of 1:1 (aka “Square Pixels”), unless you are encoding anamorphic video! In case you are encoding anamorphic material and you want to keep it anamorphic, then you will have to set the correct PAR value. Otherwise your video would be displayed with wrong aspect ratio! If you have an anamorphic source and you want to convert it to “Square Pixels” (PAR = 1:1), then you must invoke the Resize filter and resize the video accordingly. Note that “Pixel Aspect Ratio” is not equal to “Display Aspect Ratio” (DAR). Anyway, the DAR can be calculated from the PAR using this formula: DAR = Width/Height * PAR. For example: 720/576 * 64/45 = 16/9. The advantage of working with PAR values is that the PAR of a video won't change when cropping the video, while the DAR most likely will change. The following PAR options are available:

Video Usability Information

These settings are only suggestions for the playback equipment. Use them at your own risk!

Unavailable x264 options in Avidemux

Obsolete x264 options

H.264/AVC Profiles and Levels

The H.264/AVC specifications define a number of different profiles. Each profile specifies which features of H.264 are allowed (or not allowed). If you want your H.264 video stream to be compliant to a certain profile, then you may only enabled features allowed in this profile. Profiles are needed to make sure your video file will play fine on a certain decoder. For example a “Main” profile compliant video will play 100% fine on every “Main” profile capable decoder/player. When working with the x264 encoder, there are basically two profiles you have to take care of: the “Main” profile and the “High” profile. Nevertheless x264 is missing the Error Resilience feature from the “Baseline Profile” as well as the Interlacing Support from “Extended Profile”. If you want to play your video on software players, then you don't need to care about profiles that much. The H.264 decoder from “libavcodec”, which is used in MPlayer, VLC Player, ffdshow and many more, supports all of x264' features, including the “High” and “Predictive Lossless” profile features. Same for proprietary decoders, such as CoreAVC. Nevertheless if you are targeting a hardware player, then profiles are very important, as hardware players are very restrictive on what profile they support.

In addition to the profiles, the H.264/AVC specifications also define a number of levels. While profiles define which compression features of H.264 may (or may not) be used, the levels put further restrictions on other properties of the video. These restrictions include the maximum resolution, the maximum bitrate, the maximum framerate (for a given resolution) and the maximum number of reference frames (indirectly limited though MaxDPB). In order play your H.264 video on a specific hardware player, that player must not only support your videos profile, but also your video's level (or a higher one). Again software players usually don't have such restrictions, as long as you CPU is powerful enough.

Note: The common notation for Profiles and Levels is “Profile@Level”, for example High@4.1. Furthermore there is no way to directly encode your video to a specific level and/or profile. If you want your video to comply to a certain profile/level, you must choose the encoder settings accordingly. Presets may be helpful to find the correct settings. Anyway, it may still be necessary to resize your video and/or change the framerate.

List of all H.264/AVC Profiles

Baseline Extended Main High High 10 High 4:2:2 High 4:4:4 Predictive
I and P Slices YES YES YES YES YES YES YES
B Slices NO YES YES YES YES YES YES
SI and SP Slices NO YES NO NO NO NO NO
Multiple Reference Frames YES YES YES YES YES YES YES
In-Loop Deblocking Filter YES YES YES YES YES YES YES
CAVLC Entropy Coding YES YES YES YES YES YES YES
CABAC Entropy Coding NO NO YES YES YES YES YES
Flexible Macroblock Ordering (FMO) YES YES NO NO NO NO NO
Arbitrary Slice Ordering (ASO) YES YES NO NO NO NO NO
Redundant Slices (RS) YES YES NO NO NO NO NO
Data Partitioning NO YES NO NO NO NO NO
Interlaced Coding (PicAFF, MBAFF) NO YES YES YES YES YES YES
4:2:0 Chroma Format YES YES YES YES YES YES YES
Monochrome Video Format (4:0:0) NO NO NO YES YES YES YES
4:2:2 Chroma Format NO NO NO NO NO YES YES
4:4:4 Chroma Format NO NO NO NO NO NO YES
8 Bit Sample Depth YES YES YES YES YES YES YES
9 and 10 Bit Sample Depth NO NO NO NO YES YES YES
11 to 14 Bit Sample Depth NO NO NO NO NO NO YES
8×8 vs. 4×4 Transform Adaptivity NO NO NO YES YES YES YES
Quantization Scaling Matrices NO NO NO YES YES YES YES
Separate Cb and Cr QP control NO NO NO YES YES YES YES
Separate Color Plane Coding NO NO NO NO NO NO YES
Predictive Lossless Coding NO NO NO NO NO NO YES
Baseline Extended Main High High 10 High 4:2:2 High 4:4:4 Predictive

From Wikipedia, the free encyclopedia

List of all H.264/AVC Levels

Level number Max macroblocks per second Max frame size (macroblocks) Max video bit rate (VCL) for Baseline, Extended and Main Profiles Max video bit rate (VCL) for High Profile Max video bit rate (VCL) for High 10 Profile Max video bit rate (VCL) for High 4:2:2 and High 4:4:4 Predictive Profiles Examples for high resolution @ frame rate (max stored frames) in Level
1 1485 99 64(nbsp)kbit/s 80(nbsp)kbit/s 192(nbsp)kbit/s 256(nbsp)kbit/s 128×96@30.9(nbsp)(8) 176×144@15.0(nbsp)(4)
1b 1485 99 128(nbsp)kbit/s 160(nbsp)kbit/s 384(nbsp)kbit/s 512(nbsp)kbit/s 128×96@30.9(nbsp)(8) 176×144@15.0(nbsp)(4)
1.1 3000 396 192(nbsp)kbit/s 240(nbsp)kbit/s 576(nbsp)kbit/s 768(nbsp)kbit/s 176×144@30.3(nbsp)(9) 320×240@10.0(nbsp)(3) 352×288@7.5(nbsp)(2)
1.2 6000 396 384(nbsp)kbit/s 480(nbsp)kbit/s 1152(nbsp)kbit/s 1536(nbsp)kbit/s 320×240@20.0(nbsp)(7) 352×288@15.2(nbsp)(6)
1.3 11880 396 768(nbsp)kbit/s 960(nbsp)kbit/s 2304(nbsp)kbit/s 3072(nbsp)kbit/s 320×240@36.0(nbsp)(7) 352×288@30.0(nbsp)(6)
2 11880 396 2(nbsp)Mbit/s 2.5(nbsp)Mbit/s 6(nbsp)Mbit/s 8(nbsp)Mbit/s 320×240@36.0(nbsp)(7) 352×288@30.0(nbsp)(6)
2.1 19800 792 4(nbsp)Mbit/s 5(nbsp)Mbit/s 12(nbsp)Mbit/s 16(nbsp)Mbit/s 352×480@30.0(nbsp)(7) 352×576@25.0(nbsp)(6)
2.2 20250 1620 4(nbsp)Mbit/s 5(nbsp)Mbit/s 12(nbsp)Mbit/s 16(nbsp)Mbit/s 352×480@30.7(nbsp)(10) 352×576@25.6(nbsp)(7) 720×480@15.0(nbsp)(6) 720×576@12.5(nbsp)(5)
3 40500 1620 10(nbsp)Mbit/s 12.5(nbsp)Mbit/s 30(nbsp)Mbit/s 40(nbsp)Mbit/s 352×480@61.4(nbsp)(12) 352×576@51.1(nbsp)(10) 720×480@30.0(nbsp)(6) 720×576@25.0(nbsp)(5)
3.1 108000 3600 14(nbsp)Mbit/s 14(nbsp)Mbit/s 42(nbsp)Mbit/s 56(nbsp)Mbit/s 720×480@80.0(nbsp)(13) 720×576@66.7(nbsp)(11) 1280×720@30.0(nbsp)(5)
3.2 216000 5120 20(nbsp)Mbit/s 25(nbsp)Mbit/s 60(nbsp)Mbit/s 80(nbsp)Mbit/s 1280×720@60.0(nbsp)(5) 1280×1024@42.2(nbsp)(4)
4 245760 8192 20(nbsp)Mbit/s 25(nbsp)Mbit/s 60(nbsp)Mbit/s 80(nbsp)Mbit/s 1280×720@68.3(nbsp)(9) 1920×1080@30.1(nbsp)(4) 2048×1024@30.0(nbsp)(4)
4.1 245760 8192 50(nbsp)Mbit/s 62.5(nbsp)Mbit/s 150(nbsp)Mbit/s 200(nbsp)Mbit/s 1280×720@68.3(nbsp)(9) 1920×1080@30.1(nbsp)(4) 2048×1024@30.0(nbsp)(4)
4.2 522240 8704 50(nbsp)Mbit/s 62.5(nbsp)Mbit/s 150(nbsp)Mbit/s 200(nbsp)Mbit/s 1920×1080@64.0(nbsp)(4) 2048×1080@60.0(nbsp)(4)
5 589824 22080 135(nbsp)Mbit/s 168.75(nbsp)Mbit/s 405(nbsp)Mbit/s 540(nbsp)Mbit/s 1920×1080@72.3 (13) 2048×1024@72.0 (13) 2048×1080@67.8 (12) 2560×1920@30.7 (5) 3680×1536@26.7(nbsp)(5)
5.1 983040 36864 240(nbsp)Mbit/s 300(nbsp)Mbit/s 720(nbsp)Mbit/s 960(nbsp)Mbit/s 1920×1080@120.5 (16) 4096×2048@30.0 (5) 4096×2304@26.7 (5)
Level number Max macroblocks per second Max frame size (macroblocks) Max video bit rate (VCL) for Baseline, Extended and Main Profiles Max video bit rate (VCL) for High Profile Max video bit rate (VCL) for High 10 Profile Max video bit rate (VCL) for High 4:2:2 and High 4:4:4 Predictive Profiles Examples for high resolution @ frame rate (max stored frames) in Level

From Wikipedia, the free encyclopedia

For more detailed information, please refer to “Annex A” in the official ITU-T H.264 specifications!

GPU support

Since GPGPU has become a hot topic, people began asking for GPU support in Avidemux. These people need to understand that Avidemux cannot offer GPU support for H.264 encoding, until GPU support is implemented in the x264 library. There is a project scheduled to add CUDA support to x264 (see http://wiki.videolan.org/SoC_x264_2009#GPU_Motion_Estimation), but there are no results yet (May 2009). We know that there are commercial H.264 encoders with GPU support available already. But if you look at these encoders closely, you will notice that their speed-up claims are marketing blabber. These encoders may be fast, but their quality isn't anywhere near x264's quality! Also note that marketing people tend to compare their encoders to the completely unoptimized H.264 Reference Encoder. x264 is faster than the reference encoder by several orders of magnitude, which renders these speed comparisons meaningless. x264 can run extremely fast on a CPU and scales up to at least 16 cores. So don't believe everything that marketing people claim!

IDR-frames

IDR frames are: An IDR frame is what has been traditionally known as an I frame. An IDR frame, just like an I frame in MPEG-1/2 and MPEG-4 ASP, starts with a clean slate, and all subsequent frames will make reference to the IDR frame and subsequent frames. Non IDR I frames should be rare, but since they cannot be ruled out, enforcing a minimal IDR interval can help improve compression in some high motion scenes. In H.264/AVC you can also have I frames inside a GOP, which are not seekable, since the long time references introduced in H.264/AVC could result in a P frame after the I frame to reference a P frame before the I frame.

Max IDR-keyframe interval indicates the maximum distance between two IDR frames. Similarly, Min IDR-keyframe interval indicates the minimum distance between two IDR frames.

List of References

See also