Introduction
Multimedia was always considered an issue in GNU/Linux, especially in Fedora which is based on FOSS which is freely redistributable and thus cannot include codecs for various patented multimedia formats like mp3. This series of blogpost is meant to address the issues of playing and making various multimedia under Fedora 7.
It’s always good to start with theory behind. Let’s say you want to play a video you obtained somewhere. How can you do that? How can you check what is needed? How is it actually made? Look at this diagram:

You see there that video file is first processed by splitter, which must know the container format in which is the file stored. Then it is splitted into three (but possibly more or less) streams and each stream is processed by itself. So audio is decoded by audio decoder (if available) and sent to audio device, video is decoded by video decoder (if available) and sent to video device and subtitles can be either added by video decoder (most players do this) or processed by special rendering library (e.g. libass) which sends it to video device for displaying.
Encoding is done much similar. First you encode the video and audio streams with appropriate coders prepare subtitle streams in desired format and then mix them together with mixer which makes the final output file stored in desired container. Let’s have a closer look on these parts.
Containers
Containers are used to store audio and video data streams in files. If you cannot handle the container, you cannot play the video inside either. So the first thing you’ll look for when trying to play a video is whether your software can handle the container or not.
Containers can be exclusive to audio (like WAV) or can be more flexible and contain audio, video, subtitles, and some of them even more. Some of the most extensively used containers nowadays are AVI, Matroska, Ogg and OGM.
But you can ask: why there are so many of them? What are the differences? Which one I should use for my videos? Well, it’s not easy to answer these questions. There are containers that are meant to work only with specific range of codecs (and thus could be optimised specially for the data inside), like MP4 which is can contain only MPEG-4 videos, and there are so much flexible containers that you can store almost everything in it ranging from audio,video and subtitles to fonts, like Matroska. There are formats suitable for playing locally stored files, like AVI, and there are formats designed for internet streaming, like FLV (flash video).
So your choice would depend most likely on what do you want it for. But as you will most probably encode videos for storing on a HDD or CD/DVD you will most probably choose between AVI, OGM and Matroska for video. For audio you usually do not need to care of these things, as most audio formats have their file formats included, so if you play mp3, you have mp3 container, if you want FLAC, you have FLAC container, however if you want vorbis, you will most likely find it in Ogg container.
Now, what are the pros and cons of AVI, OGM and Matroska? Which one you should choose? Well if you look only for features, quality and speed you’ll chose Matroska. OGM is, well, quite aside of main interest, it has some advantages compared to AVI, and is open source, but cannot beat Matroska. So if you want everyone to play the file, without installing any additional software (on Windows), you’ll chose AVI, since every Windows release capable of playing videos can read AVI files. The rest of cases is handled by Matroska. Simply said, it’s better, it’s faster and it’s open.
And what’s so good about Matroska? You can do almost everything with it. You can attach as many video/audio streams as you want, the same with subtitles, you can attach fonts, you can define aspect ratio for a movie, so it can be anamorphic. Support for anamorphic videos is vital if you’d like to rip DVDs. Most of wide screen DVDs are encoded so that if you displayed it pixel on pixel you would get deformed video. That’s because you can store the same amount of data in wide-screen movie as in classical 4:3 movie, while keeping the vertical resolution same.
But if you want to rip it, you need to decide – encode it wide-screened or leave it anamorphic? If you want it anamorphic, the container must support it – and matroska does. If you decide to encode it wide-screened you will loose quality – you must resize it, either to make vertical resolution bigger, keeping the horizontal same, or making it smaller, keeping the vertical resolution same. The first approach leads to double resizing (when you play the movie, it is usually resized again to fit in the screen) and bigger size, the second leads to loss of detail. So it’s better to use Matroska and leave it anamorphic.
Some of the other advantages of Matroska are that it supports DVD like menus, is designed on streaming as well, you can set chapters, it’s designed to be stream-able over the internet, has high error recovery, etc.
If you are interested you can read about the most common containers on the wikipedia on Container format page or look at the Comparison of container formats on the same site.
Audio Formats
Now, if your have a support for the container, you need and audio codec to play the audio inside (or to encode the audio). The most common audio formats are mp3, vorbis, FLAC, WAV and wma. We can divide them further into lossy (mp3, vorbis, wma) and loss-less (FLAC, WAV, wma). The lossy codecs use special algorithms to decide what you can’t hear and cut it of the audio, so you loose some data, but most of the people should not notice it, unless encoded with very low bit-rates (i.e. how many bit [eight bits is one byte] are needed to store one second of the audio).
If you want to play mp3 audio you have quite wide choice in case of codecs. Probably the best choice is lame mp3 encoder. MP3 is good choice because nearly everyone can play it. It’s playable on modern CD players, car radios, mobile phones, PCs, etc. MP3 is bad choice, because it has bad quality compared to some more modern formats like vorbis. If you want mp3 and decent quality you should not go under 128 kbps (kilobits per second), good choice is 192 kbps with VBR (variable bit-rate).
For wma the situation is worse, because it’s completely closed and patented format. If you use Linux those will be one of the most hard to play. Quality is good, but most of the files available on internet are encoded using such low bit-rates that artifacts are more than noticeable. Some people says that it’s like hearing the music through casserole. But it is better than mp3 at same bit-rates. WMA can also be compressed in loss-less mode – than the quality is same as of the original.
The best choice for Linux users (and not only them
) in lossy formats is vorbis. The quality is similar or better than of most of the other lossy audio formats available, it’s open source and it’s included on most Linux distributions by default. Windows users however would need to install the codec manually and most CD/flash players don’t support it.
For loss-less compression the situation is easier. WAV is used for tens of years already and is never a bad choice. FLAC, however, support sample rate up to about 1 MHz and usually reduces file size to about one half the size of WAV. Also it is open source. So under Linux FLAC is certainly a better choice, WAV can be used only for compatibility reasons because it’s bigger and can be of worse quality.
If you wish to read more of audio compression, you can find it on the wikipedia on Audio compression page. Comparison of audio codecs is available on the wikipedia as well.
Video Formats
And now, we are at last at the part which is most confusing – video formats. Why? Well, can you tell what’s the difference between DivX and XviD? Do you know that XviD and DivX coded videos use the same format? Do you know that video codec and video format is not the same? Do you know what FourCC is? And many other questions.
So let’s start with some basics: video formats are specifications. They say how one device sends video to the other, OR in case of digital media it says how the video in a file is stored. The basic two analog video formats are PAL (used in the Europe) and NTSC (used in the U.S.). The difference is in fps (frames per second) and resolution. PAL videos are 625-line/50 Hz, however on the TV only 576 lines are displayed. Fifty Hz means that it displays fifty fields in a second. The video is interleaved, so the fields are half-frames, and it gives 25 fps. NTSC videos are 525-line/60 Hz, however only 480 lines are displayed on the TV. NTSC videos are interleaved and has 60000:1001 fields per second, which gives 30000:1001 frames per second.
Digital formats can be e.g. HDTV (1024p, i.e. 1024 lines, progressive) or compression specifications. Probably to most commonly used nowadays are MPEG-4 ASP, MPEG-4 AVC, RealVideo, Theora and WMV. You will probably know WMV – it’s default windows video format and is not very portable and RealVideo, in which many televisions have their internet broadcast. MPEG-4 ASP is known as DivX or XviD, MPEG-4 AVC is know as H.264 and Theora is open video format.
But formats are only specifications, to play/encode the video you need a video codec (COder and DECoder). The most widely spread nowadays are DivX, XviD, WMA and RealVideo, but ffmpeg and x264 are gaining their share as well. DivX and XviD are mostly the same, as they implement the same standard (MPEG-4 ASP), the difference is in power – i.e. how fast it can encode using the same quality setting. In past there was also a difference in how much of the standard was implemented, but now the standard is implemented more or less completely in both. The main problem is that FourCC used in description of a film which should be used to describe format is used to describe codec and that could confuse many video players resulting in failing to play DivX encoded video with XviD decoder and vice-versa, complaining that decoder is missing.
Here, the best choice is ffmpeg, because it’s set to associate itself with all videos it can actually decode, so it can be used instead of DivX or XviD decoders and many others. But, MPEG-4 ASP is quite old standard and is being continually replaced with newer MPEG-4 AVC. The probably most know codec implementing this standard is x264. The standard itself is patented, however x264 codec is open-source and has one of the best implementations of this standard so far. The H.264 specification, as it is also known, is the best lossy compression available, but it’s also slower. So if you’d like to HD video encoded in H.264 it will most probably have same or at least similar quality compared to original, but you’ll need a very fast machine. However, for DVD rips it’s playable on most of computers, and has the best quality to size ratio you can get.
WMV and RealVideo are a little bit different. They are used mostly for internet streams, quality usually isn’t much high – but that’s due to high compression. WMV has a great disadvantage that make it to on a OS different than windows is usually hard, RealVideo is similar same for the fact that you can use RealPlayer. But I myself have rather bad experience with RealPlayer on Fedora.
So if you want to play videos you will play most of them with ffmpeg codec, if you want to encode the best choice is in most cases H.264. Theora is a possibility and it’s completely open, however it’s rather slow and has worse quality than H.264.
For more info you can go to wikipedia to Video codec page or Comparison of video codecs.
Subtitles
Well, what to say here… There are three basic types of subtitles: hard, prerendered and soft. Hard subtitles are encoded in the movie itself and thus cannot be turned of. Usually not a good choice. Prerendered subtitles are separate video frames which are during playing overlaid on the original video. They usually lack anti-aliasing and are not very used. Soft subs are the most common ones. They’re usually a text file which contains timings, positions, styling, etc. and the subtitles themselves. It can be separate or included in the file with the video. Some of the advantages are that they can be anti-aliased, can be turned off, they’re style can be changed, they can be latter edited&hellip.
Some of the most commons soft subtitles formats are SUB, SRT SSF, ASS, … In my opinion the best ones are ASS subtitles. They can be styled, they can move inside the picture, you can do with them mostly the same you could do with hardsubs, while maintaining a small size and leaving the ability to later edit them or turn them off during the movie.
More info about subtitles is on wikipedia on page Subtitles.
Conclusion
If you are playing a file, try to find one that uses an open container (in Linux it’s certainly and advantage), has a good quality and uses formats that can be played on Linux. That means that if you get matroska file with H.264 and AAC/AC3 or vorbis audio you are happy. OGM and AVI containers are worse, but possible. Also the rest of MPEG formats family (ranging from mpeg-1 through mp3 to DivX) is usually well supported. If you rely on official fedora repositories, you must however restrict yourself to matroska/OGM, theora and vorbis/FLAC. Maybe I’ve forgotten something, but this is certainly most reliable.
If you’re going to encode, decide whether you want quality at all costs, whether you want only open technologies, or whether you want as much portability as possible. If you depend on quality than combination of Matroska, H.264 and AAC/AC3 or vorbis is the best you can have for video and FLAC is the best you can get for audio. If need open technologies, than use the same as for playing. If you want interoperability, you’ll need to choose something widely supported – that means AVI, MPEG-4 ASP, mp3. Another possibility are WMV and WMA formats – but only if your target group are people with Windows.
Next time I will start with something more interesting – i.e. how to make playing of movies and songs in Fedora 7 work.