Tinkering with a H.261 encoder
maikmerten
On the rtcweb mailing list the struggle regarding what Mandatory To Implement (MTI) video codec should be chosen rages on. One camp favors H.264 ("We cannot have VP8"), the other VP8 ("We cannot have H.264"). Some propose that there should be a "safe" fallback codec anyone can implement, and H.261 is "as old and safe" as it can get. H.261 was specified in the final years of the 1980ies and is generally believed to have no non-expired patents left standing. Roughly speaking, this old gem of coding technology can transport CIF resolution (352x288) video at full framerate (>= 25 fps) with (depending on your definition) acceptable quality starting roughly in the 250 to 500 kbit/s range (basically, I've witnessed quite some Skype calls with similar perceived effective resolution, mostly driven by mediocre webcams, and I can live with that as long as the audio part is okay). From today's perspective, H.261 is very very light on computation, memory, and code footprint.

H.261 is, of course, outgunned by any semi-decent more modern video codec, which can, for instance, deliver video with higher resolution at similar bitrates. Those, however, don't have the luxury of having their patents expired with "as good as it can be" certainty.

People on the rtcweb list were quick to point out that having an encoder with modern encoding techniques may by itself be problematic regarding patents. Thankfully, for H.261, a public domain reference-style en- and decoder from 1993 can be found, e.g., at http://wftp3.itu.int/av-arch/video-site/h261/PVRG_Software/P64v1.2.2.tar - so that's a nice reference on what encoding technologies were state-of-the-art in the early 1990ies.

With some initial patching done by Ron Lee this old code builds quite nicely on modern platforms - and as it turns out, the encoder also produces intact video, even on x86_64 machines or on a Raspberry Pi. Quite remarkably portable C code (although not the cleanest style-wise). The original code is slow, though: It barely does realtime encoding of 352x288 video on a 1.65 GHz netbook, and I can barely imagine having it encode video on machines from 1993! Some fun was had in making it faster (it's now about three times faster than before) and the program can now encode from and decode to YUV4MPEG2 files, which is quite a lot more handy than the old mode of operation (still supported), where each frame would consist of three files (.Y, .U, .V).

For those interested, the patched up code is available at https://github.com/maikmerten/p64 - however, be aware that the original coding style (yay, global variables) is still alive and well.

So, is it "useful" to resurrect such an old codebase? Depends on the definition of "useful". For me, as long as it is fun and teaches me something, it's reasonably useful.

So is it fun to toy around with this ancient coding technology? Yes, especially as most modern codecs still follow the same overall design, but H.261 is the most basic instance of "modern" video coding and thus most easy to grasp. Who knows, with some helpful comments here and there that old codebase could be used for teaching basic principles of video coding.
Tags: , ,

Cortado 0.6.0
maikmerten
Cortado 0.6.0 is now released. Changes since 0.5.2:
  • support for Theora files with 4:4:4 and 4:2:2 sampling
  • Reinforced compatibility with Java 1.1
  • Improved support for videos with dropped frames
  • Much improved support for Kate subtitles (including menu to select subtitle stream)
  • More robust scanning for stream duration
  • Fixed Vorbis surround sound
  • Release binaries should export a subset of the HTML5 media API again
You can download source and binaries from the Cortado download page.
Tags: ,

Cortado 0.5.2
maikmerten
Cortado 0.5.2 is out. Changes since 0.5.1:
  • make keepAspect ignorable again
  • minor optimizations in the decoder
  • buffer tweaks to prevent unwanted frame drops
  • contents of plugins.ini moved into code to work around resource-loading problems in ancient JVMs
  • fix problems with several audio streams
  • optimizations in YUV-to-RGB code
Downloads at http://downloads.xiph.org/releases/cortado/
Tags: , ,

Cortado nostalgia
maikmerten
Yes, this is Cortado running on Netscape 4.79:



Basically this means Cortado can be made run even on, uh, bad and slow Java virtual machines. No, the JVM included with Netscape isn't fast enough for smooth video playback even on this 3 GHz machine, but sound isn't crackling either.
Tags:

H.264 isn't so bad, is it?
maikmerten
This is mostly a quick response to http://www.kenpardue.com/blog/2009/07/25/back-on-open-video/ which e.g. reads "What I don’t understand, and what irks me so badly, is why H.264 is demonized so badly by the FLOSS community." Well, I'll try to shed some light on this.
  • The problem with H.264 is much more severe than just the costs of the licensing fees. The exact pricing for 2011 and onwards may not yet be decided upon, but its absolutely problematic that there's a single entity (MPEG-LA) with the power to enforce licensing terms and prices. Even if the licensing fees in 2011 happen to be "cheap" there's zero guarantee it'll stay "cheap". When reading the currently available documents on H.264 licensing it appears clear that the overall strategy is to "be cheap now, then get more expensive over time" and while the milking part may be pushed to a later date it'll eventually happen.
  • On the internet 100000 free-units is a joke. Every download counts, most downloads don't actually generate a new "customer" and a mildly interesting software will easily hit the 100000 downloads limit.  To make matters worse: How to count downloads if you don't restrict distribution of your software to your own sites?
  • H.264 licensing terms are unfair to small players. There's a cap for annual licensing fees ($5 million) that big companies will easily hit, meaning each product shipping after the cap is basically free, while small companies not hitting the cap will bleed for every product delivered. This shifts market balance in favor of big players, hindering competition and innovation. The cap was $3.5 million in 2005-2006 and $4.25 million in 2007-2008 - so over time it got harder for small players to hit the maximum annual royality, not easier.
  • The licensing terms are absolutely incompatible with free (speech) software. There are open-source implementation of MPEG formats, meaning you can download, alter and redistribute the source-code - but without a license from MPEG-LA you're not allowed to actually use that software, so you're denied a very essential FOSS right. Even if you had the right to use the open-source MPEG-compliant code you cannot transfer this right to somebody downloading your open-sourced code. You're always dependent on MPEG-LA licensing usage rights. Being at the mercy of a monopolistic 3rd party isn't exactly what FOSS is about.
  • Given that that MPEG licensing is incompatible with FOSS the widespread use of such encumbered formats questions the viability of FOSS on the desktop. Surfing the internet and watching embedded videos isn't possible within an intact FOSS environment in an MPEG-dominated world, users always would depend on proprietary and non-open components components like Flash carrying a MPEG license.
  • Currently everything hints at per-content licensing fees being considered at MPEG-LA. This means not only technology providers are at the mercy of MPEG-LA but also content producers and providers.

For me that's enough to answer the question of why we need to push free-for-all-under-all-circumstances-without-paperwork media  formats - and the list above most likely isn't even complete.

Tags:

Small note on current Wikipedia article on Theora
maikmerten
The current state of the Wikipedia article on Theora states:

Playback performance

Currently, there is no mainstream hardware acceleration support for Theora. Consequently, playback performance, especially on lower-end systems (such as netbooks) lacks in comparison to competing formats, such as MPEG-4.

 
I'd like to point out that

  • Theora has lower computational complexity than H.264, thus the need for hardware acceleration may not be as dire
  • Current netbooks usually have no H.264 acceleration at all due to Intel's choice of hardware components, meaning the computational complexity of H.264 directly impacts on those poor little machines
  • Many software players don't use hardware acceleration even where available

"Hardware" acceleration is a much hotter topic in the realm of digital media players or mobile phones than it is in the "usual" computing environment. From what I understand work to optimize Theora for mobile applications has begun.

Thusnelda article in c't
maikmerten
The Magazin für Computertechnik (c't) printed a 4-page long article about Thusnelda. Preparing the article and actually seeing it in stores was/is an awesome experience and I hope it helps making Theora a bit more widely known. I'd like to thank Volker Zota from c't magazine for making this possible.


edit: There are thumbnails of the article online. Personally I like what they did to the "Xiph.org scared fish(tm)" logo ;-)
Tags:

on the ffmpeg Vorbis encoder
maikmerten
Archive.org now has all moving images content also available as Ogg Vorbis + Theora. This is great!

This was a massive reencoding effort and the details of what tools were used are available at http://internetarchive.wordpress.com/2008/11/25/fast-and-reliable-way-to-encode-theora-ogg-videos-using-ffmpeg-libtheora-and-liboggz/

A setup to get the job done. However, after having had a closer look I discovered that the Vorbis encodings were done with e.g.

ffmpeg -y -i CapeCodMarsh.avi -vn -acodec vorbis -ac 2 -ab 128k -ar 44100 audio.ogg

Ooops. This means there's massive amount of content, encoded with libavcodec's Vorbis encoder. The only problem: That encoder is rather primitive and produces significantly inferior audio quality compared to libvorbis. And yup, the Archive.org encodings sound rather unpleasant despite not starving on bitrate.

How I wished ffmpeg would loudly complain whenever using unfinished/experimental/not-state-of-the-art encoders.
Tags:

More Cortado
maikmerten
(updated again to reflect recent changes)

I have frequently been asked where to get that nice litte Ogg Theora/Vorbis streaming applet. Well, the version I'm working on used to be its source hosted in the Wikimedia SVN, but development recently was moved to http://git.xiph.org/?p=cortado.git and there's a prototype project page at http://www.theora.org/cortado/

A signed binary for Cortado is available at http://www.theora.org/cortado.jar - this one should always be rather recent.

More jheora stuff
maikmerten
By now I patched Wikimedia's Cortado/jheora fork to play back any valid Theora bitstream with 4:2:0 subsampling. This is mostly a matter of properly consuming bits from the bitstream used for non-VP3 features (e.g. per-block qi indices).

Also, I hardened jheora and the jst-component using jheora to not simply stop playback whenever an exception occurs within the decoder. Upon losing sync of the bitstream (e.g. when the bitstream is damaged) corrupted values may end up getting used as array indices later on - which of course usually lead to an ArrayOutOfBoundsException. The decoder now catches all exceptions and returns appropriate error codes (well, in fact just BADPACKET). The playback component can then decide to continue (the next packet may be fine anyway) or not. The new default behaviour is to just continue decoding. I tested this "error recovery" by randomly consuming bits in the decoder during various stages of decoding, thus wrecking the bitstream on purpose. This results in colorful block-storms and all imaginable sorts of corruption - but playback won't stop, which I consider a much better behavior than just giving up.


Anyway: Thusnelda, the Java world is prepared for you, so don't hesitate to replace the old encoder! ;-)
Tags: , , ,

You are viewing maikmerten