Results 1 to 2 of 2

Thread: 10.04: Video overlay/composite editing with Theora ogv

  1. #1
    Join Date
    Sep 2009

    10.04: Video overlay/composite editing with Theora ogv

    Hi all,

    For a while, I have been looking for a solution to somewhat simple and specific problem: video overlay editing with Theora ogv support in Ubuntu Linux. Let me specify that a bit:

    Say that, for a video tutorial, we'd need to record a computer output in the physical world using a camera; and at the same time, follow the execution of programs on the same computer. In that case, we can:
    • Set up a camera to record the computer's environment and output
    • Set up some sort of a screen capturing program, to record the computer's desktop

    Thus we end up with two videos - one 'real' video, and one 'desktop capture' video. And sometimes, (if the scene allows it), we may want to put the desktop capture video, resized to a smaller area, in a layer over the 'real' video, so it is easier to follow both developments - in picture-in-picture style

    Therefore, we need facilities to overlay (composite) videos in prospective video editors.

    So I looked at Cinelerra, LiVES, OpenShot and Kdenlive as editors that may offer overlays/compositing. They can all be sudo apt-get install-ed, but none of them wanted to run properly with ogvs - especially the ogv video preview didn't work. So, I tried building melt/ffmpeg/vorbis from source, as described here:

    Building from source allowed both Openshot and Kdenlive to work with ogv; they could import, they could show preview (which could however get buggy after a while), and they could export Theora ogv using the latest source versions. However, that is not the end of the issues with video overlay - which is what will be touched upon in this post.

    First of all, I use the following - in terms of the videos mentioned above:
    • A camera that provides mjpeg AVI, 640x480, 30 fps for 'real' video
    • vncrec for 'desktop capture' video via vnc (I avoid recordMyDesktop (which, otherwise, does produce ogv files), if I want to avoid additional disk writes on the recorded machine). I usually scale my desktop to 800x600 before recording, so I obtain capture videos in this size; vncrec typically generates 10 fps.

    In both of these cases, I usually use ffmpeg2theora to convert to ogv (from either the camera AVI, or from the vncrec capture), as it offers self-contained Linux executables for download - and it produces decent looking videos with decent file sizes, that also play fine in <video> tag in firefox (using firefox's built-in theora). Because of the good results for each individual source, I figured at first that I may keep the .ogv files of both the 'real' and 'screen capture' video - and use a video editing program to overlay the ogv versions of videos, and fix timing/synchronisation problems, before finally exporting a composite ogv video.

    Well, this turns to be not that easy - even when working with latest software from source. First of all, there are some usability differences between Openshot and Kdenlive:
    • Openshot can produce an overlay/composite by having one ('desktop capture') video in a track(layer) above the other ('real') video; then one can edit the properties of the top clip, and under 'Layout', choose width, height and position
      • Note, this per clip setting can be rather irritating, if you just want your top video to just stand still:
      • first of all, the 'Layout' settings are remembered for two 'Key Frames' - start of clip and end of clip; and if the values are different between start and end, then the video clip gets animated accordingly
      • Then, when you duplicate a clip, all these settings in 'Layout' get reset to their defaults - so you have to reenter them again, to scale the clip as it was previously.
      • Cutting a scaled clip in the wrong place may reset the scaling of the last shown frame (in spite of the properties showing the right numbers)
      • Note, just scaling a clip and positioning it above another one, is enough for the bottom clip to be seen through the image area not taken up by the top clip
      • For image sequences import: File/"Import Image Sequence..."
      • Change of clip playback speed is property of a clip
      • Timeline can be zoomed in only to 1 second resolution

    • Kdenlive has a different approach to compositing:
      • Here scaling and position is not a property of a clip - but a video effect ('Scale0Tilt') added to a clip (though, note that you cannot animate as in Openshot)
      • When you scale a video, the unused image area is black and opaque - and so the video in the track below doesn't automatically show through - to do that, the 'Blue Screen' video effect needs to be added on top of effect stack for the top clip, followed by 'Scale0Tilt' - and finally 'Compositing' 'Transition' needs to be added to the top video clip.
      • For image sequences import: Project/"Add Slideshow Clip"
      • Change of clip playback speed is a video effect ('Speed') to be added to a clip (changes can be made only in integer percents)
      • Timeline can be zoomed in to 1 frame resolution

    Using the latest mlt / ffmpeg / theora from source, both of these video editors will be able to show a preview of two overlaid ogv videos - however both of them are quite slow, and take quite some time in doing the preview rendering; and also the preview gets buggy after a while (i.e. starts showing the wrong frames); and both can export to ogv.

    However, because of the slow preview, it gets rather tedious to actually edit videos in the ogv format; so in the case of this type of overlay, one can try another approach:
    • Use the original camera AVI as the bottom, 'real' video layer
    • Convert vncrec capture to a sequence of PNG images using ffmpeg; use the PNG image sequence as a top, 'desktop capture' video layer.

    This is a better approach, because none of the video editors have a problem 'stepping' through single frames of either the original AVI or an image sequence, and the preview is correct. Additionally, we work with source material with both video tracks (the conversion of a dekstop capture to PNG sequence should be lossless), which should allow for better video quality of the final exported (composite) ogv.

    Here, however, we should note that as the 'desktop capture' vncrec video has originally a frame rate of 10 fps, and the 'real' camera video 30 fps - we should instruct the editors that each image in the image sequence lasts for 3 frames, to have the default synchronisation as correct as possible (and this can be easily set upon import on both Openshot and Kdenlive). Just that, however is not enough - even if the beginning of the top clip is synced with a characteristic event of the bottom clip, note that vncrec capture essentially receives frames by random, and they can be occasionally dropped. That means that recorded events in 'desktop capture', will sometimes be hurried before the corresponding events in 'real' video, and sometimes be delayed after.

    Which is why the next stage is to use the 'razor'/cut tools in either software, to cut the 'desktop capture' video clip into pieces - and synchronise them by moving them to match to the corresponding events in the 'real' video; holes can often be filled by cutting even smaller pieces of the 'desktop capture' clips and duplicating them. This is notably more difficult to do in Openshot, as it cannot zoom the timeline in beyond 1 second resolution (although it can step and set markers on a frame level); and during the duplication the clip properties are reset to default.

    With the cutting and synchronisation stage finished, we are left to export the final composite video again to ogv. Here, however, we can again expect problems. Experience from Openshot shows:
    • Exporting to ogv directly from the application results with a file that can be played back in VLC, but not in Firefox (starts playing, but stops/freezes after first few frames - v3.6.8 being the current version of Firefox).
    • Exporting to an image sequence first, and then using the latest ffmpeg2theora (0.27) to convert the image sequence to ogv, will result with a buggy ogv with wrong colors (even when played with VLC).

    Since kdenlive would pretty much use the same engine (ie mlt/ffmpeg/theora) to do these operations, we can more-less expect that kdenlive would perform exactly the same; however that is not the case. If you try to scale a clip bu applying the scale0tilt effect, results are pretty bad:

    The scale0tilt effect is actually listed as '<filter id="frei0r.scale0tilt">' in the .kdenlive file; the above image shows the PNG export, of a track of PNG import, originally at 800x600, scaled 60% and placed in a 640x480 video project. On the left is the output of scale0tilt as video effect applied to a clip - on the right is the output of the clip without any video effects, and the scaling set as a parameter to a 'Composite' Transition ( a transition, unlike an effect, is applied per track and can have keyframes; see Kdenlive/Transitions; Creating a video panorama video tutorial).

    Notice also there is a difference in quality between "extract frame" command; and actual export with either method; export of scaling via 'Composite' Transition seems to give best results (Openshot gives similar to those).

    After some hassle, I got that one method that can finally produce a ffmpeg2theora ogv that plays nice in Firefox is:
    • Export the edited composite video as a PNG image sequence
    • Use ffmpeg2theora v0.24 to encode the exported PNG image sequence to pngexport.ogv

    • Extract audio from AVI 'real' video file using 'ffmpeg' to Vorbis audioonly.ogg
    • Merge video pngexport.ogv and audio audioonly.ogg, using 'oggz-merge' or 'oggJoin', into a single file export-av.ogv
      • At this point, the export-av.ogv plays nice in VLC, but stops after some frames in Firefox

    • Finally, re-transcode export-av.ogv back to ogv, without any changes in quality, using 'oggTranscode', into export-av-final.ogv
      • This step seems to be reconstructing some metadata, as export-av-final.ogv can be some 2 MB larger than an export-av.ogv of some 200 MB; and finally allows that both Firefox and VLC can play back the export-av-final.ogv in its entirety

    Below is a command line session outlining some of these steps:
    # export composite video as PNG images... say  
    #  in /path/to/pngs/composite_%05d.png (5-character digit name, with 0-padding)
    # you may want to use the ffplay to play the images, 
    # however, playback will be very slow for 640x480 png's
    ffplay /path/to/pngs/composite_%05d.png
    # encode exported PNG images as an ogv video     - using ffmpeg2theora
    # note - encoding 5 mins 640x480 could produce:
    #  ~ 312 MB file @ 8000 kbps
    #  ~ 200 MB file @ 5000 kbps 
    #  use ffmpeg2theora-0.24 ; 0.27 may give a bad ogv output
    ./ffmpeg2theora-0.24-2b -F 30 -V 5000 -no-audio /path/to/pngs/composite_%05d.png -o /path/to/pngexport.ogv
    # extract audio as an ogg file             - using ffmpeg
    ./ffmpeg -y -i /path/to/realvideo.AVI -vn -acodec libvorbis -ac 2 -ab 128k -ar 44100 /path/to/audioonly.ogg
    # note - this step missed above; 
    # at this point, oggJoin would fail with "Warning: </....ogv> is not a valid ogg file"
    # so we transcode video once first here...
    ./oggTranscode -rv /path/to/pngexport.ogv /path/to/pngexport-t.ogv
    # merge (mux/multiplex) audio and video     - using oggJoin
    ./oggJoin /path/to/export-av.ogv /path/to/pngexport-t.ogv /path/to/audioonly.ogg 
    #~ alternatively, mux - using oggz-merge
    #~ ./oggz-merge.linux /path/to/pngexport.ogv /path/to/audioonly.ogg -o /path/to/export-av.ogv
    # at this point, we have a file that plays
    # in VLC, but not fully in Firefox 
    # Re-encode the muxed file again         - using oggTranscode
    # -rv is "force reencode video stream"
    # we don't specify any quality settings, so 
    #  they're kept the same
    ./oggTranscode -rv /path/to/export-av.ogv /path/to/export-av-final.ogv
    # now, export-av-final.ogv should play OK in Firefox as well..
    Some notes:
    • Note that 'oggTranscode' and 'oggJoin' above, are part of The Ogg Video Tools, built from source (as reffered above, in this post)
    • 'oggz-merge.linux' above, is part of Oggz Tools; current binaries for Linux can be downloaded from nightlies
      • Here is a page introducing the two toolsets, 'Oggz Tools' and 'Ogg Video Tools': TheoraCookbook (en): LosslessIntro
      • In Ubuntu, there is also: 'sudo apt-get install oggvideotools oggz-tools' (but not as current)

    • Note also the (relatively new) tool OggIndex; (binary @, source @ GitHub. It is referred at the Ogg Index - XiphWiki page, which is an informal draft for the ogv skeleton metadata.
      • Note that, in itself, this tool cannot (yet) "repair" ogv for Firefox as 'oggTranscode' can.
      • It also usually adds a couple of MB worth of metadata to an ogv file; however, suprisingly, if this tool is run on an ogv that is output from 'oggTranscode', it will double its size ! (i.e. from a 200MB file from 'oggTranscode' as input - 'OggIndex' will generate a 400MB output ogv file)

    • Note that a 5 min, 640x480, 200 MB ogv file, may fail to play to the end in Firefox also because of buffering or memory constraints (that is, not just because of codec problems); usually, you can try to press play, and then pause before the playhead has gone beyond the buffered piece - then wait a bit for buffering, then play again and so on, piecewise (it seems that if in a big ogv played in Firefox 3.6.8, you let the playhead go beyond the buffered piece, the player stops responding on further pause/play clicks)

    Well, given that, in the end, it is possible to do video overlay, and get decent results in Ubuntu - hopefully, in the future, this kind of a process gets even easier for the end user

    For now, I hope that this writeup can help - as it can be overwhelming to keep track of all details needed to get this working



    Attached Images Attached Images
    Last edited by sdaau; July 30th, 2010 at 10:19 PM. Reason: What? Anonymous CANNOT watch images attached to the post? imageshack...

  2. #2
    Join Date
    Sep 2009

    Re: 10.04: Video overlay/composite editing with Theora ogv

    Since the topic here is partially about generating .ogv from PNG's, here's a script that produces non-optimized (and then optimized) PNG's on the fly, and then uses ffmpeg2theora to encode them to videos; which demonstrates there may a bug with ffmpeg2theora 0.27, when using it to make videos of a PNG image sequence:

    #!/usr/bin/env bash
    # for debug
    XTR="set -x"
    # start xtrace debug
    # generate source png image (gradient)
    # formats:
    convert -size 640x480 gradient:\#4b4-\#bfb $SRCIMG
    # get info - this is rgb48be
    identify $SRCIMG
    # get info on image from ffmpeg
    ffmpeg -i $SRCIMG 2>&1 | grep "Input\|Stream"
    # convert to RGB 8-bit per channel PNG ; also png8:tmp.png
    convert $SRCIMG -quantize rgb -depth 8 tmp.png 
    mv tmp.png $SRCIMG
    # get info again - rgb24
    identify $SRCIMG
    ffmpeg -i $SRCIMG 2>&1 | grep "Input\|Stream"
    # optimize source
    optipng -quiet $SRCIMG 
    # show info again - after optipng, it is "pal8"
    ffmpeg -i $SRCIMG 2>&1 | grep "Input\|Stream"
    # stop xtrace dbg
    set +x
    # generate rotated imgs
    mkdir testgpngs
    echo -n "testgpngs/: "
    for ix in $(seq 1 1 71); do 
    let angle=ix*5
    fname=$(printf "tout%03d.png" $ix)
    echo -e -n " $fname-$angle,"
    convert $SRCIMG -rotate $ix tmp.png
    convert -size 640x480 -depth 8 -extract 640x480+0+0 tmp.png testgpngs/$fname
    # start xtrace dbg again
    rm tmp.png
    # show info again - last rotated img, it is "bgra"
    identify testgpngs/$fname
    ffmpeg -i testgpngs/$fname 2>&1 | grep "Input\|Stream"
    # produce ogv's with different versions of ffmpeg2theora
    ./ffmpeg2theora-0.27.linux32.bin -F 30 -V 5000 testgpngs/tout%03d.png -o tvid-nonopti-0.27.ogv
    ./ffmpeg2theora-0.24-2b -F 30 -V 5000 testgpngs/tout%03d.png -o tvid-nonopti-0.24.ogv
    #optimize all 
    optipng -quiet testgpngs/*.png
    # show info again - last rotated img, it is "rgb24"
    identify testgpngs/$fname
    ffmpeg -i testgpngs/$fname 2>&1 | grep "Input\|Stream"
    # produce ogv's with different versions of ffmpeg2theora
    ./ffmpeg2theora-0.27.linux32.bin -F 30 -V 5000 testgpngs/tout%03d.png -o tvid-opti-0.27.ogv
    ./ffmpeg2theora-0.24-2b -F 30 -V 5000 testgpngs/tout%03d.png -o tvid-opti-0.24.ogv
    # check videos - play in vlc:
    vlc tvid-nonopti-0.24.ogv tvid-nonopti-0.27.ogv tvid-opti-0.24.ogv tvid-opti-0.27.ogv
    # For both converted and non-converted pngs, 0.24 shows OK videos; 0.27 shows extra colors for both; 
    # for all videos, vlc says:
    # vlc (command line): "swScaler: pal8 is not supported as output pixel format"
    The script produces this output:
    + SRCIMG=testgrad.png
    + convert -size 640x480 gradient:#4b4-#bfb testgrad.png
    + identify testgrad.png
    testgrad.png PNG 640x480 640x480+0+0 16-bit DirectClass 3.66KiB 0.060u 0:00.049
    + ffmpeg -i testgrad.png
    + grep 'Input\|Stream'
    Input #0, image2, from 'testgrad.png':
        Stream #0.0: Video: png, rgb48be, 640x480, 25 tbr, 25 tbn, 25 tbc
    + convert testgrad.png -quantize rgb -depth 8 tmp.png
    + mv tmp.png testgrad.png
    + identify testgrad.png
    testgrad.png PNG 640x480 640x480+0+0 8-bit DirectClass 2.23KiB 0.020u 0:00.029
    + ffmpeg -i testgrad.png
    + grep 'Input\|Stream'
    Input #0, image2, from 'testgrad.png':
        Stream #0.0: Video: png, rgb24, 640x480, 25 tbr, 25 tbn, 25 tbc
    + optipng -quiet testgrad.png
    + ffmpeg -i testgrad.png
    + grep 'Input\|Stream'
    Input #0, image2, from 'testgrad.png':
        Stream #0.0: Video: png, pal8, 640x480, 25 tbr, 25 tbn, 25 tbc
    + set +x
    testgpngs/:  tout001.png-5, tout002.png-10, tout003.png-15, tout004.png-20, tout005.png-25, tout006.png-30, tout007.png-35, tout008.png-40, tout009.png-45, tout010.png-50, tout011.png-55, tout012.png-60, tout013.png-65, tout014.png-70, tout015.png-75, tout016.png-80, tout017.png-85, tout018.png-90, tout019.png-95, tout020.png-100, tout021.png-105, tout022.png-110, tout023.png-115, tout024.png-120, tout025.png-125, tout026.png-130, tout027.png-135, tout028.png-140, tout029.png-145, tout030.png-150, tout031.png-155, tout032.png-160, tout033.png-165, tout034.png-170, tout035.png-175, tout036.png-180, tout037.png-185, tout038.png-190, tout039.png-195, tout040.png-200, tout041.png-205, tout042.png-210, tout043.png-215, tout044.png-220, tout045.png-225, tout046.png-230, tout047.png-235, tout048.png-240, tout049.png-245, tout050.png-250, tout051.png-255, tout052.png-260, tout053.png-265, tout054.png-270, tout055.png-275, tout056.png-280, tout057.png-285, tout058.png-290, tout059.png-295, tout060.png-300, tout061.png-305, tout062.png-310, tout063.png-315, tout064.png-320, tout065.png-325, tout066.png-330, tout067.png-335, tout068.png-340, tout069.png-345, tout070.png-350, tout071.png-355,
    + rm tmp.png
    + identify testgpngs/tout071.png
    testgpngs/tout071.png PNG 640x480 662x762+0+0 8-bit DirectClass 32.2KiB 0.030u 0:00.029
    + grep 'Input\|Stream'
    + ffmpeg -i testgpngs/tout071.png
    Input #0, image2, from 'testgpngs/tout071.png':
        Stream #0.0: Video: png, bgra, 640x480, 25 tbr, 25 tbn, 25 tbc
    + ./ffmpeg2theora-0.27.linux32.bin -F 30 -V 5000 testgpngs/tout%03d.png -o tvid-nonopti-0.27.ogv
    Input #0, image2, from 'testgpngs/tout%03d.png':
      Duration: 00:00:02.36, start: 0.000000, bitrate: N/A
        Stream #0.0: Video: png, bgra, 640x480, 30 fps, 30 tbr, 30 tbn, 30 tbc
      0:00:02.36 audio: 0kbps video: 5021kbps, time elapsed: 00:00:12            
    + ./ffmpeg2theora-0.24-2b -F 30 -V 5000 testgpngs/tout%03d.png -o tvid-nonopti-0.24.ogv
    Input #0, image2, from 'testgpngs/tout%03d.png':
      Duration: 00:00:02.36, start: 0.000000, bitrate: N/A
        Stream #0.0: Video: png, bgra, 640x480, 30 fps, 30 tbr, 30 tbn, 30 tbc
      Resize: 640x480
          0:00:02.39 audio: 0kbps video: 803kbps, time elapsed: 00:00:09        
    + optipng -quiet testgpngs/tout001.png testgpngs/tout002.png testgpngs/tout003.png testgpngs/tout004.png testgpngs/tout005.png testgpngs/tout006.png testgpngs/tout007.png testgpngs/tout008.png testgpngs/tout009.png testgpngs/tout010.png testgpngs/tout011.png testgpngs/tout012.png testgpngs/tout013.png testgpngs/tout014.png testgpngs/tout015.png testgpngs/tout016.png testgpngs/tout017.png testgpngs/tout018.png testgpngs/tout019.png testgpngs/tout020.png testgpngs/tout021.png testgpngs/tout022.png testgpngs/tout023.png testgpngs/tout024.png testgpngs/tout025.png testgpngs/tout026.png testgpngs/tout027.png testgpngs/tout028.png testgpngs/tout029.png testgpngs/tout030.png testgpngs/tout031.png testgpngs/tout032.png testgpngs/tout033.png testgpngs/tout034.png testgpngs/tout035.png testgpngs/tout036.png testgpngs/tout037.png testgpngs/tout038.png testgpngs/tout039.png testgpngs/tout040.png testgpngs/tout041.png testgpngs/tout042.png testgpngs/tout043.png testgpngs/tout044.png testgpngs/tout045.png testgpngs/tout046.png testgpngs/tout047.png testgpngs/tout048.png testgpngs/tout049.png testgpngs/tout050.png testgpngs/tout051.png testgpngs/tout052.png testgpngs/tout053.png testgpngs/tout054.png testgpngs/tout055.png testgpngs/tout056.png testgpngs/tout057.png testgpngs/tout058.png testgpngs/tout059.png testgpngs/tout060.png testgpngs/tout061.png testgpngs/tout062.png testgpngs/tout063.png testgpngs/tout064.png testgpngs/tout065.png testgpngs/tout066.png testgpngs/tout067.png testgpngs/tout068.png testgpngs/tout069.png testgpngs/tout070.png testgpngs/tout071.png
    + identify testgpngs/tout071.png
    testgpngs/tout071.png PNG 640x480 662x762+0+0 8-bit DirectClass 22.5KiB 0.030u 0:00.019
    + grep 'Input\|Stream'
    + ffmpeg -i testgpngs/tout071.png
    Input #0, image2, from 'testgpngs/tout071.png':
        Stream #0.0: Video: png, rgb24, 640x480, 25 tbr, 25 tbn, 25 tbc
    + ./ffmpeg2theora-0.27.linux32.bin -F 30 -V 5000 testgpngs/tout%03d.png -o tvid-opti-0.27.ogv
    Input #0, image2, from 'testgpngs/tout%03d.png':
      Duration: 00:00:02.36, start: 0.000000, bitrate: N/A
        Stream #0.0: Video: png, rgb24, 640x480, 30 fps, 30 tbr, 30 tbn, 30 tbc
      0:00:02.36 audio: 0kbps video: 5021kbps, time elapsed: 00:00:11            
    + ./ffmpeg2theora-0.24-2b -F 30 -V 5000 testgpngs/tout%03d.png -o tvid-opti-0.24.ogv
    Input #0, image2, from 'testgpngs/tout%03d.png':
      Duration: 00:00:02.36, start: 0.000000, bitrate: N/A
        Stream #0.0: Video: png, rgb24, 640x480, 30 fps, 30 tbr, 30 tbn, 30 tbc
      Resize: 640x480
          0:00:02.39 audio: 0kbps video: 803kbps, time elapsed: 00:00:08
    EDIT: here is also a one liner for ffmpeg2theora piping to ogvTranscode:
    ./ffmpeg2theora-0.24-2b -F 30 -V 5000 images/img%05d.png -o /dev/stdout | ./oggTranscode -rv /dev/stdin pngexport-tx.ogv
    (not strictly necessary - as ffmpeg2theora-0.24 png videos will play in Firefox also - except to me it looks as if oggTranscode videos run a bit smoother )
    Last edited by sdaau; November 14th, 2010 at 04:21 AM. Reason: added one-liner

Tags for this Thread


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts