Correct, this is what GStreamer devs where telling again and again, xvid is mpeg4 part2 why you need special handling? So they rejected patch that adds format in caps (like the wmv does). Here is the bug: https://bugzilla.gno...g.cgi?id=739196
So every mpeg4 part2 codecs must/should handled by streamtype = 4.
I think you misunderstand the standpoint of the gstreamer devs. IMHO they are right. Xvid (and divx) are indeed just mpeg4 part (aka mpeg4 vc). The separate frames without the container (the elementary stream) is really just mpeg4 vc (also incorrectly referred to as mpeg4 asp). This issue has nothing to do with the codec. It's a container issue. As soon as you wrap mpeg4 (or even mpeg2...) containing B frames into an avi container, there is no way to store the DTS. Avi only has (implicit) PTS. To overcome this limitation (other than using a proper container that can handle B frames) divx decided to "pack" B frames together with "normal" frames (I/P). As soon as the avi demuxer has spliced the container into elementary streams, it SHOULD also re-create the separate frames from the packed frames, including the DTS (calculated from the position in the stream and the frame rate), and you'd have a perfectly normal mpeg4 vc stream that every decoder can handle (including a hardware decoder).
So imho the avi demuxer should handle this special case which only occurs in avi containers, if only because once demuxed, there might not be enough metadata left to reconstruct the separate files.