Page 1 of 1

Adventures in video decoders: the quest for VP9a

Posted: Mon May 25, 2020 7:14 pm
by itoril
Hello everyone!
I'm new. Let me start off by saying I appreciate all the work Tom and his cohorts have done to make Ren'Py and foster a very helpful community.

I've begun working with this game engine, and I'm beginning to test its limits. I happen to be a 3D animator, and I'd like to use this art style to make a game. I make 3D video files, and I'm happy that Ren'Py supports video. I'm especially intrigued by Movie Sprites/Displayables, because the alpha mask system would allow me to animate characters and have them inhabit all sorts of environments by placing them against different backdrops.

I wanted to see how far I could go with this method. After doing some stress tests, I realise that I can have 720p video at 30 frames per second and it works well. When I started to stretch it to 1080p and 60fps, I ran into very low frame rates. At first I thought it was my ancient antique of a workstation - it's about 12 years old. I tried the same settings on a laptop from 2016, and fair enough, it is a laptop, but it does have a Geforce GTX 850m graphics card in it. I also tried it on my render nodes which are 30% faster in terms of CPU than my old workstation, but they only have integrated graphics. Here are my results...

3.1ghz, Quad Core Render Nodes with 4gb RAM - 4FPS
2.2ghz, Quad Core Workstation with Radeon HD770 and 8gb RAM - 15-20FPS
2.4ghz, Dual Core Laptop with Nvidia GTX 850m and 8gb RAM - 25FPS

The hardware acceleration is certainly working! Naturally, I'm not likely to get anywhere near 60FPS 1080p video on most computers. Even if a high end gaming rig could pull it off, I would really rather not limit the number of people who could enjoy my game so drastically. The thing I don't quite get is, that the 10 year old render nodes, and the 12 year old workstation, will play a 60FPS, 1080p video in a player like Media Player Classic, MPV or VLC just fine.

So I figured the bottleneck lies with the way Ren'Py uses the alpha mask. I understand that it would be quite computationally expensive to carve up a double wide video, apply the alpha mask on the right to the video on the left, then stitch it back together again, all in real time. I started playing around with different encoding techniques and stumbled upon the VP9 codec, and the YUVA420p colour space. I encoded a video with ffmpeg. ffplay, MPC, VLC, and Ren'Py seem to ignore the alpha channel at first. Then I learned that the "default" decoder (whichever that is, I'm not sure) for ffmpeg is incompatible with the format of the video, although it would run fine, just ignoring the alpha channel. I ran the same file in ffplay and MPV with arguments that instructed those programs to decode with the libvpx-vp9 codec, and hey presto - the embedded alpha channel worked! And still at a rock solid 60FPS on a 12 year old toaster.

I understand that Ren'Py uses libav to play videos. The part where I get out of my depth is when I try to figure out how I might coax that into using the libvpx-vp9 decoder instead of whatever "default" it uses. It feels like I'm only a couple of steps away from success, but I could be delusional. I dug deeper and deeper until I ended up at ffmedia.c in the module folder of Ren'Py, trying to hack it into forcefully using libvpx-vp9 as a decoder, but I'm very out of my depth and I might be in completely the wrong area. I noticed a method called "AVCodecContext", although I don't know what "context" means in the... context of this... context. :lol: Here's my sad attempt at code whispering so you can have a laugh:

Code: Select all

static AVCodecContext *find_context(AVFormatContext *ctx, int index) {

	if (index == -1) {
		return NULL;
	}

	AVCodec *codec;
	AVCodecContext *codec_ctx = NULL;
	AVCodecContext *codec_ctx_orig = ctx->streams[index]->codec;

	codec = avcodec_find_decoder(codec_ctx_orig->codec_id);
//  My caveman attempt at hard coding.
	codec = avcodec_find_decoder_by_name("libvpx-vp9");
// That's it. Really. That's the best I can do.
	if (codec == NULL) {
		return NULL;
	}

	codec_ctx = avcodec_alloc_context3(codec);

	if (avcodec_copy_context(codec_ctx, codec_ctx_orig)) {
		goto fail;
	}

	if (avcodec_open2(codec_ctx, codec, NULL)) {
		goto fail;
	}

	return codec_ctx;

fail:
	avcodec_free_context(&codec_ctx);
	return NULL;
}
Another method I was considering was finding a Python wrapper for the likes of ffmpeg that I might find flexible enough to tell it what decoder to use, then crowbar it into Ren'Py by way of custom displayables. But that seems like reinventing the wheel, does it not?

So as you can see, I can't tell whether I'm just missing something that's right under my nose, or if I'm tilting at windmills. I think I need the transparency. 60FPS would be "nice". I'm just wondering if it's possible I can have my cake and eat it. Considering how quickly the videos decode in a video player, I know I can't expect precisely as much efficiency as if it's running in a game, but surely there's some way to squeeze it in there?

Thank you so much for listening to me waffle on for so long.

Re: Adventures in video decoders: the quest for VP9a

Posted: Mon May 25, 2020 7:55 pm
by rayminator
well I suggest that you don't play around with renpy file you can corrupt renpy itself and that you use the formats that are supported
https://www.renpy.org/doc/html/movie.html

if you want better fps is upgrade you computer like 6 to 12 core cpu and 16 to 32 gigabyte ddr3/4 video card p2000 to p4000 for rendering even a 2080ti geforce

Re: Adventures in video decoders: the quest for VP9a

Posted: Mon May 25, 2020 11:43 pm
by uyjulian
Ren'Py compiles ffmpeg without hardware acceleration support. See https://github.com/renpy/renpy-deps/blo ... ld.sh#L407

Re: Adventures in video decoders: the quest for VP9a

Posted: Tue May 26, 2020 12:05 am
by Imperf3kt
uyjulian wrote: Mon May 25, 2020 11:43 pm Ren'Py compiles ffmpeg without hardware acceleration support. See https://github.com/renpy/renpy-deps/blo ... ld.sh#L407
I was under the impression that it did/required it, like mentioned here by pytom.
viewtopic.php?f=32&t=41914#p439376

Re: Adventures in video decoders: the quest for VP9a

Posted: Tue May 26, 2020 12:39 am
by uyjulian
Imperf3kt wrote: Tue May 26, 2020 12:05 am
uyjulian wrote: Mon May 25, 2020 11:43 pm Ren'Py compiles ffmpeg without hardware acceleration support. See https://github.com/renpy/renpy-deps/blo ... ld.sh#L407
I was under the impression that it did/required it, like mentioned here by pytom.
viewtopic.php?f=32&t=41914#p439376
I am referring to video decoding hardware aceeleration, not graphics hardware acceleration.

Re: Adventures in video decoders: the quest for VP9a

Posted: Tue May 26, 2020 7:10 am
by itoril
Okay. This is an interesting direction the discussion has gone in. If Ren'Py (or libav, or SDL, or whatever point in the stack) doesn't use hardware acceleration to decode video, that explains a lot. Am I right in assuming that this points me towards using ffmpeg with a python wrapper instead of trying to hack Ren'Py into doing something it fundamentally doesn't want to do? If that option is more realistic it at least helps me narrow things down.

Clearly I don't understand the relationship between software accelerated and hardware accelerated portions of a program, and their interplay, because when I run some tests on my oddly built render nodes, things don't seem to add up to me:

When the node is displaying text and one background, it gets about 30FPS. Acceptable for a system that is very very not designed to run video games! It doesn't even have drivers installed for its display adapter (because I only use it for CPU rendering, most of the time). When it's displaying alpha masked video, it drops to 4FPS. Now, my workstation, which has the GPU (because I model and animate through hardware accelerated viewports), gets 60FPS when it's not showing any video (better, as we would expect, hence evidence that the graphics are indeed hardware accelerated). As I mentioned before, the workstation runs the video at 15-20FPS in Ren'Py. The question this presents is, if Ren'Py doesn't use hardware acceleration to decode video, why does a 2.2ghz core beat a 3.1ghz core, by a factor of about 3-4? Unless I'm underestimating the strain the rest of the program puts on the node. The RAM on the node is faster than the workstation too - 1.3ghz DDR3 vs 800mhz DDR2.

I ran the video in Media Player Classic on my render node again, and after complaining about not being able to initialise hardware acceleration, it plays the video at 60FPS! Does this mean anything? Maybe it doesn't. Again, feeling out of my depth.

Re: Adventures in video decoders: the quest for VP9a

Posted: Tue May 26, 2020 11:07 am
by uyjulian
Probably the following is happening:
1. Ren'Py uses an old version of ffmpeg (3.0)
2. Ren'Py is optimized for SSE2 systems, so it may not use the full instruction set of your processor optimally