Spiky Caterpillar wrote:Over a decade ago, I needed a 3D accelerator to get acceptable performance playing Quake 2 on old hardware. Today, any machine that cannot get acceptable performance playing a visual novel in software mode is too slow for normal use.
You wouldn't think so. But look at the numbers for a modern game, running on slightly older - but still reasonable - hardware. (Hardware that is strictly better than a netbook connected to a TV. Hardware that's approximately similar to what my laptop runs.)
Our CPU runs at 2 gigahertz, slightly slower than the laptop I use. Our game is 720p - 1280x720. It opens with a pan from a starry screen to a moonlit beach, one that takes up the full screen. We want to run this at the full framerate of the video card, so it looks smooth, say 60hz. (Think of this as Moonlight Walks HD.)
So we have 2,000,000,000 clock cycles per second. There are 60 fps, which means we have 33,333,333 cycles per frame. In a 1280*720 image there are 921,600 pixels. That leaves 36 clock cycles per pixel - assuming all we're doing is moving that single image around the screen. I don't know what the instructions per clock for this code is - there are a lot of math operations, which are fast, but there's also a lot of memory access, to big images that may not stay in the cache. My guess is that it's probably pretty close to an IPC of 1. On an Atom, it's likely much worse.
We can play with these numbers a bit. If we consider 24 frames per second acceptable, we get a luxurious 90 cycles per pixel. If we want 1080p@60Hz, we get 16 cycles per pixel. And so on. Half that if you show 2 screens worth of images - and realize that this assumes we spend all the time drawing screens, and none of it on figuring out what to display, dealing with user input, and everything else Ren'Py does.
Hardware acceleration really helps this - it gives us a lot of computation - nice, capable, massively parallel computation with optimized ways of dealing with RAM. It gives clean, capable, and relatively high-level APIs for dealing with this functionality. And since it's running on a second unit, it's all for free.
Adding functionality to the old software renderer is very difficult. The MMX path uses all of the MMX registers available - storing any more data would require me to rewrite the thing to use SSE. The C path is less complex, but much slower. The current code is fairly complex - there isn't time to use conditionals, so it required a bit of math to get right, and it took a while before all the bugs were worked out.
Compared to this, programming in OpenGL is a dream. Functionality can be added quickly, and you don't have to worry about things like a pointer leaving the memory associated with a texture. You don't have to worry about the complexity of parallel processing - Ren'Py's software renderer never did, but it would have to to get improved performance from the software renderer. Dealing with shapes other than rectangles is easy. The hardware is more capable, and has more breathing room - applying a color matrix to each pixel drawn shouldn't be problematic.
IMO, the move towards accelerated-only is detrimental to Ren'Py.
In theory, everyone should have working 3D accelerators by now. In practice, almost everyone has a graphics accelerator in their computer - but a LOT of us have completely screwed-up drivers for one reason or another (open-source purists; lazy folks; people with old computers; people with computers so new that they need to download an old version of the drivers to play our games; people who don't understand how to install drivers; people who are afraid of drivers.).
I'm trying to keep the requirements from Ren'Py fairly standard - right now, they should be the same for running WebGL in Chrome or Firefox. I realize there are some people who can't run something this basic - I've been kind of playing for time, hoping that as time goes by, this number will drop and drop. At some point, I have to say "Your computer is too old/broken for Ren'Py." That's always been the case.
At some point, I need to decide - with community input - when to move past a generation of computers. I tried to gather more objective information, but people didn't like the idea of Ren'Py reporting back on them. Understandable, but right now I'm having to guess as to what I can require, and wait for feedback from creators. (It's only recently that it became obvious to me that there was a problem with 6.13.7 compatibility.)
I see the progressively larger screen size of games as calling for more hardware acceleration.
The one thing that might help would be to integrate Offscreen Mesa support with Ren'Py. That would let me write OpenGL, and have it software rendered. I'm not sure how acceptable the performance would be, though, or how much size it would add to the Ren'Py distribution.