I'm not sure why there was linear filtering when I was rendering at
1:1 pixel ratio, but it did happen. This fixes it by forcing
nearest-neighbour. The artefacting was caused by the linear filtering
blending with pixels outside the specified texture coordinates,
creating lines around everything.
Fun fact: the framebuffer technique CSE2 uses is demanding on the Pi
(1278x720 runs at 60 FPS when the framebuffer is forced to 852x480,
even though all the internal rendering is still 1278x720). I guess
rendering those extra 920160 pixels really takes its toll.
Apparently 2 VBOs wasn't enough. This bumped the framerate from 13FPS
to 20FPS in a stress-test (CSE2E at 1704x960 on a Raspberry Pi 3B
in X11 with the KMS OpenGL driver).
This should reduce stalling when the OpenGL driver is still
processing the buffer when we're about to upload to it.
Hopefully, this is what was making the OpenGL ES 2.0 renderer so much
slower than the SDLTexture renderer on the Raspberry Pi 3B (SDL uses
*8* buffers). Unfortunately, I don't have access to it right now, so
I can't test this.
Now the SDLSurface backend survives window resizes (also triggered by
alt-tabbing while in fullscreen), and the SDLTexture backend properly
regenerates its textures after a fullscreen alt-tab in DirectX mode.
When DirectX-SDL2 loses its device, it doesn't lose its textures,
just their contents, so we shouldn't remake the textures when we
regenerate the glyphs (that's coming next commit).
This way, I can use immediate mode, which is way faster than using
buffers for some reason. Since I'm not using profiles anymore, I
dropped the minimum requirement to OpenGL 3.1. If a driver doesn't
support Legacy GL, then it can use the slow buffer code.
But seriously, I need to figure out why using buffers is so slow.
If this was a common problem, Modern OpenGL wouldn't have made it the
only option.
For some reason CPU usage is still double that of the SDLTexture
backend (SDL2 uses OpenGL 2.1, with glEnable/glDisable-style
immediate mode).
If I downgrade to OpenGL 2.1, and use VBO-less glDrawArrays, I get
great performance. I just wish I knew what the AMD driver is doing
that's so much faster.
Turns out performance is absolutely abysmal on my laptop's copy of
Windows 10 (AMD A9 APU).
This is only one of the weird bottleecks: glFramebufferTexture2D
is a CPU sinkhole, so don't call it often.