DXX-Rebirth Forum

Full Version: [DXXR 0.58.1] segfaults on the RPi
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Today, I tried to make some .debs of the current rebirth release for raspbian and did some test-playing with the resulting binaries. Unfortunately, I experienced some random segfaults, like this one which happend when I was just killed and the death roll was supposed to start:
Program received signal SIGSEGV, Segmentation fault.
0x00047b6c in _piggy_page_in (bmp=...) at main/piggy.h:104
104         if ( GameBitmaps[(bmp).index].bm_flags & BM_FLAG_PAGED_OUT ) {
(gdb) bt
#0  0x00047b6c in _piggy_page_in (bmp=...) at main/piggy.h:104
#1  update_cockpits () at main/gamerend.c:411
#2  0x00047d94 in game_render_frame_mono (flip=<optimized out>) at main/gamerend.c:358
#3  0x0004260c in game_handler (wind=<optimized out>, event=<optimized out>, data=<optimized out>)
    at main/game.c:982
#4  0x00018898 in window_send_event (wind=<optimized out>, event=<optimized out>) at arch/sdl/window.c:211
#5  0x000162e4 in event_process () at arch/sdl/event.c:165
#6  0x0005d0f4 in main (argc=<optimized out>, argv=<optimized out>) at main/inferno.c:437
(gdb) p bmp
$1 = {index = 16508}
(gdb) ptype GameBitmap
type = struct _grs_bitmap {
    short int bm_x;
    short int bm_y;
    short int bm_w;
    short int bm_h;
    sbyte bm_type;
    sbyte bm_flags;
    short int bm_rowsize;
    unsigned char *bm_data;
    short unsigned int bm_handle;
    ubyte avg_color;
    fix avg_color_rgb[3];
    sbyte unused;
    struct _grs_bitmap *bm_parent;
    struct _ogl_texture *gltexture;
} [1800]
(gdb) up
#1  update_cockpits () at main/gamerend.c:411
411             PIGGY_PAGE_IN(cockpit_bitmap[PlayerCfg.CockpitMode[1]]);
(gdb) p PlayerCfg.CockpitMode
$10 = {0, 4}
(gdb) p cockpit_bitmap
$11 = {{index = 61}, {index = 62}, {index = 63}, {index = 0}}

Currently, I've no idea how this cockpit mode 4 came into beeing (the PlayerCfg structure itself seems OK), and if this is specific to the RPi code paths or if the different architecture just exposes some other more general bug. (Also note that unlike the default release builds which use -O2, I use -Os for my builds, which I found preferrable on the RPi at least). All I currently know is that these crashes are not easy to reproduce. I will investigate this further during the next days...
Well, cockpit_mode 4 is the Letterbox view. So that value itself isn't unexpected in this case.
However it should not load a bitmap for this one. I'll check out how this came to be.
This appears to be using Descent 1 source, right?  Your title is ambiguous.  Could you identify exactly which commit you used to build this?

Mode 4 is CM_LETTERBOX, which sounds plausible if I remember the death animation correctly. GameBitmaps[] is only [1800], so accessing [16508] is definitely a bug.  It looks like this is just fundamentally broken, since there is no defined cockpit_bitmap for letterbox mode.  Try not to go into letterbox mode. Wink
Yes, this is the d1x source, official 0.58.1 tarball. Interestingly, there seems to be at least another problem, since the first time it did crash at some other time when the letterbox cockpit mode was definitively not involved (it happend when I killed a hulk, but no idea if that fact has anything to do with this). But I was unable to reproduce that crash with a debugger attached (and a build with -g) so far, just ended up with this cockpit mode issue. I should in general enable core dumps when doing tests...
It needs to be checked whether the same call applies when going into Fullscreen mode, causing some strange behaviours as well.

I found it to be the most comfortable to usually play with Valgrind in the Background - as long as I am on my main PC with a CPU capable doing that with enough FPS.
(08-10-2013, 06:57 PM)zico link Wrote:@derhass:
I found it to be the most comfortable to usually play with Valgrind in the Background - as long as I am on my main PC with a CPU capable doing that with enough FPS.

Hehe, yes. On the RPi, I end up with ~1fps when doing so. Could be an interesting test case for the homer code, tough  Wink
Yeah that may be a problem. But I honestly started to love it as it tends to show also the things that don't crash the game immediately but make you wonder 20 minutes later why the bots suddenly stop moving and singing Cara Mia.

I really wonder why whis bug never triggered before. I may very well introduced that myself when I made the cockpits redraw for every frame (to render properly in OpenGL).
Speaking of running d1x-rebirth with valgrind. On the RPi, I do get lots of uninitialized accesses for things accessing ogl_texture_list[...].handle, like this one:
==7616== Conditional jump or move depends on uninitialised value(s)
==7616==    at 0xBD36C: ogl_get_free_texture (ogl.c:250)
==7616==    by 0xED2F: ogl_init_font (font.c:601)
==7616==    by 0xFC3F: gr_init_font (font.c:1070)
==7616==    by 0x44E7F: gamefont_loadfont (gamefont.c:91)
==7616==    by 0x45183: gamefont_choose_game_font (gamefont.c:137)
However, ogl_init_texture_list_internal() is called and everything should be intiailized here. There also seems to be no out-of-bounds access into this array, especially with that ogl_get_free_texture() call. I really don't know what to make of these, as there seems to be nothing wrong here...

You'll find the complete valgrind log here: http://www-user.tu-chemnitz.de/~heinm/tmp/valgrind.txt There are also some issues with some of the VideoCore/GLES libs, but those are probably false positives. Also note that the valgrind port for the RPi is quite new, maybe there is a problem with the tool itself.
Actually I had issues very similar like this before as well. Ended without making too much sense. But like here I also got several problems coming from the nvidia kernel module. Checking back with me eeePC running with an Intel chip, the issues disappeared. By now nVidia seemed to have fixed the most prominent issues, tho.

So yeah it may be hard to tell but if you cannot find the issue in the OpenGL(ES) code, it may probably originate from one level below. I had Valgrind running on the 0.58.1 code (not unification branch tho) and I can't say I noticed these myself.
Maybe valgrind just misses the update of the texture handle with glGenTextures(), but what confuses me: even if it does, the value was first initialized to zero outside of any libs. Running with --track-origins=yes just gives me
==7663==  Uninitialised value was created by a heap allocation
==7663==    at 0x4835978: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)
which is not really helpful after all...