abstract
When calling "free" on a string within a struct, I get a segfault.
I'm adding support for the Lua scripting language into an existing game engine (the CRX engine which powers Alien Arena.)
Just to explain, "Con" is short for "Console," since every script has its own console. You know the console you use in Quake-based games with the tilde key? That's the type of console I'm referring to. I have various reasons for not calling my variables MAX_SCRIPTS, firstFreeScript, and numScripts. Mostly these reasons are to do with future plans-- for now, when you look at the code, "con" and "script" are interchange-able. Might make it easier to understand my code.
lua_State is defined in the Lua scripting language's development headers. As is lua_close. (lua_interrupt is a custom function but not relevant. That works fine.)
I have defined a struct which contains the virtual machine state and some metadata relevant to the script. Here it is taken directly from the file without any changes at all. (You can tell I've changed nothing because of all the comments which aren't relevant to what I'm asking you... )
Code:
//===========================================================
//lua-related stuff which needs to go here
#define MAX_SCRIPT_SIZE 8196 //probably higher than needed
typedef struct {
char con_outBuf[CON_ALTERNATE_OUTPUT_SIZE];
lua_State *luaVM;
int id; //unique id for which script this is. Starts at 1 not 0.
char scriptName[1024]; //will probably eventually get rid of. Don't actually need it for anything.
char scriptText[MAX_SCRIPT_SIZE]; //will probably eventually get rid of. Don't actually need it for anything.
int publicMsgItemsReadSoFar; //compare to publicMsgOutputBufCount to see how many items to read
} lua_script_package_t;
/*these two arrays point to the same scripts. It's just that, when a script is deallocated,
in scripts_defragged, that gap is filled by moving the last script off the end into that gap and
decrimenting numCons. This makes for-loops accross all scripts efficient (happens every frame)
while allowing scripts to keep their unique IDs. */
lua_script_package_t *scripts[MAX_CONS]; //for referring to a script by its unique ID
lua_script_package_t *scripts_defragged[MAX_CONS]; //for simply going through all scripts fast and efficient, but in no special order
int firstFreeCon;
int numCons;
To explain further, each script has its own unique ID which is used to refer to it and control it. When I have six scripts, and I remove say the third one, I do not want the unique scipt IDs of any other scripts to change. This leaves the "scripts" array looking like this:
Code:
[ *lua_script_package_t, *lua_script_package_t, NULL, *lua_script_package_t, *lua_script_package_t, *lua_script_package_t]
However, for efficiency's sake (this is a 3D game, remember?) I do not want to loop through every single script and see which ones are free! So I created another array, scripts_defragged. These arrays both point to the same scripts. When you add a new script, you add it to both. Whenever a script is removed, it is removed from both. However, when you remove a script, if it's not the last script in scripts_defragged, then the last script is taken off the end of scripts_defragged and used to fill that hole. That way, you can just loop across that array instead if you want to go through all scripts in no particular order but at the best speed.
Here you see, I have written a console command (same console as before) for removing a script. Cmd_Argv and Cmd_Argc are provided within the game engine for when you're writing a console command, they do exactly what they sound like.
Code:
/*
===========
Cmd_lua_rm_f
Given a lua script ID, unload the script from memory
and 'defrag' scripts_defragged if there is a 'hole' left in it
===========
*/
//cleanly delete all pointers in the script except the actual luaVM
//we can't delete the luaVM using "free" because it's an incomplete type
//instead we call lua_close on a pointer to it and then assign that pointer the value NULL
void freeScript (lua_script_package_t *scriptToFree){
free ((*scriptToFree).scriptName);
free ((*scriptToFree).scriptText);
free ((*scriptToFree).con_outBuf);
}
void Cmd_lua_rm_f (void){
int script;
int curScript_C;
if (Cmd_Argc () != 2){
Com_Printf ("lua_unload <script ID> : Tell the script to clean up after itself and then remove the script from memory.\n");
return;
}
script = atoi(Cmd_Argv(1))-1; //the script ID starts at one not zero (zero is reserved for the displayed-on-the-screen console)
//so we subtract one to get the real index in the array
lua_interrupt (&(*scripts[script]).luaVM, "kill");
lua_close ((*scripts[script]).luaVM);
(*scripts[script]).luaVM = NULL;
freeScript (scripts[script]);
script++; //the script ID starts at one not zero (zero is reserved for the displayed-on-the-screen console)
//so we increment to get the true script ID of lua_script_package_t.id
for (curScript_C = 0; curScript_C < numCons; curScript_C++) {
if ((*scripts_defragged[curScript_C]).id == script){
scripts_defragged[curScript_C] = scripts_defragged[numCons];
scripts_defragged[numCons] = NULL;
numCons--;
return;
}
}
if ((*scripts_defragged[numCons]).id == script){
scripts_defragged[numCons] = NULL; //no need to defrag anything, we can just delete the last one.
numCons--;
}
}
And here is where the actual error happens, in void freeScript. Each of the "free" commands in that function cause a segfault (i.e. shuffle the order around and its always the first one.) This code compiles okay but never works.
It always happens the first time the function gets called! Which makes me think it's not some kind of heap corruption, that tends to only happen after a long time. (This function is called only about 10 seconds after the program starts.)
Here is a backtrace I made with GDB and the libc6 debugging symbols installed.
Code:
*** glibc detected *** /home/maxtothemax/alienarena/crded: free(): invalid pointer: 0x0869ecb8 ***
======= Backtrace: =========
/lib/i686/cmov/libc.so.6[0xb7e848f4]
/lib/i686/cmov/libc.so.6(cfree+0x96)[0xb7e86896]
/home/maxtothemax/alienarena/crded[0x804b1ec]
/home/maxtothemax/alienarena/crded[0x8049b11]
======= Memory map: ========
08048000-0806e000 r-xp 00000000 08:03 4751386 /home/maxtothemax/alienarena/crded
0806e000-08070000 rw-p 00026000 08:03 4751386 /home/maxtothemax/alienarena/crded
08070000-086d5000 rw-p 00000000 00:00 0 [heap]
b7700000-b7721000 rw-p 00000000 00:00 0
b7721000-b7800000 ---p 00000000 00:00 0
b7824000-b784e000 r-xp 00000000 08:02 597045 /lib/libgcc_s.so.1
b784e000-b784f000 rw-p 00029000 08:02 597045 /lib/libgcc_s.so.1
b7868000-b7ac6000 rw-p 00000000 00:00 0
b7ac6000-b7b3c000 r-xp 00000000 08:03 4923403 /home/maxtothemax/alienarena/arena/game.so
b7b3c000-b7b3f000 rw-p 00076000 08:03 4923403 /home/maxtothemax/alienarena/arena/game.so
b7b3f000-b7e16000 rw-p 00000000 00:00 0
b7e16000-b7f6e000 r-xp 00000000 08:02 605621 /lib/i686/cmov/libc-2.9.so
b7f6e000-b7f6f000 ---p 00158000 08:02 605621 /lib/i686/cmov/libc-2.9.so
b7f6f000-b7f71000 r--p 00158000 08:02 605621 /lib/i686/cmov/libc-2.9.so
b7f71000-b7f72000 rw-p 0015a000 08:02 605621 /lib/i686/cmov/libc-2.9.so
b7f72000-b7f75000 rw-p 00000000 00:00 0
b7f75000-b7f99000 r-xp 00000000 08:02 605625 /lib/i686/cmov/libm-2.9.so
b7f99000-b7f9a000 r--p 00023000 08:02 605625 /lib/i686/cmov/libm-2.9.so
b7f9a000-b7f9b000 rw-p 00024000 08:02 605625 /lib/i686/cmov/libm-2.9.so
b7f9b000-b7f9c000 rw-p 00000000 00:00 0
b7f9c000-b7f9e000 r-xp 00000000 08:02 605641 /lib/i686/cmov/libdl-2.9.so
b7f9e000-b7f9f000 r--p 00001000 08:02 605641 /lib/i686/cmov/libdl-2.9.so
b7f9f000-b7fa0000 rw-p 00002000 08:02 605641 /lib/i686/cmov/libdl-2.9.so
b7fa0000-b7fc5000 r-xp 00000000 08:02 36310 /usr/lib/liblua5.1.so.0.0.0
b7fc5000-b7fc6000 rw-p 00025000 08:02 36310 /usr/lib/liblua5.1.so.0.0.0
b7fdd000-b7fe1000 rw-p 00000000 00:00 0
b7fe1000-b7fe2000 r-xp 00000000 00:00 0 [vdso]
b7fe2000-b7ffe000 r-xp 00000000 08:02 596951 /lib/ld-2.9.so
b7ffe000-b7fff000 r--p 0001b000 08:02 596951 /lib/ld-2.9.so
b7fff000-b8000000 rw-p 0001c000 08:02 596951 /lib/ld-2.9.so
bffbf000-c0000000 rw-p 00000000 00:00 0 [stack]
Program received signal SIGABRT, Aborted.
0xb7fe1424 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7fe1424 in __kernel_vsyscall ()
#1 0xb7e413d0 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2 0xb7e44a85 in *__GI_abort () at abort.c:88
#3 0xb7e7a2ed in __libc_message (do_abort=2, fmt=0xb7f57328 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:173
#4 0xb7e848f4 in malloc_printerr (action=2, str=0xb7f54204 "free(): invalid pointer", ptr=0x869ecb8) at malloc.c:5994
#5 0xb7e86896 in *__GI___libc_free (mem=0x869ecb8) at malloc.c:3625
#6 0x0804b1ec in freeScript (scriptToFree=0x869acb0) at qcommon/cmd.c:682
#7 0x0804b2a7 in Cmd_lua_rm_f () at qcommon/cmd.c:699
#8 0x0804c4a1 in Cmd_ExecuteString (text=0xbfffebe0 "lua_unload 1") at qcommon/cmd.c:1239
#9 0x0804a580 in Cbuf_Execute () at qcommon/cmd.c:242
#10 0x0804a1dc in lua_frameVM (luaVM_local=0x869ecb0) at qcommon/lua.c:187
#11 0x0804b1cb in Cmd_lua_frame_f () at qcommon/cmd.c:665
#12 0x0804c4a1 in Cmd_ExecuteString (text=0xbffff070 "lua_frame 1 ") at qcommon/cmd.c:1239
#13 0x0804a580 in Cbuf_Execute () at qcommon/cmd.c:242
#14 0x0805307e in Qcommon_Frame (msec=1) at qcommon/common.c:1764
#15 0x08063525 in main (argc=1, argv=0xbffff594) at unix/sys_unix.c:334
(gdb)
This is on Debian Sid if it makes any difference. I'm not going for standards compliance here-- whatever compiles on GCC will do fine.