Bug 5702 - QVM execution crashes when compiled with -fstack-protector-all
Status: RESOLVED FIXED
Alias: None
Product: ioquake3
Classification: Unclassified
Component: Misc
Version: GIT MASTER
Hardware: All All
: P3 normal
Assignee: Zachary J. Slater
QA Contact: ioquake3 bugzilla mailing list
URL:
Depends on:
Blocks:
 
Reported: 2012-06-30 21:27 EDT by symlink
Modified: 2012-08-08 05:37:25 EDT
1 user (show)

See Also:


Attachments
do not use fast call hacks (688 bytes, patch)
2012-07-02 17:16 EDT, /dev/humancontroller

Description symlink 2012-06-30 21:27:32 EDT
When ioq3ded+ioquake3 are compiled with -fstack-protector in CFLAGS, the executables crash while loading QVMs;

    Loading vm file vm/qagame.qvm...
    File "vm/qagame.qvm" found in "/usr/share/games/quake3/baseq3/pak8.pk3"
    VM file qagame compiled to 1823488 bytes of code
    qagame loaded in 3270528 bytes on the hunk
    ********************
    ERROR: program tried to execute code outside VM
    ********************
    ----- Server Shutdown (Server crashed: program tried to execute code outside VM) -----
    recursive error after: program tried to execute code outside VM


This compile option is enabled by default on Gentoo Hardened and similar setups.

I guess there is no easy way to selectively disable SSP for the function(s) which cause trouble. As such it might be a good idea to add -fno-stack-protector and a comment to the default CFLAGS?
Comment 1 Thilo Schulz 2012-07-01 10:31:42 EDT
I'm sorry, but I cannot reproduce this.

I'm using this one: gcc version 4.4.3 (Gentoo 4.4.3-r2 p1.2)

Please note, that the error message you are getting is triggered when something inside VM execution is messed up. So something is messed up during executions of the VMs.

If you want me to debug this, you can give me a shell on your computer where i can test this and look at what's happening. However, at this time, this looks like a compiler bug to me.
Comment 2 /dev/humancontroller 2012-07-02 17:16:04 EDT
Created attachment 3245 [details]
do not use fast call hacks

(instant patch for symlink)

on some platforms, fast call hacks are used in ioquake3 to yield some minor performance gains. in r2282, i marked Clang/LLVM as a platform where these hacks should not be used. this is required because of how Clang/LLVM-generated code works by default. the hacks still work with GCC (on some platforms) by default, but not when the stack protector feature is enabled.

btw, i suspect that the gains are imperceptible anyway. note: they are, for example, turned off in Tremulous (an ioquake3-based mod) at all times.
Comment 3 symlink 2012-07-07 12:29:59 EDT
Even if I apply /dev/hc's patch to an otherwise unmodified checkout of SVN r2299 and compile with default (Gentoo Hardened's) settings, I get the same crash. Appending -fno-stack-protector to CFLAGS for the very same unmodified checkout runs fine.


Thilo, it might not be just that single flag which is causing trouble. While I'm hesitant to provide you a shell, here's some more info;

  $ emerge --info | head -n1
  Portage 2.1.10.65 (hardened/linux/amd64, gcc-4.5.3, glibc-2.14.1-r3, 3.3.8-gentoo x86_64)
  $ gcc-config -c
  x86_64-pc-linux-gnu-4.5.3

Instead of adding -fno-stack-protector to CFLAGS, you can also select the *-hardenednossp GCC profile and compile/emerge (see http://www.gentoo.org/proj/en/hardened/hardenedfaq.xml#hardenedcflags), both methods result in a working executable.
Doing the reverse and using GCC *-vanilla with -fstack-protector-all also results in the same crash (with and without /dev/hc's patch).
Comment 4 symlink 2012-07-07 12:34:07 EDT
I've just noticed that I've written "-fstack-protector" in my original post. It should be "-fstack-protector-all".
Please excuse this major slip through, I've already corrected the bug title.
Comment 5 Thilo Schulz 2012-07-07 17:29:40 EDT
Yes. I can reproduce the bug now. And I know the reason why this is happening.
The DoSyscall() function is the main entry point for when the VM calls outside the VM. To work around different calling conventions, I added some inline ASM that immediately at function start retrieves the arguments from the VM from the right registers.

This assumes that the compiler doesn't add too much boilerplate that fucks with the registers. And exactly this is happening with the stack protector. It uses EAX as temp register to write to the stack:

   0x000000000051a4ca <+12>:    mov    %fs:0x28,%rax
   0x000000000051a4d3 <+21>:    mov    %rax,-0x18(%rbp)
   0x000000000051a4d7 <+25>:    xor    %eax,%eax

However, EAX is used to specify the syscall number. That's why it won't work.
This could be fixed if I stored these variables in memory instead of just registers. I could also get rid of a whole lot of platform specific code. So I probably am gonna do this some time in the future.
Comment 6 Thilo Schulz 2012-08-08 05:37:25 EDT
(In reply to comment #5)
> This could be fixed if I stored these variables in memory instead of just
> registers. I could also get rid of a whole lot of platform specific code. So I
> probably am gonna do this some time in the future.

This is done now in r2305. Have fun!