Bug 4977 - x86 qvm improvements
Status: RESOLVED FIXED
Alias: None
Product: ioquake3
Classification: Unclassified
Component: Platform
Version: GIT MASTER
Hardware: PC other
: P3 enhancement
Assignee: Zachary J. Slater
QA Contact: ioquake3 bugzilla mailing list
URL:
Depends on:
Blocks:
 
Reported: 2011-05-05 10:29 EDT by Eugene C.
Modified: 2011-05-22 12:58:44 EDT
1 user (show)

See Also:


Attachments
Patch (1.92 KB, patch)
2011-05-05 10:29 EDT, Eugene C.
Patch (4.24 KB, patch)
2011-05-05 10:44 EDT, Eugene C.
x86 qvm optimizations (4.43 KB, patch)
2011-05-05 10:49 EDT, Eugene C.
x86 qvm jump target checks (4.15 KB, patch)
2011-05-07 06:58 EDT, Eugene C.
x86 qvm jump target checks (3.96 KB, patch)
2011-05-07 07:11 EDT, Eugene C.
x86 qvm improvements (8.15 KB, patch)
2011-05-07 15:02 EDT, Eugene C.
x86 qvm improvements (7.46 KB, patch)
2011-05-10 03:35 EDT, Eugene C.
x86 qvm improvements (9.20 KB, patch)
2011-05-11 05:08 EDT, Eugene C.
x86 qvm improvements with jump checks (9.24 KB, patch)
2011-05-11 11:49 EDT, Thilo Schulz
x86 qvm improvements with jump checks (25.60 KB, patch)
2011-05-13 06:10 EDT, Eugene C.

Description Eugene C. 2011-05-05 10:29:47 EDT
Created attachment 2691 [details]
Patch

In x86 jit compiler there is variable callMask which is just not used properly

Suggested patch adds call-out-of-qvm protection which prevents code execution out of the compiled segment. Also it will help to avoid crashes that may be caused by such code:

int (*func)(void);
func = (void*)0x07400000; // in qvm it is actually an instruction count not address
func();

Similar call in actual code segment will probably cause crash but its still better than nothing IMO

Also, more or less modern x86 CPUs have branch predictors so execution of added cmp/jae instruction costs almost nothing
Comment 1 Eugene C. 2011-05-05 10:44:57 EDT
Created attachment 2692 [details]
Patch

Some additional jit-compiler optimizations
Comment 2 Eugene C. 2011-05-05 10:49:55 EDT
Created attachment 2693 [details]
x86 qvm optimizations

Fixed second patch
Comment 3 Thilo Schulz 2011-05-05 11:51:50 EDT
There's no real jump protection in place anyways right now for x86, ppc and sparc.

Have a look at:
https://bugzilla.icculus.org/show_bug.cgi?id=4249
and have a look at Ludwig's jump target checks in vm_x86_64.c

Do you feel like implementing this for x86 as well?
Comment 4 Eugene C. 2011-05-06 06:53:40 EDT
Looks like that but without data access checks, will try to rework patch and implement jump checks too (it may decrease qvm performance I think)
Comment 5 Thilo Schulz 2011-05-06 07:57:14 EDT
(In reply to comment #4)
> Looks like that but without data access checks, will try to rework patch and
> implement jump checks too (it may decrease qvm performance I think)

There are data access checks, see the RANGECHECK macro. Don't worry about QVM performance. The point of QVMs is that you cannot execute arbitrary code. Without these checks you can. Besides, even with these jump checks in place, it will still be much faster than using the interpreter.
Comment 6 Eugene C. 2011-05-07 06:58:54 EDT
Created attachment 2695 [details]
x86 qvm jump target checks

> There are data access checks, see the RANGECHECK macro.

I mean my work :)

First patch updated, added drop message and jump target checks with some optimization
Comment 7 Eugene C. 2011-05-07 07:11:02 EDT
Created attachment 2696 [details]
x86 qvm jump target checks

Fixed patch
Comment 8 Eugene C. 2011-05-07 15:02:13 EDT
Created attachment 2697 [details]
x86 qvm improvements

Merged 2 patches in one

Also added check for jump labels in const-optimizations so generated code now should be more correct than before - i.e. it will not perform optimizations if next opcode is a jump label
Comment 9 Thilo Schulz 2011-05-07 18:15:24 EDT
(In reply to comment #8)
> Created attachment 2697 [details]
> x86 qvm improvements
> 
> Merged 2 patches in one
> 
> Also added check for jump labels in const-optimizations so generated code now
> should be more correct than before - i.e. it will not perform optimizations if
> next opcode is a jump label

Nice, thank you. I'm sorry to ask you for another favor .. you can still write data to locations outside the VM by popping the stack lots of times till it's outside and then writing to the corrupted stack. Feel like implementing checks for stack operations as well? :)
Comment 10 Thilo Schulz 2011-05-07 20:00:39 EDT
> Feel like implementing checks
> for stack operations as well? :)

You dont really need to have checks, just do a data rangecheck, i.e. bit-AND the stack pointer with the vm->dataMask every time the stack is modified
Comment 11 Eugene C. 2011-05-10 03:35:50 EDT
Created attachment 2701 [details]
x86 qvm improvements

Added more correct and effective const jump optimization

As I can see stack pointer can be modified even often than actual data reads/writes so its better to put checks for load/store/arg instructions IMO

Also found that optimizer is very sensible to jused[] array - my fixed version already rejected few load/store const-optimizations (because of jump labels) that was possible with untouched code 

So I'm thinking - maybe its better to disable optimizations completely for vq3-qvms? VM_MAGIC_VER2 sure fixing that issue but it
- it doesn't help existing mods (where source code is often unavailable)
- it breaks (iirc) Q3 SDK license so mod developers must release sources (which may unacceptable due to many reasons)
- modern CPUs is fast enough to handle that :)
Comment 12 Thilo Schulz 2011-05-10 05:49:43 EDT
(In reply to comment #11)
> As I can see stack pointer can be modified even often than actual data
> reads/writes so its better to put checks for load/store/arg instructions IMO

Look at my code in vm_x86_64.c that I added to revision 1969, specifically the macros STACK_PUSH() and STACK_POP() and OPSTACKCHECK()
The opStack is modified every time you execute push, pop, etc.
And every time the opStack is pushed/popped we need to check whether the stack pointer (edi in your case) is still pointing to an address inside the VM.

Here's a VM that writes to memory locations where it shouldn't:

code
CNSTP4 805306368
CNSTP4 0
JUMPV

make it into a QVM with:
$ q3asm -o ui.qvm segfault.asm

Copy ui.qvm to baseq3/vm/
and segfault

> Also found that optimizer is very sensible to jused[] array - my fixed version
> already rejected few load/store const-optimizations (because of jump labels)
> that was possible with untouched code 

Does this new patch fix these issues, and can we expect old mods will still work with your code?

> So I'm thinking - maybe its better to disable optimizations completely for
> vq3-qvms? VM_MAGIC_VER2 sure fixing that issue but it

If you want to do that, then what's the point of your patches?

> - modern CPUs is fast enough to handle that :)

Yes, there's not much effect on framerate, because the heavy-duty stuff happens in the renderer, not the gamecode. Even with the additional checks every time the opStack is increased/decreased there's hardly any impact on the FPS for the vm_x86_64.c
Comment 13 Eugene C. 2011-05-11 05:08:44 EDT
Created attachment 2703 [details]
x86 qvm improvements

Fixed some bugs, added const-call optimization

I'm not sure about adding/changing opstack/edi stuff because of EmitCommand/etc. which may require more than just range checks

>Does this new patch fix these issues, and can we expect old mods will still work with your code?

For MAGIC_V2 it fixes all unsafe places and utilizes all optimizations, but not for VQ3 so I'm disabled const-optimizations for that case, old stable mods will work fine
Comment 14 Thilo Schulz 2011-05-11 11:49:41 EDT
Created attachment 2704 [details]
x86 qvm improvements with jump checks

(In reply to comment #13)
> I'm not sure about adding/changing opstack/edi stuff because of
> EmitCommand/etc. which may require more than just range checks

We do need this, however. If you won't do this I can take a look at this in the next couple of days. Just tell me whether you're going to do it or not because I don't want to invent the wheel twice.

> For MAGIC_V2 it fixes all unsafe places and utilizes all optimizations, but not
> for VQ3 so I'm disabled const-optimizations for that case, old stable mods will
> work fine

Right. I've tested your changes and they appear to be sound. I still had to make a little modification to the ErrJump routine, though, or it will still crash, because it resumes execution in the VM after Com_Error and Sys_Error were recursively entered.
Comment 15 Eugene C. 2011-05-11 14:42:49 EDT
(In reply to comment #14)

> I can take a look at this 

Ok, do that, also please fix comment line 

>a chance go get false-positive

"go" -> "to" :)
Comment 16 Eugene C. 2011-05-13 06:10:48 EDT
Created attachment 2706 [details]
x86 qvm improvements with jump checks

Reworked patch, moved const-optimizations in separate function, added more optimizations (const-conditions, some fp hack)

Also found that EmitMOV*EDI() optimizations is also unsafe in case of jump labels because register may contain value calculated in some other branch
Comment 17 Thilo Schulz 2011-05-15 09:24:48 EDT
Applied your patch r1985
I got rid of much of the ugly MSVC/GCC inline assembly code which did one and the same thing anyways.
I'm wondering about a few things though.

In the constant optimization function:

        case OP_CALL:
                if ( NextConstant4() < 0 )
                        break;

why not allow syscalls to be optimized in the same way intra-VM-Calls are?

Next thing I'm wondering why we need the jump tables built into the VM. Can't we just build the jused array in the first pass?
Comment 18 Eugene C. 2011-05-15 15:46:22 EDT
>why not allow syscalls to be optimized in the same way intra-VM-Calls are?

It requires more complex asm code so I'm just skipped it for first time

>Next thing I'm wondering why we need the jump tables built into the VM. Can't
we just build the jused array in the first pass?

"switch" statements may generate jump tables (stored in data segment), so jump address can be calculated during runtime execution only - i.e. not all jump labels can be referenced(->detected) during compilation passes
Comment 19 Thilo Schulz 2011-05-16 07:23:19 EDT
(In reply to comment #18)
> >why not allow syscalls to be optimized in the same way intra-VM-Calls are?
> 
> It requires more complex asm code so I'm just skipped it for first time

Well, I think I fixed that requirement with my latest SVN commit on vm_x86.c, specifically EmitCall(). Check it out and tell me whether I can remove the check

> >Next thing I'm wondering why we need the jump tables built into the VM. Can't
> we just build the jused array in the first pass?
> 
> "switch" statements may generate jump tables (stored in data segment), so jump
> address can be calculated during runtime execution only - i.e. not all jump
> labels can be referenced(->detected) during compilation passes

What do you need these jump labels for then, anyways?
You certainly cannot rely on them as a security feature, not even in VM_MAGIC_VER2. Is it solely for the optimization pass?
Comment 20 Eugene C. 2011-05-17 15:50:34 EDT
(In reply to comment #19)
> Well, I think I fixed that requirement with my latest SVN commit on vm_x86.c,
> specifically EmitCall(). Check it out and tell me whether I can remove the
> check

Looks like correct

> What do you need these jump labels for then, anyways?
> You certainly cannot rely on them as a security feature, not even in
> VM_MAGIC_VER2. Is it solely for the optimization pass?

Jump labels just sets barriers for optimizer, for example:

ADDRGP4 $1
JUMPV

ADDRGP4 $1
LABELV $2
JUMPV

In first case we can merge instructions and create const jump - it will be legal and safe. While in second case we can't because JUMPV is a jump label i.e. JUMPV can be called in any time from any code location (means - different stack/register values). 

So potentially, all optimizations without filled jtr segment is unsafe and may cause opstack corruption or unexpected register values
Comment 21 Thilo Schulz 2011-05-19 19:58:21 EDT
Thank you for your explanations. FYI: I have a patch ready with huge changes to vm_x86.c, which does opStack protection at no additional cost. I recommend holding off working on these until i commit them
Anyways, closing this now.
Comment 22 Thilo Schulz 2011-05-19 21:18:44 EDT
I remembere another question now.

Why is jusedSize = header->instructionCount + 2? Can't we jump to an invalid label if we allow this?
Comment 23 Eugene C. 2011-05-22 12:58:44 EDT
>Why is jusedSize = header->instructionCount + 2?

I'm not sure about that, maybe size was increased for safer lookahead (just like in case of code buffer) but corresponding range check is definitely wrong