Bug 3616 - recursive error after client command overflow
Status: RESOLVED FIXED
Alias: None
Product: ioquake3
Classification: Unclassified
Component: Misc
Version: GIT MASTER
Hardware: All All
: P3 minor
Assignee: Zachary J. Slater
QA Contact: ioquake3 bugzilla mailing list
URL:
Depends on:
Blocks:
 
Reported: 2008-05-01 21:38 EDT by /dev/humancontroller
Modified: 2009-10-11 14:35:37 EDT
1 user (show)

See Also:



Description /dev/humancontroller 2008-05-01 21:38:27 EDT
(Similarly to bug 3585 ...)
There are cases when the client's reliable command buffer is overfilled. The client handles the situation by disconnecting from the server. Which means sending another reliable command ("disconnect"), which results in another error call, etc.

Technically speaking:
CL_AddReliableCommand()
 Com_Error() // overflow, anything could be behind this
  CL_Disconnect() // send "disconnect"
   CL_AddReliableCommand() // already overflown
    Com_Error()
     Sys_Error()
      CL_Shutdown()
       CL_Disconnect()
        CL_AddReliableCommand() // ah, not again!
         Com_Error()
          Sys_Error()
           CL_Shutdown() // recursive shutdown, return
           Sys_Exit()
Comment 1 Thilo Schulz 2008-05-02 01:29:35 EDT
Have you checked against most recent version of SVN?
I think I fixed it in rev 1333 three days ago already
Comment 2 /dev/humancontroller 2008-05-04 10:55:55 EDT
(Forget that. These bugs are not related.)


Our reliable command buffer keeps filling up when we add commands for transmission, and it's emptied (to an extent) when we receive acknowledge. But what if want to add one, and our buffer is full (we still haven't received acknowledge to those commands)? We would have to overwrite/drop old commands. Doing so however does not guarantee that every command will get to the server, and thus we wouldn't be calling it a "reliable" command. So when adding a reliable command, checks are done (an actual overflow does not happen). If we're about to overwrite an old command, the connection is considered defective, and must be dropped.

Such cases arise from the client trying to send commands too fast for the server to answer to them in time. With higher ping times it's easier, but even with low ping, bursting reliable commands can overfill the buffer. (Or if the server does not respond at all, the client can slowly overflow.)

To reproduce, do this:
- /bind v "a;a;a;a;a;a;a;a;a;a;a;a;a;a;a;a;a;a;a;a;a;a;a;a;a"
- connect to a server with bad ping (first make a server with packet delay)
- quickly press v a few times

Let's say that our buffer is just full, and a CL_AddReliableCommand is done. We execute Com_Error( ERR_DROP, "Client command overflow" ), and there, call CL_Disconnect.
The assumption that CL_Disconnect makes is that it can gracefully disconnect from the server, and therefore does a CL_AddReliableCommand("disconnect"). But our buffer is ALREADY full, and we're trying to overfill it the second time.
If we wouldn't be checking for recursive errors in Com_Error, then it would be an infinite cycle of CL_AddReliableCommand, Com_Error, and CL_Disconnect. But we do check. Since com_errorEntered got set at the first Com_Error call, we call a Sys_Error.
Sys_Error's first thing is to do CL_Shutdown, which calls CL_Disconnect. Again, it would be another infinite cycle of Com_Error, Sys_Error, CL_Shutdown, and CL_Disconnect. But CL_Shutdown also checks for recursive errors, and does nothing but returns if called recursively.
So we're back at our second Sys_Error call, and we exit the program.

So how to diagnose the overflow?
Borrow the overflow check from CL_AddReliableCommand, and don't send "disconnect" if we're full? The connection will hang then.
Overwrite the last command with "disconnect"? Command scripts are sent in packs (like small programs), and may have rather undesired effects when half-sent.
So formulate a new packet with just one "disconnect" command? Or keep track of how many reliable commands we have summed up in the last frame, and remove ones that were added in the current frame that caused the overflow?
What if we're just at the buffer limit, and CL_Disconnecting because of something else? Send an additional "disconnect" from outside the buffer?
Comment 3 Thilo Schulz 2009-10-11 14:35:37 EDT
Fixed in svn rev. 1655 and 1656

I made it always leave a slot free for a possibly needed "disconnect" cmd in the end. IMHO, this is the only way of ensuring the disconnect command is properly received and executed on the server.