This is an incredible trivial design mistake that has haunted Quake 3 listen servers since ever. Some licensees have fixed it (Call of Duty has, and some others - even early ones).
Symptoms (for a listen server where the client framerate is high compared with the server one, for example, com_maxps 90 and sv_fps 20) (this affects both the server's client, and LAN clients unless sv_lanForceRate is 0) (this is for sv_syncronousclients 0; and it doesn't happen with bots since thay are always syncronous):
1. Demos are very big, and they play back jerky (everything, including POV, moves at 20fps)
2. Clients see each other moving jerky (they seem to move at 20fps)
3. Network traffic is very high (same issue as 1)
I don't really know a lot the engine internals, but long before the source was released I had a gut feeling about what was happening.
On the localhost client and LAN clients (if sv_lanForceRate is 1) the game forces to send a snapshot every client frame (com_maxfps). However, the server is running at sv_fps. The result is that several identical snapshots are sent (causing size & traffic).
On top of that, the client gets confused and says "if I'm getting 90 snapshots per second, I don't have to interpolate". So it uses all the snapshots, but since they are identical (no movement happened), things appear to move at sv_fps fps.
The trivial attached patch fixes all these issues. Perhaps somebody with more experience can look at the issue more closely. What worries me about the patch are harmonics: consider the sequence of client frames:
[1] 2 3 [4] 5 6 [7] 8 9
[Bracketed] frames are frames where the server runs. (In this case, the client would be running 3 times at the speed of the server - for example 20fps vs 60fps). What I don't know for sure is if the snapshots that are sent are sent on frames 1, 4 and 7. Perhaps they are sent in frames 2, 5 and 8, or 3, 6 and 9, which is correct but suboptimal since it introduces lag. I haven't verified this, perhaps it's not a problem.
Note that when I said com_maxfps, I assume the machine is fast enough to run at that speed - you get the idea. It's a shorter way of saying "client FPS".
I've been thinking about it and I think the best solution would be to make sure you only send snapshots in the frames where the server runs.
If I'm not mistaken, when using dedicated servers, "snaps" mean "wait at least xxx ms before sending another snapshot" (where xxx is 1000/snaps). So if sv_fps is 20 and snaps is 15 (or any value between 10 and 19); then 10 snapshots go out every second (every other one).
So for example, using the same diagram as above:
[1] 2 3 [4] 5 6 [7] 8 [9] [10] 11 [12] 13 14 [15]
Note that the client framerate is changing (slowdown, whatever) (assuming sv_fps is 20, in 1--4 it's 60fps, in 9--10 it's 20fps - you get the idea). So for localhost and LAN clients and remote clients with high enough rate and snaps>=sv_fps, you send snapshots exactly on the bracketed frames.
For a client who has snaps 15 (the server has sv_fps 20), you want to send snapshots exatly on 1, 7, 10 and 15, or 4, 9, and 12 (both are equally good). Similar stuff for rate thortled clients.
Finally, there's the case where the client framerate falls bellow sv_fps. I don't know what happens here. If less server frames are run (exactly one per client frame), then you can continue to do the same. If more than one server frame is run per client frame, you have to make sure only the last snapshot is sent (it doesn't make sense to send two one after another wothout time in between - you only care about the most recent one).
(With "client framerate" I always mean "FPS of the machine thet runs the listen server")
I believe this is as close as you can get to a dedicated server using a listen server. It'd be better to be multithreaded and run the server exactly at sv_fps in its own asynchronous high-priority thread. Many people start two processes to run a dedicated server.
Observant readers will have noticed that a poor man's fix is to set sv_fps as high as the client framerate. That doesn't fix the demos and network traffic being huge though - and it's just evil.
Comment 3Zachary J. Slater
2006-08-02 12:46:52 EDT
This all stinks of breaking network compatability. If it does, we aren't changing it.
Ok, I've done a lot of testing with a modified-for-testing build which prints when server frames happen and when snapshots are sent, and both my concerns were unfounded: this works OK all of the time.
To answer some of my own questions: harmonics will never happen, since I use svs.time to calculate the next send time. It's always in perfect sync with the server's tics.
If client framerate < server framerate, multiple server tics are run per frame, but only one snapshot is sent at the end. This is correct.
The resulting binary is fully compatible with iD's 1.32, both at network level and at demos level.
Setting a QA contact on all ioquake3 bugs, even resolved ones. Sorry if you get a flood of email from this, it should only happen once. Apologies for the incovenience.
--ryan.
Created attachment 1005 [details] Said patch