Bug 4962 - Protocol version 69 + legacy support
Status: RESOLVED FIXED
Alias: None
Product: ioquake3
Classification: Unclassified
Component: Misc
Version: GIT MASTER
Hardware: PC All
: P3 normal
Assignee: Zachary J. Slater
QA Contact: ioquake3 bugzilla mailing list
URL:
: 4961
Depends on:
Blocks:
 
Reported: 2011-04-27 16:15 EDT by Thilo Schulz
Modified: 2011-07-12 08:01:28 EDT
3 users (show)

See Also:


Attachments
Protocol 69 (39.51 KB, patch)
2011-04-27 16:15 EDT, Thilo Schulz
Fixes and changes without new protocol. (25.19 KB, patch)
2011-04-27 18:32 EDT, Zack Middleton
Add new protocol to fixes and changes. (17.28 KB, patch)
2011-04-27 18:38 EDT, Zack Middleton
Protocol 69 - New patch that fixes a few quirks (39.67 KB, patch)
2011-05-03 09:54 EDT, Thilo Schulz
Protocol 69 - Patch v3, better protocol version negotiation (42.14 KB, patch)
2011-05-04 10:16 EDT, Thilo Schulz
Protocol 69 - Patch v4, updated for rev 2012 (40.36 KB, patch)
2011-06-06 16:52 EDT, Thilo Schulz
Protocol 69 - Patch v5, updated for rev 2060 (40.93 KB, patch)
2011-06-25 12:56 EDT, Thilo Schulz
Protocol 69 - Patch v6 (rev 2070), implement discussed changes, add demo menu display support for both legacy and new protocol demos (38.55 KB, patch)
2011-07-07 21:44 EDT, Thilo Schulz

Description Thilo Schulz 2011-04-27 16:15:02 EDT
The Quake3 protocol has several weak points. In the past, when id was still having official releases, they would simply increase the protocol version, have a new point release and have their stuff fixed.
Indeed, we did not have that possibility until now because we were concerned with legacy support which is valid. On the other hand, there are the interests of standalone mod developers who will either need to abide by our restrictions of staying compatible and live with bugs, or either fix these bugs themselves. Why should we not fix this if we can?

I have prepared an initial patch which adds the new protocol with number 69, while legacy support for protocol 68 is still available. Both servers and clients are fully compatible. Servers should appear in the master server lists, too.

The support for the legacy protocol is completely disabled for standalone builds per default. For cases like openarena the cvar "oldprotocol" can be used to change the protocol version number that is recognized as legacy.

I have tested this kind of protocol separation for my EliteForce port and it works very well. I'd like to have this included as this fixes a long-standing vulnerability in the UDP connection code reported by Luigi Auriemma, and it will help me with the VoIP bugs that are still open on this bugtracker.
Comment 1 Thilo Schulz 2011-04-27 16:15:58 EDT
Created attachment 2667 [details]
Protocol 69
Comment 2 Tim Angus 2011-04-27 16:38:51 EDT
I'm lazy:

<Timbo> so is the main point only to fix http://aluigi.altervista.org/papers/q3noclient.txt ?
<Thilo> not only
<Timbo> but the main reason?
<Thilo> one of the majore reasons
<Timbo> because really that seems like a non-bug
<Thilo> the other reason is to get rid of the VoIP hacks
<Thilo> it's not a non-bug
<Timbo> it's only really an issue when you know the client address
<Timbo> which is almost never
<Thilo> skype, teamspeak...
<Thilo> forums
<Thilo> LAN connections
<Timbo> feh
<Timbo> tenuous
<Timbo> I mean I don't see it being used any where
<Timbo> the changes are waaaay too extensive for just addressing that
<Thilo> i told you
<Thilo> it's not only that
<Timbo> the #ifdef stuff is horrid
<Timbo> #ifdefs are a last resort
<Timbo> every time you add one you double your test cases
<Thilo> well, you can make your points on the bugtracker, i have to go now ;)(
<Timbo> and your failure modes
<Timbo> ok
<Thilo> night everyone :)
<Undeference> also being able to disconnect everyone on your network is kinda neat
<Timbo> haha
<Timbo> also, supporting two protocols just sounds like pain
<Lakitu7> "surely someone on my campus is playing quake 3 right now"
<Timbo> and that's ignoring the #ifdeffery
<Timbo> it really seems more like a patches page type thing to me that something we want in trunk
<Thilo> well i can get rid of the ifdeffery
<Timbo> it's massively disruptive to anyone using ioq3 as an engine
<Lakitu7> changes to "if" :]
<Timbo> the STANDALONE stuff was bad enough
<-- Henkan (~Henkan@h70n2fls31o1105.telia.com) has quit (Quit: Lämnar)
<Thilo> thats because your personal dislike of ifdefs
<Timbo> it's not a personal dislike
<Thilo> but i for one am not content to be stuck with a version number that id can change on a whim
<Undeference> it seems like a bad idea for a server to respond on a different address, since there's a good chance the client will never see the response
<Timbo> it's engineering sense
<Thilo> look at the hacks we're doing to get around the old protocol
<Thilo> seriously
<Thilo> reading past message end?
<Thilo> that's just as ugly
<Timbo> but it works, and isn't nearly as disruptive
<Thilo> disruptive?
<Thilo> you mean during an svn merge?
<Timbo> yeah, and future maintenance wise
<Thilo> the two voip bug reports would benefit from this new protocol
<Thilo> the changed code looks more than it really is
<Thilo> most of the patch size is the changed readme
<Timbo> alright, but this doesn't fix those things
<Thilo> we fix a bug that is exploitable and can disconnect any client on a whim
<Timbo> so if you did, that means another protocol version?
<Timbo> so we support three protocols then?
<Thilo> no
<Thilo> we only need to maintain the one ID legacy protocol
<Thilo> that was our stated goal
<Thilo> if we increase the version number we need not support the other protocol numbers
<Thilo> users using ioq3 can upgrade to newest version
<Thilo> those stuck with idq3 can still play together with ioq3 just minus our features
<Undeference> problem comes in with different distros maintaining different versions with incompatible protocols. not much of a problem as long as they can figure it out and fallback to the old protocol
<Timbo> so if an arbitrary ioq3 version connects, and it doesn't speak the new protocol, what happens?
<Thilo> Timbo: falls back to idq3
<Timbo> ok
<Timbo> I still hate the idea of multiple protocols :)
<Thilo> well alright, let's see how this plays out ;)
Comment 3 Tim Angus 2011-04-27 16:39:26 EDT
*** Bug 4961 has been marked as a duplicate of this bug. ***
Comment 4 Zack Middleton 2011-04-27 18:32:50 EDT
Created attachment 2670 [details]
Fixes and changes without new protocol.

Fixes and changes from Thilo's patch which do not split network protocol. Could be used/applied without new protocol.
Comment 5 Zack Middleton 2011-04-27 18:38:01 EDT
Created attachment 2671 [details]
Add new protocol to fixes and changes.

Adds onto the fixes and changes patch and contains the rest of Thilo's patch which adds a new protocol while keeping support for older protocols.
Comment 6 Thilo Schulz 2011-04-28 03:35:05 EDT
Thanks ZTurtleMan!

You really drive my point: the first patch without the new protocol should be applied anyways, as it already does a little part in the hardening against udp spoofing and denial of service attacks.

When coming to the new patch: the new protocol is actually not that much code. If you hate the #ifdeffery, that can be thrown out. The idea behind this was that standalone games need not run through the checks to support an old legacy protocol, but they don't really lose much if they still do them.

If not applying this patch, there will still be need for a "compat" field in the client/servers structures, as I intend to apply a few fixes for VoIP that would make voip incompatible.

Thus with this new protocol we kill two birds with one stone:
- Effectively secure ioq3 against UDP spoofing (if the attacker cannot listen in on your connection of course).
- Have a simple unique mechanism to discriminate ioquake3 clients/servers with different versions of VoIP or other features that might be added later

Different versions of ioquake3 can still cooperate by falling back to the original ioquake3 protocol.
All of what I wrote pretty much happens already anyways. In effect, there are two different protocols running side by side: VoIP enabled and disabled.

In terms of maintainability, I believe my solution is acceptable, and makes ioquake3 fit for the future while not forgetting id's original quake3.
Comment 7 Simon McVittie 2011-04-29 11:37:42 EDT
I like the idea of parallel support for an enhanced ioquake3 protocol, and protocol 68 (aka the idquake3 protocol).

I'm not sure that this patchset does quite the right thing for OpenArena: compatibility with historical OA versions would require that (with appropriate cvars set) protocol "71" in demos/on the network is interpreted as equivalent to the real protocol 68, so having the old/new protocol split be disabled for STANDALONE games isn't really desirable? (It doesn't directly matter to me, because Debian's ioquake3 engine build isn't STANDALONE anyway, but OA's official binaries are.)

If you're making an incompatible network protocol anyway, would it be feasible to have a separate indication of "content version" or "base game version" or something? Obviously OA's "protocol 71" has to stay (in protocol-68 mode) for compatibility, but it'd be great if when using Thilo's enhanced protocol, standalone games like OA could get the same easy indication of "no, this server is not compatible with you" without abusing the protocol field in this way. Even better if there's enough information that the error message can be something like "the server's base game is not compatible with yours (server requires OpenArena 0.8, your base game is OpenArena 0.9)".

If the engine supports exactly two network protocols (protocol 68, and an enhanced protocol) at any one time, will this cause problems for demos? For instance, if we have:

ioquake3 1.37: supports protocol 68 and Thilo's protocol 69
ioquake3 1.42: supports protocol 68 and an even newer protocol 70

would ioquake3 1.42 still have a sane way to play back protocol-69 demos?
Comment 8 Thilo Schulz 2011-04-29 11:55:41 EDT
(In reply to comment #7)
> I'm not sure that this patchset does quite the right thing for OpenArena:
> compatibility with historical OA versions would require that (with appropriate
> cvars set) protocol "71" in demos/on the network is interpreted as equivalent
> to the real protocol 68, so having the old/new protocol split be disabled for
> STANDALONE games isn't really desirable? (It doesn't directly matter to me,
> because Debian's ioquake3 engine build isn't STANDALONE anyway, but OA's
> official binaries are.)

If OA's current protocol is 71, You can set the cvar "protocol" to "72" and "oldprotocol" to 71. It will work fine with STANDALONE oa then. In either case, negotiation of which protocol to use happens transparently from the user.

> If the engine supports exactly two network protocols (protocol 68, and an
> enhanced protocol) at any one time, will this cause problems for demos? For
> instance, if we have:
> 
> ioquake3 1.37: supports protocol 68 and Thilo's protocol 69
> ioquake3 1.42: supports protocol 68 and an even newer protocol 70
> 
> would ioquake3 1.42 still have a sane way to play back protocol-69 demos?

No. I do not intend to support any other legacy protocol than idq3's original protocol. What will be supported are protocol 68 demos and the most current protocol used by ioq3.
Comment 9 Thilo Schulz 2011-05-03 09:54:49 EDT
Created attachment 2682 [details]
Protocol 69 - New patch that fixes a few quirks

- Makes ioq3 clients fallback to idq3 protocol if current protocol not supported
- Master server support works fine even with new protocol.
Comment 10 Thilo Schulz 2011-05-04 10:16:59 EDT
Created attachment 2687 [details]
Protocol 69 - Patch v3, better protocol version negotiation

I've added yet another version of the new patch. This removes the need for a hack to detect a server running the legacy protocol version.
Comment 11 Thilo Schulz 2011-06-06 16:52:20 EDT
Created attachment 2759 [details]
Protocol 69 - Patch v4, updated for rev 2012

I've again updated the patch to commit cleanly against latest revision, and also fixed a last tiny quirk.
Comment 12 Thilo Schulz 2011-06-25 12:56:26 EDT
Created attachment 2795 [details]
Protocol 69 - Patch v5, updated for rev 2060

Version 5. I've noticed I forgot one last hack that I got rid of. We must retain the cvar name "protocol" for the old protocol, the new protocol is now named "protocolver". Or else the old QVMs won't work correctly displaying the server list.
Also updated to apply against r2060
Comment 13 Tim Angus 2011-06-27 07:01:17 EDT
OK, I've had a pretty good look at it now. I have a few points and questions:

* There are changes in cvar.c and Com_sprintf that look unrelated to this patch; if that's the case they should probably just be committed anyway, separately.

* On the same theme, the fix for the q3noclient issue should probably be in a subsequent patch, to avoid clouding what is changing here.

* There is a line of commented code sending a disconnect command using NET_OutOfBandPrint. If it's not necessary it should be removed completely. If it needs to remain commented for some reason, this reason should be explained in comments.

* The cvar names protocol and protocolver are pretty horrible and misleading. I see why you've named them like this, but I think a better solution would be a well placed hack to translate requests for the value of "protocol" coming from legacy qvms to "oldprotocol".

* Actually, "oldprotocol" would be better as "legacyprotocol". This goes for all the preprocessor tokens etc. too.

* Now a slightly bigger suggestion. If we're going to have all this upheaval then we might as well do it right. To that end, I suggest we ditch identifying protocol by a simple number and switch to using strings. There are probably half a dozen things out there using protocol numbers in the range ~69-75, none of which are remotely compatible. If we use strings instead we avoid all this mess and also make it more difficult to (attempt) to connect to something where it has no business working. If strings are a problem (for whatever reason) we could alternatively use FOURCC codes like video codecs. This is assuming we have 32bits to play with.
Comment 14 Simon McVittie 2011-06-27 09:32:33 EDT
(In reply to comment #13)
> * The cvar names protocol and protocolver are pretty horrible and misleading.

(Where I say "idtech3" here, I refer to the entire family of Q3-based engines.)

I think part of the problem is that "protocol" is being used to mean two things: a version of the idtech3 network protocol, and a set of incompatible game content being used over that protocol.

In practice, all ioquake3-based games use idtech3 network protocol version 68, but some (e.g. OpenArena) re-label it with a different number in order to express that their game content is not compatible. (Or perhaps the network protocol version already encapsulates incompatible changes within the history of the Quake 3 game content? I don't know.)

So, perhaps we really want something like:

* protocol, with its current strange semantics, for interop with idquake3,
  older ioquake3 versions, and older ioquake3 forks like the one in
  OpenArena 0.8.5 releases: "use the idtech3 network protocol v68, but
  tell everyone that it is protocol version n"

* ioprotocol: underlying network protocol to use when in "new ioquake3" mode
  (probably best to make this visually distinct from protocol - start again
  at 1, or start from 100, or use a non-integer, or something)

* gamecontent: high-level game which you (are compatible with|require),
  example values "Quake3-1.0" and "OpenArena-0.8.1" - Q3's content has
  been backwards compatible with pak0.pk3 all the way back to the retail
  release, while OpenArena 0.8.5 is backwards compatible with 0.8.1 but
  no further back. Like com_basegame, this would only be relevant to
  complete games (like OA); any Q3 mods that require pak0.pk3 would still
  be "Quake3-1.0".

This would make gamecontent somewhat similar to com_basegame, but with the addition of an idea of "last compatible version".
Comment 15 Thilo Schulz 2011-06-27 15:23:26 EDT
(In reply to comment #13)
> OK, I've had a pretty good look at it now. I have a few points and questions:

Thanks

> * There are changes in cvar.c and Com_sprintf that look unrelated to this
> patch; if that's the case they should probably just be committed anyway,
> separately.

Yes. The cvar thing was part of a hack to write a debug_protocol cvar so that the serverbrowser would accept old versions. It's not necessary anymore though.
The snprintf change should be committed, i see no reason why it should return void when we have the information returned by snprintf()

> * On the same theme, the fix for the q3noclient issue should probably be in a
> subsequent patch, to avoid clouding what is changing here.

Can be done.

> * There is a line of commented code sending a disconnect command using
> NET_OutOfBandPrint. If it's not necessary it should be removed completely. If
> it needs to remain commented for some reason, this reason should be explained
> in comments.

Then I will remove it. The idea is that there's no possibility a spoofed disconnect message can disconnect arbitrary clients, because that occurance is not connection-based.

> * The cvar names protocol and protocolver are pretty horrible and misleading. I
> see why you've named them like this, but I think a better solution would be a
> well placed hack to translate requests for the value of "protocol" coming from
> legacy qvms to "oldprotocol".

Yes, it is necessary so the server browser won't break.
How do you detect legacy QVMs though? I proposed something some time ago which was rejected.
Besides, I don't like this kind of a hack. We'd have to add an extra if condition to Cvar_VariableStringBuffer(), or at least in cg_ui.c where that function is called.
I'd rather stay with the current solution of just adding a new cvar name. If you know a better name, go ahead.

> * Actually, "oldprotocol" would be better as "legacyprotocol". This goes for
> all the preprocessor tokens etc. too.

No problem. I don't care about naming.

> * Now a slightly bigger suggestion. If we're going to have all this upheaval
> then we might as well do it right. To that end, I suggest we ditch identifying
> protocol by a simple number and switch to using strings. There are probably
> half a dozen things out there using protocol numbers in the range ~69-75, none
> of which are remotely compatible. If we use strings instead we avoid all this
> mess and also make it more difficult to (attempt) to connect to something where
> it has no business working. If strings are a problem (for whatever reason) we
> could alternatively use FOURCC codes like video codecs. This is assuming we
> have 32bits to play with.

Do you mean for each project to then call its protocol like "ioq3_1" for our ioquake3, maybe "wq3_1" for western quake3 or "wop_1" for world of padman? Sure we can do it, no question about it. The protocol version is already transmitted in the infoString as a string to the server. But do we really need this? Most standalone games seem to be separated via master server anyways, and the cvars give full control over which protocol version number to regard as legacy, and which as new protocol
Comment 16 Thilo Schulz 2011-06-27 15:30:36 EDT
(In reply to comment #14)
> I think part of the problem is that "protocol" is being used to mean two
> things: a version of the idtech3 network protocol, and a set of incompatible
> game content being used over that protocol.

Yes. It's been even used in both these ways by id I think.

> * gamecontent: high-level game which you (are compatible with|require),
>   example values "Quake3-1.0" and "OpenArena-0.8.1" - Q3's content has
>   been backwards compatible with pak0.pk3 all the way back to the retail
>   release, while OpenArena 0.8.5 is backwards compatible with 0.8.1 but
>   no further back. Like com_basegame, this would only be relevant to
>   complete games (like OA); any Q3 mods that require pak0.pk3 would still
>   be "Quake3-1.0".
> 
> This would make gamecontent somewhat similar to com_basegame, but with the
> addition of an idea of "last compatible version".

Maybe. I don't see myself integrating this into ioq3 because i have no need for it, personally.
Comment 17 Thilo Schulz 2011-06-27 19:47:16 EDT
(In reply to comment #15)
> in the infoString as a string to the server. But do we really need this? Most
> standalone games seem to be separated via master server anyways, and the cvars
> give full control over which protocol version number to regard as legacy, and
> which as new protocol

Another issue this raises that I forgot about is that dpmaster does not support non-numeric protocol values (yet).
This may not be a problem for legacy quake3, as we always send protocol ver. 68 anyways. But this is definitely a problem for standalone game without legacy protocol.
Comment 18 Tim Angus 2011-06-30 08:54:02 EDT
(In reply to comment #15)
> > * The cvar names protocol and protocolver are pretty horrible and misleading. I
> > see why you've named them like this, but I think a better solution would be a
> > well placed hack to translate requests for the value of "protocol" coming from
> > legacy qvms to "oldprotocol".
> 
> Yes, it is necessary so the server browser won't break.
> How do you detect legacy QVMs though? I proposed something some time ago which
> was rejected.
> Besides, I don't like this kind of a hack. We'd have to add an extra if
> condition to Cvar_VariableStringBuffer(), or at least in cg_ui.c where that
> function is called.
> I'd rather stay with the current solution of just adding a new cvar name. If
> you know a better name, go ahead.

Hmm, I'll think about it a bit more. Having "protocol" and "protocolver" isn't good though. "protocolnew" doesn't really fit, nor does "protocol_not_legacy". Do we really need to expose that cvar anyway? It seems like the only reason we're allowing "protocol" to be changed by the user is to allow ioq3 to be used with OA? Is there a reason to allow "protocolver" to be changed as well?

> Do you mean for each project to then call its protocol like "ioq3_1" for our
> ioquake3, maybe "wq3_1" for western quake3 or "wop_1" for world of padman? Sure
> we can do it, no question about it. The protocol version is already transmitted
> in the infoString as a string to the server. But do we really need this? Most
> standalone games seem to be separated via master server anyways, and the cvars
> give full control over which protocol version number to regard as legacy, and
> which as new protocol

Yes that's what I mean. I realise other games are separated by master server but right now there is nothing stopping me connecting a Tremulous client, for example, to something else ioq3 based that advertises protocol 69. The server won't complain at the connection. It'll break later obviously, but not in a controlled or graceful way. If we have a string for protocol identification then this awkwardness goes away.

(In reply to comment #17)
> Another issue this raises that I forgot about is that dpmaster does not support
> non-numeric protocol values (yet).
> This may not be a problem for legacy quake3, as we always send protocol ver. 68
> anyways. But this is definitely a problem for standalone game without legacy
> protocol.

Problems like this was why I was suggesting possibly using a FOURCC code instead of a string. Providing we have 32 bits to play with for the protocol field, we can have much more unique protocol identification without any storage changes. In theory much of the existing code dealing with protocol number wouldn't need to be changed, except where user facing.
Comment 19 Tim Angus 2011-06-30 08:59:23 EDT
Oh, one other thing. In the README it's probably best not to draw attention to the specific security issue that is fixed, instead refering more generally to "security issues" or something like that.
Comment 20 Thilo Schulz 2011-06-30 10:20:27 EDT
(In reply to comment #18)
> Hmm, I'll think about it a bit more. Having "protocol" and "protocolver" isn't
> good though. "protocolnew" doesn't really fit, nor does "protocol_not_legacy".
> Do we really need to expose that cvar anyway? It seems like the only reason
> we're allowing "protocol" to be changed by the user is to allow ioq3 to be used
> with OA? Is there a reason to allow "protocolver" to be changed as well?

Yes. It's not only OA... World of Padman uses the same mechanism to distinguish between game content. I am sure other mods do as well.
The Debian package maintainers don't want to compile 5 different instances of the same engine that basically does the same thing when nothing really changed except for game content. I understand that, and would like to support them in that. Also not all standalone developers have the resources to make builds for the machines we support. ioquake3 has powerful standalone detection capabilities for exactly these cases.

> Yes that's what I mean. I realise other games are separated by master server
> but right now there is nothing stopping me connecting a Tremulous client, for
> example, to something else ioq3 based that advertises protocol 69. The server
> won't complain at the connection. It'll break later obviously, but not in a
> controlled or graceful way. If we have a string for protocol identification
> then this awkwardness goes away.

The getServersExt query does already differentiate between games. This requires each standalone game to run their own masters, or at least link to a master that understands this query. But that's not an unreasonable demand. Especially given upcoming ipv6 support, we can make getserversext query standard in the standalone case, not only for ipv6 connections. I understand that idsoft's master server is a dpmaster. If they update, we could drop the getservers query entirely in favor of getserversExt. The master server would go on supporting getservers for old q3 clients, but it wouldnt break anything anymore then.
(getServersExt only returns ipv4 results if the ipv6 keyword is not given, by the way.)

> Problems like this was why I was suggesting possibly using a FOURCC code
> instead of a string. Providing we have 32 bits to play with for the protocol
> field, we can have much more unique protocol identification without any
> storage changes.

I find this to be a better solution than any fourcc code. Who'd get to be the standardizing body? It wouldn't solve the problem for debian maintainers who want to use the same engines for several games, we'd still have to make the protocol var changeable.

> Oh, one other thing. In the README it's probably best not to draw attention to
> the specific security issue that is fixed, instead refering more generally to
> "security issues" or something like that.

Why not? The exploit by Luigi has been out for ages.
Comment 21 Thilo Schulz 2011-06-30 10:27:26 EDT
> (getServersExt only returns ipv4 results if the ipv6 keyword is not given, by
> the way.)

That's actually not correct. I meant to say: getserversExt can be told to return only ipv4 or only ipv6 or both by specifying the respective "ipv4" or "ipv6" parameters.
Comment 22 Simon McVittie 2011-06-30 11:06:17 EDT
(In reply to comment #20)
> (In reply to comment #18)
> > It seems like the only reason
> > we're allowing "protocol" to be changed by the user is to allow ioq3 to be used
> > with OA? Is there a reason to allow "protocolver" to be changed as well?
> 
> Yes. It's not only OA... World of Padman uses the same mechanism to distinguish
> between game content. I am sure other mods do as well.

This means protocol needs to be changeable at the command-line; it doesn't strictly require that  "new_protocol" can be changed too, if we can require that standalone games provide their "game content identifier" in a different way.

Perhaps we should just declare the protocol cvar to mean *only* what people use it for in practice now - an artificial compatibility-breaker for standalone games.

If this is the way to go, then we should make sure that when using Thilo's new protocol, the protocol cvar (at least) is looked at to determine compatibility (even though what we're actually going to talk is Thilo's new protocol, not protocol 68); and that it's made obvious to standalone game developers that they should *not* change newprotocol (or whatever it's called), even when they're changing protocol.

> The getServersExt query does already differentiate between games.

Not between incompatible variants of a game, though - I believe OpenArena 0.7.7 and 0.8.x are both considered to be the same game, distinguished only by the protocol cvar?

However smart the server browser is, in an ideal world we'd be able to get something like this, even when the server browser isn't used:

    \connect arena.example.com

    Error: server not compatible with this game
    Server is Tremulous-2, protocol 68
    This is Quake3-1, protocol 68
Comment 23 Simon McVittie 2011-06-30 11:08:44 EDT
The other major case that I'd like to continue to "fail nicely" is using the same base game as the server, but an incompatible version. So if you connect to an OA 0.8.x server with a hypothetical incompatible OA 0.9, you'd get:

    \connect arena.example.com

    Error: server not compatible with this game
    Server is OpenArena-1, protocol 71
    This is OpenArena-1, protocol 72
Comment 24 Zack Middleton 2011-06-30 13:41:14 EDT
Maybe connecting clients should included the game's heartbeat (sv_heartbeat) in userinfo? In SV_DirectConnect check if client has correct heartbeat for game, drop client with message if they do not. For legacy support allow clients with no heartbeat set to connect. sv_heartbeat can (and should?) be set differently for standalone games.

This should work for keeping games using the same protocol (but different heartbeats) separate. Doesn't seem to solve the differing game content issue though.
Comment 25 Thilo Schulz 2011-07-01 08:59:28 EDT
> If this is the way to go, then we should make sure that when using Thilo's new
> protocol, the protocol cvar (at least) is looked at to determine compatibility
> (even though what we're actually going to talk is Thilo's new protocol, not
> protocol 68); and that it's made obvious to standalone game developers that
> they should *not* change newprotocol (or whatever it's called), even when
> they're changing protocol.

I don't believe in placing any restrictions of this sort. By making com_oldprotocol and com_newprotocol (this naming is not final obviously) I see no reason why we should restrict this. The rest has to be figured out by those that write the startup scripts for the respective engines, e.g. as a debian maintainer, that would be you.

> > The getServersExt query does already differentiate between games.
> 
> Not between incompatible variants of a game, though - I believe OpenArena 0.7.7
> and 0.8.x are both considered to be the same game, distinguished only by the
> protocol cvar?

Well, I don't view this as a serious flaw. At most, users will be prompted to autodownload the new content, but they probably already get an incompatible protocol message so they will know they'll have to update. What's wrong with that?

> However smart the server browser is, in an ideal world we'd be able to get
> something like this, even when the server browser isn't used:
> 
>     \connect arena.example.com
> 
>     Error: server not compatible with this game
>     Server is Tremulous-2, protocol 68
>     This is Quake3-1, protocol 68

Can do, see my answer to Zack's comment below

(In reply to comment #23)
> The other major case that I'd like to continue to "fail nicely" is using the
> same base game as the server, but an incompatible version. So if you connect to
> an OA 0.8.x server with a hypothetical incompatible OA 0.9, you'd get:
> 
>     \connect arena.example.com
> 
>     Error: server not compatible with this game
>     Server is OpenArena-1, protocol 71
>     This is OpenArena-1, protocol 72

This happens already, not as verbose though as wrote it here, but it won't be changed.

(In reply to comment #24)
> Maybe connecting clients should included the game's heartbeat (sv_heartbeat) in
> userinfo? In SV_DirectConnect check if client has correct heartbeat for game,
> drop client with message if they do not. For legacy support allow clients with
> no heartbeat set to connect. sv_heartbeat can (and should?) be set differently
> for standalone games.
> 
> This should work for keeping games using the same protocol (but different
> heartbeats) separate. Doesn't seem to solve the differing game content issue
> though.

Yes, that is a good idea. It won't make any problems with legacy clients, because added parameters to messages in connectionless messages are simply ignored. And new clients can exchange enough information to make connection attempts fail gracefully.
Comment 26 Simon McVittie 2011-07-01 09:26:35 EDT
(In reply to comment #25)
> By making
> com_oldprotocol and com_newprotocol (this naming is not final obviously) I see
> no reason why we should restrict this. The rest has to be figured out by those
> that write the startup scripts for the respective engines, e.g. as a debian
> maintainer, that would be you.

I suppose what I'm really asking for is: whichever way you go on this, I'd like it to be obvious to standalone game developers what they ought to do when they rebase onto a version of ioq3 with your new (variant of the) network protocol.

Debian runs OpenArena by using ioquake3 with command-line options, but OpenArena upstream provides precompiled -DSTANDALONE binaries which have the hard-coded constants changed; whatever Debian does needs to be compatible with what OA upstream are doing, otherwise we won't be able to use the same servers.

> > Not between incompatible variants of a game, though - I believe OpenArena 0.7.7
> > and 0.8.x are both considered to be the same game, distinguished only by the
> > protocol cvar?
> 
> Well, I don't view this as a serious flaw. At most, users will be prompted to
> autodownload the new content, but they probably already get an incompatible
> protocol message so they will know they'll have to update. What's wrong with
> that?

0.7.7 to 0.8.x was an incompatible change (non-free files were deleted from pak0.pk3, IIRC), so from a compat point of view it's a different game: there's nothing you can add to 0.7.7 to turn it into 0.8.x. In the network-protocol-68 world, this was done by incrementing the protocol cvar; "also increment com_newprotocol" would be fine as a solution.

0.8.1 to 0.8.5 was a compatible change (it just added more PK3s to the end of the sequence, like the id patches do) so that particular mismatch can be resolved by auto-downloading (if enabled).

> (In reply to comment #24)
> > Maybe connecting clients should included the game's heartbeat (sv_heartbeat) in
> > userinfo? In SV_DirectConnect check if client has correct heartbeat for game,
> > drop client with message if they do not. For legacy support allow clients with
> > no heartbeat set to connect. sv_heartbeat can (and should?) be set differently
> > for standalone games.

This sounds good to me; I believe most (all?) standalone games do change sv_heartbeat. Bonus points if you document that they should/must do so :-)
Comment 27 Thilo Schulz 2011-07-01 09:43:49 EDT
(In reply to comment #26)
> (In reply to comment #25)
> I suppose what I'm really asking for is: whichever way you go on this, I'd like
> it to be obvious to standalone game developers what they ought to do when they
> rebase onto a version of ioq3 with your new (variant of the) network protocol.

No worries, would be documented anyways.

> 0.7.7 to 0.8.x was an incompatible change (non-free files were deleted from
> pak0.pk3, IIRC), so from a compat point of view it's a different game: there's
> nothing you can add to 0.7.7 to turn it into 0.8.x. In the network-protocol-68
> world, this was done by incrementing the protocol cvar; "also increment
> com_newprotocol" would be fine as a solution.

Right, it's the prerogative of the respective developers to change the protocol cvar to reflect this. Since with the proposed patch you can configure both version numbers which are considered "legacy" and which are considered "new" protocol, you can basically do whatever you want.
Comment 28 Thilo Schulz 2011-07-07 21:44:38 EDT
Created attachment 2812 [details]
Protocol 69 - Patch v6 (rev 2070), implement discussed changes, add demo menu display support for both legacy and new protocol demos

Alright. I have implemented many of your suggestions.
The client will now send the sv_heartbeat name in getChallenge requests (old clients don't and in this case no check is performed) to the server, which will not accept incompatible heartbeat messages.
This will help differentiate from standalone games.

I have also fixed demo menu display in ui_demo2.c, which now displays both: legacy and new protocol.
Comment 29 Tim Angus 2011-07-11 17:24:13 EDT
Looks good. I'm not convinced that the wording in the README regarding com_protocol is correct, but maybe I'm misunderstanding it. Grab me on IRC if you need to. The other thing I might be inclined to do is colorise the warnings in CL_ConnectionlessPacket yellow, as is the precedent, but that's no big deal really.
Comment 30 Simon McVittie 2011-07-12 03:38:29 EDT
This is looking good! Some minor comments:

+  com_protocol                      - when encountering a server/client that
+                                      only supports the version configured in
+                                      this cvar

"... that does not support the version ...", surely?

+  If the value for "com_protocol" and "com_legacyprotocol" is identical, then
+  the legacy protocol is always used. If protocol is set to 0, then support for
+  the legacy protocol is disabled.

"If com_legacyprotocol is set to 0" presumably?

+  Mods that use a standalone engine obviously do not require dual protocol
+  support

I think this is misleading: "do not require dual-protocol support unless they intend to maintain backwards compatibility with older versions" would be better?

I'd personally be inclined to make LEGACY_PROTOCOL always be defined, to avoid the #ifdefs (standalone mods that don't want it can just set com_legacyprotocol to 0); how much size does it actually add to the engine?

+ 			NET_OutOfBandPrint(NS_SERVER, from, "print\nGame mismatch: This is a %s server\n",

To avoid bad grammar when a mod starts with a vowel ("this is a OpenArena-1 server"), perhaps this could be "Game mismatch: this server runs %s"?

+#define	PROTOCOL_VERSION	69
+#define PROTOCOL_LEGACY_VERSION	68
 // 1.31 - 67

Should these maybe go in q_shared.h, if standalone mods are going to be changing them?

Maybe extend the comment:

 // 1.31 - 67
 // 1.32c - 68
 // PROTOCOL_LEGACY_VERSION will be 68 forever, except in standalone mods
 // ioquake3 1.37 - 69

Maybe it'd be good to have some convention like "protocol = legacy protocol + 100" (then +200 if you break compat again, etc.), so it's (relatively) obvious what standalone mods should do?
Comment 31 Thilo Schulz 2011-07-12 08:01:28 EDT
Applied r2075, with your suggested changes, Timbo.
Simon: Yep. I fixed the README now. Yeah, I could remove the ifdefs.. somehow it doesn't feel right for me, though, leaving in dual protocol support when it is really not required.