[D2X Rebirth all versions Win7 x64] Bug - demo playback stops suddenly (crashes?)
#11
I don't have enough C++ experience to understand 80% of the rebirth code, would someone mind explaining how the demo system works? I may be able to offer (theoretical) improvements.
Reply
#12
Kp, good you reproduced it at last! I think you overrate the problems with the demo system a bit, it more-or-less works IMO, and is very useful for recording anything, obviously necessary for validating speedruns, and very helpful for recording videos, too, you can play in one setup (for example no music and full screen), record a demo, then make a video with music and small resolution for YT.

In fact I myself could take it, if it was possible to compile on Windows. I have 15+years of C/C++ experience including the low level/assembly optimization stuff - and a lot of math too, but always working on Window system as it immensely simplifies development and especially debugging. Also, even my experience is not enough to understand those variadic macro parts, LOL even latest Microsoft VS compiler cannot understand it! Of course I easily understand everything else, but still cannot compile it on my system... If I were in charge of the project, I would first make it CMake - based (to simplify build process for everyone) and compiling for both Windows and Linux. There are other people who want to work with the code (except me) but cannot do it because of Linux-only and build system complications...

About fixing the demo system I still consider fixing bugs one-by-one (starting from ones that are easiest to fix) the best idea. About this particular bug I would log all actions (I don't know what are they exactly, should be move, shoot, etc) during recording demo and playing it back, and see where it starts to deviate and why.
Reply
#13
Also I should add, that some of the demo playback bugs seem to be fixed in latest build! For example this Obsidian level 12 demo stopping 3 sec after the start in 0.58.1, now works normally! And the bug with demo playback aborting seems to be partially fixed (I opened another thread for the remaining part)
Reply
#14
(12-30-2017, 06:46 AM)LightWolf Wrote: I don't have enough C++ experience to understand 80% of the rebirth code, would someone mind explaining how the demo system works? I may be able to offer (theoretical) improvements.
Clearly, I need to work on that remaining 20%. Wink

The demo system is functionally unchanged by me, aside from spot adjustments to convention changes elsewhere (although even there, git diff reports substantial changes). It probably shows more of the original C code than most other subsystems.
Code:
$ git diff --stat 0.58.1-d2x:main/newdemo.c github/next:similar/main/newdemo.cpp
main/newdemo.c => similar/main/newdemo.cpp | 2599 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------
1 file changed, 1571 insertions(+), 1028 deletions(-)
In principle, it's based on having hooks scattered throughout the game that call newdemo_record_foo functions when various "important" events happen. On playback, the important data is pulled from the file and written directly into various variables. It's not so much a journal of what the game did as it is a movie encoded as "Put a type 5 robot here. Put a type 8 powerup there. Render scene. Clear view. Put a type 5 robot over there. Render scene." To know what is meant by a type 5 robot or a type 8 powerup requires loading compatible game data. That is not in the file, nor is much of anything else that was not necessary to render the scene.

I have several major complaints about the demo system:
  • It's extremely difficult to extend without generating demos that would terminally confuse legacy builds. See https://github.com/dxx-rebirth/dxx-rebir....cpp#L1997 - #L3250 for the switch statement that handles top-level "important" events on playback (and yes, that's a 1253 line switch statement, and yes, very few of those lines are just comments, and yes, that's just the top-level handler; functions elsewhere are called to do most of the real work). In particular, look at the very end of that region, where unhandled demo opcodes route to Int3 (trap to debugger), then fall out the bottom. So even though not all opcodes are defined, if I define a new one, and somebody tries to play back a demo using that new opcode on a build that does not understand it, bad things happen. The demo code needs a clean way to skip over opcodes it does not understand or report to the user that the demo requires features missing in the user's build. I could add that latter, but that would itself be a new feature, limiting its utility.
  • Existing opcodes have no defined length in the file, but instead are assumed to be particular lengths as a function of the specific opcode. For some opcodes, such as render-object, that length is in turn a function of other fields logged in the object. See https://github.com/dxx-rebirth/dxx-rebir...o.cpp#L660 - #L805, where objects conditionally read fields from the file depending on the object's base type, render type, movement type. This not only complicates extending the file, it also makes it hard for other tools to parse the file.
  • The "important" events are only enough to reproduce what the player saw. It's not generally possible to use the demo to discover information that the player did not observe.
  • "Important" events are recorded in a weird combination of too much and too little information. For example, recorded objects do not record their original index in Objects[], so the demo code assigns them an arbitrary index on playback. Thus, even if I know that object #64 was important in the main level, I have no way to readily find that object in the demo, because it was almost certainly not consistently observed in a scene that would make it the 64th object rendered in that frame.
  • Events are recorded as snapshots of the state of specific variables, so it's inconvenient (though not always impossible) to deduce code flow based on the saved data.
There may be other problems. This is just what comes to mind quickly: too hard to extend, inconvenient to analyse, and far less useful than it ought to be for debugging or performance analysis.

The problem is not that I don't know how to fix it. The problem is that the fix I want to pursue is so invasive that I cannot justify the time required to write and debug it.
(12-30-2017, 11:34 AM)AlexanderBorisov Wrote: I think you overrate the problems with the demo system a bit, it more-or-less works IMO
It works, but it's fragile and tedious to debug. It certainly has valid use cases, which is why I let it alone. I just wish those use cases were served by a subsystem that was less trouble to support.
(12-30-2017, 11:34 AM)AlexanderBorisov Wrote: In fact I myself could take it, if it was possible to compile on Windows.
Compilation from Windows should be possible (though most people use Linux as a build host).
(12-30-2017, 11:34 AM)AlexanderBorisov Wrote: I have 15+years of C/C++ experience including the low level/assembly optimization stuff - and a lot of math too, but always working on Window system as it immensely simplifies development and especially debugging. Also, even my experience is not enough to understand those variadic macro parts, LOL even latest Microsoft VS compiler cannot understand it! Of course I easily understand everything else, but still cannot compile it on my system...
You can mostly ignore the variadic macros. I try to keep those away from anything people need to understand, since they are complicated. Which Microsoft Visual Studio version did you try? Last I checked, their C99 preprocessor support was still lacking, hence why it mishandled the macros. I stored the project files in contrib/broken-vs2017 for a reason. Wink You should be able to build using gcc, even on Windows.

I would suggest excluding those macros when using Visual Studio, but the easiest way to exclude them requires a macro, which is probably not a viable solution in this case. Smile
(12-30-2017, 11:34 AM)AlexanderBorisov Wrote: If I were in charge of the project, I would first make it CMake - based (to simplify build process for everyone) and compiling for both Windows and Linux. There are other people who want to work with the code (except me) but cannot do it because of Linux-only and build system complications...
CMake is an ugly language. I grant that the current SConstruct has grown over time to a point where it's not maintainable by everyone, but I still pick SCons over CMake every time. I also question whether CMake can do some of the useful things I do in SCons.

The main problem with building on Windows is the lack of a package manager to install all the dependencies. I do not bundle SDL, PhysFS, etc. in the main tree, so it is the package manager's responsibility to install those in appropriate places first. I do test for those as part of the build, to try to give good error messages when the setup is not usable.
(12-30-2017, 11:34 AM)AlexanderBorisov Wrote: About fixing the demo system I still consider fixing bugs one-by-one (starting from ones that are easiest to fix) the best idea. About this particular bug I would log all actions (I don't know what are they exactly, should be move, shoot, etc) during recording demo and playing it back, and see where it starts to deviate and why.
Sure, that's an obvious course of action. It's also a pain since the object numbers don't line up, and all visible objects get logged every frame, so the logs are very verbose.
Reply
#15
Thanks for detailed answer! Yes I've got your point on the demo system, seems it would be better to create a new solution that is backward-compartible, easier to debug (includes optional debug-mode logging at least), writes less data, stores the complete information, etc. And yes, if a bug fix requires too many changes that can break everything else, it is good idea to leave it as is. This switch statement is a really impressive thing!

About building - is there a way so use some good IDE while building the project with GCC on Windows? Which will allow me to debug comfortably, too? And I never had an experience of building Windows binaries on Linux, never thought it was possible...

I tried VS 2017, had no problems with downloading and compiling the dependencies (PhysFS and SDL) as far as I remember both have CMake build system, SDL has precompiled DLL, too. The broken vs2017 project had some wrong file paths, and some missing ones, however, fixed it too. About the macros - I commented ones used for debug printf's but still there are a lot of other variadic macros working with inner data structures, exactly the ones I cannot understand. Did't dare to touch them.
I'd be very happy if there was a solution to quickly make all those macros compile under conditional compilation switch for Microsoft VS compiler.

About CMake vs SCons - I am not a big fan of CMake myself (agree that it is ugly), but still we use it at work everywhere for cross platform (IOS devices, Mac, Linux, Windows) builds; and all libraries (mostly math) come with CMake build system, I haven't seen a single one with SCons, this means something... Probably CMake is the least evil...
Of course I cannot argue which is better, since I don't know SCons. But, I remember having some nasty errors with it when I tried to run DXX-Rebirth SCons system on Windows, and I was unable to solve them or even google the solution.

About installing dependencies in Windows, we had no problems at all with CMake projects (and we had much more than 2 dependencies in our projects! Maybe 20 and many were gigantic libraries), you just install and compile each dependency separately, and then set the path to it in CMake GUI for main project... Not too bad. And there is even dependency auto-detect function in some CMake projects, although those often do not work on Windows.
Reply
#16
(12-31-2017, 04:09 AM)AlexanderBorisov Wrote: Thanks for detailed answer! Yes I've got your point on the demo system, seems it would be better to create a new solution that is backward-compartible, easier to debug (includes optional debug-mode logging at least), writes less data, stores the complete information, etc. And yes, if a bug fix requires too many changes that can break everything else, it is good idea to leave it as is. This switch statement is a really impressive thing!
My plan, which I've had in abstract for a couple of years, but never written any code for, is that the replacement ought to be extensible (able to recognize critical versus optional unsupported fields, and able to ignore non-critical fields that were introduced in later versions), that it have an optional verbose mode as you said, and ideally save enough to recreate the level state. Although I doubt I'll ever get it to this point, I've thought it would be neat to mimic a feature that I've read of in some other modern games, where the replay doubles as a running savegame, allowing you to pause partway through and switch from viewing the replay to playing the game as it was at that point in the replay.
(12-31-2017, 04:09 AM)AlexanderBorisov Wrote: About building - is there a way so use some good IDE while building the project with GCC on Windows? Which will allow me to debug comfortably, too? And I never had an experience of building Windows binaries on Linux, never thought it was possible...
If the IDE can be told to run external commands, you could configure it to run SCons with appropriate options. For debugging, any debugger should work, but the experience will be much nicer if the debugger understands the debuginfo generated by your compiler. On Windows, that's a bit messy since Microsoft's tools only understand Microsoft PDB symbols (and possibly CodeView, though that has likely been retired by now) and only Microsoft's compiler knows how to write Microsoft PDB symbols (and as far as I know, the format was never documented, so it's very unlikely anyone will ever patch gcc to emit such symbols). So if you build with gcc, you get DWARF debuginfo, which cannot be consumed by the Microsoft tools. I don't know if GDB on Windows is usable, but if it is, it should understand DWARF. If you build with Microsoft tools, then you can get PDB symbols and use a Microsoft debugger. (Though this assumes you could get a Microsoft compiler to compile Rebirth, which as discussed above and below, is not necessarily viable.)

Cross-compiling Windows <-> Linux depends on whether the build system can handle using cross-compilers, linking non-native libraries, etc. I consider support for cross-compilation to be a high priority in Rebirth. If it doesn't work, I want to know about it and fix it.
(12-31-2017, 04:09 AM)AlexanderBorisov Wrote: I tried VS 2017, had no problems with downloading and compiling the dependencies (PhysFS and SDL) as far as I remember both have CMake build system, SDL has precompiled DLL, too. The broken vs2017 project had some wrong file paths, and some missing ones, however, fixed it too. About the macros - I commented ones used for debug printf's but still there are a lot of other variadic macros working with inner data structures, exactly the ones I cannot understand. Did't dare to touch them.
Sorry about the bad paths. Since I don't have routine access to a Windows development environment and the build is known to be broken even with those fixed, I haven't maintained those project files as well as I probably should. I won't take patches that break working functionality from the supported platforms, but I'll happily take anything that makes the Visual Studio build less broken without hurting other platforms - path fixes, MSVC-specific macro changes, etc. If you can't get Visual Studio to work without breaking the other targets, you can still post the patches, but I'll have to rework them to be conditional on Visual Studio before they are merged.

When last I tried it, even after hacking around the C99 variadic macro issue (for expedience, I locally applied patches that would have broken all the other platforms, so those patches were never merged), I couldn't get the build to finish. There were several files that crashed the compiler. That was when I gave up, since the crash didn't identify the problem in any detail, and I suspected I wouldn't be able to work around the crash without substantial changes to code I didn't want to modify. Perhaps a later service pack for the compiler has made it more resilient in the face of complicated templates.
(12-31-2017, 04:09 AM)AlexanderBorisov Wrote: I'd be very happy if there was a solution to quickly make all those macros compile under conditional compilation switch for Microsoft VS compiler.
Some of the variadic macros that troubled Visual Studio 2017 (and Visual Studio 2013, back when I tried that) are purely debug/optimisation macros. You can get a usable build with them completely unset. I don't know if that is true of all the macros that trouble it. I'd need to see a full list of what it doesn't accept and review the affected macros.
(12-31-2017, 04:09 AM)AlexanderBorisov Wrote: About CMake vs SCons - I am not a big fan of CMake myself (agree that it is ugly), but still we use it at work everywhere for cross platform (IOS devices, Mac, Linux, Windows) builds; and all libraries (mostly math) come with CMake build system, I haven't seen a single one with SCons, this means something... Probably CMake is the least evil...
CMake is probably easier to start with. SCons has a bit of a bad reputation because it is marketed as a "build system construction kit", so the quality of an SCons-based build depends heavily on the author(s) who write the SConstruct file. While it's possible to make a mistake in any build system, I suspect it's easier to make serious mistakes in SCons than it is in CMake. In particular, SCons makes it easy to fail to support certain conventions, like $CC/$CXX/$CPPFLAGS/$CFLAGS/$CXXFLAGS to set defaults. Distribution package maintainers rightly dislike packages that ignore such conventions. I like to think that I've written the SConstruct file to be acceptable to package maintainers. When someone tried to package Rebirth for OpenBSD recently, the changes they needed were relatively minor (and several of their patches were unnecessary / wrong), so I consider that a success.
(12-31-2017, 04:09 AM)AlexanderBorisov Wrote: Of course I cannot argue which is better, since I don't know SCons. But, I remember having some nasty errors with it when I tried to run DXX-Rebirth SCons system on Windows, and I was unable to solve them or even google the solution.
If you can't get it working after following the included instructions, please report the problem. I don't expect the instructions to work for everyone as-is, but I'd like to improve them so that they do.

On the more immediate topic, I modified the demo code to save only robot objects, which greatly reduced the noise (at the obvious price of making the demo confusing and incomplete as a demo). I traced the problem as far as that the relevant robot (object #12, robot type #22) gets a different ->rtype.polyobj.model_num depending on whether you cold-start the level or warm-start the level. The demos I recorded always use the cold-start model_num during playback, regardless of the model_num used during recording. If the original play was a warm-start, then the player and the demo disagree about the model_num. The two model_num values in question (37 for cold start, 110 for warm start) have different n_models (1, 8); if the player and demo disagree about n_models, then they disagree about how many anim_angles are written, which causes the file position to be incorrect, which causes the later anim_angles to be misinterpreted as an EOF. I have not yet had time to trace why the two paths disagree about the model_num, but suspect it to be some sort of ordering issue in the loading of custom data. If so, levels that do not customise the robots would not experience this problem.
Reply
#17
Ok, thanks, I will ask my teammates at work about Windows IDE that can compile with GCC and debug with DWARF symbols.
Some of them know that stuff better than me.
But first I want to try to make working VS 2017 project, will start with looking again at the errors and offending macros, and report what is wrong. At least I suspect that original game won't have any problems with compiling in VS, and 99% of the changes in code, too; and those "crazy" offending macros can be adopted in some way for VS. I've seen a lot of code causing internal compiler errors in MS compilers but usually there  are workarounds; and there are much less errors like this in latest MS compilers; this way seems the easiest one and the most rewarding, as other people (like Sirius from DescentBB) wanted to fix this VS project and work with it, too.
Yet better if usable VS project could be created with SCons like it is usually done with CMake, but this could be the next step.
I think I should post this stuff in personal e-mail, should I? Because it gets unrelated to the topic itself.

On the topic, good you isolated the problem, I think it can be related that the boss is also a Pyro, I've never tried recording levels like this before (and never seen this bug, too!) but I plan to record Apocalyptic Factor where there are pyros, too, and so I am a bit concerned about this bug. From what you write, it still seems possible to track it down to the root cause and fix (i.e without serious changes to the demo system) - at least there is a chance.
Reply
#18
Visual Studio support probably shouldn't be in this thread, but it might be better as a Github issue or separate forum thread than as private e-mail. There may be lurkers who want to contribute (or at least benefit from the discussion), who would be shut out if we switched to a private thread.

I don't know if it's still supported, but old versions of Visual Studio definitely had the ability to run external commands. This was commonly used for collecting build outputs into packages for distribution, but could probably be abused to run SCons. That would be passable (although suboptimal) for using Visual Studio as an IDE while running gcc as the compiler. It wouldn't maintain the file lists in Visual Studio though, so navigating among files would be more trouble than necessary. For running Microsoft's cl.exe, you would be better off with working Visual Studio project files. SCons documentation suggests this is supported in some way, but I've never used it and don't know how to start with that.

From what I can tell, it's not specifically because the boss is a Pyro, but rather that the campaign changes robot definitions between levels. The bad demos seen so far can be repaired by deleting the spurious block of nulls, but you need a hex editor that can change file length, and you need to know exactly where in the demo to find the bogus blocks. The latter is pretty easy to compute if you can cause the game to print out the position of each record during playback. If you can't (and no released builds can, I had to patch that in for my testing), finding the right spot is rather tedious. As a rare perk of the limited capabilities of the demo file format, I think you can remove records in the middle without breaking anything, because there are no cross-record jumps. I haven't tested that yet though.

I think I can fix this, but it will change game results in the common case. The problem appears to be the following bad code flow:

  1. Start level 3
  2. Load campaign custom robots into Robot_info
  3. Load level 3 custom robots into Robot_info
  4. Level 3 begins
  5. Finish level 3
  6. Start level 4
  7. "Fix" robot model_num values by loading them from Robot_info
  8. Load campaign custom robots into Robot_info
  9. Level 4 begins
Steps 7 and 8 should have been in the other order. Since they are in this order, level 3 robot data is used to customize a level 4 robot. In the cold start case, level 3 data was never loaded, so the robot takes the custom definitions of the campaign, rather than the custom definitions of level 3.

The fix is conceptually simple: load data before it is used, not after. However, this will cause any campaigns that relied on this bug to manifest different robots. Such campaigns are already buggy, since a cold start behaves differently than a warm start.

Load robot customizations before use fixes this for new demos. Existing demos will need to be discarded or manually repaired, since the fix is in the demo recording side, not the demo playback side.
Reply
#19
I will open a separate thread on VS support later, then. The paragraph about SCons basically repeats what CMake does,
including the pre-build step that it adds to VS project to rerun CMake generator on changes!
No need to repair any demos, forget it. Only I am a bit concerned about backward compartibility, and still don't understand some part of the explanation. So, some questions to clarify.
1) Will be this mission (Entropy) be playable with the fix? Will it affect custom robots (visible) behavior, weapons, look in any way? Will demo recording/playback work and save/load, too? I mean limited only to the new code version.
2) When you said about mission relying on the bug, what exactly do you mean? Because it is one thing if they just rely on the game not to crash and quite a diffirent thing if they try to achieve some robot functionality this way... If the fix will make some other missions unplayable it will not be a good thing...
2) It seems the bug is not directly in demo recording, but in loading the robot data betewwn level, right? It just AFFECTs the demos. Wonder why it doesn't screw up the save/load functionlity, too...
Reply
#20

  1. It loaded fine for me when testing it. I did not try to complete it. Yes, it may affect custom robots. Prior to the fix, custom robots used some data inherited from the prior level and some data from the current level's HXM. Now, it consistently uses only data from the current level's HXM. The warm-start experience with the change should be the same as you got when you cold-started level 4 with or without the change.
  2. If the mission was balanced with the idea that the warm-start bug would cause inherited properties from the prior level to apply, the experience will be different now. Prior to my change, the user would get different results for warm-start versus cold-start. I cannot fix the demo recording while retaining that inconsistency. I must choose between causing all users to get the seemingly intended semantics (previously seen only by cold start players) or causing all users to get the seemingly accidental semantics of inheriting prior level HXM modifications (previously seen only by warm start players). Although the warm-start case is more common, the cold-start case seemed to be correct, so I chose to prefer it.
  3. Save/load used an entirely separate code path, and recorded the objects differently. The way in which savegames recorded the object was not sensitive to the number of animations the model used, so save and load did not need to agree on that number.
Reply


Forum Jump:


Users browsing this thread: 2 Guest(s)