I've written this code at work and it's not pretty. As usual, it was done in a hurry with a Grumpy Boss Man shouting and making Basil Fawlty appear calm and collected. It also uses code from a third party, unsuitable for our hardware but nonetheless required for integration and testing.
This is the code for an embedded system which was written by Windows people in C++ using a cross-platform GUI toolkit including a GUI but also using this toolkit's message-passing infrastructure to facilitate inter-thread communication. Yes, it's multi-threaded.
Grumpy Boss Man wouldn't let us put this GUI toolkit on our system even just to get this code up and running so I had to re-implement various select parts of said toolkit myself to get the useful buts of the code from the supplier working, fortunately with the GUI thrown out.
Working all hours, with my fingers on fire, my brain melting and all sorts of things I replaced the TCP/IP socket functionality and the thread classes (in a very cheap, Scottish, minimalist, parsimonious way).
Lo-and-behold it ran!
Now this system contains Secret Sauce(TM) that I'm not allowed to see because IP and all that. So I have a target build with a secret binary module provided by the suppliers. I have my own little stub module implementing it's API which I wrote so I could do a host (x86-64) build. Grumpy Boss Man never quite understood why anyone would want to run the code on the host as well as the target (Aarch64).
It's quite simple: expediency. I can compile, link and execute the code in a couple of seconds on the host. I have rigged up a little automated test harness, in addition to my unit tests, which runs the application and sends messages to it, and waits for and checks the replies. I can run it through various test scenarios just by typing make. Remember, this is asynchronous multi-threaded code with TCP/IP sockets. Every time I compile I get free tests. The same tests can be run on the target too (I've done it).
The second reason is that compiling and running (testing) on a different architecture shakes out certain bugs. Ideally, it would be on an architecture with a different endianness and a different OS but the world is becoming more homogeneous these days. Unless there's a SPARC box about, if it's x86-64 or Aarch64, it's going to be Little Endian.
However, x86 is CISC and ARM is RISC and we all know that CISC and RISC processors treat memory differently. Now here comes the fun part.
My host (x86-64) builds/tests were fine. So were my target Aarch64) builds and they ran fine when I put them on the target and ran my tests there.
Our suppliers produced a new version of their Secret Sauce that needed some reconfiguration inside my code. My code (actually, their example code but a bit modified) had a couple of arrays holding certain configuration data and these became twice as large and held more constants.
All the compiles worked. My host regression tests passed. Putting the target binary on the hardware and running it resulted in a crash. It was a nice crash in that my pthread_create() failed with an error code and I printed a nice error message and the rest of the program kept going.
As I said earlier, I had been re-implementing parts of this C++ library at breakneck pace and I was thinking about memory corruption and perhaps I'd made some mistakes in one of the C++ constructors for the thread class.
I instrumented the code six ways to Sunday and came to the conclusion that there was stack corruption somewhere because all the right addresses for the thread main routine and arguments were getting set in the object instances but when pthread_create() was getting called it was returning a nasty error.
Then I remembered the mighty Valgrind. So I installed it.
After about half an hour I had the answer to the problem. I had forgotten to initialise the attributes for the thread (pthread_attr_init()) and then initialise a mutex for a shared buffer (pthread_mutex_init()).
It just so happened that on x86-64, due to the layout of memory, and due to the random contents of that memory, the program was running correctly. On Aarch64 it was falling over in a smouldering pile.
The moral of the story is (1) Don't write code on your own. Get someone to review it. (2) Don't write code in a hurry even when there's a Grumpy Boss Man (3) Compile and test on at least two different architectures and (4) use Valgrind (5) I hate C++.
(Score: 2) by turgid on Friday May 10 2024, @06:31AM (2 children)
C++ is just so complicated. Every time I come across it in the world of work is seems to have been used badly. I'm not a C++ expert, but it seems to be used by hackers who think they know more than they evidently do.
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 1, Insightful) by Anonymous Coward on Friday May 10 2024, @08:20AM (1 child)
while there are "C++ experts" out there, nobody knows everything about it.
if you frequently have to deal with other people's C/C++ messy code, one thing I can suggest to streamline the "get familiar with code" stage is to put doxygen to work.
it will draw inheritance graphs, include dependency graphs, and generally give you a nice html view of the different class definitions etc.
obviously you have your own way of doing this, since you were able to put together a replacement for the secret code.
but I think it's generally true for any language that different people solve the same problem in different ways.
I have a collaborator who writes python code to access a python library that I wrote.
And yet I can barely parse his code, because he has such a very different style of doing things (while nominally OOP, his code is often a single long function that does everything...).
to get back to your original story: I generally agree with your moral; I believe your point (1) starts with the "rubber duck principle" ("explain your code to your rubber duck"), and its' stronger form is the "four eyes principle" (get another pair of preferably human eyes to look at it).
(Score: 2) by turgid on Friday May 10 2024, @08:48AM
Yes, Doxygen is great and I did use it on this project. I installed it locally. Our development servers are hosted on the cloud and the admins failed to get it installed, so I only have Doxygen when I build locally.
The moral of that story is if you want something done right, take responsibility for your own development environment. I'm surprised this lesson still has to be learned this far into the 21st century.
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].