Big-Endian Testing with QEMU

(hanshq.net)

113 points | by jandeboevrie 2 days ago

18 comments

  • bluGill 1 day ago
    What I really want is memory order emulation. X86 as strong memory order guarantees, ARM has much weaker guarantees. Which means the multi-threaded queue I'm working on works all the time on development x86 machine even if I forget to put in the correct memory-order schematics, but it might or might not work on ARM (which is what my of my users have). (I am in the habit of running all my stress tests 1000 times before I'm willing to send them out, but that doesn't mean the code is correct, it means it works on x86 and passed my review which might miss something)
  • susam 1 day ago
    I wrote a similar post [1] some 16 years ago. My solution back then was to install Debian for PowerPC on QEMU using qemu-system-ppc.

    But Hans's post uses user-mode emulation with qemu-mips, which avoids having to set up a whole big-endian system in QEMU. It is a very interesting approach I was unaware of. I'm pretty sure qemu-mips was available back in 2010, but I'm not sure if the gcc-mips-linux-gnu cross-compiler was readily available back then. I suspect my PPC-based solution might have been the only convenient way to solve this problem at the time.

    Thanks for sharing it here. It was nice to go down memory lane and also learn a new way to solve the same problem.

    [1] https://susam.net/big-endian-on-little-endian.html

  • AKSF_Ackermann 1 day ago
    > When programming, it is still important to write code that runs correctly on systems with either byte order

    What you should do instead is write all your code so it is little-endian only, as the only relevant big-endian architecture is s390x, and if someone wants to run your code on s390x, they can afford a support contract.

    • jcalvinowens 1 day ago
      Don't ignore endianness. But making little endian the default is the right thing to do, it is so much more ubiquitous in the modern world.

      The vast majority of modern network protocols use little endian byte ordering. Most Linux filesystems use little endian for their on-disk binary representations.

      There is absolutely no good reason for networking protocols to be defined to use big endian. It's an antiquated arbitrary idea: just do what makes sense.

      Use these functions to avoid ifdef noise: https://man7.org/linux/man-pages/man3/endian.3.html

      • drob518 1 day ago
        What do you mean by “networking protocols,” exactly? Most packet level Internet protocols (TCP, UDP, etc.) are big endian. Ethernet is big endian at the octet level and little endian on the wire at the bit level. Network order is big endian because it has to be something and it’s easier to draw pictures as a matrix of bytes that are transmitted from left to right and top to bottom. There is no right answer to endianness. It’s like which side of the road cars should drive on. You just need to pick one and stick with it. Mostly people bitch about endianness when their processor is the opposite of whatever someone else picked. But processors are all over the map. IBM mainframes are big endian. Motorola 68k is big. HP PA-RISC is big. IBM Power started big and then went bi. MIPS is bi. RISC-V is little. ARM is bi but dominantly little (AArch64). And of course x86 is little. So, take your pick. That said, little endianness is the right answer as is driving on the right side of the road.
        • hmry 1 day ago
          > RISC-V is little

          These days it's bi, actually :) Although I don't see any CPU designer actually implementing that feature, except maybe MIPS (who have stopped working on their own ISA, and now want all their locked-in customers to switch to RISC-V without worrying about endianness bugs)

          • drob518 1 day ago
            Well, sort of. Instruction fetch is always little-endian but data load/store can be flipped into big. But IIRC the standard profiles specify little, so it's pretty much always going to be little. But yea, technically speaking data load/store could be big. Maybe that's important for some embedded environments.
            • hmry 1 day ago
              > Well, sort of. Instruction fetch is always little-endian but data load/store can be flipped into big

              ARM works the same way. And SPARC is the opposite, instructions are always big-endian, but data can be switched to little-endian.

        • jcalvinowens 22 hours ago
          I read your reply as mostly agreeing with me: endianness is arbitrary, using big endian for a novel protocol just because some widely used protocols decided to decades ago is silly.

          > it’s easier to draw pictures as a matrix of bytes that are transmitted from left to right and top to bottom.

          There are many reasons for big endian... but that is not one of them :)

          > But processors are all over the map

          That's not true anymore, big endian is dead. Upstream Linux is refusing to support big endian riscv at all, and is making serious noises about ripping out the existing big endian aarch64 support because the companies that ship the hardware that needs it don't work upstream.

        • pwdisswordfishy 1 day ago
          > it’s easier to draw pictures as a matrix of bytes that are transmitted from left to right and top to bottom

          This argument is pretty silly: visualizations can always be changed. For some time I have been thinking that hexdumps on little-endian systems ought to be written right-to-left: in fact, when I once decided to include such a right-to-left dumper in my own software, it took me very little time for me to get used to, and I immediately started regretting I don't have it available everywhere.

      • Veserv 1 day ago
        You should actually not use format-swapping operations.

        You should actually use format-swapping loads/stores (i.e deserialization/serialization).

        This is because your computer can not compute on values of non-native endianness. As such, the value is logically converted back and forth on every operation. Of course, a competent optimizer can elide these conversions, but such actions fundamentally lack machine sympathy.

        The better model is viewing the endianness as a serialization format and converting at the boundaries of your compute engine. This ensures you only need to care about endianness when serializing and deserializing wire formats and that you have no accidental mixing of formats in your internals; everything has been parsed to native before any computation occurs.

        Essentially, non-native endianness should only exist in memory and preferably only memory filled in by the outside world before being parsed.

        • jcalvinowens 20 hours ago
          Somebody has to actually write the code at some point, it can't be serialization abstractions all the way down. That's what I'm talking about.
    • addaon 1 day ago
      There's still at least one relevant big-endian-only ARM chip out there, the TI Hercules. While in the past five or ten years we've gone from having very few options for lockstep microcontrollers (with the Hercules being a very compelling option) to being spoiled for choice, the Hercules is still a good fit for some applications, and is a pretty solid chip.
    • socalgal2 1 day ago
      I'm with you this. I lived through the big endian / little endian hell in the 80/90s. Little endian won. Anyone making a big endian architechture at this point would be shooting themselves in the foot because off all the incompatibilities. Don't make things more complicated.

      In fact, I'd be surprised if you made a big endian arch and then ran a browser on it if some large number of websites would fail because they used typedarrays and aren't endian aware.

      The solution is not to ask every programmer in the universe to write endian aware code. The solution is to standardize on little endian

      • classichasclass 1 day ago
        We already know that's the case. I had to add little endian typed array emulation to TenFourFox.
    • sllabres 1 day ago
      Not only the System/390. Its also IBM i, AIX, and for many protocols the network byte order. AFAIK the binary data in JPG (1) and Java Class [2] files a re big endian. And if you write down a hexadecimal number as 0x12345678 you are writing big-endian.

      (1) for JPG for embedded TIFF metadata which can have both.

      [2] https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.ht...

      • hmry 1 day ago
        The endianness of file formats and handwriting is irrelevant when it comes to deciding whether your code should support running on big-endian CPUs.

        The only question that matters: Do your customers / users want to run it on big-endian hardware? And for 99% of programmers, the answer is no, because their customers have never knowingly been in the same room as a big-endian CPU.

        • sllabres 1 day ago
          Saying that (hand)writing is irrelevant is a bit of a strawman implying I said writing hexadecimal numbers big-endian on paper matters for coding.

          The second sentence, weather your customers know if they have been in the same room with a big-endian system (CPU alone doesn't matter) is irrelevant when the point is to write correct code. Many of then aren't interested in this or other details and that is ok as they are not responsible for the implementation.

          Changing the endianness either direction did have show bugs to me several times, that could be fixed, and it was worth it for that alone.

    • nyrikki 1 day ago
      The linked to blog post in the OP explains this better IMHO [0]:

         If the data stream encodes values with byte order B, then the algorithm to decode the value on computer with byte order C should be about B, not about the relationship between B and C.
      
      One cannot just ignore the big/little data interchange problem MacOS[1], Java, TCP/IP, Jpeg etc...

      The point (for me) is not that your code runs on a s390, it is that you abstract your personal local implementation details from the data interchange formats. And unfortunately almost all of the processors are little, and many of the popular and unavoidable externalization are big...

      [0] https://commandcenter.blogspot.com/2012/04/byte-order-fallac... [1] https://github.com/apple/darwin-xnu/blob/main/EXTERNAL_HEADE...

      • adrian_b 1 day ago
        To cope with data interchange formats, you need a set of big endian data types, e.g. for each kind of signed or unsigned integer with a size of 16 bits or bigger you must have a big endian variant, e.g. identified with a "_be" suffix.

        Most CPUs (including x86-64) have variants of the load and store instructions that reverse the byte order (e.g. MOVBE in x86-64). The remaining CPUs have byte reversal instructions for registers, so a reversed byte order load or store can be simulated by a sequence of 2 instructions.

        So the little-endian types and the big-endian data types must be handled identically by a compiler, except that the load and store instructions use different encodings.

        The structures used in a data-exchange format must be declared with the correct types and that should take care of everything.

        Any decent programming language must provide means for the user to define such data types, when they are not provided by the base language.

        The traditional UNIX conversion functions are the wrong way to handle endianness differences. An optimizing compiler must be able to recognize them as special cases in order to be able to optimize them away from the machine code.

        A program that is written using only data types with known endianness can be compiled for either little-endian targets or big-endian targets and it will work identically.

        All the problems that have ever existed in handling endianness have been caused by programming languages where the endianness of the base data types was left undefined, for fear that recompiling a program for a target of different endianness could result in a slower program.

        This fear is obsolete today.

        • cv5005 1 day ago
          Having different types seems wrong to me because endianess issues disappears after serialization, so it would make more sense to slap an annotation on the data field so just the serializer knows how to load/store it.
      • whizzter 1 day ago
        MacOS "was" big-endian due to 68k and later PPC cpu's (the PPC Mac's could've been little but Apple picked big for convenience and porting).

        Their x86 changeover moved the CPU's to little-endian and Aarch64 continues solidifies that tradition.

        Same with Java, there's probably a strong influence from SPARC's and with PPC, 68k and SPARC being relevant back in the 90s it wasn't a bold choice.

        But all of this is more or less legacy at this point, I have little reason to believe that the types of code I write will ever end up on a s390 or any other big-endian platform unless something truly revolutionizes the computing landscape since x86, aarch64, risc-v and so on run little now.

    • j16sdiz 1 day ago
      If you comes to low level network protocol (e.g. writing a TCP stack), the "network byte order" is always big-endian.
      • edflsafoiewq 1 day ago
        That's a serialization format.
      • 7jjjjjjj 1 day ago
        It goes without saying that all binary network protocols should document their byte order, and that if you're implementing a protocol documented as big endian you should use ntohl and friends to ensure correctness.

        However if designing a new network protocol, choosing big endian is insanity. Use little endian, skip the macros, and just add

          #ifndef LITTLE_ENDIAN
            #error
        
        Or the like to a header somewhere.
        • AnthonyMouse 1 day ago
          What does it actually cost you to define a macro which is a no-op on little endian architectures and then use it at the point of serialization/deserialization?
          • kccqzy 1 day ago
            A lot because to the compiler a no-op macro is the same as not having the macro in the same place so it won’t catch cases where you should use the macro but didn’t. Then you just give yourself a false sense of security unless you actually test on big endian.
            • AnthonyMouse 1 day ago
              The article demonstrates how you can run your existing test suite on big endian with a few simple commands. Or you can just wait until someone actually wants to use it there, they run your program or test suite on their actual big endian machine and then you get a one-line pull request for the place you forgot to use the macro.

              Adding other architectures to your build system also tends to reveal nasty bugs in general, e.g. you were unknowingly triggering UB on all architectures but on the one you commonly use it causes silent data corruption whereas one with a different memory layout results in a much more conspicuous segfault.

      • whizzter 1 day ago
        And honestly at this point it's mostly a historical artifact, if we write that kind of stuff then sure we need to care but to produce modern stuff is a honestly massive waste of time at this point.

        FWIW I doing hobby-stuff for Amiga's (68k big-endian) but that's just that, hobby stuff.

        • jasomill 1 day ago
          IBM z/Architecture, i (OS/400), and AIX aren't primarily used for "hobby stuff".
      • skrtskrt 1 day ago
        Prometheus index format is also a big-endian binary file - haven’t found any reference to why it was chosen.
    • cbmuser 1 day ago
      > What you should do instead is write all your code so it is little-endian only, as the only relevant big-endian architecture is s390x, and if someone wants to run your code on s390x, they can afford a support contract.

      Or you can just be a nice person and make your code endian-agnostic. ;-)

      • simonask 1 day ago
        Or they can be a nice person and pay me for the work I do that only benefits them?
    • bear8642 1 day ago
      > the only relevant big-endian architecture is s390x

      The adjacent POWER architecture is also still relevant - but as you say, they too can afford a support contract.

      • AKSF_Ackermann 1 day ago
        The adjacent POWER architecture seems to be used in ppc64le mode these days.
        • classichasclass 1 day ago
          For Linux, yes. AIX and IBM i still run big.
          • namibj 1 day ago
            The latter can definitely afford a support contract.
    • justin66 1 day ago
      > What you should do instead is write all your code so it is little-endian only

      Of course that’s not what people will do. They’ll write code and not have any idea which parts have a dependency on endianness. It won’t be given a thought during their design or testing and when they need to make it work on a different architecture, it will needlessly be a giant pain in the ass.

    • userbinator 1 day ago
      • imtringued 1 day ago
        There is a comment by sophisticles that fundamentally misunderstands the cost of an endian swap.

        It costs nothing other than having separate instructions for the different endian types.

        The reason for this is that on the transistor level it takes exactly zero transistors to implement a byte swap since all you are changing is in which order the wires are connected.

        Forcing software to deal with the pain of big endian support in exchange for saving a nonexistent cost in hardware is such a bad trade that it's on the same level of stupidity as not applying a clear coat on a car and then seeing them rust and expecting the owner of the car to thoroughly wax the car frequently to prevent the inevitable formation of rust.

        • pezezin 12 hours ago
          It doesn't take zero transistors, at the very least you will need a multiplexer to choose between the two encodings. But such a mux is less than 10 transistors per bit, a rounding error for any modern CPU.
    • EPWN3D 1 day ago
      I mostly agree, but network byte ordering is still a thing.
    • mghackerlady 1 day ago
      or maybe AIX on POWER
    • GandalfHN 1 day ago
      [flagged]
      • AKSF_Ackermann 1 day ago
        Not sure why you consider that to be an issue, if you need to interact with a format that specifies values to be BE, just always byte-swap. And every appliance/embedded box i had to interact with ran either x86 or some flavour of 32-bit arm (in LE mode, of course).
      • adrian_b 1 day ago
        Endianness problems should have been solved by compilers, not by programmers.

        Most existing CPUs, have instructions to load and store memory data of various sizes into registers, while reversing the byte order.

        So programs that work with big-endian data typically differ from those working with little-endian data just by replacing the load and store instructions.

        Therefore you should have types like int16, int32, int64, int16_be, int32_be, int64_be, for little-endian integers and big-endian integers and the compiler should generate the appropriate code.

        At least in the languages with user-defined data types and overloadable operators and functions, like C++, you can define these yourself, when the language does not provide them, instead of using ugly workarounds like htonl and the like, which can be very inefficient if the compiler is not clever enough to optimize them away.

        • cbarrick 22 hours ago
          Defining BE integer data types seems like a bad approach.

          I wouldn't want to maintain those types. The maintainer would either have to implement all of the arithmetic operations or assume that your users would try to hack their way to arithmetic. But really, you shouldn't ever do arithmetic with non-native endianness anyway.

          Instead, define all your interfaces to work with native endianness integers and just do byte swapping at the serialization boundaries.

        • wakawaka28 1 day ago
          If two machines have different endianness, then there is no optimal byte order for either one. Therefore, this problem can't be solved in a one-size-fits-all optimal solution. People have tried to make code generators to take some of the pain out of encoding/decoding but that isn't effortless either.
      • 7jjjjjjj 1 day ago
        Assuming an 8-bit byte used to be a "vendor specific hack." Assuming twos complement integers used to be a "vendor specific hack." When all the 36-bit machines died, and all the one's complement machines died, we got over it.

        That's where big endian is now. All the BE architectures are dying or dead. No big endian system will ever be popular again. It's time for big endian to be consigned to the dustbin of history.

        • zephen 1 day ago
          > It's time for big endian to be consigned to the dustbin of history.

          And, especially what most people call big-endian, which is a bastardized mixed-endian mess of most significant byte is zero, while least significant bit is likewise zero.

          • jasomill 1 day ago
            While I have a strong personal preference for little endian, one thing I've always appreciated about IBM System/360 and its successors is that it at least has consistent notational conventions: most significant byte first, most significant bit zero[1][2].

            [1] https://bitsavers.trailing-edge.com/pdf/ibm/360/princOps/A22...

            [2] https://www.ibm.com/docs/en/SSQ2R2_15.0.0/com.ibm.tpf.toolki...

            • zephen 17 hours ago
              > IBM System/360 and its successors ... at least has consistent notational conventions

              Yes, if I hadn't known about that, I probably wouldn't have written "most."

              > While I have a strong personal preference for little endian

              Despite the porportedly even-handed treatment given in the seminal paper:

              https://www.rfc-editor.org/ien/ien137.txt

              That paper was obviously a product of motivated reasoning. And motivated reasoning in the hands of an intelligent and articulate person is always dangerous.

              (Today, in the public sphere, we are seeing successful motivated reasoning by people who are much less intelligent and articulate, but that is a completely separate issue.)

              The primary benefit (from observation of past arguments) that big-endian has is when you are dumping data and looking at a sequence of bytes, and don't want to mentally swap them around.

              But that itself begs the question. If you are so keen on big-end first, then why does your dump start at the small end of memory?

          • wpollock 1 day ago
            > And, especially what most people call big-endian, which is a bastardized mixed-endian mess of most significant byte is zero, while least significant bit is likewise zero.

            In the 1980s at AT&T Bell Labs, I had to program 3B20 computers to process the phone network's data. 3B20s used the weird byte order 1324 (maybe it was 2413) and I had to tweak the network protocols to start packets with a BOM (byte order mark) (as the various switches that sent data didn't define endianess), then swap bytes accordingly.

            Lesson learned was Never Ignore Endian issues.

            • jasomill 1 day ago
              While I have no personal experience with the 3B2 series, its documentation[1] clearly illustrates the GP's complaint: starting from the most significant binary digit, bit numbers decrease while byte addresses increase.

              As for networking, Ethernet is particularly fun: least significant bit first, most significant byte first for multi-byte fields, with a 32-bit CRC calculated for a frame of length k by treating bit n of the frame as the coefficient of the (k - 1 - n)th order term of a (k - 1)th order polynomial, and sending the coefficients of the resulting 31st order polynomial highest-order coefficient first.

              [1] https://vtda.org/docs/computing/AT&T/3B2/3b2_Assembly_Lang_P...

              • zephen 16 hours ago
                I know this particular pain intimately.

                I was in charge of the firmware for a modem. I had written the V.42 error correction, and we contracted out the addition of the MNP correction protocol. They used the same CRC.

                The Indian (only important because of their cultural emphasis on book learning) subcontractor found my CRC function, decided it didn't quite look like the academic version they were expecting, and added code to swap it around and use it for MNP, thus making it wrong.

                When I pointed out it was wrong, they claimed they had tested it. By having one of our modems talk to another one of our modems. Sheesh.

            • zephen 16 hours ago
              > Lesson learned was Never Ignore Endian issues.

              This is an excellent lesson for data transport protocols and file formats.

              > I had to tweak the network protocols to start packets with a BOM (byte order mark) (as the various switches that sent data didn't define endianess), then swap bytes accordingly.

              (A similar thing happened to me with the Python switch from 2 to 3. Strings all became unicode-encoded, and it's too difficult to add the b sigil in front of every string in a large codebase, so I simply ensured that at the very few places that data was transported to or from files, all the strings were properly converted to what the internal process expected.)

              But, as many other commenters have rightly noted, big-endian CPUs are going the way of CPUs with 18 bit bytes that use ones-complement arithmetic, so unless you have a real need to run your program on a dinosaur, you can safely forget about CPU endianness issues.

          • imtringued 1 day ago
            Not just that. If you store a lower precision type e.g. 2 byte integer inside a large precision type e.g. 8 byte integer, you can read the 2 byte integer by just reading two bytes. Extending or shrinking data types leads to a very natural way of implementing arbitrary precision arithmetic. To get the same capability with big endian your pointer has to point at the end of the number. If you have byte arrays, like a string, you would have to swap the order in which you allocate data, starting from the end of the array and always decrement your index from the array pointer. This would then also apply this to struts. You point at the end of the struct and subtract the field offsets.

            Overall this seems like a pretty weird choice on a planet where the vast majority of text is written from left to right and only numbers are written right to left. Especially since endianness only affects byte order but not bit order, as you said.

            • zephen 16 hours ago
              The primary benefit touted for big-endian is "When I do a memory dump, the data looks right."

              But if you really believe the left side is bigger, why do you put the smaller memory address on the left side of your dump?

        • cmrdporcupine 1 day ago
          > No big endian system will ever be popular again

          Cries in 68k nostalgia

        • namibj 1 day ago
          JS numbers behave much more like C's definition of signed overflow being UB as it's signed numbers are effectively like 51-ish bit with a SEPARATE sign bit and non-assiciative behavior when overflow happens.
  • RustyRussell 1 day ago
    As with many comments here: use a build-time assertion that the system is little-endian, and ignore it. Untested code is broken code.

    I was at IBM when we gave up on big endian for Power. Too much new code assumed LE, and we switched, despite the insane engineering effort (though TBH, that effort had the side effect of retaining some absolutely first-class engineers a few more years).

  • zajio1am 1 day ago
    There is one reason not mentioned in the article why it is worth testing code on big-endian systems – some bugs are more visible there than on little-endian systems. For example, accessing integer variable through pointer of wrong type (smaller size) often pass silently on little-endian (just ignoring higher bytes), while read/writ bad values on big-endian.
  • ncruces 1 day ago
    If you're using Go on GitHub (and doing stuff where this actually matters) adding this to your CI can be as simple as this: https://github.com/ncruces/wasm2go/blob/v0.3.0/.github/workf...

    On Linux it's really as simple as installing QEMU binfmt and doing:

       GOARCH=s390x go test
  • rcxdude 1 day ago
    If you're worrying about the endianness of your processor, your code is somehow accessing memory from 'outside' as anything other than a char*, which is already thin ice as far as C and C++ are concerned. You should have a parse_<whatever>_le and/or parse_<whatever>_be function to convert from that byte stream into your native types, that only cares about the _endianness of the data_ (and they can be implemented without caring about your processor endianness as well). Then you don't need to worry about the processor you're running on at all. There's more significant and subtle processor quirks than endianness to worry about if you're trying to write portable code (namely, memory model and alignment requirements).
  • bluGill 1 day ago
    For most code it doesn't matter. It matters when you are writing files to be read by something else, or when sending data over a network. So make sure the places where those happen are thin shims that are easy to fix if it doesn't work. (that is done write data from everywhere, put a layer in place for this).
    • rcxdude 1 day ago
      It doesn't really need a shim, if you're parsing incoming data that code should only depend on the endianness of the data, not the endianness of the processor that the code is running on, and in general if you're staying within defined behaviour in C and C++ you're going to need to do this anyway (e.g. https://commandcenter.blogspot.com/2012/04/byte-order-fallac...). If you're concerned about performance optimizers have been able to turn this into the right instructions for decades at this point.
    • JetSetIlly 1 day ago
      I agree. Dealing with different endianness has never been an issue so long as you're aware of where the boundaries are. A call to htons() or ntohs() (or the 32bit equivalents) was the solution. I would hope all modern languages have similar helper functions/macros.
  • electroly 1 day ago
    > When programming, it is still important to write code that runs correctly on systems with either byte order

    I contend it's almost never important and almost nobody writing user software should bother with this. Certainly, people who didn't already know they needed big-endian should not start caring now because they read an article online. There are countless rare machines that your code doesn't run on--what's so special about big endian? The world is little endian now. Big endian chips aren't coming back. You are spending your own time on an effort that will never pay off. If big endian is really needed, IBM will pay you to write the s390x port and they will provide the machine.

    • Retr0id 1 day ago
      > There are countless rare machines that your code doesn't run on--what's so special about big endian?

      One difference is that when your endian-oblivious code runs on a BE system, it can be subtly wrong in a way that's hard to diagnose, which is a whole lot worse than not working at all.

      • electroly 1 day ago
        That sounds like a problem to deal with as part of your paid IBM s390x porting contract. I guess my point is: why deal with this before IBM is paying you? No other big endian platform matters, and s390x users are 100% large commercial customers. If IBM or one of their customers isn't paying you, there's nobody else who would need it. If IBM is paying you, you can test on a real z/VM that they provide. I see big endian as entirely their burden now; nobody else needs it. If they want it, they can pay for the work.
        • Retr0id 1 day ago
          I value correct code for purely selfish reasons. The most likely person to try to run my code on a BE system is me.
          • Retr0id 1 day ago
            Also, endian-correct code is usually semantically clearer. For example, if you're reading network-ordered bytes into an int, an unconditional endian swap (which will produce correct results on LE systems but not BE) is less clear than invoking a "network bytes to u32" helper.
            • kelnos 1 day ago
              I think that's a bit different than the argument being made. We should still always use htonl() and ntohl() etc. when dealing with protocols that use network byte order (a shame we're stuck dealing with that legacy). I think even if all big-endian machines magically disappeared tomorrow, we should still do that (instead of just unconditionally doing a byte-swap).

              But for everything else, it's fine to assume little-endian.

              You sound like some sort of purist, so sure, if you really want to be explicit and support both endiannesses in your software when needed, go for it. But as general advice to random programmers: don't bother.

            • namibj 1 day ago
              u32::from_be_bytes

              u32::from_le_bytes

              u32::from_ne_bytes the n stands for native

          • eesmith 1 day ago
            There are a lot of odd (by modern standards) machines out there.

            You're also the most likely person to try to run your code on an 18 bit machine.

            • fc417fc802 1 day ago
              It might sound outrageous but I guard against this sort of thing. When I write utility code in C++ I generally include various static asserts about basic platform assumptions.
              • classichasclass 1 day ago
                So do I. I don't find that outrageous at all. Anyone trying to do the port to something unusual would appreciate the warning.

                Granted, I still work on a fair number of big endian systems even though my daily drivers (ppc64le, Apple silicon) are little.

                • bear8642 1 day ago
                  > daily drivers (ppc64le, Apple silicon)

                  How come you're running ppc64le as a daily driver?

                  • CursedSilicon 1 day ago
                    Cameron is known to have a TALOS II machine
              • peyton 1 day ago
                This is much-appreciated. I’m hardly a Richard Stallman, but finding little incompatibilities after-the-fact is pretty irritating.
                • eesmith 1 day ago
                  Take a look at https://www.kermitproject.org/ckupdates.html . These quotes come from the last few years:

                  > [fixes] specific to VMS (a.k.a. OpenVMS),

                  > For conformity with DECSYSTEM-20 Kermit ...

                  > running on a real Sun3, compiled with a non-ANSII compiler (Sun cc 1.22)

                  > this is fatal in HP-UX 10 with the bundled compiler

                  > OpenWatcom 1.9 compiler

                  > OS/2 builds

                  > making sure that all functions are declared in both ANSI format and K&R format (so C-Kermit can built on both new and old computers)

                  Oooooh! A clang complaint: 'Clang also complains about perfectly legal compound IF statements and/or complex IF conditions, and wants to have parens and/or brackets galore added for clarity. These statements were written by programmers who understood the rules of precedence of arithmetic and logical operators, and the code has been working correctly for decades.'

                  • jasomill 1 day ago
                    But wait, there's more!

                    As of the fourth Beta, DECnet support has been re-enabled. To make LAT or CTERM connections you must have a licensed copy of Pathworks32 installed.

                    SSH is now supported on 32bit ARM devices (Windows RT) for the first time

                    REXX support has been extended to x86 systems running Windows XP or newer. This was previously an OS/2-only feature.

                    No legacy telnet encryption (no longer useful, but may return in a future release anyway)

                    For context:

                    The first new Kermit release for Windows in TWENTY-TWO YEARS

                    Yes, it's called Kermit 95 once again! K95 for short. 2025 is its 40th anniversary.

              • eesmith 1 day ago
                There's platform and there's platform. I assume a POSIX platform, so I don't need to check for CHAR_BIT. My code won't work on some DSP with 64-bit chars, and I don't care enough to write that check.

                Many of the tests I did back in the 1990s seem pointless now. Do you have checks for non-IEEE 754 math?

                • shakna 1 day ago
                  Well, last year clang did not define __STDC_IEC_559__, so assuming IEEE-754 math with most C compilers is a bad idea.
                  • fc417fc802 1 day ago
                    Using C++ under Clang 17 and later (possibly earlier as well, I haven't checked) std::numeric_limits<T>::is_iec559 comes back as true for me for x86_64 on Debian as well as when compiling for Emscripten. Might it be due to your compiler flags? Or is this somehow related to a C/C++ divergence?
                    • shakna 1 day ago
                      The standard warns that macros and assertions can return true for this one, even if it isn't actually true. The warning, because that's what compilers currently do.

                      Its one of the caveats of the C-family that developers are supposed to be aware of, but often aren't. It doesn't support IEEE 754 fully. There is a standard to do so, but no one has actually implemented it.

                      • fc417fc802 1 day ago
                        I don't see any such caveat mentioned here? Is the linked page incomplete? https://en.cppreference.com/w/cpp/types/numeric_limits/is_ie...

                        Of course in my case what I'm actually concerned with is the behavior surrounding inf and NaN. Thankfully I've never been forced to write code that relied on subtle precision or rounding differences. If it ever comes up I'd hope to keep it to a platform independent fixed point library.

                        • shakna 15 hours ago
                          CPPReference is not the C++ standard. Its a wiki. It gets things wrong. It doesn't always give you the full information. Probably best not to rely on it, for things that matter.

                          But, for example, LLVM does not fully support IEEE 754 [0].

                          And nor does GCC - who list it as unsupported, despite defining the macro and having partial support. [1]

                          The biggest caveat is in Annex F of the C standard:

                          > The C functions in the following table correspond to mathematical operations recommended by IEC 60559. However, correct rounding, which IEC 60559 specifies for its operations, is not required for the C functions in the table.

                          The C++ standard [2] barely covers support, but if a type supports any of the properties of ISO 60559, then it gets is_iec559 - even if that support is _incomplete_.

                          This paper [3] is a much deeper dive - but the current state for C++ is worse than C. Its underspecified.

                          > When built with version 18.1.0 of the clang C++ compiler, without specifying any compiler options, the output is:

                          > distance: 0.0999999

                          > proj_vector_y: -0.0799999

                          > Worse, if -march=skylake is passed to the clang C++ compiler, the output is:

                          > distance: 0.1

                          > proj_vector_y: -0.08

                          [0] https://github.com/llvm/llvm-project/issues/17379

                          [1] https://www.gnu.org/software/gcc/projects/c-status.html

                          [2] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n49...

                          [3] https://isocpp.org/files/papers/P3375R2.html

                    • eesmith 1 day ago
                      If I am not mistaken, is_iec559 concerns numerical representation, while __STDC_IEC_559__ is broader, and includes the behavior of numerical operations like 1.0/-0.0 and various functions.

                      Huh. https://en.cppreference.com/w/c/23.html says the "Old feature-test macro" __STDC_IEC_559__ was deprecated in C23, in favor of __STDC_IEC_60559_BFP__ .

                  • eesmith 1 day ago
                    Do you have checks for non-IEEE 754 math?
                    • fc417fc802 1 day ago
                      I do, yes. I check that the compiler reports the desired properties and in cases where my code fails to compile because it does not I special case and manually test each property my code depends on. In my case that's primarily mantissa bit width for the sake of various utility functions that juggle raw FP bits.

                      Even for "regular" architectures this turns out to be important for FP data types. Long double is an f128 on Emscripten but an f80 on x86_64 Clang, where f128 is provided as __float128. The last time I updated my code (admittedly quite a while ago) Clang version 17 did not (yet?) implement std::numeric_limits support for f128.

                      Honestly there's no good reason not to test these sorts of assumptions when implementing low level utility functions because it's the sort of stuff you write once and then reuse everywhere forever.

                    • shakna 1 day ago
                      Okay, as the last wasn't obvious enough: C does not do IEEE 754 math.

                      It is _all_ non-IEEE 754 math.

                      That it isn't compliant is a compiler guarantee, in the current state of things.

                      You may as well have an `assert(1)`.

                      • eesmith 1 day ago
                        And as I wrote, "There's platform and there's platform."

                        I don't support the full range of platforms that C supports. I assume 8 bit chars. I assume good hardware support for 754. I assume the compiler's documentation is correct when it says it map "double" to "binary64" and uses native operations. I assume if someone else compiles my code with non-754 flags, like fused multiply and add, then it's not a problem I need to worry about.

                        For that matter, my code doesn't deal with NaNs or inf (other than input rejection tests) so I don't even need fully conformant 754.

                        • fc417fc802 1 day ago
                          So you don't test for it because your code doesn't use it. Which is fine, but says nothing about code which does depend on the relevant assumptions.
                          • eesmith 1 day ago
                            I say nothing about code which can support when char is 64-bit because my entire point was that my definition of "platform" is far more restrictive than C's, and apparently yours.

                            You wrote "I generally include various static asserts about basic platform assumptions."

                            I pointed out "There's platform and there's platform.", and mentioned that I assume POSIX.

                            So of course I don't test for CHAR_BIT as something other than 8.

                            If you want to support non-POSIX platform, go for it! But adding tests for every single one of the places where the C spec allows implementation defined behavior and where all the compilers I used have the same implementation defined behavior and have done so for years or even decades, seems quixotic to me so I'm not doing to do it.

                            And I doubt you have tests for every single one of those implementation-defined platform assumptions, because there are so many of them, and maintaining those tests when you don't have access to a platform with, say, 18-bit integers to test those tests, seems like it will end up with flawed tests.

                            • fc417fc802 15 hours ago
                              > maintaining those tests when you don't have access to a platform with, say, 18-bit integers to test those tests, seems like it will end up with flawed tests.

                              No? I don't over generalize for features I don't use. I test to confirm the presence of the assumptions that I depend on. I want my code to fail to compile if my assumptions don't hold.

                              I don't recall if I verify CHAR_BIT or not but it wouldn't surprise me if I did.

                        • shakna 1 day ago
                          I can test for some of those. So I can support a broader range of platforms, than just "works for me".

                          I can't support IEEE 754, so its simply irrelevant - so long as I know I cannot support it, and behaviour differs.

      • CJefferson 1 day ago
        However, unless you are a super-programmer, it's very easy to introduce subtle bugs. Software I write has hit this occasionally, someone somewhere does something like cast an int to bytes to do some bit-twiddling. Checking your whole codebase for this is incredibly hard.

        My modern choice is just to make clear to BE users I don't support them, and while I will accept patches I'll make no attempt to bugfix for them, because every time I try to get a BE VM running a modern linux it takes a whole afternoon.

      • edflsafoiewq 1 day ago
        Static-assert the machine is little endian.
        • Retr0id 1 day ago
          Someone's LLM will comment out that line the moment it causes a build failure
    • CJefferson 1 day ago
      You are correct, honestly, I couldn't disagree more with th article. At this point I can't imagine why it's important.

      It's also increasingly hard to test. Particularly when you have large expensive testsuites which run incredibly slowly on this simulated machines.

  • pragmaticviber 1 day ago
    It's all fun and games until you have to figure out if the endianness bug is in your code or in QEMU's s390x emulation.
    • rurban 1 day ago
      Haven't found any bug in QEMU's s390x, but lots in endian code.
  • drob518 1 day ago
    This whole endianness issue can be traced to western civilization adopting Arabic numbers. Western languages are written left to right, but Arabic is right to left. Thus, Arabic numbers appear as big-endian when viewed in western languages. Consequently, big-endian appears to be "normal" for us in the modern age. But in Arabic, numbers appear little-endian because everything is right to left. Roman numbers are big-endian, though. Maybe that's why we kept the Arabic ordering even when adopting the system? We could have flipped Arabic numbers around and written them as little-endian, but we didn't.
    • zajio1am 1 day ago
      'Arabic' numbers comes originally from India, from Brahmi numerals. And Brahmi script was left to right. So big-endian was 'normal' even originally, it was Arabs who kept left-to-right numbers within right-to-left script (and therefore use little-endian relative to direction of Arabic script).
      • zephen 1 day ago
        > it was Arabs who kept left-to-right numbers within right-to-left script

        Do they do this? I thought they swapped this as well.

        • Narishma 1 day ago
          There are actually two ways to read numbers in Arabic.

          The most common is to start from the most significant digit and read left-to-right until the last two digits, which you then read right-to-left.

          A less common alternative is to read right-to-left starting from the least significant digit.

          • zephen 16 hours ago
            Interesting! How do you know which alternative is being used?
            • Narishma 13 hours ago
              To give an example, you could read 1234 as either 'one thousand and two hundred and four and thirty' or 'four and thirty and two hundred and one thousand'.

              Now that I think about it though, I've only seen the latter way used for the year in a date.

    • pezezin 1 day ago
      How do you read a number like 1234? "One thousand two hundreds thirty four", big endian.

      Most (all?) Western languages say out their numbers in big endian, as do East Asian languages like Chinese, Japanese and Korean. It is only natural that we write down our numbers in big endian, it can be argued that the mistake was making little endian CPUs.

      • userbinator 1 day ago
        Little endian: byte N has value 256^N, bit n has value 2^n.

        Big endian: byte N has value 256^(L-N), bit n has value 2^n or 2^(l-n) depending on the architecture (some effectively have little bit-endian but big byte-endian) and where L and l are the byte and bit size of the whole integer respectively.

        Design hardware or even write arbitrary precision routines, and you'll quickly realise that "big endian is backwards, little endian is logical".

        • zajio1am 20 hours ago
          As bits are generally not addressable / not ordered, it makes no sense to call CPU architecture big/little bit-endian. That makes sense only for serial lines/buses.
          • userbinator 17 hours ago
            Wrong for any CPU with bit manipulation instructions... which is nearly all of them.
      • DavidVoid 1 day ago
        How do you read a number like 17? "seventeen", small endian.

        Language is messy, some more than others [1].

        [1]: https://en.wikipedia.org/wiki/Danish_language#Numerals

        • pezezin 13 hours ago
          "Diecisiete", still big endian. (you could make an argument for numbers between 11 and 15 though)

          Yeah, I know there are exceptions, but on average most human languages are big endian.

  • eisbaw 1 day ago
    I did that many years back, but with MIPS and MIPSel: https://youtu.be/BGzJp1ybpHo?si=eY_Br8BalYzKPJMG&t=1130

    presented at Embedded Linux Conf

  • BobbyTables2 1 day ago
    MIPS is often big endian.

    Of course the endianness only matters to C programmers who take endless pleasure in casting raw data from external sources into structs.

    • bigstrat2003 1 day ago
      Hey, there's no need to kink shame C programmers like that.
  • siraben 1 day ago
    Without installing anything, this can also be reproduced with a shell script that uses a Nix shebang to specify the cross compilers.

    https://gist.github.com/siraben/cb0eb96b820a50e11218f0152f2e...

  • beached_whale 1 day ago
    I've used docker buildx to do this in the past. Easier to work with than qemu directly(it does so under the hood).
  • 1over137 1 day ago
    >But without access to a big-endian machine, how does one test it? QEMU provides a convenient solution. With its user mode emulation we can easily run a binary on an emulated big-endian system

    Nice article! But pity it does not elaborate on how...

  • throwaway2027 1 day ago
    Is there any benefit in edge cases to using big-endian these days?
    • zephen 1 day ago
      Well, blogging about how it's important can certainly give insight to others about the age of your credentials, just in case repeatedly shouting "Get off my lawn!" didn't suffice.
  • IshKebab 1 day ago
    > When programming, it is still important to write code that runs correctly on systems with either byte order

    Eh, is it? There aren't any big endian systems left that matter for anyone that isn't doing super niche stuff. Unless you are writing a really foundation library that you want to work everywhere (like libc, zlib, libpng etc.) you can safely just assume everything is little endian. I usually just put a static_assert that the system is little endian for C++.