POSIX Is Not a Shell

(alganet.github.io)

36 points | by gaigalas 5 hours ago

9 comments

  • JdeBP 44 minutes ago
    Stéphane Chazelas would agree,

    * https://unix.stackexchange.com/a/496642/5132

    because the problem here is educating people with slipshod ideas about 'sh' being 'the POSIX shell' or (worse) 'the Bourne shell'. Both M. Chazelas and M. Gaigalas are making the point that 'sh' is a language that one aims to write in, not one of the many programs that sort of, sometimes, if invoked in the right way, implement that language; and a subordinate point that people are generally very poor about doing that when yet they insist that they are writing 'POSIX shell script'.

    Fun facts: Standardization led by existing practice is not a simple process.

    * https://unix.stackexchange.com/a/493743/5132

    POSIX/SUS standardization is not a static thing. The long discussed thorny issue of echo has now subtly changed from all of those explanations given about it over the years, because in 2024 the standard was changed.

    * https://pubs.opengroup.org/onlinepubs/9799919799/utilities/e...

    M. Chazelas's own quite famous 2013 StackExchange answer on the subject has not yet been updated with the change that now incorporates -e and -E into the rule. Amusingly, it was M. Chazelas that raised defect 1222 that caused this change.

    * https://unix.stackexchange.com/a/65819/5132

    * https://www.austingroupbugs.net/view.php?id=1222

    • wpollock 22 minutes ago
      Stéphane (not "M") Chazelas is a genius. I have written him several times over the past 30 years, encouraging him to write a book about shell, POSIX, and other topics that he is an expert in. He writes very well too. I fear we will all be the poorer if/when he retires.
  • qudat 18 minutes ago
    This is quite a timely post compared to: https://bower.sh/posix-shell-is-all-you-need

    This post was thought provoking, I wonder, is the hidden argument here that the posix spec for a shell is not well specified if there is so much variance between the implementations?

    Or is the fundamental issue simply a matter of history? Both?

    • gaigalas 0 minutes ago
      Most shell families precede the spec. Both bash and ksh had features at the time the spec was written that are not in it. It's weird.

      My main point is that following the spec doesn't guarantee shell scripts will be portable, which is a common misconception.

  • Muhammad523 3 hours ago
    This post is nice: the writer first explains a problem, using a simple example. In the next section, they reflect a bit about the problem, and then they casually mention two tools they built. In my opinion, this is amazing: you sponsor you project, while also making the problem it solves clear: use their tool to test how portable your code is
  • kevin_thibedeau 21 minutes ago
    Solaris most definitely has a POSIX shell at /usr/xpg4/bin/sh which behaves differently than OG Bourne at /bin/sh.
  • croemer 18 minutes ago
    Inconsistent: claims /bin/sh on Alpine is dash in one paragraph, BusyBox ash in another.

    Smells a lot of AI writing.

    • gaigalas 13 minutes ago
      They are the same, kinda! ash and dash are from the Almquist family.

      https://en.wikipedia.org/wiki/Almquist_shell

      It's just poetic license.

      • croemer 4 minutes ago
        They are not the same. For example, `[[` apparently works on Alpine's ash but not on Debian's dash. BusyBox ash has some bash compatibility options enabled in Alpine.

        It's ironic that a post about shell differences glosses over this. Maybe not surprising that slop is sloppy.

  • echoangle 3 hours ago
    Pretty bad argument. If it’s not defined by POSIX, it’s not POSIX compatible if you rely on a specific behavior.

    If you only use defined behavior and it works, it is compatible.

    It’s like saying C99 isn’t a compiler. True, but you can still write C99 code, right?

    • smitty1e 2 hours ago
      > C99 isn’t a compiler.

      Sure, but the pojt here is that if we say "Write in X" we generally understand it to mean "Treat X like a standard and don't get too colloquial with the stylings."

      Pedantry is worthwhile, but it can be a diminishing returns game.

      • eqvinox 1 hour ago
        Feels like you missed the point.

        On the example of 'echo \n' - it's not defined in POSIX, therefore a script written in "POSIX shell" must simply never hit that case.

        TFA kinda implies you can't target POSIX shell. That's silly, of course you can. The question is, what tools are there to check for compliance. Whether running on 14 shells is a good such tool - idk. Something specifically searching for POSIX violations might be better.

        • Joker_vD 1 hour ago
          Well, with C language it's pretty much the same. You are supposed to "just" never write (or rather, most of the time, to just not execute) anything that is UB. And lots and lots of people to this day continue to believe that can do this (most of the time, they're wrong).
          • eqvinox 22 minutes ago
            UB (specified to be undefined) and 'plain' unspecified are not the same thing.
        • gaigalas 1 hour ago
          The spec is not that good.

          `local` for example is present in many shells (almost all of them), but they decided to leave it out uniquely because of ksh93 (scope is different). It became undefined behavior.

          When the spec was written, ksh was important. Since then, it has only been revised but not updated and I consider it to be obsolete.

          So, if you follow POSIX strictly, you then lose local scope on functions, which is more likely to cause bugs and hard to catch with a linter like you suggested. You're left with a broken feature set (on many other angles too) that is not actually practical. Even spellcheck makes concessions.

          • echoangle 55 minutes ago
            I don’t get this point either. If local is not in the POSIX spec, I guess you can’t use it if you want to be POSIX compatible. Just because many shells do it doesn’t mean it’s POSIX.
            • gaigalas 49 minutes ago
              You can target POSIX if you want to, but doing that doesn't guarantee shell scripts will work.

              The blog post stresses this, the difference between POSIX and portability.

              If you want portability, testing is better for now. One of the goals of these projects is to more precisely capture a retrospec (what actually works, not what was specified), it's the same thing they did with HTML5.

  • sdovan1 2 hours ago
    If your environment is POSIX, testing scripts with tool written in POSIX shell, like shellspec[1], might also be a choice.

    [1] https://shellspec.info/

  • paulddraper 48 minutes ago
    > When someone says "write it in POSIX shell for portability," they mean well.

    > POSIX is a specification. Not a program. The thing that actually runs your script is bash, dash, ash, ksh, yash, or one of a dozen others.

    “When someone says ‘write it in ECMAScript,’ they mean well.

    “ECMAScript is specification. Not a program. The thing that actually runs your script is Node.js, Bun, Deno, Rhino, or one of a dozen others.”

    See how silly that sounds?

    • gaigalas 16 minutes ago
      No one says "Let's write ECMAScript for portability". If someone did, I would probably be writing about that too.
  • jmclnx 2 hours ago
    Will not build without docker, so I am out of luck. This tells me this is not portable, even to some Linuxes.
    • Joker_vD 1 hour ago
      Strict POSIX conformance is arguably worse. I mean, have you seen what it advises for shebangs? First of all:

          The shell reads its input from a file (see sh), from the -c option or from the system() and popen() functions defined in the System Interfaces volume of POSIX.1-2017. If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.
      
      Ah, so shebangs are not required to be supported, already a great start.

          Applications should note that the standard PATH to the shell cannot be assumed to be either /bin/sh or /usr/bin/sh, and should be determined by interrogation of the PATH returned by getconf PATH, ensuring that the returned pathname is an absolute pathname and not a shell built-in. [...]
      
          Furthermore, on systems that support executable scripts (the "#!" construct), it is recommended that applications using executable scripts install them using getconf PATH to determine the shell pathname and update the "#!" script appropriately as it is being installed (for example, with sed). For example:
      
              #
              # Installation time script to install correct POSIX shell pathname
              #
              # Get list of paths to check
              #
              Sifs=$IFS
              Sifs_set=${IFS+y}
              IFS=:
              set -- $(getconf PATH)
              if [ "$Sifs_set" = y ]
              then
                  IFS=$Sifs
              else
                  unset IFS
              fi
              #
              # Check each path for 'sh'
              #
              for i
              do
                  if [ -x "${i}"/sh ]
                  then
                      Pshell=${i}/sh
                  fi
              done
              #
              # This is the list of scripts to update. They should be of the
              # form '${name}.source' and will be transformed to '${name}'.
              # Each script should begin:
              #
              # #!INSTALLSHELLPATH
              #
              scripts="a b c"
              #
              # Transform each script
              #
              for i in ${scripts}
              do
                  sed -e "s|INSTALLSHELLPATH|${Pshell}|" < ${i}.source > ${i}
              done
      
      Marvelous. What a robust foundation of useful and hard-to-misuse utilities.
    • gaigalas 1 hour ago
      Author here.

      It definitely builds outside docker. It's a musl-cross-make toolchain, you can procure the dependencies locally if you don't like the Docker recipes.

      Feel free to open an issue if you feel like that's a challenge. Likely, you can get it to work but checksum reproducibility will be hard without a controlled environment like docker.