Why is the first C++ (m)allocation always 72 KB?

(joelsiks.com)

82 points | by joelsiks 6 hours ago

6 comments

  • pjmlp 1 hour ago
    This is compiler specific and cannot be generalised as C++.
    • zabzonk 1 hour ago
      Well, yes, but still quite interesting, IMHO. It's not like GCC is one of the least used compilers.
      • pjmlp 44 minutes ago
        Yeah, but that isn't C++ in isolation, thus the tile is incorrect.
    • compiler-guy 1 hour ago
      And C++ library specific as well. Perhaps even more so.
  • aliveintucson 57 minutes ago
    I think you should read up on what "always" means.
  • throwaway2037 4 hours ago
    I would like the see the source code for libmymalloc.so, however, I don't see anything in the blog post. Nor do I see anything in his GitHub profile: https://github.com/jsikstro

    Also, I cannot find his email address anywhere (to ask him to share it on GitHub).

    Am I missing something?

    • joelsiks 4 hours ago
      The exact implementation of mymalloc isn't relevant to the post. I have an old allocator published at https://github.com/joelsiks/jsmalloc that I did as part of my Master's thesis, which uses a similar debug-logging mechanism that is described in the post.
    • nly 4 hours ago
      dlsym() with the RTLD_NEXT flag basically:

      https://catonmat.net/simple-ld-preload-tutorial-part-two

      There's actually a better way to hook GNUs malloc:

      https://www.man7.org/linux/man-pages/man3/malloc_hook.3.html

      This is better because you can disable the hook inside the callback, and therefore use malloc within your malloc hook (no recursion)

      But you can't use this mechanism before main()

      • Joker_vD 3 hours ago

            The use of these hook functions is not safe in multithreaded
            programs, and they are now deprecated.  From glibc 2.24 onwards,
            the __malloc_initialize_hook variable has been removed from the
            API, and from glibc 2.34 onwards, all the hook variables have been
            removed from the API.  Programmers should instead preempt calls to
            the relevant functions by defining and exporting malloc(), free(),
            realloc(), and calloc().
        • nly 2 hours ago
          Yeah. Shame though because it gave you the option to control exactly when you hooked and didn't hook, which let stop and start debugging allocations based on arbitrary triggers.

          The global variable approach was very useful and pretty low overhead.

          • jeffbee 5 minutes ago
            If you only wanted to observe the behavior the post is discussing, it seems like `ltrace -e malloc` is a lot easier.
  • Joker_vD 3 hours ago
    Huh. Why is this emergency pool not statically allocated? Is it possible to tune the size of this pool on libc++ startup somehow? Because otherwise it absolutely should've been statically allocated.
    • joelsiks 2 hours ago
      I did mention it briefly in the post, but you can opt-in for a fixed-size statically allocated buffer by configuring libstdc++ with --enable-libstdcxx-static-eh-pool. Also, you can opt-out of the pool entirely by configuring the number of objects in the pool to zero with the environment variable GLIBCXX_TUNABLES=glibcxx.eh_pool.obj_count=0.
      • ninkendo 1 hour ago
        I wonder why it’s opt-in. Maybe it’s part of the whole “you only pay for what you use” ethos, i.e. you shouldn’t have to pay the cost for a static emergency pool if you don’t even use dynamic memory allocation.
  • znpy 45 minutes ago
    > TLDR; The C++ standard library sets up exception handling infrastructure early on, allocating memory for an “emergency pool” to be able to allocate memory for exceptions in case malloc ever runs out of memory.

    Reminds me of Perl's $^M: https://perldoc.perl.org/variables/$%5EM

    In Perl you can "hand-manage" that. This line would allocate a 64K buffer for use in an emergency:

        $^M = 'a' x (1 << 16);