Your hex editor should color-code bytes

(simonomi.dev)

433 points | by tobr 2 days ago

48 comments

  • dspillett 10 hours ago
    Everything should try do some basic syntax highlighting IMO. Not too much, or it just becomes a sea of formatting that doesn't help at all. It is surprising how much difference just a little splash of colour can make if it isn't overdone. If possible, always include configuration options for the user though, so those with colour-blindness issues can tweak things to their needs, those who are just fussy can make the output fit with their finely adjusted system-wide colour schemes¹, and even better, where you can, allow bold/italic/other as well as colours so that those who barely see colour at all can play too.

    Of course none of this helps those using screen-readers and other tech, so make sure that all your fancy colouring & such is additive so if it is all “lost” no meaning is absolutely lost with it.

    --------

    [1] Some people can be very vocal about this, more so than if highlighting isn't possible at all. If you give any output formatting they'll expect you to match, or be able to be made to match, their preferred style.

    • finaard 9 hours ago
      I'd recommend for every developer to get one or more colourblind friends. I have some, and regularly send them screenshots of what I'm working on to get feedback what they can see and what they can read/can't read.

      They've been absolutely invaluable for making sure their kind of people can't use my apps properly.

      • gortok 8 hours ago
        8% of the male population has some form of colorblindness (for women it’s around 0.5%). I have deuteranomaly colorblindness. If you search for images on the internet related to that type of colorblindness you’ll find representations of how we see color and how we see the world.

        It is not a fun condition to have, and leads to lots of problems in my everyday life. This blog post accidentally accentuated that issue, since the colors are (to what I can understand) very similar looking to me as a colorblind person.

        1 in 12 men and 1 in 200 women go through the same sorts of experiences, and it’s worth it, if you aren’t color deficient, to try out some of the colorblindness sites and see the world as we do.

        https://www.colourblindawareness.org/colour-blindness/colour...

        • cyberge99 28 minutes ago
          Does anyone know of a study done on depression/color blindness.
        • dspillett 6 hours ago
          > 1 in 12 men and 1 in 200 women go through the same sorts of experiences,

          Almost everyone to an extent loses some colour definition in their vision as we age, even those lucky enough to have excellent colour vision to start with, some lose a lot more than others and it is gradual so mostly not noticed at first. The is one of the reasons many grandparents have the saturation oddly high on their TVs (the other main reason, of course, being they've just never changed it from the default that is picked to make the display “pop” under bright show-room lighting conditions).

        • basilgohar 4 hours ago
          Thank you both for sharing your lived experience as well as concrete examples for understanding. I, like I am sure many others, live a richer life knowing what others are going through and how I can make tiny adjustments, even if it's just awareness, to account for how others different from in one way or another go through life.
        • watwut 6 hours ago
          Most color blind men are mildly color blind, plenty even go through their lives without noticing.

          Yours is on the much stronger side of the things.

          • pc86 5 hours ago
            A cousin of mine found out in his late 20's that he is red-green color blind.
            • pdpi 2 hours ago
              Had one of those happen in high school — science teacher talking about colour blindness and shows students the colour blindness tests, one student assumes he’s being trolled and that one of the test images was a solid colour.
        • gsich 8 hours ago
          >1 in 12 men and 1 in 200 women go through the same sorts of experiences, and it’s worth it, if you aren’t color

          Not the same, it's a gradient.

          • embedding-shape 6 hours ago
            Some people lack in vision, some lack in reading and some lack in stopping themselves to reply to comments misunderstanding others.
            • gsich 3 hours ago
              You must be speaking of yourself.
              • ddingus 3 hours ago
                Nope.

                The greater point here being we definitely benefit when we all understand one another better.

                Surely that point was not lost, yes?

      • Narew 19 minutes ago
        Ubisoft release some tools to simulate colorblindness. It's easy to extract de color transformation from their shader directly. I'm not colorblind but I use that quite often to check roughly if the color palette I choose is fine https://github.com/ubisoft/Chroma
      • dspillett 6 hours ago
        > I'd recommend for every developer to get one or more colourblind friends.

        In the absence of any naturally occuring colour-blind friends, do you have any tips about surrptitious damaging someone's eyes to create one? :-)

        Though there are simulation tools avaliable which do a reasonable job, I'll probably stick to those where I have a concern. That feels less drastic.

      • tosti 6 hours ago
        > They've been absolutely invaluable for making sure their kind of people can't use my apps properly.

        Why would you do that?

        • pc86 5 hours ago
          It's obviously a typo (or an excellently delivered joke) but I did get a chuckle out of the idea of someone going out of their way to ask color blind friends for feedback just to do the opposite out of spite for some reason.
          • ddingus 3 hours ago
            I thought it was intended and excellent!
      • mghackerlady 7 hours ago
        Get more than one, someone with red-green and blue-yellow color blindness are going to have completely different experiences
      • watwut 6 hours ago
        Pro-tip: there are browser extensions able to simulate various kinds of color blindness.

        That is better then a random friend, because a.) there are various kinds of colorblindness b.) you wont ask the random friend to work for your company for free.

      • aa-jv 7 hours ago
        For sure, but if you don't have any friends, don't discount the value of using tools such as CoBlis:

        https://www.color-blindness.com/coblis-color-blindness-simul...

        .. to get an idea of the impact of your UI design on color-limited folks out there ..

        I used this a few times to great effect, it was very revealing to see that my carefully selected teals and ambers were incomprehensible to some folks I really wanted to use my apps .. didn't take much iteration to come to a happy palette though, just needed a bit of care.

    • ddingus 3 hours ago
      A screen reader could use various aural means of emphasis that I bet would be as effective as this "color by pattern" idea is.

      We have pitch, volume, enunciation speed, and for the voice itself the vocal formant frequency can change as can the harmonics. And that is a rich field we are good at differentiating in too.

      One other screen reader idea I had upon seeing this is to use a brief sound either immediately before or after, maybe even slightly overlapping the vocalization.

      30 [30 MS BEEP] CO [30 MS BEEP FOLLOWED BY A SHORT CHIRP THAT INDICATES A KNOWN ADDRESS]

      Writing that out looks messy. All I can say is the sounds in my head right now make a lot more sense and would compliment the colors nicely.

    • PunchyHamster 8 hours ago
      As long as you just give people list of settings for colors, they can pick as much color or monochrome as they want.

      So by all means "color everything", people have different opinions on what they want colored so give them option

    • cubefox 8 hours ago
      > Everything should try do some basic syntax highlighting IMO.

      Interesting idea. So even syntax–highlighting natural language. Grammar highlighting, as it were. Prepositions, verbs, question marks, etc. An LLM could do it. Would it actually improve readability though? Seems likely!

      • dspillett 6 hours ago
        > So even syntax–highlighting natural language. Grammar highlighting, as it were.

        Not as fine-grained as individual items of grammar, but we essentially already do this and have for almost as long as writing has been a thing. Headings in bold, things that you want to emphasise in body text in italic or bold, hyperlinks underlined and/or in a different colour, …

        Highlighting by grammar might be useful in language analysers/translators, for those of us trying to learn a second, though being able to pick from a selection of rules would be needed for it to be truly useful: sometimes I might want words agreeing with each other (subject->verb, subject->adjectives) in a colour of their own, sometimes specific word types (is abierto the past participle of abrir here, or the adjective?). Or verb tenses. You could perhaps do both tense and agreement, highlighting the stem by tense and the suffix with the same colour as the subject, and object pronouns the same colour as any relevant adjectives, but this is likely to make the colouring system too complex to be useful at a glance. On anything more than a single simple sentence something more dynamic might be better here, not highlighting anything by default but when something is hovered over have it and relevant things spring into colour appropriately, you'll only need small set of colours in that case rather than one for each subject/object in a longer paragraph with trying to match the same subject/object to a set colour consistently throughout.

      • bombcar 7 hours ago
        I don't even seen the paragraphs anymore, just sentence diagrams!

        Here's Claud attacking your post:

        http://schnecke.bombcar.com/random/sentence_diagram.png

      • davidmurdoch 6 hours ago
        I always enjoy to find people who think so drastically differently than myself. This sounds like an absolute nightmare to me and I would gauge my eyes out.
        • afiori 2 hours ago
          Imo there would be ways to do it so that it could have a similar effect to the capitalized nouns in German
        • oblio 4 hours ago
          Everything in nature is colored. Our eyes are worse than dinosaur eyes but they can still distinguish millions of colors. We might as well use 5-10-20 for highlights.
      • BigTTYGothGF 8 hours ago
        I think that's a good idea, altho you can probably get away with good old NLP to do it.
  • cuechan 9 hours ago
    For anyone who regularly has to look at/analyze binary files, i highly recommend ImHex [1].

    Its a hex editor built with imgui and has a lot of built in tools. Imo the best feature is the data structure editor. You can write a data type definition similar to C and it overlays it on the hexdump and parses it in a structured way while you type.

    It also has a node based editor.

    1: https://github.com/WerWolv/ImHex

    • altairprime 8 hours ago
      Does it do color highlighting by value of hex bytes, though?
      • criddell 6 hours ago
        • seritools 6 hours ago
          as far as i can tell, no it does not. it only desaturates 00 in particular. the other colors you see in the screenshots come from matched formats/patterns. it does not do direct coloring based on byte value.
          • criddell 5 hours ago
            Under the Edit menu select "Highlighting Rules..." and you can define or load any set of rules you can imagine.
            • altairprime 1 hour ago
              Thanks!

              How many rules would it take to create 256 different colors for 00..FF?

    • p0w3n3d 5 hours ago
      WOW!!!! if I had this when I had been working on Omnet connector to the HKEX, I would have win my life. I mean financial life, but still...
    • dhash 4 hours ago
      ImHex is amazing! It’s actively maintained, sponsored by FUTO, and is very hackable (both without recompiling, as well as modifying the grammars for others to take advantage of)
    • yonatan8070 6 hours ago
      That looks super cool! Now I just need a reason to look at hex files
    • octagons 6 hours ago
      ImHex++. I also can’t help but shill for 010 Editor, a commercial alternative that one might describe as a little less opinionated.
      • r0yadar 5 hours ago
        Yeah 010 is my go to (much cleaner for me)
    • mr_sturd 6 hours ago
      Plus the built-in parsers for well known formats!
    • ddingus 3 hours ago
      THANK YOU.

      This is another fine tool I can add to my collection.

      And FUTO! Love it.

    • sandos 8 hours ago
      But does it have colors!?!??!
  • roelschroeven 9 hours ago
    When you're going to color-code bytes in a hex dump, I would expect each ASCII character in the right column to have the same color as the hex byte in the left column, making it easier to pair them. I wonder why that wasn't done here.
    • pragma_x 6 hours ago
      I came in here to comment the same. Our brains are wonderful pattern recognition engines and the reader would absolutely be able to more readily see the correlation between hex and character representations this way. It might even accelerate learning hex values in the process.
  • NooneAtAll3 9 hours ago
    Why did author decide that best way to demonstrate his idea would be by cutting contrast in half?

    color-coding might be a great solution, but you don't really know beforehand which byte values are important. Manually selecting C0 to make it stand out it just ctrl+f with extra steps. (But I wouldn't mind something like "color 00 separate from ascii separate from the rest)

    • seszett 9 hours ago
      > Manually selecting C0 to make it stand out

      That's not what they did, actually. C0 is the only byte in there that's above 3F or so, and it's far from it. Hence the very different colour, and the lack of contrast between the colours of the other bytes.

    • oblio 3 hours ago
      OP is automatically selecting similar colors across the spectrum. C0 is just very far from the rest so the color is very different to the other bytes.
  • bwiggs 7 hours ago
    DEFCON30, Mayhem CTF.

    We were given a file full of random bytes. The flag was in there somewhere. It was too random to be encrypted, there wasn't any structure. `file` didn't return anything, truly just a bag of bytes.

    I had decided to install `hexyl` as an alternative option to some of the other hex editors installed o my linux machine. All the bytes were colored grey.

    I scrolled the file and noticed a blip of yellow. A random golden `{` amongst all the noise. Weird.

    The next colored byte was a `C`, then `T`, `F`.

    ---

    At that time, I was mostly using HexFiend to look at raw files, which didn't have byte coloring. For DEFCON I had decided to drive my linux machine. I had ghex installed, but i had also decided to install and try `hexyl` via cli. So seeing bytes in color was purely by chance that I had installed it. I eventually posted an issue to ghex to add color support. https://gitlab.gnome.org/GNOME/ghex/-/issues/60

    I need to see if I can find the file and post it on that blog post. https://bwiggs.com/posts/2023-08-31-hacking-in-color/

    • abcd_f 7 hours ago
      > It was too random to be encrypted

      That's a rather odd remark.

      • justsomehnguy 22 minutes ago
        It's not?

        Compare a random data from a pseudo gen, a really random data and some encrypted data. They are all different.

      • bwiggs 7 hours ago
        You would still expect some amount of file structure, ex byte headers or something at the beginning/end of file. no?
        • Crestwave 5 hours ago
          That would be expected for encodings or container file formats. Straight-up encryption like AES produces results that are visually indistinguishable from random data.
        • throwawaysoxjje 5 hours ago
          I’d expect the greater length of the encrypted data (which should look random) vs the structured header/footer to rapidly push the Shannon entropy to the maximum
        • wang_li 5 hours ago
          No. Why would you? Encrypted data should look no different than random. The app figures out if it is the app's data after it attempts to decode it.
    • masfuerte 5 hours ago
      I don't get it. If you were looking at random data, why did hexyl apply colour to only the brace, C, T and F?
    • Crestwave 5 hours ago
      Wouldn't strings(1) have worked for this?
      • ksherlock 5 hours ago
        By default, strings needs a run of 4+ (printable|ascii) characters. This sounds like it was 1 ascii character at a time in a sea of random data (with other alpha chars removed).
  • Someone 8 hours ago
    The first example is “go ahead, try to find the single C0 in these bytes” and then argues one should highlight C0 bytes.

    If that’s true, how does the tool know I will be looking for C0 bytes and not for 03, D3, etc? The logical conclusion of that would be that the hex editor should uniquely color code every byte. And following the other examples even that’s not enough.

    The proposed solution is to create groups of byte values that each get their unique color. I think that helps, but we can do better: add a search feature. That tells your editor what you are looking for. Once you enter a search string, it can highlight all hits.

    Yes, “colorful output in a hexdump is useful for the same reason that syntax highlighting for code is useful”, but do you know what syntax highlighting needs? Knowledge of the expected content of a file. Without that, a hex editor at best can guess at how to color-code stuff.

    IMO, if you want to add syntax coloring to a hex editor, give it pluggable syntax coloring and heuristics for deciding which one to use when.

    While at it, also let those plugins control where to break lines, whether to show hex at all (why show it at all if a file has a few paragraphs of English text or an array of IEEE doubles?), etc.

    Those plug-ins will make errors and sometimes, users will want to see all byte values, so you’ll need a way for the user to override them.

    • Antibabelic 7 hours ago
      I don't think that's quite the point that example was intended to illustrate. The idea is not the you're looking for C0 bytes or any other kinds of bytes in particular, but rather that it's easier to fish out unique and interesting information in a sea of noise when you have color-coded bytes: like the fact that there's a conspicuous lonely C0 or some other value or series of values that stand out.
      • bombcar 7 hours ago
        Exactly - though the writing style buries the lede until the end; the recommendation is to have at least a different color for each of the first nibbles (e.g, 0x is one color, 1x another). As a minimum that makes some outliers pop out. He then has other recommendations like highlighting the ASCII range, etc.

        As a note, the some up there is load-bearing - color may lull you into complacency where the difference between 01 and 0F is major and important but not highlighted. More complicated regex built color tools designed to highlight "anomalies" could be developed but then you need to define what anomalies are (patters, places where a pattern changes, etc).

  • myfonj 7 hours ago
    When (rarely) using hex editors, one thing constantly comes to my mind: isn't base 16 arabic-roman numerals a bit awkward for "skimmable" overview? Color-coding indeed helps immensely there, but wouldn't simply letting bits and bops shine in eight bit clusters, resembling the "physical" shape of the eight-bit byte, be somewhat more readable?

    We even have characters in the Unicode for representing 0..255 variations, actually two distinct groups: Braille (arguably a bit misuse for binary) and octants (accompanied by older predecessors). So what would be

        |65|97|66|98|67|99|32|126|32|72|101|108|108|111|44|32|109|111|109|33|32|240|159|166|132|
    
    in base-10 or

        |41|61|42|62|43|63|20|7e|20|48|65|6c|6c|6f|2c|20|6d|6f|6d|21|20|f0|9f|a6|84|
    in base-16, could be

        |⢈|⢊|⡈|⡊|⣈|⣊|⠂|⡾|⠂|⠌|⢪|⠮|⠮|⣮|⠦|⠂|⢮|⣮|⢮|⢂|⠂|⠛|⣵|⡣|⠡|
    in Braille, or

        |𜵲|𜵶|𜴷|𜴻|𜶭|𜶱|𜴀|𜵯|𜴀|𜴋|𜶔|𜴭|𜴭|𜷟|𜴫|𜴀|𜶢|𜷟|𜶢|𜵴|𜴀|(⁕)|𜷢|𜵖|𜴙|
    using octants.

    Most significant bit is at the top left here, the least one is bottom right -- it felt somewhat intuitive to me this way, your intuition may differ, obviously.

    Or, naturally, "AaBbCc ~ Hello, mom! <Unicorn Emoji>" as a "UTF-8" text.

    Try: http://myfonj.github.io/tst/byte-dec-hex-braille-octant.html) Test (with added "CSS" variant and "highlight" of empty dots): http://myfonj.github.io/tst/byte-visualisation-exploration.h...

    (⁕) HN apparently eats upper-half block. Amusing that only this particular ("old", as referred earlier) one got filtered out…

    Also caveat: Android phones have messed-up Braille block due outdated broken embedded font, so all patterns with dots in the left half appear in the right instead. Long reported, not fixed, IIRC.

    • nottorp 2 hours ago
      Might be useful, but do leave the hex representation in.

      By the way, the "octants" representation is unreadable to me and I have HN at like 140% zoom.

    • aa-jv 7 hours ago
      This is a great idea and I concur with your line of thought - there is room for expression in the realm of number representations .. have you considered building an ImHex plugin that would illustrate your point? The use of octants is particularly intriguing ..

      One thing I often ponder on, along similar lines, is whether I can write some clever plugin that would put FF ChartWell - a font which uses ligatures to render useful graphs out of boring numerical data - into use, within ImHex. Seen how ligatures can be used this way?

      https://typographica.org/typeface-reviews/chartwell/

      Your idea of discovering new means of representing numeral data put me in mind of ligature hacking, in any case ...

  • jcalvinowens 4 hours ago
    If you just want to see patterns, and don't actually need to see the values, you can go a step further and simply visualize the data as a bitmap, e.g.

        dd if=/dev/urandom bs=$[256*256] count=1 | display -size 256x256 -depth 8 GRAY:-
    
    You can do the same thing with audio, which makes different sorts of patterns obvious, e.g.

        dd if=/dev/urandom bs=$[256*256] count=1 | aplay -c 1
    • EvanAnderson 1 hour ago
      I made little CLI tool back in the MS-DOS days to dump binaries into VGA mode 0x13. It allowed me to vary the width of "line" wrapping. It was a killer tool for seeing data in binaries.
    • felooboolooomba 3 hours ago
      > dd if=/dev/urandom bs=$[256*256] count=1 | aplay -c 1

      This plays sounds, sounds like "Liberate Mae?"

  • leetrout 14 minutes ago
    ipython saved me this week when it color coded

      b'\x100'
    
    
    Which was not obvious to me in the print output that it is \x10 and a literal 0.
  • Findecanor 58 minutes ago
    If you're making a hex editor and going to have colour coding, I'd think you expend some effort to make the colouring schemes configurable — and easy to configure and change. Maybe load and save as separate files.

    Different colouring schemes for different types of data.

  • delta_p_delta_x 10 hours ago

      > Your hex editor should colour-code bytes so it is easier for users to distinguish patterns
      > Article is fully in lowercase, which makes it harder for readers to make out sentences and the flow of the article
      > mfw the irony
    • gblargg 9 hours ago
      The text smashed up to the left border doesn't help either.
  • bandrami 12 hours ago
    Emacs's hexl-mode does this, incidentally, though annoyingly by default it makes all faces the same color. I never understood why it defines the faces but then doesn't customize them.
    • TeMPOraL 7 hours ago
      What exactly does it do? I'm looking at hexl-mode sources in my Emacs, and I see it defining only two faces - hexl-address-region and hexl-ascii-region.
      • kleiba2 5 hours ago
        That's correct, as far as I can tell: the first one is used for all hex values in the "main" are of the buffer, and the second one for the character representation of each byte in the right-hand side column.
  • kokakiwi 9 hours ago
    ImHex (https://imhex.werwolv.net/) is also a really nice Hex editor with tons of plugins (patterns, file support, etc.) and even an embedded language for adding more patterns easily
    • aa-jv 7 hours ago
      +1 for ImHex - I've had a lot of fun with Pattern Language lately .. its quite fruitful to, for instance, have an AI take a few headers worth of input to produce some PL code which highlights all the things very well. For a while I almost forget I'm in a hex editor and not in some custom data wrangling environment ..
  • nticompass 9 hours ago
    I used to use wxHexEditor and that had a feature where I could select a section of the file and highlight it in a color. When I was working to decode a certain file format, I used that to color-code different sections of the file and it was super useful. Those color-codes were stored in a separate file so you could load them back in.
  • ChrisRR 9 hours ago
    What a bad way to illustrate your point by using such similar looking pastel colours
    • altairprime 8 hours ago
      Ironically, my complaint about the article was that the author apparently only uses typical human vision ranges here, rather than mapping 00..FFh onto an OKlab gradient of hue 0..359° that rewards those few of us with impeccable color fidelity with even better highlighting than most can see :) No doubt there’s value in contentful highlighting but I’d rather just have a straight hue translation on the circle at a fixed luminosity. There’s only 256 hues to discern, after all! And what a pleasure it would be to learn to read hex code by hue alone :D
  • orphea 8 hours ago
    I get the idea but those specific examples are awful - not enough contrast.
  • ape4 2 hours ago
    As a next level beyond coloring... how about adding some interactivity? How about a slider to control the brightness of each type of byte. Turn everything but text to 10% when when you're looking for some words in a binary file.
  • Archelaos 11 hours ago
    This article made me think how I could use similar techinques to colour code the data in database tables. Has anyone here tried that and has some recommendations where to start, etc.?
  • nickwanninger 4 hours ago
    I added type-based color printing to my hexdump in my kernel [1] if anyone wants to have that code. It was instrumental in finding bugs quickly in wee hours of the night sometimes, especially if you have heap corruption. ---- [1] https://github.com/ChariotOS/chariot/blob/e046849c668458d25e...
  • xvilka 4 hours ago
    Rizin[1][2] does exactly this, also there is a compact hex-II[3] mode.

    [1] https://rizin.re

    [2] https://github.com/rizinorg/rizin

    [3] https://speakerdeck.com/ange/no-more-dumb-hex

  • js8 12 hours ago
    I think semantic coloring (based on structure) is more useful. Also (can't help as someone working with z/OS), if you really want to make hex output readable, I recommend using big-endian machine.
  • dhosek 4 hours ago
    I found the coloring in most of the examples to be more distracting than helpful. I can see cases where it could be helpful (e.g., highlighting bytes in the 0x20–0x7E range for spotting ASCII strings, or a fancier one that can identify UTF-8 strings, or better still, invalid sequences in what might otherwise be UTF-8), but most of the cases here didn’t really help all that much for me.
  • randusername 7 hours ago
    I think this is a cool idea.

    I'd want to take it further by using full RGB and cycling through some colormaps with different properties. Sequential, diverging, cyclic like in matplotlib.

    https://matplotlib.org/stable/users/explain/colors/colormaps...

    Can't think of a specific use-case off the top of my head, but sometimes I just want the "feel" of the data when I'm plotting something, and maybe the same scattershot approach would pay off at some point on unknown hex data if it was an option.

  • psychoslave 11 hours ago
    That said, even colored these dumps still feels unappealing to me — so yes this is admittedly subjective gut jumping in the conversation. I get that occult form can also be an attractive force.

    The post put on the table an interesting point about how to improve the presentation layer to fit what’s human cognition is good at spotting (in general, or at least for the expected audience with some training). And it does start proposing something with these color schemes. But isn’t it kind of missing the forest for the tree? Actually why do we even have rendering with [012345678ABCDEF], when a specific set of (colored/imaged?) glyphs would be able to make more obvious what’s on the table? Or even beyond the hexadecimal grouping, wouldn’t be more relevant to render something "intuitively" far more easy to grap without several layer of internalized interpretation through acculturation?

    • GuB-42 10 hours ago
      I can't think of anything better than a hex dump for representing raw binary data. I don't mean that there are no others, equally good representations, but hex dumps win because of familiarity.

      Of course, if you know about the format, there are better ways, but it goes beyond the scope of a hex editor, though the most advanced ones support things like template files and can display structured data, disassembly, etc...

    • q3k 9 hours ago
      > Actually why do we even have rendering with [012345678ABCDEF], when a specific set of (colored/imaged?) glyphs would be able to make more obvious what’s on the table?

      Most of us have internalized the relationship between digits in [0-9] for a very long time. Adding 6 more glyphs after that is quite easy (and they're also somewhat well known in the world), and after a while you stop even thinking about the glyphs consciously anyway. A hex 'C' intuitively means to me '4 from the end'. A hex 'F' intuitively means to me 'all 4 bits are set to 1'. I don't see any advantage to switching to a different glyph set for this base, other than disruption for disruption's sake.

      > Or even beyond the hexadecimal grouping, wouldn’t be more relevant to render something "intuitively" far more easy to grap without several layer of internalized interpretation through acculturation?

      Modern computers deal with 8-bit bytes, and their word sizes are a multiple of bytes - unless you're dealing with bit-packed data, which is comparatively rare (closest is bit twiddling of MMIO registers, which is when you sometimes switch to binary; although for a 4-bit hex nibble you can still learn arbitrary combinations of bits on/off into its value).

      This means you can group 8 bits into 1 digits of 8 bits as one glyph (alphabet too large to be useful), 2 digits of 4 (hex), 4 digits of 2 (alphabet too small to give a benefit over binary) and 8 digits of 1 (binary). Hex just works really well as a practical middle ground.

      Back when computers used 12 bit words (PDP-8 and friends) octal (4 digits of 3 bits represented in the 0-7 alphabet) was more popular.

      • js8 9 hours ago
        I thought about a similar concept for fun - each hex digit was replaced by 4x4 pixel matrix, where amount of pixels roughly corresponded to the value. So dot for 0, two dots for 1, checkerboard for 8 etc.

        Then byte was represented as 16x16 matrix where each 4x4 area had the lower digit pattern, and these were arranged in the shape of the higher digit.

        But at the end of the day, it wasn't really more readable.

  • soegaard 6 hours ago
    This was a great article and inspired me to add support for binary files in `peek`.

    https://soegaard.github.io/peek/#%28part._binary-files%29

    For me the key insight is that similar values should get similar colors. And since Fx and 0x are "similar" the color palette should be cyclic.

  • red_admiral 9 hours ago
    My hex editor should let me turn syntax highlighting on and off; follow my personal color theme (and not produce light gray on white in the terminal); and let me highlight specific things I'm searching for like OD OA or FF FE.
  • taeric 6 hours ago
    I find it funny that I found the "single C0" pretty much instantly.

    I grant that the post largely has a point, mind you. But scanning for a needle in a haystack is something that you just don't often do?

    I am, of course, now very curious how often folks are using hex editors. And itching for an excuse to open a file that way. :D

    • traderj0e 1 hour ago
      The first time I used one was in high school when I was playing some video game in an emulator and wanted to convert the game save to another emulator's format, cause halfway through I realized the first emu had issues with the game. There was no conversion software, but with the hex editor I figured out that you needed to remove some header and change the endianness.
  • MisterTea 6 hours ago
    > compare that to one with colors:

    The colors make it worse as I'm red-green colorblind. Looking at that mess is eye strain.

    Honestly I mostly prefer syntax highlighting turned off as it causes eye strain. I have found the black on light yellow theme of the Acme editor to be a very comfortable monochrome color scheme.

  • fleebee 8 hours ago
    > having more colors makes it possible to recognize more complex patterns

    The implicit cost here is that the simple patterns become harder to recognize when every byte is only subtly differently colored. Rather than give everything a different color, I'd rather have the important stuff highlighted.

    In the comparisons given, I think hexyl's highlighting scheme is significantly more useful.

  • stronglikedan 6 hours ago
    > go ahead, try to find the single C0 in these bytes:

    Ctrl+C, Ctrl+F, Ctrl_V... Easy!

    • oblio 3 hours ago
      That part isn't literal. It's meant to be: "find the outlier in this sea of hex-coded bytes".
  • whizzter 7 hours ago
    Anyone tried using Kaitai descriptions? It seems like a fairly flexible system that would be an excellent starting point for a hex-editor that wants to add good higher level coloring (and perhaps even editing).
  • sidewndr46 6 hours ago
    compare a simple high contrast display to one of that makes it difficult to read and hurts my eyes? Sure! absolutely!

    I'll pass thank you

  • PunchyHamster 8 hours ago
    I wonder how hard it would be to color code repeating sequences
  • azalemeth 11 hours ago
    I really like hexyl [1], which does this by default.

    https://github.com/sharkdp/hexyl

    • kqr 10 hours ago
      The author uses hexyl as an example of trying, but not doing it right.
  • greatgib 10 hours ago
    To me the random colors at each byte is messing up with my brain making it hard to fast identify C0 or any other value that I could more easily identify in all black.

    But color would be nice more based on the bytes logic.

    Eventually the 00 in a shaded grey instead of black, and in best case scenario by logic unit based on your protocol. And worst case scenario by groups of words or so.

  • asibahi 11 hours ago
    When I read this article a few days ago it inspired me to create my own hex viewer : https://ar-ms.me/thoughts/3sl-a-sweet-hex-utility/

    The cool thing about it imo (outside of colors) is a `--windows` flag. Which separates the hex view into partitions: so `-w 2:-3:5` shows the first two bytes on a line, then skips three bytes, then shows the next 5 bytes on a line, then the rest of the file. Easy to use combined with a terminal's up arrow.

  • xyx0826 11 hours ago
    If you analyze binary files often, I highly recommend binvis - http://binvis.io/. It creates a colored minimap for files it loads and has two available arrangements. Pixel color is based on range of bytes, eg ASCII/null bytes/FF bytes. Besides, it’s a pretty basic hex viewer that runs in your browser. The minimap is extremely powerful for identifying interesting areas and patterns in unknown data.
    • pratyahava 10 hours ago
      > it’s a pretty basic hex viewer that runs in your browser

      excuse me? "basic" and "runs in your browser" together sound very contradictory to me. while doing things i actually feel (yes, emotionally) much better when there is no browser open on my machine, but only text editors, vcs gui and file managers, and terminals of course. and sometimes i reject an idea to start a browser just thinking how much ram it will take (ha, what a progress we have done - one github issue tab, with text only and no images, takes 180mb of ram).

      • franga2000 10 hours ago
        It's basic bause it does like two things. It's not advanced or complex. HN is also a basic forum, even though it runs in a browser.
  • adv_zxy 10 hours ago
    radare2 also has excellent hex viewing/editing support, if one manages to grok the usage of it.
    • snvzz 7 hours ago
      The iaito GUI suddenly got good recently.
  • mplanchard 6 hours ago
    Anyone know of a good emacs package for this?
  • a_t48 12 hours ago
    I've started doing this with hashes in a CLI I'm working on. For slow prints, it's somewhat helpful https://asciinema.org/a/aD38Pk88CZgSZqtq but for debug dumps with many many hashes it really helps readability and tracking hashes across lines.
  • 7bit 10 hours ago
    > it’s much easier to pick out the unique byte when it’s a different color! human brains are really good at spotting visual patterns—given the right format

    Don't really see the advantage. Unique bytes have no unique meaning across data types.

    The only good syntax highlight to me is 00 and perhaps FF. But that's my opinion of course.

    Anything else that has no direct relation to what you're looking at is meaningless.

    • masklinn 9 hours ago
      > The only good syntax highlight to me is 00 and perhaps FF. But that's my opinion of course.

      Would probably make the most sense to have various ranges you can enable depending on what you’re looking for (or to look for patterns) e.g. for single byte coloration I could see

      - nul

      - printable / non-printable ascii

      - non-ascii

      - UTF8 leading / continuation

      - separators

      - start/end pairs (both printable and non printable)

    • gblargg 9 hours ago
      It would be interesting to do a heat map coloring based on frequency of that value.
  • wang_li 5 hours ago
    Color coding is a simplistic, half-assed way of doing semantic analysis. Go full assed and make an /etc/magic aware file decoder that decodes and highlights anomalies.
  • shmerl 5 hours ago
    Any good neovim plugin recommendation for hex editing?
  • TheRealPomax 4 hours ago
    It'd be nicer if the hex coloring actually matched the "ascii" coloring as well. Orange on the left but green on the right does not help find things.
  • 0xfalafel 5 hours ago
    [dead]
  • samzong_ 11 hours ago
    [dead]
  • ralferoo 9 hours ago
    I actually stopped reading after the intro because I fundamentally disagreed with its premise. The "find the C0" took me about 1/4 second with uncoloured. Looking at the coloured took my eyes about 3 seconds to recover from the colour overload, then I was scanning down and found the colours so distracting with the constant switching between orange, pink and yellows than it took me a total of about 5 seconds to scan down as far as the blue C0. Maybe if it was all uncoloured and blue just for that, I might have actually noticed it looking different earlier.

    It's been a while since I used hexedit on Linux, but I think that highlighted search results in reverse colours, just like less does for text search. Personally, I'd prefer that to colours.