5 comments

  • dcrazy47 minutes ago
    More info on how ASIF differs from the decades-old sparseimage format: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=44259132">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=44259132</a>
  • albertzeyer45 minutes ago
    Small self-advertisement: as an alternative to dissect.cstruct, a fun side-project of mine (C parser + C interpreter in Python) can do a very similar thing:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;albertz&#x2F;PyCParser&#x2F;blob&#x2F;master&#x2F;demos&#x2F;dissect_cstruct.py" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;albertz&#x2F;PyCParser&#x2F;blob&#x2F;master&#x2F;demos&#x2F;disse...</a>
  • saagarjha5 hours ago
    I have to admit that using C syntax as a string to parse something from Python is definitely a choice. I&#x27;m not even sure I would use C structs to lay things out in C…
    • quietbritishjim4 hours ago
      I asked an LLM to rewrite it for me using the Python built-in struct module, and it gave me this:<p><pre><code> import sys import struct from collections import namedtuple # Bake the layout once into a reusable, precompiled object. HEADER = struct.Struct(&quot;&gt;4sIIIQQ16sQQIIII&quot;) # struct only knows positions, not names — pair it with a namedtuple # to recover the named-field access that cstruct gives you for free. Header = namedtuple(&quot;Header&quot;, [ &quot;magic&quot;, &quot;field4&quot;, &quot;field8&quot;, &quot;fieldC&quot;, &quot;field10&quot;, &quot;field18&quot;, &quot;field20&quot;, &quot;field30&quot;, &quot;field38&quot;, &quot;field40&quot;, &quot;field44&quot;, &quot;field48&quot;, &quot;field4C&quot;, ]) with open(sys.argv[1], &quot;rb&quot;) as fh: header = Header._make(HEADER.unpack(fh.read(HEADER.size))) print(header) </code></pre> To me, this seems significantly less readable... less Pythonic, even. The printed output is also less readable.
      • saagarjha4 hours ago
        No I think Python&#x27;s struct module is also really bad. My point is if you are making a new DSL for laying out arbitrary formats why not do something better than what we have
        • schamper3 hours ago
          Author here, this is a valid point but there are also valid reasons to choose C structures. The larger framework that this is a part of is primarily targeted towards people working in cybersecurity, not software engineers. Cybersecurity people are very often not great software engineers and there is a high throughput of “throwaway” scripts, or “make a quick hacky change”. C is commonly already well understood, a bespoke DSL usually is not and requires a learning step. You can “hit the ground running”, so to say.<p>And, as a bonus, creating, say, a filesystem implementation is now often as easy as copy&#x2F;pasting existing C structure definitions, either from the original source (which is usually C) or from reversing tools such as IDA&#x2F;Ghidra.<p>There’s no right or wrong way in my opinion, just preferences.
        • toast04 hours ago
          I would assume dissect.cstruct was written for interopt with c programs using C structs, or to use formats documented as C structs. Not as a greenfield tool for arbitrary formats.<p>C structs seem less bad than python structs, so why not use them? Especially why write a struct parser and create a DSL for it, when there&#x27;s already one that you can use that uses a well known DSL you might already understand.
        • quietbritishjim2 hours ago
          OK so what&#x27;s your alternative then? It&#x27;s easy to say you don&#x27;t like something but the onus is on to show there&#x27;s something actually better.<p>The library used in the author&#x27;s post seems perfectly readable to me, enough that it didn&#x27;t even register until I read your comment. Could it be tweaked slightly to not use C syntax? Sure, but it&#x27;s still going to need roughly the same pattern of identifier + type (including size). Types in C are straightforward so long as you don&#x27;t have functions&#x2F;pointers (which have the &quot;inside out&quot; problem, but they&#x27;re not needed for binary encodings), so you&#x27;re going to be looking at pretty trivial changes to syntax. Certainly not enough to warrant this level of quibbling.
    • goeiedaggoeie4 hours ago
      in video&#x2F;image space most code we deal with day to day is still C, lots more rust plugins in gstreamer ecosystem, but 90%+ still C
      • flohofwoe3 hours ago
        The article is about disk images though :)<p>But yeah, while I think the `cstruct` helper function to describe a binary data layout in Python is more elegant than the builtin alternatives, it would have been much less painful to just go with a minimal C command line program (or any other programming language where a struct directly maps to memory). Python and most other scripting languages have been built for manipulating text data, but suck when working with binary data.
      • saagarjha4 hours ago
        Sure, but it&#x27;s a greenfield Python script
  • fragmede6 hours ago
    I like a good jaunt with IDApro as much as the next RE, but my question is what does ASIF do that Qcow2 doesn&#x27;t?<p>My other question is why does it take so long to copy an app out of a dmg and into &#x2F;Applications. Like, just change some pointers to pointers to data on disk and shit.
    • donatj6 hours ago
      &gt; what does ASIF do that Qcow2 doesn&#x27;t<p>Mount natively in macOS<p>&gt; why does it take so long to copy [...] out of a dmg<p>Compression mostly. DMG contents can optionally be compressed using zlib, lzfse, or slow as molasses bzip2.<p>Also Gatekeeper.
      • 1e1a5 hours ago
        Additionally, while I don&#x27;t know much about APFS, I don&#x27;t think it would be beneficial to point the extracted app to blocks that are also part of the dmg file, i.e. some copying has to happen anyway.
        • fragmede4 hours ago
          in a perfect theoretical filesystem, copy-on-write means copying is as cheap as moving a file, though uncompressing time makes sense.
          • xoa3 hours ago
            A perfect theoretical filesystem can still have subjective user configurable choices though right? Like case sensitivity, UTF normalization, checksum hash function, extra copies of data&#x2F;metadata to store for redundancy&#x2F;healing, etc (as well as compression&#x2F;encryption). I think ZFS is a pretty strong real world example of a CoW FS, but you can still set a lot of different properties between sub-fs and then need to copy when you go between them to get the structural changes.<p>Disk images are supposed to function as if they&#x27;re attached storage I think, and have different properties from what FS you&#x27;re running on boot or your home folder (which themselves can be different, I run my home folder on my main Mac off a NAS via iSCSI). I&#x27;m not sure any underlying FS would avoid a copy operation there in general?
          • galad874 hours ago
            Of course, but that works only if the files are already in the same partition. A dmg is a virtual image, even if it&#x27;s stored in the same partition, once mounted it acts like another partition.
  • ARTKILL5 hours ago
    Worth noting ASIF&#x27;s compression tradeoff also affects Spotlight indexing — since the content is opaque until mounted, you lose searchability on unmounted disk images that you&#x27;d get with a regular folder structure.