The author might have had better luck by using an external storage device to boo...

mrb · on April 4, 2024

The author tried essentially the same thing as what you suggest. He booted into recoveryOS (a separate partition) then from there tried to delete files from the main system partition. But rm failed with the same error "No space left on device". So as others have suggested, truncating a file might have worked "echo -n >file"

davorak · on April 4, 2024

The next step I have used and seen recommended after recoveryOS is single user mode, which is what I think I used to solve the same issue on an old mac. I vaguely remember another reason I used single user mode where recovery mode failed but I do not remember any details.

My bet is that you can get nearly the same functionality with single user mode vs booting from external media, but I only have a vague understanding of the limitations of all three modes from 3-5 uses via tutorials.

userbinator · on April 4, 2024

If the filesystem itself got into a deadlocked state, booting from anything and going through the FS driver to delete files from it won't work.

HumanOstrich · on April 4, 2024

What do you mean by "deadlocked state" for a filesystem?

syncsynchalt · on April 4, 2024

Modern (well, post-ZFS) filesystems operate by moving the filesystem through state changes where data is not (immediately) destroyed, but older versions of the data are still available for various purposes. Similar to an ACID-compliant database, something like a backup or recovery process can still access older snapshots of the filesystem, for various values of "older" that might range from milliseconds to seconds to years.

With that in mind, you can see how we get in a scenario where deleting a file will require a minor bit of storage for recordkeeping the old and new states, before it can actually free up the storage by releasing the old state. There is supposed to be an escape hatch for getting yourself out of a situation where there isn't even enough storage for this little bit of record keeping, but either the author didn't know whatever trick is needed or the filesystem code wasn't well-behaved in this area (it's a corner-case that isn't often tested).

jrockway · on April 4, 2024

I'm most surprised by the lack of testing. Macs tend to ship with much smaller SSDs than other computers because that's how Apple makes money ($600 for 1.5TB of flash vs. $100/2TB if you buy an NVMe SSD), so I'd expect that people run out of space pretty frequently.

callalex · on April 4, 2024

And if you make the experience broken and frustrating people will throw the whole computer away and buy a new one since the storage can’t be upgraded.

kjkjadksj · on April 4, 2024

Not to mention potentially paying for your cloud storage for life.

brokenmachine · on April 4, 2024

Don't forget to make it so nothing they have works properly with any other brands devices so the next one they buy must also be Apple.

Rinse and repeat.

dataflow · on April 4, 2024

It feels like insanity that the default configuration of any filesystem intended for laymen can fail to delete a file due to anything other than an I/O error. If you want to keep a snapshot, at least bypass it when disk space runs out? How many customers do the vendors think would prefer the alternative?!

kamray23 · on April 4, 2024

It's not really just keeping snapshots that is the issue, usually. It's just normal FS operation, meant to prevent data corruption if any of these actions is interrupted, as well as various space-saving measures. Some FSs link files together when saving mass data so that identical blocks between them are only stored once, which means any of those files can only be fully deleted when all of them are. Some FSs log actions onto disk before and after doing them so that they can be restarted if interrupted. Some FSs do genuinely keep files on disk if they're already referenced in a snapshot even if you delete them – this is one instance where a modal about the issue should probably pop up if disk space is low. And some OSes really really really want to move things to .Trash1000 or something else stupid instead of deleting them.

p_l · on April 4, 2024

Pretty much by the time you get to 100% full on ZFS, the latency is going to get atrocious anyway, but from my understanding there are multiple steps (from simplest to worst case) that ZFS permits in case you do hit the error:

1. Just remove some files - ZFS will attempt to do the right thing

2. Remove old snapshots

3. Mount the drive from another system (so nothing tries writing to it), then remove some files, reboot back to normal

4. Use `zfs send` to copy the data you want to keep to another bigger drive temporarily, then either prune the data or if you already filtered out any old snapshots, zero the original pool and reload it by `zfs send` from before.

Shorel · on April 4, 2024

Modern defrag seems very cumbersome xD

p_l · on April 4, 2024

Defragmentation and ability to do it are not free.

You can have cheap defrag but comparatively brittle filesystems by making things modifiable in place.

You can have filesystem that has as its primary value "never lose your data", but in exchange defragmentation is expensive.

dataflow · on April 4, 2024

I don't buy this? What does defragmentation have to do with snapshotting? Defragmentation is just a rearrangement of the underlying blocks. Wouldn't snapshots just get moved around?

p_l · on April 4, 2024

The problem is that you have to track down all pointers pointing to specific block.

With snapshotting, especially with filesystems that can only write data through snapshots (like ZFS), blocks can be referred to by many pointers.

It's similar to evaluating liveness of object in a GC, except you're now operating on possibly gigantic heap with very... pointer-ful objects, that you have to rewrite - which goes against core principle of ZFS which is data safety. You're doing essentially a huge history rewrite on something like git repo, with billions of small objects, and doing it safely means you have to rewrite every metadata block that in any way refers to given data block - and rewrite every metadata block pointing to those metadata blocks.

dataflow · on April 4, 2024

But more pointers is just more cost, not outright inability to do it. The debate wasn't over whether defragmentation itself is costly. The question was whether merely making defragmentation possible would impose a cost on the rest of the system. So far you've only explained why defragmentation on a snapshotting volume would be expensive with typical schemes, which is entirely uncontroversial. But you neither explain why you believe defragmentation would be impossible (no "ability to do it") with your scheme, nor why you believe it's impossible for other schemes to make it possible "for free"?

In fact, the main difficulty with garbage collectors is maintaining real-time performance. Throw that constraint out, and the game changes entirely.

p_l · on April 5, 2024

I never claimed it's impossible - I claimed it's expensive. Prohibitively expensive, as the team at Sun found out when they attempted to do so, and offline defrag becomes easy with two-space approach which is essentially "zfs send to separate device".

You can attempt to add an extra indirection layer, but it does not really reduce fragmentation, it just lets you remap existing blocks to another location at a cost of extra lookup. This is in fact implemented in ZFS as solution for erroneous addition of a vdev, allowing device removal though due to performance cost its oriented mostly at "oops, I added the device wrongly, let me quickly revert".

dataflow · on April 5, 2024

If by "not able to" you meant "prohibitively expensive" - well, I also don't see why it's prohibitively expensive even without indirection. Moving blocks would seem to be a matter of (a) copy the data, (b) back up the old pointers, (c) update the pointers in-place, (d) mark the block move as committed, (e) and delete the old data/backups. If you crash in the middle you have the backup metadata journaled there to restore from. No indirection. What am I missing? I feel like you might have unstated assumptions somewhere?

p_l · on April 5, 2024

My bad - I'm a bit too into the topic and sometimes forget what other people might not know ^^;

You're missing the part where (c) is forbidden by design of the filesystem, because ZFS is not just "Copy on Write" by default (like BTRFS, which has in-place rewrite option, IIRC) nor LVM/disk-mapper snapshot which similarly don't have strong invariants on CoW.

ZFS writes data to disk in two ways - a (logically) write-ahead log called ZFS Intent Log (which handles synchronous writes and is read only on pool import), and transaction group sync (txgsync), where all newly written data is linked into new metadata tree, sharing structure with previous TXG metadata tree (so unchanged branches are shared), and the pointer to the head of the tree is committed into on-disk circular buffer of at least 128 pointers.

Every snapshot in ZFS is essentially a pointer to such metadata tree - all writes in ZFS are done by creating a new snapshot. The named snapshots are just rooted in different places in filesystem. This means that sometimes even in case of catastrophic software bug (for example, master branch had for few commits a bug where they accidentally changed on-disk layout of some structures - one person ran master branch and hit that resulting in pool that could not be imported... but the design meant they could tell ZFS import to "rewind" to TXG sync number from before the bug)

Updating the blocks in place violates design invariants - once you violate them, the data safety guarantees are no longer guarantees. And this makes it into minimally offline operation, and at that point the type of client that needs in-place defragmentation can reasonably do the two-space trick (if you're big enough, to make that infeasible, you're probably big enough to easily throw in an extra JBOD at least and relieve fragmentation pressure).

To make latter paragraphs understandable (beware, ZFS internals as I remember them):

ZFS is constructed of multiple layers[1] - from the bottom (somewhat simplified):

1. SPA (Storage Pool Allocator) - what implements "vdevs" - the only layer that actually deals with blocks. It implements access to block devices, mirroring, RAIDz, draid, etc. and exposes single block-oriented interface upwards

2. DMU (Data Management Unit) - An object oriented storage system. Turns bunch of blocks into object-oriented PUT/GET/PATCH/DELETE like setup, with 128bit object IDs. Also handles base metadata - the immutable/write-once trees for turning "here's a 1GB blob of data" into 512b to 1MB portions on disk. For every given metadata tree/snapshot, there is no in-place changes - modifying an object "in place" means that new txgsync has, for given object ID, a new tree of blocks that shares as much structure with previous one as possible.

3. DSL / ZIL / ZAP - provide basic structures on top of the DMU - DSL is what gives you "naming" ability for datasets and snapshots, ZIL handles the write-ahead log for dsync/fsync, ZAP provides a key-value store in DMU objects.

4. ZPL / ZVOL / Lustre / etc - Those are the parts that implement user-visible filesystem. ZPL is ZFS Posix Layer, which is a POSIX-compatible filesystem implemented over object storage. ZVOL does similar but presents emulated block device. Lustre-on-ZFS similarly talks directly to ZFS object layer instead of implementing ODT/OST on top of POSIX files again.

You could, in theory, add an extra indirection layer just for defragmentation, but this in turn makes problematic layering violation (something found at Sun when they tried to implement BPR) - because suddenly SPA layer (the layer that actually handles block-level addressing) needs to understand DMU's internals (or a layer between the two needing bi-directional knowledge). This makes for possibly brittle code, so again - possible but against overarching goals of the project.

The "vdev removal indirection" works because it doesn't really care about location - it allocates space from other vdevs and just ensures that all SPA addresses that have ID of the removed vdev, point to data allocated on other vdevs. It doesn't need to know how the SPA addresses are used by DMU objects

dataflow · on April 5, 2024

I appreciate the long explanation of ZFS, but I don't feel most of it really matters for the discussion here:

> Updating the blocks in place violates design invariants - once you violate them, the data safety guarantees are no longer guarantees.

Again - you can copy blocks prior to deleting anything, and commit them atomically, without losing safety. The fact that you (or ZFS) don't wish to do that doesn't mean it's somehow impossible.

> the type of client that needs in-place defragmentation can reasonably do the two-space trick (if you're big enough, to make that infeasible, you're probably big enough to easily throw in an extra JBOD at least and relieve fragmentation pressure).

You're moving goalposts drastically here. It's quite a leap to go from "has a bit of free space on each drive" to "can throw in more disks at whim", and the discussion wasn't about "only for these types of clients".

And, in any case, this is all pretty irrelevant to whether ZFS could support defragmentation.

> this makes it into minimally offline operation

See, that's your underlying assumption that you never stated. You want defragmentation to happen fully online, while the volume is still in use. What you're really trying to argue is "fully online defragmentation is prohibitive for ZFS", but you instead made the wide-sweeping claim that "defragmentation is prohibitive for snapshotted filesystems in general".

p_l · on April 6, 2024

You're hung on the word "impossible" which I never used.

I did say that there are trade offs and that some goals can make things like defragmentation expensive.

ZFS' main design was that it nothing short of (extensive) physical damage should allow destruction of users data. Everything else was secondary. As such, the project was not interested, ever, in supporting in-place updates.

You can design a system with other goals, or ones that are more flexible. But I'd argue that's why BTRFS got undying reputation for data loss - they were more flexible, and that unfortunately also opened way for more data loss bugs.

dataflow · on April 6, 2024

> You're hung on the word "impossible" which I never used.

That's not true. That was only in the beginning -- "impossible" was only what I originally took (and would still take, but I digress) your initial comment of "ability to defragment is not free" to be saying. It's literally saying that if you don't pay a cost (presumably, performance or reliability), then you become unable to defragment. That sounded like impossibility, hence the initial discussion.

Later you said you actually meant it'd be "prohibitively expensive". Which is fine, but then I argued against that too. So now I'm arguing against 2 things: impossibility and prohibitive-expensiveness, neither of which I'm hung up on.

> ZFS' main design was that it nothing short of (extensive) physical damage should allow destruction of users data. Everything else was secondary.

Tongue only halfway in cheek, but why do you keep referring to ZFS like it's GodFS? The discussion was about "filesystems" but you keep moving the goalposts to "ZFS". Somehow it appears you feel that if ZFS couldn't achieve something then nothing else possibly could?

Analogy: imagine if you'd claimed "button interfaces are prohibitively expensive for electric cars", I had objected to that assertion, and then you kept presenting "but Tesla switched to touchscreens because they turned out cheaper!" as evidence. That's how this conversation feels. Just because Tesla/ZFS has issues with something that doesn't mean it's somehow inherently prohibitive.

> As such, the project was not interested, ever, in supporting in-place updates.

Again: are we talking online-only, or are you allowing offline defrag? You keep avoiding making your assumptions explicit.

If you mean offline: it's completely irrelevant what the project is interested in doing. By analogy, Microsoft was not interested, ever, in allowing NTFS partitions to be moved or split or merged either, yet third-party vendors have supported those operations just fine. And on the same filesystem too, not merely a similar one!

If you mean online: you'd probably be some intrinsic trade-off eventually, but I'm skeptical it's at this particular juncture. Just because ZFS may have made something infeasible with its current implementation, that doesn't mean another implementation couldn't have... done an even better job? e.g., even with the current on-disk structure of ZFS (let alone a better one), even if a defragmentation-supporting implementation might not achieve 100% throughput while a defragmentation is ongoing, surely it could at least get some throughput during a defrag so that it doesn't need to go entirely offline? That would be a strict improvement over the current situation.

> But I'd argue that's why BTRFS got undying reputation for data loss - they were more flexible, and that unfortunately also opened way for more data loss bugs.

Hang on... a bug in the implementation is a whole different beast. We were discussing design features. Implementation bugs are... not in that picture. I'm pretty sure most people reading your earlier comments would get the impression that by "brittleness" you were referring to accidents like I/O failures & user error, not bugs in the implementation!

Finally... you might enjoy [1]. ;)

[1] https://www.reddit.com/r/zfs/comments/1826lgs/psa_its_not_bl...

werid · on April 4, 2024

i've filled up an zfs array to the point where i could not delete files.

the trick is to truncate a large enough files, or enough small files, to zero.

not sure if this is a universal shell trick, but worked on those i tried: "> filename"

pdimitar · on April 4, 2024

For reasons I am completely unwilling to research, just doing `> filename` has not worked for me in a while.

Since then I memorized this: `cat /dev/null >! filename`, and it has worked on systems with zsh and bash.

adrianmonk · on April 4, 2024

That seems to be zsh-specific syntax that is like ">" except that overrides a CLOBBER setting[1].

However, it won't work in bash. It will create file named "!" with the same contents as "filename". It is equivalent to "cat /dev/null filename > !". (Bash lets you put the redirection almost anywhere, including between one argument and another.)

---

[1] See https://zsh.sourceforge.io/Doc/Release/Redirection.html

pdimitar · on April 4, 2024

Yikes, then I have remembered wrong about bash, thank you.

In that case I'll just always use `truncate -s0` then. Safest option to remember without having to carry around context about which shell is running the script, it seems.

alias_neo · on April 4, 2024

"truncate -s0 filename"

I believe "> filename" only works correctly if you're root (at least in my experience, if I remember correctly).

EDIT: To remove <> from filename placeholder which might be confusing, and to put commands in quotes.

pdimitar · on April 4, 2024

Oh yes, that one also worked everywhere I tried, thanks for reminding me.

alias_neo · on April 4, 2024

Pleasure.

It saved me just yesterday when I needed to truncate hundreds of gigabytes of Docker logs on a system that had been having some issues for a while but I didn't want to recreate containers.

"truncate -s 0 /var/lib/docker/containers/**/*-json.log"

Will truncate all of the json logs for all of the containers on the host to 0 bytes.

Of course the system should have had logging configured better (rotation, limits, remote log) in the first place, but it isn't my system.

EDIT: Missing double-star.*

matja · on April 4, 2024

Simple to verify with strace -f bash -c "> file":

    openat(AT_FDCWD, "file", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3

man 2 openat:

    O_TRUNC
        If the file already exists and is a regular file and the
        access mode allows writing (i.e., is O_RDWR or O_WRONLY) it
        will be truncated to length 0.
        ...

pdimitar · on April 4, 2024

Sure, but I just get an interactive prompt when I type `> file` and I honestly don't care to troubleshoot. ¯\_(ツ)_/¯

matja · on April 5, 2024

Probably you are using zsh and need:

    MULTIOS=1 > file

- zsh isn't POSIX compatible by default

pdimitar · on April 5, 2024

I see. But in this case it's best to just memorize `truncate -s0` which is shell-neutral.

matja · on April 4, 2024

Ok, we'll leave that a mystery then!

markandrewj · on April 4, 2024

Depending on the environment you can also use the truncate command. This will work if the file is open as well.

https://man7.org/linux/man-pages/man1/truncate.1.html

ralferoo · on April 4, 2024

It'd be better to do ": >filename"

: is a shell built-in for most shells that does nothing.

aidenn0 · on April 4, 2024

Some filesystems may require allocating metadata to delete a file. AFIK it's a non issue with traditional Berkeley-style systems, since metadata and data come from a separate pools. Notably ZFS has this problem.

em-bee · on April 4, 2024

btrfs has this problem too it seems. but there it is usually easy to add a usb stick to extend the filesystem and fix the problem.

i find it really frustrating though. why not just reserve some space?

steve_rambo · on April 4, 2024

btrfs does reserve some space for exactly this issue, although it might not always be enough.

https://btrfs.readthedocs.io/en/latest/btrfs-filesystem.html

> GlobalReserve is an artificial and internal emergency space. It is used e.g. when the filesystem is full. Its total size is dynamic based on the filesystem size, usually not larger than 512MiB, used may fluctuate.

aidenn0 · on April 4, 2024

Yeah, with ZFS some will make an unused dataset with a small reservation (say 1G) that you can then shrink to delete files if the disk is full.

rincebrain · on April 4, 2024

This hasn't been a problem you should be able to hit in ZFS in a long time.

It reserves a percent of your pool's total space precisely to avoid having 0 actual free space and only allows using space from that amount if the operation is a net gain on free space.

MarkSweep · on April 4, 2024

For more details about this slop space, see this comment:

https://github.com/openzfs/zfs/blob/99741bde59d1d1df0963009b...

p_l · on April 4, 2024

Yeah, a situation where you pool gets suspended due to no space and you can't delete files is considered a bug by OpenZFS.

rincebrain · on April 4, 2024

I mean, the pool should never have gotten suspended by that, even before OpenZFS was forked; just ENOSPC on rm.

aidenn0 · on April 4, 2024

Oh, that's good to know. I hit it in the past, but it was long enough ago that ZFS still had format versions.

rincebrain · on April 4, 2024

Yeah, the whole dance around slop space, if I remember my archaeology, went in shortly after the fork.

p_l · on April 4, 2024

The recommended solution is to apply a quota on top-level dataset, but that's mainly for preventing fragmentation or runaway writes.

b112 · on April 4, 2024

I think the solution is to not use a filesystem that is broken in this way.

p_l · on April 4, 2024

Note that ZFS explicitly has safeguards against total failure. No filesystem will work well with near full state when it comes to fragmentation.

b112 · on April 4, 2024

This is a whataboutism. Being unable to use the filesystem, due to space full, without arcane knowledge, is not the same as "not working well".

This is a brokwn implementation.

aidenn0 · on April 4, 2024

You're misunderstanding. See the sibling thread where p_l says that this problem has been resolved, and any further occurrence would be treated as a bug. Setting the quota is only done now to reduce fragmentation (ZFS's fragmentation avoidance requires sufficient free space to be effective).

b112 · on April 4, 2024

No, I'm not. They said the "recommended solution" for this issue is to use a quota.

They also said it was mainly used for other issues, such as fragmentation. In other words, this was stated as a fix for the file delete issue.

How does this invalidate my comment, that this was a broken implementation?

It doesn't matter if it will be fixed in the future, or was just fixed.

aidenn0 · on April 4, 2024

According to rincebrain, the "disk too full to delete files" was fixed "shortly after the fork" which means "shortly after 2012." My information was quite out of date.

b112 · on April 4, 2024

Well I'm glad they fixed a bug, which made the filesystem unusable. Good on them, and thank you for clarification.

pxx · on April 4, 2024

Where `rm`, or more technically unlink(2), fails due to ENOSPC, like in the article...

Hilariously this failure case doesn't seem to be listed in the docs. https://developer.apple.com/library/archive/documentation/Sy...

klausa · on April 4, 2024

Why do you think that would work; if using recoveryOS or starting the Mac Share Disk/Target Disk mode didn't?

appplication · on April 4, 2024

This is the kind of comment someone is going to be very happy to read in 8 years when they’re looking for answers for their (then) ancient Mac.

deeth_starr_v · on April 4, 2024

Anyone know why you can’t use the first usb-c port on a Mac laptop to make the bootable os?

Tsiklon · on April 4, 2024

The ports mentioned expose the serial interface that can be used to restore/revive the machine in DFU mode

https://support.apple.com/en-us/108900

That said, no idea why they can’t be used in this case

joecool1029 · on April 4, 2024

> That said, no idea why they can’t be used in this case

My intuitive guess here is how the ports are connected to the T2 security chip. One port is as you said a console port that allows access to perform commands to flash/recover/re-provision the T2 chip. Same as an OOB serial port on networking equip.

The rest of the ports the T2 chip has read/write access to devices connected to it. Since this is an OS drive, I'm guessing it needs to be encrypted and the T2 chip handles this function.

amelius · on April 4, 2024

That doesn't make it technically impossible to implement booting from that port.

PeterisP · on April 4, 2024

Sure, but it also doesn't make it necessary or useful to implement booting from that port - booting from a port IMHO is not a feature that Apple wants to offer to its target audience at all, so it's sufficient if some repair technician can do that according to a manual which says which port to use in which scenario.

p_l · on April 4, 2024

The firmware is based on iPhone boot process from my understanding, and simply does not have space in ROM to implement boot from external storage.

The rest of the code necessary to boot from external sources is located on main flash

amelius · on April 4, 2024

Yes, but the decision to use this firmware was made by Apple.

This is like saying my software did not work because it was based on an incompatible version of some library. Maybe so, but that is a bad excuse. Implementing systems is hard, and like the rest of us, Apple should not get away with bad excuses. And this is even more true because they control more of the stack.

pjerem · on April 4, 2024

OTOH, the current implementation works and is sufficient so Apple could easily decide that it’s not worth modifying firmware that already works to solve an inexistant issue.

amelius · on April 4, 2024

That's how you end up with a UX that works only for programmers (see e.g. Linux in the 90s).

pjerem · on April 4, 2024

We are talking about booting from USB. On a Mac M2. That’s literally the most power user feature of a MacBook.

The 3 users of this feature on this planet are already happy that it’s even possible at all. The only thing Apple could do is to document this clearly like adding a text in the boot drive selector.

nottorp · on April 4, 2024

> Mac laptop computer: Use any USB-C port except the leftmost USB-C port when facing the ports on the left side of the Mac.

Also on my mbpro at least the mentioned port is the one closest to the magsafe connector and may have funny electrical connections to it, perhaps.

plussed_reader · on April 4, 2024

What if it's through a USB c adapter to a usbA thumbstick?

Macha · on April 4, 2024

> Was surprised to learn that with Apple silicon-based Macs, not all ports are equal when it comes to external booting

iirc, not all ports were equal when it came to charging with the m1 macs, so this is actually not so surprising.

voidbert · on April 4, 2024

But charging through many ports requires extra circuitry to support more power on every port, while booting from multiple ports just requires the boot sequence firmware to talk to more than one USB controller (like PC motherboards do, for example)

timcederman · on April 4, 2024

Or boot it into Target Disk Mode using another machine.

olliej · on April 4, 2024

you can't boot the arm Macs into target disk mode, you can only boot to the recovery os and share the drive - it shows up as a network share iirc. I was super annoyed by this a few weeks ago because you can, for example, use spotlight to search for "target disk mode" and it will show up, and looks like it will take you to the reboot in target disk mode option, but once you're there it's just the standard "choose a boot drive" selector.