[kwlug-disc] Flush to persist on Apple's NVMe drives

Chris Irwin chris at chrisirwin.ca
Tue Feb 22 17:11:26 EST 2022


On Tue, Feb 22, 2022 at 09:28:39AM -0500, Mikalai Birukou via kwlug-disc wrote:
>I start to have a funny view on storage, as boundaries start to blur.
>
>When I get SATA device, its cache memory is on that PCB inside, i.e. 
>flash, and caches are on that PCB. I hope that such device has 
>capacitors to flush "flushed" cache in case of power loss.

You shouldn't need to hope capacitors exist, as the fsync is an 
instruction to tell the drive to flush to permanent storage. The 
operation returns when complete. You should be able to assume after an 
fsync that the data is either *on disk* (or at least will survive a 
power outage with some weirder storage options)

Flushing the drive's cache is inherently slow. There's a variety of ways 
to work around making this slow, such as:

* Write to a battery-backed buffer (like hardware RAID cards)

* Write to faster storage, move later (like HDDs w/ cache SSD)

* Write to/as SLC, then move to MLC later (some SSDs internally)

* Just write normally, incurring the speed penalty (most drives)

* Don't do it and say you did (some drives via Doug's twitter thread)

>When RAM and flash chips are soldered to main board, should I treat it 
>as storage device with CPU? RAM can be used for flash's needs and for 
>programs. Then RAM is moved into CPU. Nice boundaries are gone.

Actually, Apple's NVMe drive is totally fine and reliable from what I'm 
reading (albeit curiously slow when flushing writes). Apple's drive 
doesn't lie about if data is written, like some others (Doug's twitter 
thread).  

Apple instead took a new approch vs. the above options:

* Break the whole OS to work around a slow drive, and add a new syscall 
   to to actually do what you thought fsync did.

-- 
Chris Irwin

email:   chris at chrisirwin.ca
  xmpp:   chris at chrisirwin.ca
   web: https://chrisirwin.ca




More information about the kwlug-disc mailing list