Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Your comment, along with other users, suggests that TLC is a positive attribute for consumers, however, the transition from SLC and MLC NAND to TLC and QLC 3D-NAND actually marked a decline in the longevity of SSDs.

Using a mode other than SLC with current SSDs is insane due to the difference with planar NAND features, as the current 3D-NAND consumes writes for everything.

3D-NAND, To read data consume writes [0],

    " Figure 1a plots the average SSD lifetime consumed by the read-only workloads across 200 days on three SSDs (the detailed parameters of these SSDs can be found from SSD-A/-B/-C in Table 1). As shown in the figure, the lifetime consumed by the read (disturbance) induced writes increases significantly as the SSD density increases. In addition, increasing the read throughput (from 17MBps to 56/68MBps) can greatly accelerate the lifetime consumption. Even more problematically, as the density increases, the SSD lifetime (plotted in Figure 1b) decreases. In addition, SSD-aware write-reduction-oriented system software is no longer sufficient for high-density 3D SSDs, to reduce lifetime consumption. This is because the SSDs entered an era where one can wear out an SSD by simply reading it."
3D-NAND, Data retention consume writes [1],

    " 3D NAND flash memory exhibits three new error sources that were not previously observed in planar NAND flash memory:

    (1) layer-to-layer process variation, 
    a new phenomenon specific to the 3D nature of the device, where the average error rate of each 3D-stacked layer in a chip is significantly different;

    (2) early retention loss, 
    a new phenomenon where the number of errors due to charge leakage increases quickly within several hours after programming; and

    (3) retention interference, 
    a new phenomenon where the rate at which charge leaks from a flash cell is dependent on the data value stored in the neighboring cell. "

[0] https://dl.acm.org/doi/10.1145/3445814.3446733

[1] https://ghose.cs.illinois.edu/papers/18sigmetrics_3dflash.pd...



Even datacenter-grade drives scarcely use SLC or MLC anymore since TLC has matured to the point of being more than good enough even in most server workloads, what possible need would 99% of consumers have for SLC/MLC nowadays?

If you really want a modern SLC drive there's the Kioxia FL6, which has a whopping 350,400 TB of write endurance in the 3TB variant, but it'll cost you $4320. Alternatively you can get 4TB of TLC for $300 and take your chances with "only" 2400 TB endurance.


Current tech meets the needs of normie home users (most of whom will never see even a full drive write) and big enterprise (servers have specific retirement schedules, each and every server is fully disposable, complete redundancies abound) but leave out in the cold those of us running small businesses and smaller data centers where machines are overprovisioned and not on a depreciation schedule, hopefully with a RAID 1(0) or ZFS/LVM being the safety mechanism in place.


I got 4TB of TLC for $230 (Silicon Power UD90). It even has SLC caching (can use parts of the flash in SLC mode for short periods of time).


True, I was looking at the prices for higher end drives with on-board DRAM, but DRAM-less drives like that UD90 are also fine in the age of NVMe. Going DRAM-less was a significant compromise on SATA SSDs, but NVMe allows the drive to borrow a small chunk of system RAM over PCIe DMA, and in practice that works well enough.

(Caveat: that DMA trick doesn't work if you put the drive in a USB enclosure, so if that's your use-case you should ideally still look for a drive with its own DRAM)


TLC cannot mature as long as it continues to use 3D-NAND without utilizing a more advanced material science. Reading data and preserving data consume writes, what degrades the memory, because the traces in the vertical stack of the circuit create interference.

Perhaps there are techniques available to separate the traces, but this would ultimately increase the surface area? which seems to be something they are trying to avoid.

You should not use datacenter SSD disks as a reference, as they typically do not last more than two years and a half. It appears to be a profitable opportunity for the SSD manufacturer, and increasing longevity does not seem to be a priority.

To be more specific, we are talking about planned obsolescence for consumer and enterprise SSD disks.

> If you really want a modern SLC drive there's the Kioxia FL6, which has a whopping 350,400 TB of write endurance in the 3TB variant, but it'll cost you $4320.

Did you read the OP article?


I agree wholeheartedly. It’s not something a large enterprise can do, but for my own home and multiple small business needs I purchased a good number of Samsung 960/970 Pro NVMe drives when they came out with the TLC 980 Pro.

I’m still rocking some older Optanes and scavenge them from retired builds. They’ll last longer than a new 990 Pro.


> Your comment, along with other users, suggests that TLC is a positive attribute for consumers, however, the transition from SLC and MLC NAND to TLC and QLC 3D-NAND actually marked a decline in the longevity of SSDs.

The bit that you're pointedly ignoring and that none of your quotes address is the fact that SLC SSDs had far more longevity than anyone really needed. Sacrificing longevity to get higher capacity for the same price was the right tradeoff for consumers and almost all server use cases.

The fact that 3D NAND has some new mechanisms for data to be corrupted is pointless trivia on its own, bordering on fearmongering the way you're presenting it. The real impact these issues have on overall drive lifetime, compared to realistic estimates of how much lifespan people actually need from their drives, is not at all alarming.

Not using SLC is not insane. Insisting on using SLC everywhere is what's insane.


> Your comment, along with other users, suggests that TLC is a positive attribute for consumers

TLC is better than QLC, which is specifically what my comment was addressing; I never implied that it's better than SLC though, so just don't, please.

It's interesting to see that 3D-NAND has other issues even when run in SLC mode, though.


> I never implied that it's better than SLC though, so just don't, please.

My apologies.

> It's interesting to see that 3D-NAND has other issues even when run in SLC mode, though.

Basically the SSD manufacturers are increasing capacity by adding more layers (3D-NAND). When one cell is read in the vertical stack, the interferences produced by the traces in the area increases the cells that need to be rewritten, what consumes the life of the device, by design.


> When one cell is read in the vertical stack, the interferences produced by the traces in the area increases the cells that need to be rewritten, what consumes the life of the device, by design.

You should try being honest about the magnitude of this effect. It takes thousands of read operations at a minimum to cause a read disturb that can be fixed with one write. What you're complaining about is the NAND equivalent of DRAM rowhammer. It's not a serious problem in practice.


Not NAND equivalent as the larger the stack, the larger the writings on the continuous cells, not just rewriting a single cell.

Here, the dishonest are the SSD manufacturers of the last decade, and they are feeling so comfortable as to introduce QLC into the market.

> It's not a serious problem in practice.

It's as serious as in to read data consume the disk, and the faster its read the faster it's consumed [0]. You should have noticed that SSD disks no longer come with a 10-year warranty.

    "under low throughput read-only workloads, SSD-A/-B/-C/-D/-E/-F extensively rewrite the potentially-disturbed data in the background, to mitigate the read (disturbance) induced latency problem and sustain a good read performance. Such rewrites significantly consume the already-reduced SSD lifetime. "
Under low throughput read-only workloads.

It is a paper from 2021, what means sci-hub can be used to read it.

[0] https://dl.acm.org/doi/10.1145/3445814.3446733


> It's as serious as in to read data consume the disk, and the faster its read the faster is consumed

Numbers, please. Quantify that or GTFO. You keep quoting stuff that implies SSDs are horrifically unreliable and burning through their write endurance alarmingly fast. But the reality is that even consumer PCs with cheap SSDs are not experiencing an epidemic of premature SSD failures.

EDIT:

> You should have noticed that SSD disks no longer come with a 10-year warranty.

10-year warranties were never common for SSDs. There was a brief span of time where the flagship consumer SSDs from Samsung and SanDisk had 10-year warranties because they were trying to one-up each other and couldn't improve performance any further because they had saturated what SATA was capable of. The fact that those 10-year warranties existed for a while and then went away says nothing about trends in the true reliability of the storage. SSD warranties and write endurance ratings are dictated primarily by marketing requirements.


In a 2min search,

https://www.reddit.com/r/DataHoarder/comments/150orlb/enterp...

    "So, on page 8's graphs, they show that 800GB-3800GB 3D-TLC SSDs had a very low "total drive failure" rate. But as soon as you got to 8000GB and 15000GB, the drives had a MASSIVE increase in risk that the entire drive has hardware errors and dies, becomes non-responsive, etc."
Study: https://www.usenix.org/system/files/fast20-maneas.pdf

(with video): https://www.usenix.org/conference/fast20/presentation/maneas


Would you care to explain how any of that supports the points you're actually making here?

Some of what you're spamming seems to directly undermine your claims, eg.:

> Another finding is that SLC (single level cell), the most costly drives, are NOT more reliable than MLC drives. And while the newest high density 3D-TLC (triple level cell) drives have the highest overall replacement rate, the difference is likely not caused by the 3D-TLC technology


"likely" not caused by. Any case I delete such spamming? link.

> Would you care to explain how any of that supports the points you're actually making here?

Other day, if you don't mind.


On the page 7 of the usenix study,

    "The last column in Table 1 allows a comparison of ARRs across flash types. A cursory study of the numbers indicates generally higher replacement rates for 3D-TLC devices compared to the other flash types. Also, we observe that 3D-TLC drives have consumed 10-15X more of their spare blocks."
Latter follows

    "we observe that SLC models are not generally more reliable than eMLC models that are comparable in age and capacity. For example, when we look at the ARR column of Table 1, we observe that SLC models have similar replacement rates to two eMLC models with comparable capacities [...] This is consistent with the results in a field study based on drives in Google’s data centers [29], which does not find SLC drives to have consistently lower replacement rates than MLC drives either. Considering that the lithography between SLC and MLC drives can be identical, their main difference is the way cells are programmed internally, suggesting that controller reliability can be a dominant factor."
What certainly follows,

    "Overall, the highest replacement rates in our study are associated with 3D-TLC SSDs. However, no single flash type has noticeably higher replacement rates than the other flash types studied in this work, indicating that other factors, such as capacity or lithography, can have a bigger impact on reliability."
So programmed obsolescence is present in the drivers, as well as in the 3D-NAND that degrades over time with reads (the chosen traces design, not the layers themselves). Interesting.

China, are you reading this? You have the opportunity to shake the market and dominate it globally, just by implementing a well-designed product, honest drivers and modest nm (not lowering to today's sizes, just enough to ensure decent energetic efficiency and good speed).


* Were I wrote "drivers" should be read as controllers and firmware.


The massive increase is still 1/500 chance per year.


> When one cell is read in the vertical stack, the interferences produced by the traces in the area increases the cells that need to be rewritten, what consumes the life of the device

So 3D-NAND suffers interference between the stacked layers? (Introducing Columnhammer... /s)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: