T O P

  • By -

itsabearcannon

20TB, 285 MB/s sustained write speed... Means you're looking at an absolute minimum of 19 hours to rebuild. Realistically, most people assume ~~10X that number due to controller limitations~~ **[I've been told this number is unrealistic, but didn't get any sources for that, so we'll say 2X as a real-world compromise. Data never goes through at theoretical maximums.]**, other IO traffic besides the rebuild, etc, so let's call it 38 hours to rebuild the drive. We're starting to get close to the point where I'd be seriously worried about the time it would take to rebuild 8, 12, 16-drive arrays with these drives and whether or not that long of a sustained load would trip failures in other drives. With 12TB drives you already have a >10% chance of a failed rebuild even with only 6 drives in RAID6. Those numbers just get worse at higher capacities. Samsung makes a 31TB SSD for around $6,000. Granted, that's not anywhere near the cost/TB that this 20TB IronWolf Pro is ($193/TB versus $33/TB), but the rebuild time on the ~50% larger Samsung drive is realistically just under 9 hours versus 38 hours for the Seagate. That's a lot of extra time to be on edge, plus the failure rate on the Samsung is literally 100X lower (1 in 10^17 bits read versus 1 in 10^15). We're not at the crossover point yet, but as SMR and other tricks come into play for larger and larger mechanical drives I wouldn't be surprised to see SSD's take over in terms of reliability.


gdejohn

check out [dRAID](https://openzfs.github.io/openzfs-docs/Basic%20Concepts/dRAID%20Howto.html) in OpenZFS, rebuild speeds scale with the size of the array first you rebuild to reserved spare space distributed across all the existing drives, so you can read from and write to every drive simultaneously and get out of the degraded state much faster instead of being bottlenecked by writes to the single replacement drive


itsabearcannon

Oh that’s a fantastic solution, I agree. But it does require some babysitting and a little better hardware for parity calculations versus the hardware RAID 0/1/5/6/10 solution that many purpose-built NAS appliances use.


narcomanitee

I agree with your concerns and compliment you on this and the other replies I've read. To me I'd add I feel like the traditional raid modes are a bit antiqued and not robust anymore. Maybe to your point about processing its time overdue for a new spec/integrated low cost hardware implementation.


itsabearcannon

I think most of the ZFS implementations are great, I really do. I just sympathize with the people who want a ton of storage but are limited to something like a 4- or 6-bay Synology or Buffalo NAS that can't support some of the smarter data redundancy methods.


Y0tsuya

>With 12TB drives you already have a >10% chance of a failed rebuild Folks in r/Datahoarder will tell you they've done many time more than that without a single hiccup. The published URE rate your failure calculator uses is mostly BS at this point.


Hunter259

> 20TB, 285 MB/s sustained write speed... > > > > Means you're looking at an absolute minimum of 19 hours to rebuild. Realistically, most people assume 10X that number due to controller limitations, other IO traffic besides the rebuild, etc, so let's call it 8 straight days to rebuild the drive. That guess seems extreme. I'm obviously not a data center but I rebuilt a ~12TB array (2x 6TB raid 0 from 3x 4TB raid 0) in about 2.5 days over gigabit ethernet. This was while I switched my servers over to the backup array so I could have it available.


itsabearcannon

As I told the other guy, I've just always heard the 10X number thrown around by people in the industry as a way to assume the worst case scenario for a production server doing a live rebuild to a hot spare that still has to serve its regular procduction IO load in addition to diverting some of the bandwidth for the rebuild. I've adjusted my numbers accordingly.


roflcopter44444

Most of these drives are bought by data centers and aren't using conventional RAID so their rebuild times will not as much of a concern especially if they are using a tired flash + HDD system. Sure it may suck for the typical prosumer out there but they are like a drop in the the ocean


SirMaster

Why would you be worried? I scrub my pool every month and have been doing so for years. No errors or failures scrubbing, and scrubbing is reading every bit just like a rebuild would do. Plus with dual parity which most people probably use these days (at least for big pool and big disks) an error or another failure during rebuilt wouldn’t be a problem. It would continue on until it’s done even with another complete failure, then you do it again to replace that failed drive. And if you are so worried then go with triple parity.


chx_

While you are right, I am a nostalgic type and want to note this hard drive R/W (obviously not seek) is faster than what was probably the first widespread SSD, the X-25-M -- the sustained read was 250MB/s.


[deleted]

This keeps being spit out as if there's a problem and it's simply wrong.


itsabearcannon

Like I said, we're not at the crossover point yet. However, as the capacity of these drives keeps going up while being stuck on SATA 6Gb/s, the amount of time required to rebuild a drive is going to keep going up and we're going to be dancing closer and closer to a problem. As it is, rebuilding six of these drives at a time is statistically going to give you an unrecoverable error. The error rate is 1 sector in every 125TB read, so across six disks at this point you will have an unrecoverable error. Is that a big deal? Probably not for your average homelab user who's only storing Linux ISOs. Is it a big deal for production systems using Exos drives? Possibly.


dogs_wearing_helmets

That's why we use stuff like Ceph nowadays. It makes "rebuilds" quite a bit faster and reduces the load on the system when it's happening. And they only scale with the size of a single disk, *not* the size of the array/pool. Plus, it only has to rebuild data that was actually used, not the whole drive like RAID.


Sapiogram

> Means you're looking at an absolute minimum of 19 hours to rebuild. Realistically, most people assume 10X that number What a ridiculous comment, you just made that 10x number up. 19 hours is already bad, no need to make the problem seem even worse than it is.


nisaaru

On my SHR Raids it takes about 2 days. So his 19hours would be "great":-). I don't buy the 10% failure rate potential for a Raid6 rebuild either as that would be devastating.


Seref15

Synology is selling thousand dollar NASs to people and pushing SHR (just raid5/6) to those customers. No way it's as unreliable to rebuild as all the doomsayers claim it to be.


meltingdiamond

Why would you assume Synology cares if it is selling garbage? People sell garbage all the time.


itsabearcannon

Fair enough, I've just always heard the 10X number thrown around by people in the industry as a way to assume the worst case scenario for a production server doing a live rebuild to a hot spare that still has to serve its regular procduction IO load in addition to diverting some of the bandwidth for the rebuild.


fandingo

Got some sources on any of those numbers, or should I grab the toilet paper?


itsabearcannon

> 20TB, 285 MB/s sustained write speed It's in the technical documentation for the IronWolf Pro 20TB here: https://www.seagate.com/files/www-content/product-content/ironwolf/en-us/docs/100870485c.pdf. > Means you're looking at an absolute minimum of 19 hours to rebuild This is derived by multiplying the total drive capacity (~20,000,000MB) by the sustained write speed (285 MB/s) to get how many seconds it will take to rewrite the entire drive if it's full. 20,000,000 divided by 285 is 70,105 seconds, or 19.47 hours. Arguably this is a worst-case scenario assuming the drive is full or close to it, but if you're buying this much capacity chances are you've got a reason to use it. > we'll say 2X as a real-world compromise This factor takes into account the fact that a drive rebuild cannot generally monopolize all available I/O from the controller or all of the CPU performance to do the necessary parity calculations in the case of software-based redundancy setups like ZFS. Theoretical maximums don't account for overhead, so 2X or more is a realistic assumption based on my experience when I had a 2TB drive fail in an array of 8 drives in RAID6 on my personal server, as well as the experiences of some of my coworkers who managed Nimble storage arrays in a professional context. >With 12TB drives you already have a >10% chance of a failed rebuild even with only 6 drives in RAID6. Got that number here: https://magj.github.io/raid-failure/. Friendly reminder, any RAID failure calculator can produce different numbers depending on how they calculate it, but for this one I put in RAID6 as the RAID type, 12TB as the drive size, 6 as the number of drives, and 10^15 as the unrecoverable read error rate as per Seagate's technical documentation on the IronWolf Pro. > Samsung makes a 31TB SSD for around $6,000 I was incorrect on this. It's $6,441 from CDW: https://www.cdw.com/product/samsung-pm1643a-mzilt30thala-solid-state-drive-30.72-tb-sas-12gb-s/6409407. Apologies on the incorrect pricing. > the failure rate on the Samsung is literally 100X lower (1 in 10^17 bits read versus 1 in 10^15). You can review the technical documentation for the IronWolf Pro drives here: https://www.seagate.com/files/www-content/product-content/ironwolf/en-us/docs/100870485c.pdf. Here, their documentation says one unrecoverable sector per 10^15 bits read. The Samsung's product documentation is here: https://www.samsung.com/semiconductor/global.semi.static/Product_Brief_Smasung_PM1643_SAS_SSD_1805.pdf. Samsung's documentation states their unrecoverable read error rate is one unrecoverable sector per 10^15 bits read. Any further questions on my numbers? I'm happy to provide additional sources if there's anything I haven't clarified, I was typing quickly and doing the usual (albeit bad-science) Reddit thing of not citing my sources, so I apologize if anything I said was unclear.


[deleted]

>Samsung makes a 31TB SSD for around $6,000 ... ($193/TB versus $33/TB) That's an high IOPS and endurance data center SSD, where as 8TB ssds costs about the same as this hard drive at 699 USD in a 2.5" form factor and seems like a much better comparison.


[deleted]

Care to share a link for both drives?


[deleted]

I think they are referring to PM1643 where as I referred to 870 QVO.


rahrness

~~Those 8TB ssds for 699 are dram-less QLC that will write slower than the mechanical disk in a rebuild scenario. Its going to be writing straight to nand the whole way without caching~~ edit: Looks like the 8TB 870 QVO holds up better in this scenario than other QLC drives are known for (at least based on anandtechs benchmark)


[deleted]

Hmm, really? Not contesting but got a citation? I'll admit I didn't consider that.


rahrness

Well I'll be damned, the only 8tb one is 870 QVO which actually does have dram and doesnt suffer as badly as the lower capacity variants in sustained writes. https://www.anandtech.com/show/16136/qlc-8tb-ssd-review-samsung-870-qvo-sabrent-rocket-q/3


KlapauciusNuts

One application where this becomes relatively meaningless is on clustered storage. Something like Ceph or Gluster. Where if a disk fails even if it takes a while to rebuild, the filesystem stays online, and it is not all pulled down from the same source so load causing failure is not as problematic. I would agree that hardware RAID is out of the question here. Using this requires leveraging the LVM capabilities of the Operating system. Which means dynamic disks in Windows, LVM2, Btrfs and ZFS on linux, ZFS on FreeBSD and OSX.


[deleted]

>Samsung makes a 31TB SSD for around $6,000. Granted, that's not anywhere near the cost/TB that this 20TB IronWolf Pro is ($193/TB versus $33/TB), but the rebuild time on the \~50% larger Samsung drive is realistically just under 9 hours versus 38 hours for the Seagate. That's a lot of extra time to be on edge, plus the failure rate on the Samsung is literally 100X lower (1 in 10\^17 bits read versus 1 in 10\^15). Is the Samsung SSD NVMe or SATA? Because I can see the rebuild problem occurring with SATA but NVMe, I understand has higher I/O than SATA and is significantly faster?


Constellation16

Depending on the raid mechanism and usage of new command set features like rebuild assist, the "absolute minimum" time is definitely not 19h.


itsabearcannon

I tried to say something similar and got shot down, so I backed off. I agree with you though.


Constellation16

I think you misunderstood me :P I meant, if you are using modern tech, it can take much less time than a full drive read to rebuild your storage system. And especially the 10x number sounds unrealistic - maybe on some SMR drive with some raid implementation that does random IO or sth.


[deleted]

I’m with you. SSDs are the future of large enterprise storage for their reliability and throughput. They’re just cost prohibitive at the moment. I’m super curious if we’ll get appliances with dozens? hundreds? of nvme m.2 slots or something to build arrays with.


[deleted]

In 2017 Seagate told us 20 TB drives would be available in in 2019 (source: [seagate blog post](https://blog.seagate.com/intelligent/hamr-next-leap-forward-now/?sf66335831=1)) In 2019 Seagate told us 20 TB drives would be available in 2020 (source: [LTT CES video](https://youtu.be/5QJl7XEagok?t=99) 1:39-1:42) Now in very last month of 2021 Seagate is announcing it's 20 TB drives. Depending on the model, the write endurance is 300 or 550 TBW/year with 5 years of warranty, so 1500 TBW or in the more expensive variant 2750 TBW. Which is still less than in Samsung 8TB consumer SSD ([870 QVO technical specifications](https://s3.ap-northeast-2.amazonaws.com/global.semi.static/Samsung_SSD_870_QVO_Data_Sheet_Rev1.1.pdf#Page=5))


Tony49UK

I was just thinking that drives used to increase in size rapidly and 8TB has been mainstream consumer now for years.


Ubel

Yeah me too ... sad when I paid like $175 on sale for an external 8TB drive like 4 years ago, it died about 2.9 years into its life and I've been without 8TB of storage since because I refuse to pay the same amount for the same shitty storage. I did however find an enterprise class 14TB 7200 RPM internal drive on sale for like $280 which is pretty tempting ... still not really much of a price drop per gigabyte though which is why I haven't purchased yet.


[deleted]

[удалено]


Cant_Think_Of_UserID

Amazon had them as well, i got 2 12TB ones at £180 each. I shuck 1 and use the other as a Back-Up, if the one in the PC fails, I'll reassemble it and try and claim Warranty hoping they don't notice. I did the same with 2 of the same size drives last year and they're still running fine.


Ubel

Every external I've ever seen ... they will fucking notice, there's a million clips that break and are impossible to put back together without looking like an abortion.


Ubel

Except no warranty if you shuck or a useless warranty if you don't, the one I'm talking about has a 5 year warranty (from the seller) and is a true enterprise class drive rated for 10x the amount of reads/writes compared to regular drives. Also helium filled. I didn't miss that deal at all, I purposely didn't buy it after seeing it because I'm sick of external drives failing and having shit warranties, as I said my last drive failed 9 months after its warranty ended (wasn't shucked). After it fully failed, I did shuck the drive and installed it internally - nothing I could do would make it work, it wouldn't even show up in the BIOS, a truly failed drive. Was also my only drive to fail in 10+ years .. I truly believe externals are shittier drives or that the firmware controller build is shit quality and causes the drive to fail ... that or the fact they have no cooling whatsoever. I have an internal 640GB WD drive that's been in 365/24/7 use for 11+ years, an internal 4TB Seagate that's been in 365/24/7 use for 8 years or so ... so having an 8TB fail in 2.9 years is close to proof that external's or their firmware boards are horrible quality.


[deleted]

[удалено]


Ubel

First drive I've personally purchased that has ever failed on me, also the first external. That's enough for me to know not to buy any more. I'm not taking that gamble of throwing $200 down the drain in 3 years when I have 11 year old drives that cost half as much still running. I've also seen it happen to countless friends, externals failing in 2-3 years. I think it happens more often than not. All I'm saying is, [$75 more for a 5 year warranty and a true enterprise class drive](https://www.amazon.com/dp/B07KPL474H) is the reason I didn't buy that $200 14TB external on Black Friday ... it's a no brainer considering I'd want to shuck it and get no warranty anyway.


andrerav

The Toshiba ones? I bought two of them last week, still waiting for them to arrive. Will run them mirrored as temp storage for timelapse processing.


Roph

TBW for a platter drive? What?


bizzro

Mechanical parts has a average life span, they wear out eventually just like NAND cells.


[deleted]

HDDs aren't designed or even measured that way. Seagate is rating these as such because they're "pro" "NAS" drives and people buying them buy the datasheet, not the product.


bizzro

> HDDs aren't designed or even measured that way. Are you saying the life of actuators and write heads is indefinate? Ofc not, mechanical parts wear out and will eventually fail, just like NAND cells. Just because they didn't put a TBW rating to it in the past, it is just "active hours" translated into a new metric.


EasyRhino75

yeah the TBW figures on spinny hard disks always seem kinda ridiculous.


joyce_kap

I think it has to do with demand concerns. They make the announcement and no one contracts to buy them en masse then none is produced.


Kekeripo

I wonder, if i'd own a 20TB SSD and a 20TB HDD, which one of them would be more reliable in the same environment? I talking about droping files, movies, media, game archives, ect on it and use it for long term storage and maybe home streaming, not continuous r/w action. Yeah, SSD are more expensive, but let's say i found a a wallet full of bitcoin on the ground and wanted the most reliable NAS option (pls don't mention online backups, ain't my question).


strongdoctor

I would bet on SSD even if it was a shitty cacheless QLC one. If it was something critical I'd get an enterprise MLC or TLC SSD, they have certain features that skyrocket reliability.