Amazing level of transparency and detail about their custom storage servers. HN discussion at http://news.ycombinator.com/item... (discusses why this is appropriate for backup, but perhaps not generic storage needs)
- Bret Taylor
from Bookmarklet
45 drives per unit and many units means they must be constantly replacing failed hard drives - just due to the sheer quantity of them in use
- Jacob Old
It wasn't entirely clear to me from the blog post what you have to do to replace a drive. Looks like at minimum you have to remove the unit from the rack, and I don't see any drawer guides or similar to assist with that. And do they have to take the unit offline to replace a single drive?
- Jason Wehmhoener
Geez. Back in 1998, Microsoft was bragging about their 1 TB cloud... :-) Millions of $ then I think.
- Mitchell Tsai
One happy Backblaze customer checking in.
- Russellreno
sounds neat - now what to do with 67 TB of storage...
- Matt Ellsworth
Seriously Matt! Lots and Lots and Lots of video? HD video!
- Rick Cogley
So, they store their data "securely" in Palo Alto? That makes me scared.
- Jonas S Karlsson
Quoted from blog- "Backblaze Storage Pods are building blocks upon which a larger system can be organized that doesn’t allow for a single point of failure." They have indicated an amazing amount of cost savings.
- Wins Fern
Mitchell: I don't think 1TB was "millions of dollars" in 1998.
- Steve de Mena
Nice idea. Pity that it only supports a HTTPS interface, not surprising at that cost though (the software that runs the filesystems on the NetApp and other devices isn't exactly cheap to write). Anyone see if they quoted transfer speeds? I'm wondering what impact the four SATA cards each with SATA multipliers on them has when it comes to access speeds.
- Russ
Steve: according to http://www.littletechshoppe.com/ns1625... disk cost ~$0.08 / mb in 1998, which comes out to >$800,000 for 1 TB or just over a million bucks in todays dollars. so maybe not millions, but a million!
- Karl Rosaen
Russ: It runs Debian. If you were rolling your own (and they don't sell these units), you could turn on NFS or some other protocol (CIFS, iSCSI). They only use HTTP because it's cloud storage. NFS license is a major expense on NetApp, but all the major Linux distributions can act as NFS servers, CIFS servers, and probably iSCSI targets.
- Andy Dustman
Andy: I know that you could do that on them but it leaves the problem of what to do with the storage. You could merge the 3 volumes into an LVM VG but the performance could become an issue with any load on it. It seems I wasn't the only one to question the performance, while the views of a Sun engineer aren't exactly unbiased it does highlight some of the downsides: http://www.c0t0d0s0.org/archive...
- Russ
Fascinating article; but more questions: "In rough terms, every time one of our customers buys a hard drive, Backblaze needs another hard drive." -- so what happens when a drive fails; how much redundancy is there? What happens when a meteorite destroys the whole building; is there off site backup too? (I know this *is* the off-site backup, but still...) I wonder how much data flows in and out over time. Maybe I should just read their website.
- Rob Fisher
Rob: they mention using 15 drive RAID6 volumes that can lose up to 2 drives before failure
- Mike Chelen
The worst part about this cluster design is the fact that I couldn’t shut up about it for the first couple days after finding out about it. It was the solution I proposed to every problem. There were complaints.
- A Mitchell
IMO RAID6 is not that great. Granted, it's highly unlikely to lose 3 drives at the same time, but there's still possibility. Besides, for write-intensive app, parity calculation is quite time-consuming. I personally prefer RAID 10 (striped array of RAID1 pairs). Yes the effective usable space is less than half total capacity, but for backups -- which will sooner or later be used to restore something -- I prefer data integrity over usage efficiency.
- Pandu ● IT Optimizer
from fftogo
IMO RAID6 is not that great. Granted, it's highly unlikely to lose 3 drives at the same time, but there's still possibility. Besides, for write-intensive app, parity calculation is quite time-consuming. I personally prefer RAID 10 (striped array of RAID1 pairs). Yes the effective usable space is less than half total capacity, but for backups -- which will sooner or later be used to restore something -- I prefer data integrity over usage efficiency.
- Pandu ● IT Optimizer
from fftogo