Joyent Weblog
Oh Thumpers, Oh Thumpers
I can’t tell you what a pleasure it is to have your storage run the same operating system as your “normal” servers. So much so that I have to share it with you all after doing some more tooling around on one.
So when you have 47 discs sitting around
$ zpool create joyous1 \
raidz2 c{5,4,7,6,1,0}t4d0 c{4,7,6,1,0}t0d0 \ {can lose 2; notice how it cuts across controllers}
raidz2 c{5,4,7,6,1,0}t5d0 c{4,7,6,1,0}t1d0 \ {can lose 2; notice how it cuts across controllers}
raidz2 c{5,4,7,6,1,0}t6d0 c{4,7,6,1,0}t2d0 \ {can lose 2; notice how it cuts across controllers}
raidz2 c{5,4,7,6,1,0}t7d0 c{4,7,6,1,0}t3d0 \ {can lose 2; notice how it cuts across controllers}
spare c5t{1,2,3}d0 {three hot spares}
And instantly you get
But the thing isn’t just about space.
It’s really about spindles.
For many things (mail, postgresql or any database for that matter), it’s all about the number of spindles: can the disk I/O (that’s in-out) be as fast as say … one’s network connection (gigabit+ in a good world)? cpu <-> memory (generally ~22Gbps)?
And how do you scale disc IO? That’s correct, you add spindles.
So let’s take filebench and just look at what the “webserver” profile does (90% reads, 10% writes).
[private:/] $ /opt/filebench/filebench filebench> load webserver 26073: 43.569: Webserver Version 1.13 2005/06/21 21:18:53 personality successfully loaded filebench> set $dir=/joyous/jason/ filebench> run 60IO Summary: 5544761 ops 91746.2 ops/s, (29594/2961 r/w) 499.3mb/s, 60us cpu/op, 0.0ms latency
26073: 142.218: Shutting down processes
Hmm … nearly ~100,000 operations/sec and a disc IO of ~500MB/sec.
My God.
That’s 4Gbps.
Which is pretty good for a little, light benchmark.
And apparently the limits are “approximately 1 GBps from disks to network and approximately 2 GBps from disk to memory”. Note the capital B, so that’s GigaBytes, a Gigabit would be 8x that (8-16 Gbps).
I mean isn’t a 2 hour DVD less than 4GBs? So if I say had this in my house … it would take 4 seconds for a DVD-worth of movie to head out of that server? It only they didn’t sound like a Boeing 747.
By the way I’m talking one of these
Commenting is closed for this article.

There’s one funny thing about all zfs posts I read on the net:
They never explain what the issued commands mean.
Sure most of the time the basics can be inferred, but occasionally one wonders about the intricacies. It wouldn’t hurt to briefly state what you’re doing.
— cch 1040 days ago #The manual is pretty good.
— Jason Hoffman 1040 days ago #Hm, are users allowed to take snapshots of stuff on their container?
Though I probably wouldn’t use it. My backups are going to strongspace.
Would be neat to come up with some rsync wrapper script that was installed on each container that allowed users to do snapshot backups to strongspace.
— Joe 1039 days ago #Hmm, well, it’s awesome that you guys have a Thumper… however, I’d be curious about how you’d set this sort of thing up if you had a bunch of smaller boxes that you wanted to use with ZFS.
I’m mainly curious because I’m expecting to be in a situation soon where I’m going to need some serious storage space, but it’s for data that will start out small (by small I mean 500GB-1TB) and grow from there. The data in question should be highly compressible, so ZFS-like compression would be awesome. But I’d rather just buy the boxes one at a time as I need them, though it’d be a real pain in the neck to have to deal with the storage as separate volumes. Basically, I’m looking for a poor-man’s SAN. I get the impression that ZFS can’t do this, but perhaps something else on top of ZFS?
— Bob Aman 1039 days ago #Bob, try iSCSI. In fact, I think there was a previous Joyeur post about iSCSI+ZFS.
— Wes Felter 1039 days ago #We’ve heard for large strip/sequential reads/writes thumper performs as you specified above. But for random access, it’s performance is pretty poor. Do you have any numbers on that?
— Lenny Tropiano 1012 days ago #