I already did the long conference entry here, so just a quick update: slides from PGEast are posted and next week I’ll be at the increasingly misnamed MySQL Conference in Santa Clara, California.
One thing I’m known for now is ranting about cheap Solid State Drives and how they suck for database use. The Reliable Writes wiki page collects up most of the background here. The situation the last few years has been that every inexpensive drive on the market does not have a safe write cache for database use. Every customer of mine who has purchased one of Intel’s SSD drives for example, either the X25-M or the not-enterprise-at-all X25-E, has suffered at least one massive data corruption loss.
In order to make a flash drive safe, you need to have a battery-backup on the unit, for the same reasons they’re needed on high-performance RAID controllers. When the database writes data and uses the fsync system call to make sure it’s flushed to disk, you cannot physically write that data fast enough to make people happy, onto either spinning disk nor flash. The situation is somewhat worse on flash even, because writing tiny commits of data out without a cache will actually wear out the drive faster, too. Add a battery, make the drive’s controller flush all pending data when power drops, and you can make SSD reliable enough for databases.
Really expensive enterprise drives have gotten this right for a while now, but hardware suitable for home use or small business has been scarce. OCZ released their Vertex 2 Pro drive with a super-capacitor and proper write flushing last year. The capacitor is the “Pro” part, and don’t confuse this with the regular Vertex 2. Those have been running around $650 for 100GB of SSD, and is really fast. But you can’t have only one fast drive: they fail, same as any other component in your computer. And $1300 for a pair of drives has left them still outside of the range of small shops, and even a single one has exceeded my personal home hardware tinkering budget.
Well, now there’s another choice. Intel has finally cleaned up their act here. The new 320 series drives from them integrates a set of small capacitors and proper shutdown logic into the drive. They’ve even made it part of the marketing now that they’re doing it right, including a fancy briefing on how it works. That’s where this subject is at now, by the way: if the manufacturer does write caching correctly, they will brag about it. If you don’t hear any bragging, that means they’ve screwed it up, and the drive will eat your database.
There’s a whole product line of these new Intel drives available, starting at a sub-$100 40GB model, all with the same write reliability. The larger drives are faster though, and I wanted something faster in every way than the regular hard drive it was replacing. That point doesn’t come until the $220 120GB model, which has a sequential write speed faster than the terabyte drives I use most of the time. One of the 120GB Intel 320 drives arrived in my excited hands earlier this week.
You can find the full numbers from my initial review on pgsql-performance. Basic performance parameters are as expected: 253MB/s reads, 147MB/s writes, and a respectible 5000 commits/second, all matching specifications and expectations. The only thing I can gripe about is the random read/write results. Despite claims of much higher numbers, I’m only getting around 3500 IOPS, translating to 27MB/s on a mixed workload. This is acceptable, spanking any regular drive, but it’s on the low side as SSD goes. Can’t complain given the price–if I want faster, I can always spend 3X as much for the OCZ Vertex 2 Pro–but it’s something to be aware of. There are a bunch of shameless Intel loving reviews that get this wrong; the only review I’ve seen so far that caught the same issue and put it into proper perspective is the one from Anand. It shows the 300GB 320 series drive (which is even faster than one I have) delivering middle to bottom of the pack speeds on random work, which is where it realistically is at. That’s not unacceptable, it’s just important to understand the set of trade-offs these drives deliver.
If your data fits in 120GB, this drive is a very compelling alternative to the traditional high-performance database setup. Getting a RAID controller with battery-backed write cache and a pair of drives normally adds up to around $600, and you get only fair random I/O performance from the result. Buy a pair of these drives for around $450, use software RAID for redundancy, and you will be way ahead most of the time. Just make sure you follow good SMART monitoring practices for these drives. They don’t last forever, with the write limit being a known failure point even if nothing breaks before then. There are plenty of consumables with the older technology here too, though, including replacement drives, replacement batteries, and sometimes needing to have extra controllers around as spare parts for critical systems. There should be a cost savings with SSD now so long as your data fits in the available size. And performance is going to be a big step up if you are hitting disk right now. The best way to boost performance is to add more RAM, but since eventually that data needs to go to and from disk that may not always be good enough.
Intel, you get my official thumbs-up here: you have finally done the right thing, and I’ll be happy to recommend you as a vendor now that you have. I’m still trying to figure out what I’m going to do with my now faster than ever server at home, and that’s a good problem to have.