« Mathematics of the SPC-1 Benchmark | Main | The quest for a better benchmark »

October 30, 2007

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Barry Whyte

The point I was making wasn't that these products were the latest and greatest, but that the tool isn't purely dependent on drive numbers. If it were the case then there would be very little deviation from the trend lines.

As for the details in the full disclosure, yes it does prove an interesting read to see what was done. I'd speculate that each vendor will setup their rig to get the best out of it - as they understand at that time. This in itself can show you how well they understand fine tuning their product and indeed how well they will be able to aid the proof of concept style setups that we all agree are the real benchmark by which customers can truely prove that the boxes they are considering will give them what they need.

SPC however does give a proof point where all vendors are set against the same benchmark to verify that the box can do what is claimed of it. The rules and tests are the same for everyone. You don't see people complaining about Graphics Card benchmarks for instance, 3DMark has just as many configurable options when it comes to memory speeds, bus speeds, CPU, overclocking etc etc And it may well not be the case that just because 3DMark shows great frame rates, game X may not play well on your setup - however it does give you a measuring stick by which to compare one gfx card and associated components with another. Thats what SPC is about too.

I don't think we will ever agree, I can see a place for SPC, as does IBM as a whole, HP, SUN, Hitachi, Fujitsu... I suspect with each subsequent SPC result published by each vendor, the same question will surface, and the same arguments will be rolled out by EMC. If indeed as you speculate DMX can saturate its 2400 DDMs then what have you got to lose?

open systems storage guy

I suppose the question I would ask is what real life workload should we use to compare systems? I know SPC is messed up because it allows people to spend unrealistic amounts of money on hardware to get the numbers they want, but as long as all the vendors are doing it, at least it can be used to compare, if not to quantify.

If someone came up with a full disclosure benchmark test that did not allow any of the nonsense we see in SPC, would you get behind it and support testing?

I am not much of an expert- I certainly wouldn't be able to write a benchmark specification, however what I'd like to see is:
-actual IO request streams from production servers of various types: these could be gotten by recording streams at your clients' sites
-configuration requirements that reflect common best practices (no short stroking, cache mirroring on, etc)
-independent (or opposed) benchmarkers: either EMC and a competitor doing the same tests the same way until they agree to the same answer, or testing done by tom's hardware or somesuch
-benchmark specification review: if something is borked with the benchmark that's favoring one vendor's machines, then another vendor should be able to force a change to the benchmark
-editorial section taking the benchmark beyond just numbers: the vendor, a neutral third party, and a competitor would be able to hash out some things that might not affect performance but do affect the final price a company pays (for example superior unique system algorithms for Netapps, or single points of failure for IBM)

Some of these items are not going to fly and I know that, but any one of them would make a better benchmark than the SPC, and if the SPC is what you're objecting to, then maybe some of these would make a good replacement.

Also, as a post script, IBM Barry claimed on EMC Barry's blog that the DS8300 is not being EOL'd. While it certainly doesn't have any new features, a denial from a IBM manager is something.

dotconnector

Hi Barry,
I absolutely agree that there is nothing wrong with optimizing the host to get the best result. I myself would increase queue_depth if I had to tune a system without hesitation. It almost appears that this optimization was missed for the DS8300, not because of an intent to decieve, but sheer oversight. I am definitely not accusing IBM of manipulating the results here. My personal opinion is the the older DS8300 could have done better with these optimizations.

What does EMC has to lose if we can show no scaling violation to 2400 drives? Something very critical, IMHO, - this would send the message that all storage systems are alike, and that architecture, features like cache partitioning, QoS etc do not matter. SPC-1 cannot exercise these features. This puts the DMX on par with anyone else - and I believe this would be a disservice to the DMX and our customers. The DMX is a cut above.

And I do fear that we will never agree on the place of SPC-1 in the storage world, Barry! And thats OK. I assure you I still hold you in very high regard, and hope we meet some day and have a chance to shoot the breeze face to face!

Cheers, K.

dotconnector

Hi OSSG,
Excellent comment!

I am definitely not against benchmarking - just the marketing misrepresentation thereof. As I have stated earlier (and several of my colleagues have as well), the real meat behind the SPC-1 is not the SPC-1 IOPS measurement itself, but the $/SPC-1 IOPS. The absolute magnitude does not mean much, but the cost to achieve it does.

The quest to create a better benchmark and an more useable one (that for example, gives weight to Netapp intellectual property, for example) and not just contrived performance numbers is worthy of a separate thread. I'll collect my thoughts and do a separate blog post and hopefully spark some good discussion around this difficult problem

Thanks again,
Cheers, K.

open systems storage guy

That's what I'm here for ;)

David Vellante

Hi Connector,

On a recent conference call, IBM made the same assertion as above, that spc suggests SVC performs about the same as the uspv to which many of us on the call said, 'huh - look at the knee of the rt curve...how's that the same?' To which IBM said: 'that's a fair point.' It was a strange exchange. Thoughts?

dotConnector

Hi Dave,
The SVC response time curve compared to the USP V response time curve shows that the USP V was not saturated in any component other than the number of spindles. At the 1024 drive mark, the SVC too does not show the saturation it shows at 1576 drive mark... its RT degrades a bit at 1024 drives compared to the USP V, but not by much.

So that would suggest that the SVC and USP V are in fact comparable in that spindle count regime.

Which I believe to be an artifact of the way SPC-1 is crafted, and not a reflection of the true capabilities of the platforms themselves. And, the response time metric itself is one that is highly suspect, as it is a result of shortstroking the drive capacity to reduce the seek time.

The USP V used only 34% (52 TB) of its raw storage capacity (147 TB) to post these results. The SVC used 44% (46 TB) of its raw storage (110 TB) to get good response time metrics. This is an artificial way to reduce the seek time and increase the number of spindles, and thereby boost the IOPS and reduce the RT, which is not reflective of how customers intend to use the storage. Our customers want to use the storage they buy, not have it sit idle.

This is what lends the results of the SPC-1 being used to "prove" points that are highly questionable in marketing blitzes. This is exactly why I believe the SPC-1 is of dubious merit as a way to compare storage systems.

I have stated this before - I think HDS did the USP V a disservice by allowing it to participate in the SPC-1. There are many customers who probably now think that the SVC and USP V are equivalent, an unless they peel the covers off like in these posts/comments, they are being misled.

Thanks for your comments!
Cheers, dotConnector.

The comments to this entry are closed.