OSSG shared some interesting ideas about who does the benchmarking, and how to manage these results. The core question here is: are vendors the only ones who can benchmark? Why not allow customers who own the equipment to benchmark and publish their own results?
Perhaps the right approach is to create a support structure for this. A Web 2.0 collaborative portal with a wiki such as the one from Wikibon can give guidance on what constitutes a benchmark, how to run it, and what the pitfalls are, as well as what the utility of the result is in making a storage platform decision. The the results can be commented on and sliced and diced by the world at large, and in true internet anarchy style, converge on the “truth” and “vaidity” of each result.
I have misgivings about this, although the approach is very intriguing. First, most customers I know don't benchmark storage, they benchmark an entire stack. For example, a customer of mine recently re-platformed (Unix variant change) one of their critical OLTP applications running on a Progress database - their benchmark was a particular piece of batch code wall clock timings, which we used as a basis for tuning storage performance. Can this be used as a general purpose result? Likely not.
Another fundamental problem here is constucting a benchmark for storage that can bring an array to it's knees, without the IO driver infrastructure saturating. Most customers don't have the capability of using say 12 Linux systems with a distributed IO driving piece of code to really test such large arrays, and run the risk of saturating the driving server before the array chokes.
But, there is something appealing about anarchy...... and regardless of the whether customer benchmarks should be collated, interpreted and governed, a central body of guidelines and definitions would be very helpful.
Thoughts and comments?
Sidebar
Marc Farley asked why the SPC couldn't be modified to satisfy EMC's objections, and whether EMC's objections were religious or technical.
Disclaimer: These are my opinions only, and may not reflect the opinions of EMC as a company.
Well, I don't believe EMC will (re)join the SPC because SPC-1 set a direction that has proven to be unproductive. The reasons are technical, and well described in earlier posts in this blog. In a nutshell, the SPC-1's cache hostile workload profile reduces its utility to counting the number of spindles in an array, and therefore a) provides no new useable information and b) deliberately "dumbs" down storage arrays and the intelligence inherent in them. It has no discriminating power between offerings from different vendors. EMC left the SPC for this reason, and I suspect (now, this is only my personal opinion) EMC wont join again. So by all means we can discuss modifying the SPC - I'm not holding my breath that it will be the standard benchmark from EMCs point of view. This whole effort is to discuss the creation of a new set of gudelines on how to characterize performance outside the SPC.
Cheers,
.Connector.
SPC is unlikely to change. More to the point, it's run by vendors. Users are in a better position to create benchmarks because aside from technical caveats that would be discovered, it's unlikely that they would try to deliberately show a high or low number (by short stroking, for example).
Wikis are a good first step, but what we need is an organization that will collect and publish benchmark submissions that conform to a certain standard of documentation and repeatability.
I imagine it would work like this:
-Client performs benchmark using real world workload, documents all variables (config, data used, etc), and then submits it for review.
-The vendors and bloggers chew over it to point out any caveats that make the data less useful or general.
-Other people replicate the benchmark to see if they get similar results.
Once all these things have happened, and everyone agrees that under these conditions, this performance is reasonable, the benchmark goes from "submitted for review" to "published", and can be used for comparison.
Posted by: open systems storage guy | November 13, 2008 at 10:13 AM
I don't agree on the appeal of anarchy. It would look more like a presidential campaign than a benchmark.
Posted by: marc farley | November 13, 2008 at 12:10 PM
What if SPC developed a new mixed workload benchmark, say SPC-3 or 4 as Barry Whyte has suggested? I assume religion is a larger factor in this scenario?
Posted by: marc farley | November 13, 2008 at 01:40 PM
My thoughts on anarchy - Wikipedia is anarchy - it as no centralized leader or power center, but it works. So its not all bad. The regulating force is community self-policing. So I ain't knocking OSSGs idea yet.
Its meaningless for me to comment on a hypothetical SPC-4 and EMCs possible reaction to it. Lets see it and we'll see how to respond to it. I too think SPC is unlikely to change.
Cheers,
.Connector
Posted by: .Connector | November 13, 2008 at 04:51 PM
Analogies to Linux, Wikipedia or any other open collaborative effort aren't very good here as the context for a benchmark is by nature much more competitive. We ought to acknowledge this in our assumptions and expectations.
The "best result" would be for this effort to take on a life of its own - independent of vendor steering. But that is also probably not realistic. I guess that's why I like the university approach suggested by inch in the previous discussion. Not that influence couldn't be applied unevenly there, but there might be ways to make it more transparent.
Posted by: marc farley | November 13, 2008 at 07:33 PM