Hi folks,
Very nice ideas from a lot of people. I'll summarize them in this post, and put forth some of my own. But first, from my dim and distant past...
Everything I need to know I learned in Particle Physics
My soul is that of a physicist. In my younger days, I studied high-energy collisions between different elementary particles to understand their internal structure and interactions between them.
We know of four kinds of interactions: Strong, Weak, Electromagnetic and Gravitational. (Strictly speaking, Electromagnetic and Weak are now unified into the Electroweak interaction). Elementary particles came in three kinds - quarks, leptons (electrons, for example) and the carriers of the interactions, the gauge bosons (the photon is the best known).
Charting the properties of an unknown entity or structure meant using a probe to see the effects of a particular interaction. For example, if I wanted to probe electromagnetic structure, I'd use an electromagnetic probe, like the electron. If I wanted to probe weak structure, I would use a weak probe, like the neutrino. Now quark structure was tricky, as it could participate in strong, weak, electromagnetic and gravitational interactions.
So the trick to getting a solid composite view of the quark was to use different probes to build out a picture of all the aspects of this beast - not unlike using the input of 6 blind men to figure out the structure of an elephant. Any single probe would give incomplete information, and the results of all the probes had to be triangulated to get the correct interpretation.
I believe that any approach to storage benchmarking is no different, and has to take into account that the probe can only reveal substructure from its dominant interaction. In the storage world, particles are storage arrays; probes are workloads. So my first draft of the first of the Storage Benchmarking Postulates...
Postulate #1: A usable benchmark must subject the storage array to multiple workloads, testing different aspects of response while keeping the physical configuration static.
For example, I think the same physical array configuration should be subject to workloads that stress back end (cache hostile), cache (throughput) and front-end (cache friendly) (and others) application workloads. Then vendor doctoring or optimizing for a specific measurement should die out. Something optimized for OLTP may not do well for data warehouse workloads.
The figures-of-merit for this would be a composite of these results for a fixed configuration. The issue with the SPC-1 wasn't so much that thought didn't go into it, but rather that it ended up testing just one of these aspects, leading to skewed interpretations. What should a minimal set of probes be to fully characterize an array?
I have other postulates in my head, but let me solicit some feedback here first, and see if this is an avenue people think is worth pursuing.
The great ideas from you
As I mentioned earlier, many readers had great suggestions.
The The Anarchist and Chuck Hollis commented that size or scale is a critical factor. Storage arrays came in many sizes, and for effective normalization of the results, one should really have small, medium and large (and maybe "Ginormous no-holds-barred") configurations that one should plan to test for.
BarryB also suggested that NAS be included in this (seconded by OSSG) so even though NFS and CIFS introduced some unique challenges, they do belong in the realm of devices to be tested. OSSG proposed a way to be inclusive - "to allow access to its storage using SCSI disk LUNs" and skirt the issue of host connection protocol. I like that - and would like to modify my definitions for an array to reflect that.
The germ for Postulate #1 was contained in TechEnki's and BarryW's comments as well. They suggested layered tests, one with a baseline of array properties, before going to higher levels of functionality. So one would test the array envelope performance for components, layer on application workloads, and then move on to advanced functionality tests running concurrently with the workload. Chuck also wanted the addition of standard failure scenarios and the systems response to that, like drive rebuilds or cache disabling.
Chuck pointed out that coming up with "standardized" workloads would be tough (in the spirit of Postulate #1, that would be multiple standardized workloads!), and that making this a suite of tests an end user could run would probably make it even more usable. OSSG also pointed out that "for this benchmark to have any real meaning, any result must be agreed upon by at least two parties, at least one of which must be neutral or a competitor".
These are tough governance issues. Should there be an external body (outside of the vendor community) that should govern such a benchmark? Should this be an open source set of workload drivers with configuration guidelines, with a results database from end-user testers. My inclination is the latter - let our customers tell us how well they think the arrays do.
David Vellante and OSSG offered to support such an effort, perhaps in the form of end-user driven wikis or other collaboration tools. Perhaps vendors can help to create a support structure in the move to return power to the customers.
Please ask others you know to contribute their ideas. I am a big fan of collaborative think. More the merrier!
Ta-ta for now!
Cheers, dotConnector.
I like the idea of having an open standard benchmark that anyone can perform. Since there are so many variables that will need to be measured and there will probably end up being a plethora of benchmarks, having an open standard is the best way to make sure all ideas get considered.
Posted by: Open Systems Guy | November 05, 2007 at 09:08 PM
Hi OSSG,
I say let the community decide on usability. This is an experiment, but I would say that if enough people participate, it will converge. Look at Wikipedia - one would naively think that an edit-by-anyone forum would be pure anarchy (no offence - BarryB!), but au contraire!
A large enough social community seems to reach stable equilibrium pretty well.
Will an open source open usage benchmark suite fare differently? Maybe - maybe not.
I see this discussion leading to a framework for a good benchmark - not necessarily a code base. Where the community takes that remains to be seen.
Lets see if participation and support for this discussion is passionate enough to develop its own inertia. We may end up concluding that the task is too complex and too daunting.
Thanks for the input!
Cheers, .Connector.
Posted by: dotconnector | November 05, 2007 at 10:14 PM
G'day,
Firstly, I would like to say that I'm so glad to finally see a storage blog with no petty "He said she said my array goes faster not your array" crap.
Other bloggers whom remain nameless that work for various organisations don't do themselves or the company's they work for any justice with the constant slamming of each others products.
Anyway, to the topic at hand....
I think it is about time that a conversation has started about tools and methodologies for storage performance testing.
I have a bit of experience evaluating various storage technologies and generally have to concoct my own tools on the fly depending on the product.
Most of these tools are a mash of various open source products which are not meant for testing out disk arrays, nas equipment and switch they are generally made for filesystem or protocol testing.
I'm sure each vendor would have their own tools for burning in disk arrays and load generation, why not open source the lot... :-)
I will also raise my hand along with David and OSSG to assist where I can
Cheers
Posted by: inch | November 06, 2007 at 04:50 AM
No offense taken. As you note, the Wikipedia proves that anarchy can work. And importantly, anarchy isn't about not having any rules, it's about not having a RULER. This was actually the foundation of my blog's title - I'm an activist trying to ensure that in the absence of a ruler in the storage blogosphere (by definition), we all still play by common rules of fairness and judgement.
And I TOTALLY support the community approach to this effort...but as you say, it really depends upon a commitment to participate from a broad enough community. What the effort really needs soon are:
1) participants from other storage vendors (HP, Sun, Hitachi, Dell, 3par, etc.). We even need more from EMC and IBM, given the breadth of our respective product lines.
2) participation from the storage network infrastructure vendors. HBAs, switches and the like. Whether or not we decide these play a significant role, we need to consider the impact of these components on the benchmark configurations and results.
3) Server vendors...and this one will undoubtedly be controversial. But the power of the server driving a workload frequently plays a role in the results. In order to be repeatable, the benchmark is going to have to account for the differences in server platforms. Involving the server vendors may (or may not) help address this aspect more efficiently.
4) Most importantly, we need CUSTOMER participation - and this is critical. In fact, I'd like to see the CUSTOMER definition of what's needed before we go much further. As technologists, we can imagine a plethora of dimensions and angles - but it's possible that many or even most of those are irrelevant to the customer. So getting the voice of the customer involved should of highest priority.
I'll be recruiting every opportunity I get, but let me know if there's anything else I can do...
Posted by: the storage anarchist | November 06, 2007 at 06:42 AM
Here's a start: http://www.wikibon.org/Benchmark_storage
I have not finished laying out the format, and have to do some real work for a bit ;). I'll take another crack at it tonight, but in the meantime if someone wants to go and start adding content (or copy-pasting .Connector's post if we have permission), then please do :)
You may need to sign up for a wikibon acct to edit, but that's probably a good thing- keeps people accountable, more or less. Also, if you get a "server not responding" error, just try again- it seems to be an intermittent problem. Make sure you copy the text of your edits to notepad or something before clicking submit though...
Posted by: open systems storage guy | November 06, 2007 at 10:03 AM
Inch,
Thanks for your kind words - and I really appreciate your willingness to help.
I second Anarchist's call for more participation. This effort is pointless unless we have strong customer and vendor participation. I too will recruit tirelessly.
Gotta run now - more later.
Cheers, .Connector.
Posted by: dotConnector | November 06, 2007 at 12:44 PM