« SPC-1: What have you done for me - lately! | Main | A trip down memory lane... »

November 03, 2007

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

the storage anarchist

A couple of things you should probably consider from the start.

First is the notion of the workload test "size." Given that there are rather significant differences in the number of ports, hosts, LUNs, GB of RAM, and drives that various arrays can support, I think you'll have to plan on having more than one "test size." May two or three standard class sizes (small, medium, large), with a specific MAXIMUM number of ports/hosts/RAM, etc. And maybe a Ginormous, no-holds-barred max configuration class.

Second, regarding your restricting this to block protocols only. Clearly, that would include iSCSI (and other block-over-network protocols). But you should also consider that NFS/CIFS is a characteristically unique protocol that should be included in the test domain. That is, a Celerra can front-end a Symmetrix, and NetApp supports a similar configuration. Running standard database or file applications over CFS/NFS undoubtedly puts a different strain/stress on a storage array than does running those same applications directly over block FC. Thus, when you get around to discussing the workloads you want to model, I think you should include NFS/CFS-based front-end servers as one of the host types.

Looking forward to the discussion...

David Vellante

Connector,

Thanks for taking the initiative to write up your thoughts on this. I see your proposed effort as having real potential and the notion of extending the discourse to customers is fundamental to that. For our part the wikibon community is committed to working with folks like you to advance the metrics and standards by which customers can obtain useful decision-making tools. Our users are asking for this and we'll commit to recruit them in to the process, host meetups/telebriefings on the topic, moderate discussions, write about this effort, make the press aware of it so more users will be informed and lend a helping hand wherever it makes sense.

dotConnector

Hey Barry,
Both valid points.

Having multiple "scales" for tests makes a lot of sense. This helps the smaller guys from being crushed by the highly scalable large systems, so having the McDonalds model of small, medium, large and Biggie should help delineate the tested systems.

On the NFS/CIFS attach - I struggled with whether or not to include it. For exactly the reasons you stated. The network attach means that two distinct pieces have to be specified - the access to the storage, and the access to the network, which would increase the complexity of such a benchmark. My inclination was to have a separate set of metrics for that, as there is significant intertwining with some other standards like SPECnfs, and variants like network topology and switching gear.

But, I agree, NFS/CIFS have a definite place alongside iSCSI in a comprehensive benchmark. We should discuss if a common framework works for both.

Cheers,
dotConnector.

dotConnector

Hi Dave,
Thank you very much for your support and offer to assist. Like yourself, BarryB also pointed out in his recent post (http://storageararchist.typepad.com) the real question is, should such testing and certification belong in the end-user domain, as opposed to industry consortiums? We, as vendors, should be doing what the customer feels they need.

Building out a grass-roots, user-driven externalized framework for performance measurement, with the Web2.0 collaborative technologies like blogs and wikis is, IMHO, the lasting approach. And even though some of it may succumb to the tyranny of the majority, it also offers a distinct voice to everyone, and hopefully the convergence will be rapid and meaningful.

In my day job, I have the privilege of working with hundreds of customers at many scales, and I can collect a lot of good ideas by word of mouth. I will pass the word around and get them to participate. I'd rather have this be a customer-driven vendor-facilitated effort than a vendor-driven effort.

I find efforts like the Wikibon gratifying for that reason, and will definitely collaborate in making that successful as well.

Thanks,
Cheers, dotConnector.

Barry Whyte

I'd certainly be happy to get involved. The undertaking is not small, but the end result could be a wonderful tool for end users and vendors alike.

The inclusion of specific tests for todays feature rich products, like snapshot, replication, data migration and their impacts on the 'background' processing. Running nightly backup / restore processing etc.

I could imagine something that has a base 'background' workload processing and then a series of 'delta' like benchmarks to run on-top.

I like the idea.

dotConnector

BarryW,
Wonderful! Lets put aside our competitive postures for a while, and I believe something very useful for our customers could emerge.
Thanks for the support!
Cheers, K.

TechEnki

The complexities of benchmarking mean that the best approach is probably a suite of benchmarks broken into several classes. The classes should be tackled from easiest ones first--for example why worry about different front-end host types if you can't recreate a valid back-end test.

The first would be storage vendor centric (i.e. a test that is cache hostile like SPC would be one, cache friendly would be another, etc).

The second class would be usage model driven. Ideally we would recruit the application vendors to provide some of these components.

The third would cover special features such as snap-shots or different front end-hosts types.

Out friends in the consumer PC space have been using suites for years-admittedly their problems are simpler.

Example of hard drive suite: http://www.tomshardware.com/2007/11/05/the_terabyte_battle/

From this simple test series, I see what different products are optimized for (bandwidth, access time, or power consumption), and how those trade-offs play in different scenarios (database simulation, windows startup). Benchmark users can gain useful information about what to expect and what the trade-offs are, which is hard to do with SPC.

dotConnector

TechEnki,
Excellent post! I like the idea of layered benchmarks (akin to the baseline+delta idea from BarryW).

Between your suggestions and the others (BarryB and BarryW), I believe there is a seed for a substantial, solid approach to develop.

Let me summarize your contributions, and put forth a consolidated approach in my next blog post, and we can build on that.

Thanks for your support!
Cheers, dotConnector.

open systems storage guy

This is great :) I hope that by putting competitive doublespeak aside for long enough to get an agreed upon benchmark, a real service will have been done for the companies who use storage products.

I have a few things I'd like to start with. First, and most importantly, I believe that for this benchmark to have any real meaning, any result must be agreed upon by at least two parties, at least one of which must be neutral or a competitor.

Second, I think the definition of a storage array should simply state that it either must be able to allow access to its storage using SCSI disk LUNs, or must be compatible with most standard email and database packages. These requirements would simplify the requirement and avoid the whole question of NFS and iSCSI.

Thirdly, why don't we start a wikipedia stile wiki? It's the easiest way to allow collaboration, and it also has tools allowing us to police for vandals and neutral points of view.

I'll volunteer to get the thing off the ground, but if anyone's interested in helping, please email me at opensystemsguy@gmail.com. I can provide for the hosting and bandwidth, however I could use a hand with the wiki software- never used it before.

Chuck Hollis

Good discussion.

I agree with the folks who want different "sizes" of tests. Based on experience, the scaling factors are relevant.

I also like the idea where a skilled practitioner could run the test themselves. Results would not be as comparable, but I believe would ultimately generate more value for the industry and customers.

The challenge will be in coming up with a "standardized" multi-threaded workload.

I think one of the central arguments here is that "storage arrays" have to be able to handle multiple flavors of workload simultaneously, and cope with rapid changes.

Similarly, there would ideally be a notion of "standardized" failure scenarios, e.g. drive bad, link fails, cache card fails, etc.

Perhaps we could combine the thoughts as follows:

"Small" workloads do not have much variability in either performance characteristics, or envisioned failure characteristics.

"Medium" has a bit more. And "large" has a very wide range of dynamic behavior in terms of workload and envisioned failure modes.

Man, I hope this goes somewhere. Having to respond to the umpteenth person as to why SPCs are irrelevant (and actually harmful) is getting tiresome.

dotConnector

Chuck, OSSG,
Thanks for the input. Scale is obviously something we have to consider in this effort.

The idea of workload generation is thorny - and I agree will be very challenging. But I have some ideas that I can throw out in my next post.

A wiki for this would be very useful - in fact Dave Vellante has his wikibon community and has already indicated that he would try to help as mush as he can. Dave, don't want to put you on the spot, but is hosting a wiki under wikibon viable?

Cheers, Kartik.

Barry Whyte

OK, so Mr Burke has commented nobody wants to join in - where are you then Dr... I'm ready, willing and happy to join the merry band, but its going to take more than just us - maybe you guys need to join back with the SPC and together we can create an SPC-4 (EMC happy) benchmark...

The comments to this entry are closed.