« Back again! | Main | Should end-users do the benchmarking? »

November 12, 2008

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Craig Randall

Welcome back! Like you, I'm also of the school of pragmatic thinking; so, I appreciate your discussion of what's "not right" and making it "right" instead of throwing the baby out with the bathwater. But I'm just a content management guy. Cheers...

.Connector

Thanks, Craig! Welcome to the party... I'm a closet Content Management afficionado myself, so I won't hold it against you!

Cheers.

open systems storage guy

Welcome back!

My main question would be how the benchmarks are administered. Ideally, the organization that does this would be self sufficient so that they don't rely on any vendor to fund them.

Also, I would like there to be a way for companies that own equipment to publicly release their own benchmarking numbers, and to try to duplicate others' benchmarks. They would have to follow a specific format for disclosure that would ensure that their results could be duplicated.

.Connector

Hey OSSG,
Good to be back!

Good points - but does there have to be a governing body? If the tests are such that anyone can run them, vendors can keep each other on the straight and narrow by cross checking the results independently, right?

This is time and resource intensive - and may be difficult to implement, so perhaps there does need to be a non-vendor affiliated body like the SPC who can verify vendor claims. The question then is who funds them?

Good points for more input from others.

Cheers.

open systems storage guy

There needs to be some sort of body that will at the least release guidelines about what information to disclose, as well as keep a record of disclosed benchmarks. Organizationally, it would be very light, however.

.Connector

OSSG,
Agreed! A record keeper and someone to actually develop and maintain the benchmarks code. Rather like the SPC today.

I'm OK conceptually with customers publishing their own benchmarks on equipment they own, but the disclosure of the configurations has to be very detailed or its difficult to evaluate the results.

I have had a several customers who did single threaded 100% random read benchmarks through file systems, and saturated the backplane of the rather weak server they used to generate the workload - concluding that they had actually saturated the storage.

This is a common pitfall - the workload generators and server hardware need some serious ooomph to ensure that the measurement is really for the storage system.

Especially for modern arrays, which are immense in their processing power compared to most commodity servers. There is an art to constructing such punishing workloads....

Cheers.

open systems storage guy

Not "develop and maintain the benchmarks code" per se- more offer guides on how to make a benchmark relevant to the question you're trying to answer.

Mostly, they would be responsible for indexing benchmark results and filtering out results that are unverifiable or unrepeatable.

As for badly designed benchmarks, I imagine there would be a community of storage people (vendors, bloggers, users, etc) who would be diverse enough to call out nonsense when they see if. So long as the organization can assure that a single server 100% read miss test is clearly documented as such, someone would call bull-hockey.

Storagezilla

I'm already sick of you lording your superior intellect over me and it's only been one post.

Back to sleep before I have you put to sleep. ;-)

.Connector

OSSG,
So are you suggesting that there should be no standard set of benchmarks, but rather, guidelines on how to construct one to answer the question at hand?

And 'Zilla,
Always good to hear from you! And you sorely overestimate my intellectual capabilities..... mongo pawn in game of life!

Cheers.

open systems storage guy

Yes- I believe any benchmarking system should support (and moderate) user submissions. The more data, the better. Also, this means that people benchmarking their gear will get proper guidance and feedback about configurations, which should help reduce the amount of general misconfiguration and ignorance.

.Connector

OSSG,
Fascinating idea! I am warming up to the approach... however I worry that it will take a long time to separate the wheat from the chaff - and there will be a LOT of chaff in the beginning.

And at some point, someone will have to provide a good IO driver and a way to construct specific workloads. But, I think its an idea that we should discuss further in this forum - the possibilities are very intriguing.

Let me do a post on this to attract opinions.

Cheers.

open systems storage guy

Remember- it's far easier to call BS on a benchmark than it is to post one. Also, there will be more people interested in calling BS on a submission than there will be people wanting it to be repeatable and verifiable.

Lastly, no benchmark will be anonymous. If someone continually provides garbage data to the end of manipulating opinions, their voice will eventually not be heeded.

It's the basic concept of the wisdom of the crowds. Doesn't work for everything, but should work for this :)

marc farley

I don't think it makes sense to have a benchmark "free for all" that depends on collective knowledge to sort out the results. That's just asking for a total mess.

Please explain why we can't consider creating a new SPC benchmark that addresses EMC's objections? Also, are they based on religious or technical grounds? If the problem is religion I'd say "get over it". If it's technical then it's worth continuing the discussion.

Martin G

I think there are some good technical reasons for continuing this discussion. EMC are right in that customers run mixed workloads with hundreds of servers attached to enterprise arrays; so perhaps there needs to be a benchmark which reflects this.

For example I have an application which if you took it in isolation, the best array would probably be CX but I can't really afford to silo applications like that. So benchmarks which reflect multiple-application occupancy would be very useful for me. Complicated for you guys to create tho'...

open systems storage guy

@marc
It's a much smaller collaborative effort than, say, building an operating system, and Linux is hardly a "mess".

SPC is not going to change- they've made that clear. The reality is that benchmarks should not be managed by vendors- they all have too much at stake. The process should include them, but be driven by buyers.

marc farley

I agree that a large scale mixed workload benchmark would be valuable.

inch

Interesting idea,

however, I'm sure there would be a lot of upset vendors and CTO's when some customer came out and said "holy crap, the storage device we bought for $1mil runs like a pig"

I'm really not to sure it would be a truly honest picture - I don't think the storage guy would say he has a lemon. Would that vendor give him such a good discount in the future on other products? Will the CTO ever listen to him again and agree to pay for any new kit?

What are peoples thoughts on a neutral third party like a University to perform the benchmarks? A university has cheap labour (students), people who can write good documents (phd folk), an openness for learning (its a Uni!) and no real vendor bias.

.Connector

Hi Inch,
Good to hear back from you!

About your university thread...Whats in it for the University? Could be writing the IO driver and the benchmarks could yield a graduate thesis project, I suppose... but on a sustained basis?

marc farley

@OSSG, why the skepticism about SPC's unwillingness to do things differently. Everybody has to deal with change. Also, there is a big difference between the development of Linux and this effort. Yes, Linux was/is a much larger effort, but less divisive than this is likely to be. :)

Many storage buyers depend heavily on their vendor's professional services for installation, tuning, change management and most other life-cycle events. A vendor's hands off approach is not realistic - although my co-workers at 3PAR would probably really like to see THAT sort of benchmark.

Along those lines, we'd like to ensure that the effort to tune and prep storage for the benchmark is well documented. That alone can be a significant task for some products and an unexpected burden to buyers wanting to participate.

There is no question buyers will have a keen interest in seeing their application mix modeled and benchmarked. They are also in the best position to know the challenges their load mixes give them.

That said, Inch's idea to work at the university level is very interesting. Dr. Kartik's concern about project longevity is spot on, but perhaps there is an opportunity to establish something similar to what UNH does in the networking world?

The comments to this entry are closed.