[erlang-questions] benchmarks game harsh criticism (was Learning Erlang from the scratch)

Bengt Kleberg bengt.kleberg@REDACTED
Fri Nov 23 13:26:33 CET 2007


greetings,

this is seriously off topic for erlang-questions, so i would recommend 
each and every one of you to stop reading now.


kvw(*) is not a benchmark report. it is a paper about experiments on how 
to do benchmarks. they tested some ideas and reached a few general 
principles. the one pertinent to this discussion says "Memory-related 
issues and the effects of memory hierarchies are pervasive: how memory 
is managed, from hardware caches to garbage collection, can change 
runtimes dramatically". to see this it is necessary to vary the input to 
such an extent as to find the dramatic runtime changes. the shootout 
does not do this.

isaac gouy has previously stated that the shootout is not, and shall not 
be, about the kind of wide spectrum of inputs that kvw recommends 
investigating. now he is instead saying that the shootout is better than 
what kvw recommends for this kind of investigations.
this time he is wrong.


bengt
Those were the days...
    EPO guidelines 1978: "If the contribution to the known art resides
    solely in a computer program then the subject matter is not
    patentable in whatever manner it may be presented in the claims."


On 11/18/07 21:59, Isaac Gouy wrote:
> On 2007-09-3 Bengt Kleberg wrote:
>> my main problem with the alioth shootout is that it has thrown away 
>> one of the main ideas/insights from the paper(*) that was the
>> inspiration for the original shootout. namely that it is very
>> important to look at how the timeing changes with the size of the
>> input. the alioth shootout takes only 3 very similar size values.
>> to make things worse these 3 values must give results for the major
>> languages (no timeout, etc).
> 
>> (*)Timing Trials, or, the Trials of Timing, 
>> http://cm.bell-labs.com/cm/cs/who/bwk/interps/pap.html)
> 
> 
>>> On 2007-08-31 20:54, Michael Campbell wrote:
> -snip-
>>> Be careful with that.  Alioth's shootouts are for how quickly a
>>> language can run a particular *algorithm*, which can at times be
>>> VERY DIFFERENT from how you would normally do it in that language.
>>>
>>> So some of the code on that will be weirdly contorted to fit the
>>> particular algorithm, rather than what the prevailing idiom is for
>>> that language.
>>>
>>> A somewhat more harsh criticism can be found here:
>>>
> http://yarivsblog.com/articles/2006/07/11/erlang-yaws-vs-ruby-on-rails/#comment-70
> 
> 
> 
> My apologies for digging up this 2 month old comment, but sometimes I'm
> just taken-aback by criticism of the benchmarks game. I'm well aware of
> my limitations and value informed criticism - to a great extent we rely
> on others to notice our mistakes and suggest improvements and
> alternatives. 
> 
> Sometimes the criticism misleads - I never know if that's the
> intention.
> 
> 
> According to Bengt Kleberg "the alioth shootout takes only 3 very
> similar size values" which "has thrown away one of the main
> ideas/insights" of 'Timing Trials, or, the Trials of Timing'.
> 
> Can you guess how many different input sizes were used for "Timing
> Trials, or, the Trials of Timing"? Do you guess 20? Do you guess 10?
> 
> No. The comparisons in "Timing Trials, or, the Trials of Timing" were
> based on just 4 input values! The Benchmarks Game has slipped from the
> insightful 4 to the miserable 3 :-)
> 
> As for "very similiar size values" the timing range for different input
> values is ~10x to ~100x, in comparison to mostly < 10x in "Timing
> Trials, or, the Trials of Timing".
> 
> 
> 
> Michael Campbell points to Yariv's Blog, and I guess to Austin
> Ziegler's  comment. The specific problem he raises - "... must be set
> at the user’s shell [ulimit]. They do not do this and report that the
> Ruby program doesn’t run" - was raised a year earlier on the Ruby
> mailing-list. 
> 
> That problem was fixed by November 2005, 9 months before Austin Ziegler
> ranted on Yariv's Blog - his repeated "harsh criticism" had been untrue
> for 9 months and by then the ackermann benchmark he complains about had
> been replaced.
> 
> 
> 
> Rather than a general warning about wierdly contorted code, wouldn't it
> be more helpful to say which of the Erlang programs you think are
> wierdly contorted?
> 
> 
>       ____________________________________________________________________________________
> Never miss a thing.  Make Yahoo your home page. 
> http://www.yahoo.com/r/hs
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions



More information about the erlang-questions mailing list