Benchmark guidelines (kickstart) - Task 005
Through of combination of: discussions with researchers, the SHAREing performance assessment seminar and a review of benchmarking literature, this task has collated a proposed set of guidelines for researchers and developers to design their own benchmark. We here list a condensed set of recommendations. We believe that a benchmark should:
- capture several components of the mathematical or algorithmic features of the software
- be representative of a typical job run, e.g., including realistic I/O calls
- be configurable by the performance analyst, i.e., the example job should be easily scalable and should allow for features such as I/O to be switched off or on
- include documentation which explains how to build and run the benchmark, including details on compilers, libraries, etc. and their versions
- not be too small, as this will behave more like a synthetic benchmark. It is likely easier for a performance analyst to reduce the size of the benchmark, than increase it in any meaningful way. For the full document discussing these guidelines further, please see the document linked below.
Outcomes
This is a first iteration of the SHAREing benchmarking guidelines, suggestions are highly encouraged!
- Document outlining the proposed SHAREing benchmark guidelines
- Attachment: SHAREing proposed benchmark guidelines LaTeX
Fit to programme
This task has been identified by the working groups as part of the agenda behind WP 1.2.
The task number is 005.
Description
SHAREing is working toward a generic and simple performance assessment service. For this, the project has to curate guidelines. The objective of this task is to kickstart the curation of a benchmarking guide that SHAREing can continue to iterate and improve. These guidelines describe a proposed set of characteristics for a benchmark. In brief, this tasks posits that a benchmark should accurately describe features of research software, whilst also being configurable for a rigorous analysis.