Analysis Effort
Analysis Effort #
This section describes which metrics and procedures should be reported when performing data analysis. It also aims to propose a standardised way to do that so it is easier to compare and share data among different projects.
The idea is to have sections/descriptions that should be in the methods section of a paper, ensuring research can be reproduced, compared and improved. Moreover, it is also important to try and justify the choices as much as possible, either with citations from similar work or ecological reasons for making such decisions
Effort #
How much data was analysed
Unit: Bytes
How was the data analysed? #
Description of how the data was analysed
Example:
- Data was manually
annotated Acoustic indiceswere calculated- Species
presence/absencedata was collected - Minutes were
clustered, etc.
Note: for more information on annotations see this link
Tool #
Which software, version, package, was used to analyse the data. Provide all the required information to make sure the research is reproducible.
Example:
Ravenwas used for annotations;AnalysisPrograms.exewas used for generating indices;Audacitywas used for detecting species presence/absence;Kaleidoscopewas used for clustering andRavenwas used for species' identification in each cluster.
How much effort was put into it? #
How much time did it take to analyse the informed amount of data, in a way, using that tool.
Ideally, this information should be the most precise as possible. However, we understand this might not always be possible or accurate. Therefore, even if it is a rough estimate, or only the computational effort, this will help to provide comparison between methods and inform the development of better tools.
Example:
Xbytes of data were manually annotated inYhour of effort.Ubytes of data were analysed usingAnalysisPrograms.exeto calculate acoustic indices. In a computer withABCspecifications, the analysis tookDhours to run.Fhours were necessary to go throughEbytes of files and collecting bird species presence/absence information.- Clustering of
Qbytes tookPminutes withNclusters being generated. Identification of those clusters tookOhours.
What was the analysis protocol? #
Was there protocol used to analyse the data in a systematic way? In this section is important to provide justification for every choice, being that the ecological/biological reason for such choice and/or relevant citation that used similar method for answering related questions.
Example:
- Species were identified in the
first minute of every hour - Indices were calculated for
every minuteof recording; - Species presence were collected by analysing
5 minutes every 10 minutes; 60 initial clusterswere generated and5 first minutes of each onewere identified.
Who performed the analysis #
Provide information on the person who performed the analysis (ideally ORCID).
Data repository #
If the data or any step of the analysis is available in a repository, provide link/DOI for access.
Any additional information that was no covered in this document is valuable as it could improve usability of the data in different contexts. Information on techniques that were tested and did not work is also extremely valuable along with the information/hypothesis of how/why it failed. A lot of time could be saved if we shared not only the success stories but also the “fiascos”.
Checklist #
- Analysis effort
- Data analysis description
- Information software/method used
- How much effort was put into analysis
- Analysis protocol
- Description
- Ecological/biological justification
- Important references/citations
- Information on who performed the analysis
- Data repository information