Wrangling Sound Files #

Dr Anthony Truskinger, QUT Ecoacoustics


With enough data, every possible thing that can go wrong, will go wrong.

Topics #

  1. Storing data
  2. Repairing data
  3. Segmenting files

Storage #

Scheduling #

  • DO: what your experiment design needs
  • PREFER: longer recordings
  • AVOID: contiguous short recordings
    • e.g. recording consecutive minute-long samples is bad

Many small files seriously affect the performance of file systems.

Storage #

Storing data #

  • DO: Use hard drives
    • portable HDDs are a good choice
    • DO NOT: unpowered SSD Drives - after some time data loss will occur
  • DO: Have a backup
    • Try to have at least one backup off site
  • DO: Upload data to an external repository
  • AVOID: ‘Box’ like services for audio
    • e.g. OneDrive, DropBox, Google Drive
    • Tend to be slow for very large sets of data
    • Also suffer performance penalties for many small files
    • Are optimised for documents, not audio—the difference detection can slow down your computer

Storage #

Directory Structure #

  • DO: Be consistent
  • DO: Follow a pattern
    • For example1 {project}/{deployment}/{site}/[{memory_card}/]
  • DO: Keep all files produced by the sensor
  • DO NOT: Mix files from sensors/memory cards into one directory

Storage #

Formats #

  • DO: keep your files in their original formats
    • embedded metadata is often not kept during conversion
  • DO: keep all files produced by the sensor
    • log files
    • auxiliary support files
    • schedules
  • DO: Embrace compression
    • FLAC compression for the A2O sensors results in ≈50% reduction in file size

Storage #

Remote repositories #

Several are available:

  • Arbimon
  • Wildlife Acoustics’ Cloud Storage
  • Ecosounds

Ecosounds now supports remote uploading for any approved user.

Repairing data #

  • Sensors produce all sorts of faulty files.
  • Problems are documented in an open source known-problems repository1
  • Categorizing problems allows us to describe them in a common language

Some Examples:

ID Vendor Description
OE001 N/A No date in filename
FL008 Frontier Labs Invalid datestamps in file names (space instead of a zero)
FL011 Frontier Labs Partial files named data
WA002 Wildlife Acoustics Generating files with no data

Repairing data #

Introducing EMU #

The Ecoacoustics Metadata Utility.

  • Renames files
  • Fixes problems
  • Extracts metadata
  • Open source
  • Cross platform
  • QutEcoacoustics/emu

emu help page

Repairing data #

Using EMU to rename files #

Can convert dates:

> emu rename **/*.WAV
Looking for targets...
-   Renamed 5B07FAC0.WAV
        to 20180525T120000Z.WAV
1 files, 1 renamed, 0 unchanged, 0 failed

Can add timezone offsets1:

> emu rename **/*.wav --offset "+11:00"
Looking for targets...
-   Renamed PILLIGA_20121204_234600.wav
        to PILLIGA_20121204T234600+1100.wav
1 files, 1 renamed, 0 unchanged, 0 failed

  1. There is one true date format: ISO8601  ↩︎

Repairing data #

Using EMU to rename files #

Can read metadata from the files to use in rename:

$ emu rename --template "{StartDate}_{SampleRateHertz}{Extension}" --scan-metadata
Looking for targets...
-   Renamed /mnt/f/tmp/fixes/renames/20210621T205706-0300.wav
        to /mnt/f/tmp/fixes/renames/20210621T205706-0300_256000.wav
-   Renamed /mnt/f/tmp/fixes/renames/20220331T094902-0300.flac
        to /mnt/f/tmp/fixes/renames/20220331T094902-0300_44100.flac

Real use case: recovering dates from corrupted memory card:

$ emu rename --template "{StartDate}{Extension}" --scan-metadata **/F*
Looking for targets...
-   Renamed /mnt/f/tmp/fixes/renames/F4622343428908
         to /mnt/f/tmp/fixes/renames/20220331T094902-0300.flac
-   Renamed /mnt/f/tmp/fixes/renames/F4623864286243
         to /mnt/f/tmp/fixes/renames/20210621T205706-0300.wav
2 files, 2 renamed, 0 unchanged, 0 failed

Repairing data #

See what emu can fix #

Repairing data #

Using EMU to fix problems #

FL010: Repairing an invalid duration

Repairing data #

Using EMU to fix problems #

OE004, FL001, WA002:Renaming empty (or near empty) files

Command used: ~/emu/emu fix apply -f OE004 -f FL001 -f WA002 .

Repairing data #

Why EMU? #

“I could fix this myself”

  • Can you do it for 10,000 files in a 1000 folders?
  • Is your fix destructive?
  • Does it destroy metadata?
  • Is it idempotent?

EMU is used to clean and repair files ingested into Ecosounds

and the A2O .

It has scanned > 1 million files, and fixed ≈400,000 of them.

Segmenting #

Tools to use:

  • ffmpeg: the best solution for most tasks
  • sox: high quality resampling and spectrogram generation
  • AP’s audiocutter: for simple tasks with good defaults
    • useful if you already have AP installed
  • <your favourite method>: after all the tricky stuff is done
    • e.g. readWave

Lots of ways to achieve the same outcome.

  • Ecosounds and the A2O use ffmpeg and sox under the hood
  • AP uses ffmpeg and sox under the hood
  • R can use ffmpeg through the system call
  • The av package in R uses FFmpeg

Fun fact: WAVE is the name of the audio format, .wav is the extension commonly used for WAVE files

Segmenting #

Using ffmpeg #

  • General format: ffmpeg -i <input_file> <arguments> <output_file>
  • ffmpeg infers the format you want from the extension on your output file
# Convert a FLAC file to a WAVE file:
> ffmpeg -i 20191026T000000+1000_REC.flac 20191026T000000+1000_REC.wav

# Cut out the 10th minute:
> ffmpeg -i 20191026T000000+1000_REC.flac -ss 300 -t 60 20191026T000000+1000_REC_10th_minute.wav

# mix down multiple channels into one single channel
> ffmpeg -i 20191026T000000+1000_REC.flac -ac 1 20191026T000000+1000_REC_mixdown.flac

# downsample to a different frequency (22050 Hz)
> ffmpeg -i 20191026T000000+1000_REC.flac -ar 22050 20191026T000000+1000_REC_mixdown.flac

# putting it all together:
> ffmpeg -i 20191026T000000+1000_REC.flac -ss 300 -t 60 -ac 1 -ar 22050 20191026T000000+1000_REC_10th_minute.wav

See our ffmpeg guide for more examples.

Segmenting with AP #

  • Uses the same process as audio2csv
  • Has useful defaults
    • mixes down channels
    • resamples to 22050 Hz
    • emits 1-minute WAVE blocks
  • Can customize sample rate, segment size, start and end offsets, segment overlap, segment minimum durations

Example (2 hour FLAC file):

#                <input_file>                  <output_directory>
> AP audiocutter 20191026T000000+1000_REC.flac 20191026T000000+1000_REC_segments/
# <...snip...>
Took 00:01:31.0498711. Done.
> ls 20191026T000000+1000_REC_segments/
20191026T000000+1000_REC_0-60.wav       20191026T000000+1000_REC_3120-3180.wav  20191026T000000+1000_REC_5340-5400.wav
20191026T000000+1000_REC_1020-1080.wav  20191026T000000+1000_REC_3180-3240.wav  20191026T000000+1000_REC_540-600.wav
20191026T000000+1000_REC_1080-1140.wav  20191026T000000+1000_REC_3240-3300.wav  20191026T000000+1000_REC_5400-5460.wav
# <...snip...>

See our AP cutting guide for more examples.

Golden rules #

  1. There is one true date format: ISO8601
    • Compact format is valid and is very useful for filenames: YYYYMMDDTHHMMSS+ZZ
  2. Never delete your originals
  3. Keep your code simple—make use of other tools to do the heavy lifting
  4. Cut on demand
  5. Embrace the shell and the command line tools
    • Knowledge transfer!
    • Common abstraction means easy automation


Next the practical .