Music Mixing requires two songs to be at the same tempo.
If the tempo differs, it needs to be adjusted.
For that task, there exist many different algorithms.
Basic algorithm in digital signal processing
All more specialized algorithms utilize OLA
OLA produces significant artifacts in the output signal, which is especially noticable in harmonic structures.
Speech | Music | |
---|---|---|
Original (48kHz) | ||
played 20% faster (57.6kHz) | ||
OLA 20% faster (48kHz) | ||
played 20% slower (38.4kHz) | ||
OLA 20% slower (48kHz) |
github.com/Itja/ola
)Developed by W. Verhelst and M. Roelands in 1993 at Vrije Universiteit Brussel [VR93]
Still used today via various audio processing libraries that are used in programs such as Foobar2000, Audacity, Rhythmbox, Firefox and Chrome [DMDP16]
Idea: Move each pair of overlapping frames around a bit before merging them, such that their waveforms are as similar as possible
Space complexity $\mathcal O(n)$ (with $n$ being the frame size)
Time complexity $\mathcal O(n \cdot \log_2n)$ [DMDP16]
Therefore, with the right equipment, suited for real-time usage.
There exist many proposals for further reduction of WSOLA complexity (e.g. by estimating the optimal shift [KLK+10])
Original | |
OLA 20% faster | |
WSOLA 20% faster | |
OLA 20% slower | |
WSOLA 20% slower |
Name | Algorithm | Audio Artifacts |
---|---|---|
Vexwarp | Phase Vocoder | Metal Tunnel |
tempo-sox.js | WSOLA | Unknown |
PhaseVocoder.JS | Phase Vocoder | Smeared Transients |
OLA-TS.JS | Modified OLA | Modulation in harm. struct. |
[D11] J. Driedger, Time-Scale Modification Algorithms for Music Audio Signals
, M.Sc. Thesis, Saarland University, 2011
[DMDP16] B. Dias, D. M. Matos, M. Davies and H. S. Pinto, Time Stretching & Pitch Shifting with the Web Audio API: Where are we at?
in Proceedings of Web Audio Conference (WAC), 2016
[KLK+10] D. S. Kim et al., Complexity Reduction of WSOLA-Based Time-Scale Modification Using Signal Period Estimation
in Future generation Communication and Networking (FGCN), 2010, pp. 155—557
[S03] S. W. Smith, FFT Convolution
in Digital Signal Processing, Newnes, USA, 2003, pp. 311—318
[S13] R. Schmakeit, Destiny Can Wait, Music, 2013 (https://youtu.be/J1FX7Klafng)
[VR93] W. Verhelst, M. Roelands, An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech
in Acoustics, Speech, and Signal Processing (ICASSP), vol. 2, 1993, pp. 554—557
Questions?