The convolution audio effect is traditionally used to sample a room to create artificial reverb. Others have used it extensively for creative purposes, for example convolving guitars with angle grinders and trains. The technology normally requires recording a sound, then analyzing it and then finally loading the analyzed impulse response (IR) into an effect to use it. The Liveconvolver3 let you live sample the impulse response and start convolving even before the recording is finished.
In the context of the crossadaptive project, convolution can be a nice way of imprinting the characteristics of one audio source on another. The live sampling of the IR is necessary to facilitate using it in an improvised manner, reacting immediately to what is played here and now.
There are some aesthetic challenges, namely how to avoid everything turning into a (somewhat beautiful) mush. This is because in convolution all samples of one sound is multiplied with every sample of the other sound. If we sample a long melodic line as the IR, a mere click of the toungue on the other audio channel will fire the whole melodic segment once. Several clicks will create separate echoes of the melody, and a coninuous sound will create literally thousands of echoes. What is nice is that only frequencies that the two signals have in common will come out of the process. So a light whisper will create a high frequency whispering melody (with the long IR described above), while a deep and resonant drone will just let those (spectral) parts of the IR through. Since the IR contains a recording not only of spectral content but also of its evolution over time, it can lend spectrotemporal morphing features from one sound to another. To reduce the mushyness of the processed sound, we can enhance the transients and reduce the sustained parts of the input sound. Even though this kind of (exaggerated) transient designer processing might sound artificial on its own, it can work well in the context of convolutions. The current implementation, Liveconvolver3, does not include this kind of transient processing, but we have done this earlier so it will be easy to add.
There are also some technical challenges to using this technique in a live setting. These are related to amplitude control, and to the risk of feedback when playing on larger speaker systems. The feedback risk occurs because we are taking a spectral snapshop (the impulse response) of the room we are currently playing in (well, of an instrument in that room, but nevertheless, the room is there), then we process sound coming from (another source in) the same room. The output of the process will enhance those frequencies that the two sources have in common, hence the characteristics of the room (and the speaker system) will be amplified, and this generally creates the risk of feedback to arise. Once we have unwanted feedback with convolution, it will also generally take a while (a few seconds) to get rid of, since the nature of the process creates a revereb-like tail to every sound. To reduce the risk of feedback we use a very small frequency shift of the convolver output. This is not usually perceptible, but it disturbs the feedback chain sufficiently to significantly reduce the feedback potential.
The challenge of the overall amplitude control can be tackled by using the sum of all amplitudes in the IR as a normalization factor. This works reasonably well, and is how we do it in the liveconvolver. One obvious exeption being in the case where the IR and the input sound contains overlapping strong resonances (or single lone notes). Then we will get a lot of energy on those overlapping frequency regions, and very little else. We will work on algorithms to attempt normalization in these cases as well.
The effect uses two separate audio inputs, one for the impulse response sampling, and one for the live input to be convolved. We have made it as a stereo effect, but do not expect it to convolve a stereo input. It also creates a mono output in the current implementation (the same signal on both stereo outputs). In the figure we see two input sources. Track 1 receives external audio, and routes it to an aux send to the liveconvolver track, panned left so that it will enter only input 1 to the effect.. Track 2 receives external audio and similarly routes it to an aux send to the liveconvolver track, but panned right so the audio is only sent to input 2 of the effect.
The effect itself has contols for input level, highpass filtering (hpFreq), lowpass frequency (lpFreq) and output volume (convVolume). These controls basically do what the control name says. Then we have controls to set the start time (IR_start) of the impulse response (allow skipping a certain number of seconds into the recording), and the impulse response length (IR_length), determining how many seconds of the IR recording we want to use. There are also controls for fading the IR in and out. Without fading, we might experience clicks and pops in the output. The partition length sets the size of partitioned convolution, higher settings will require less CPU but will also make it respond slower. Usually just leave this at the default 2048. The big green button IR_record enables recording of an impulse response. The current max duration is 5.9 seconds at 44.1 kHz sampling rate. If the maximum duration is exceeded during recording, the recording simply stops and is treated as complete. The convolution process will keep running while recording, using parts of the newly recorded IR as they become available. The IR_release knob controls the amount of overlap between the new instances of convolution created during recording. When recording is done, we fall back to using just one instance again. Finally, the switch_inputs button let us (surprise!) switch the two inputs, so that input 1 will be the IR record and input 2 will be the convolver input. If you want to convolve a source with itself, you would first record an IR then switch the inputs so that the same source would be convolved with its own (previously recorded) IR. Finally, to reduce the potential of audio feedback, the f_shift control can be adjusted. This shifts the entire output upwards by the amount selected. Usually around 1 Hz is sufficient. Extreme settings will create artificial sounding effects and cascading delays.
The effect is written in the audio programming language Csound, and compiled into a VST plugin using a tool called Cabbage. The actual program code is just a small text file (a csd) that you can download here.
You will need to download Cabbage (the bleeding edge version can be found here), then open the csd file in Cabbage and export it as a plugin effect. Put the exported plugin somewhere in your VST path so that your favourite DAW can find it. Then you’re all set.
Routing in other hosts
As a short update, I just came to think that some users might find it complicated to translate that Reaper routing setup to other hosts. I know a lot of people are using Ableton Live, so here’s a screenshot of how to route for the liveconvolver in Live:
- the aux sends are “post” (otherwise the sound would not go through the pan pot, and we need that).
- Because the sends are post, the volume fader has to be up. We will probably not want to hear the direct unprocessed sound, so the “Audio To” selector on the channels is set to “Sends only”
- Both input channels send to the same effect
- The two input channel are panned hard left (ch 1) and hard right (ch 2)
- The monitor selector for the channels is set to “in”, activating the input regardless of arm/recording
Whith all that set up, you can hit “IR_record” and record an IR (of the sound you have on channel 1). The convolver effect will be applied to the sound on channel 2.