As we’ve now done some initial experiment sessions and gotten to know the terrain a little bit better, it seems a good time to sum up the different working methods for cross adaptive processing in a live performance setting. Some of this is based on my previous article for DAFx-15, but extended due to the practical work we’ve done on the subject during the last months. I’ve also written about this in an earlier blog post here.
This type of effect takes two (mono) signals as input, and apply a transformation to one signal according to the characteristics of the other signal. In this cathegory we typically find spectral morphing effects, resonators, convolvers). It is a quite practical effects type with regards to signal routing, as it can easily be inserted into any standard audio mixing signal flow (use the effect on a stereo aux channel, aux send to the effect from two sources, each source panned hard left and right).
We have some proposals/ideas for new effects in this cathegory, including streaming convolution, cross adaptive resonators, effects inspired by the modal reverberation techniques, and variations to common spectral processors adapted to the crossadaptive domain.
This type of signal interaction can be found in a conventional audio production, most commonly for dynamics processing (e.g. the genre-typical kick-drum-ducking-synth-pads). Multiband variations of sidechaining can be seen in de-essing applications, where the detector signal is filtered so that dynamics processing is only applied whenever there is significant energy in a specific frequency region. The application has traditionally been limited to fixing cosmetic errors (e.g. dampening the too prounounced “s” sounds of a vocal recording with high sibliance). We’ve experimented with using multiband sidechaining with a wider selection of effects types. The technique can be used to attenuate or amplify the signal going into any effects processor, for example boosting the amount of vocal signal going into a pitch shifter in accordance with the amount of energy on the low frequencies of the guitar track. This type of workflow is practical in that it can be done with off-the-shelf commercially available tools, just applying them creatively and bending their intended use a bit. Quite complex routing scenarios can be created by combining different sidechains. As an extension of the previous example, the amount of vocal signals going into the pitch shifter increase with the low energy content of the guitar but only if we do not have significant energy in the high frequency region of the vocal signal. The workflow is limited to adjusting the amplitude of signals going into an effects processor. In all its simplicity, adjusting the amplitude of the input signal is remarkably effective and can create dramatic effects. It can not, however, directly modify the parameters of the processing effect, so the effect will always stay the same.
Envelope follower as modulation source
As a a slight extension of the sidechaining workflow, we can use envelope followers (on the full signal or a band-limited part of it) as modulation sources. This is in fact already done in the sidechaining processors, but the envelope followers are generally hardwired to control a compressor or a gate. In some DAWs (e.g. Reaper and Live), the output of the envelope follower can be routed more freely and thus used as a modulator for any processing parameter in the signal chain. This can be seen as a middle ground of crossadaptive workflows, combining significant complexity with widely available commrecial tools. The workflow is not available in all DAWs, and indeed it is limited to analyzing the signal energy (in any separate frequency band of the input signal). For complex mappings, it might become cumbersome or unwieldy, as the mapping between control signals and effects parameters is distributed, so it happens a little bit here and a little bit there in our processing chain. Still it is a viable option if one needs to conform to using only commercially available tools.
Analysis and automation
This is the currently preferred method and the one that gives the most flexibility in terms of what features can be extracted by analysis of the source signal, and how the resulting modulator signal can be shaped. The current implementation of the interprocessing toolkit allows any analysis methods available in Csound to be used, and the plugins are interfaced to the DAW as VST plugins. The current modulator signal output is in the form of Midi or OSC, but other automation protocols will be investigated. Extension of Csound’s analysis methods can be done by interfacing to excisting technologies used in e.g. music information retrieval, like VAMP plugins, or different analysis libraries like Essentia and Aubio. There is still an open question how to best organize the routing, mixing and shaping of the modulator signals. Ideally, the routing system should give an intuitive insight into the current mapping situation, provide very flexible mappping strategies, and being able to dynamically change from one mapping configuration to another. Perhaps we should look into the modulation matrix, as used in Hadron?
Mapping analyzed features to processing parameters
When using the analyzer plugin, it is also a big question how do we design the mapping of analyzed features to effect processing parameters. Here we have at least 3 methods:
- manual mapping design
- autolearn features to control signal routing
- autolearn features to control effects parameters
For method 1) we have to continue experimenting to familiarize ourselves with the analyzed features and how they can be utilized in musically meaningful ways to control effects parameters. The process is cumbersome, as the analyzed features behave quite differently on different types of source signals. As an example, spectral flux will quite reliably give an indicator of the balance between noise and tone in a vocal signal, and even if it does so also on a guitar signal, the indication is more ambiguous and thus less reliable for use as a precise modulation source for parameter control. On the positive side, the complex process of familiarizing ourselves with the analysis signals will also give us an intuitive relation to the material, which is one cruicial aspect of being able to use it as performers.
Method 2) has been explored earlier by Brech De Man et al in an intelligent audio switch box. This plugin learns features (centroid, flatness, crest and roll-off) of an audio source signal and uses the learned features to classify the incoming audio, to be routed to either output A or B from the plugin. We could adapt this learning method to interpolate between effects parameter presets, much in the way the different states are interpolated in the Hadron synthesizer.
Method 3 can most probably be soved in a number of ways. Our current approach is based on a more extensive use of artificial intelligence methods than the classification in 2) above. The method is currently being explored by NTNU/IDI master student Iver Jordal, and his work-in-progress toolkit is available here. The task for the A.I is to figure out the mapping between analyzed features and effects parameters based on which sound transformations are most successful. The objective measurement of successful is obviously a challenge, as a number of aesthetic considerations normally apply to what we would term a successful transformation of a piece of audio. As a starting point we assume that in our context, a successful transformation makes the processed sound pick up some features of the sound being analyzed. Here we label the analyzed sound as sound A, the input sound for processing is sound B, and the result of the processing is sound C. We can use a similarity measure between sound A and C to determine the degree of success for any given transformation (B is transformed to C, and C should become more similar to A than B already is). Iver Jordal implemented this idea using a genetic algorithmto evolve the connections of a neural network controlling the mapping from analyzed features to effect control parameters. Evolved networks can then later be transplanted to a realtime implementation, where the same connections be used on potentially different sounds. This means we can evolve modulation mappings suitable for different instrument combinations and try those out on a wider range of sounds. This application would also potentially gain from being able to interpolate between different modulator routings as described above.