Seminar 16. December

Philosophical and aesthetical perspectives

–report from meeting 16/12 Trondheim/Skype

Andreas Bergsland, Trond Engum, Tone Åse, Simon Emmerson, Øyvind Brandtsegg, Mats Claesson

The performers experiences of control :

In the last session ( Trondheim December session ) Tone and Carl Haakon (CH) worked with rhythmic regularity and irregularity as parameters in the analysis. They worked with the same kind of analysis, and the same kind of mapping analysis to effect parameter. After first trying the opposite, they ended up with: Regularity= less effect, Irregularity= more. They also included a sample/hold/freeze effect in one of the exercises. Øyvind commented on how Tone in the video stated that she thought it would be hard to play with so little control, but that she experienced that they worked intuitively with this parameter, which he found was an interesting contradiction. Tone also expressed in the video that on the one side she would sometimes hope for some specific musical choices from CH (“I hope he understands”) but on the other hand that she “enjoyed the surprises”. These observations became a springboard for a conversation about a core issue in the project: the relationship between control and surprise, or between controlling and being controlled . We try to point here to the degree of specific and conscious intentional control, as opposed to “what just happens” due to technological, systemic, or accidental reasons.  The experience from the Trondheim December session was that the musicians preferred what they experienced as an intuitive connection between input and outcome, and that this facilitated the process in the way that they could “act musically”. (This “intuitive connection” is easily related to Simon’s comment about “making ecological sense” later in this discussion.)  Mats commented that in the first Oslo session the performers stated that they felt a similarity to playing with an acoustic instrument. He wondered if this experience had to do with the musicians’ involvement in the system setup, while Trond pointed out that the Trondheim December session and Oslo session were pretty similar in this respect. A further discussion about what “control”, “alienation” and “intuitive playing” can mean in these situations seems appropriate.

Aesthetic and individual variables

This led to a further discussion about how we should be aware that the need for generalising and categorising – which is necessary at some point to actually be able to discuss matters – can lead us to overlook important variable parameters such as:

  • Each performer’s background, skills, working methods, aesthetics and preferences
  • That styles and genres relate differently to this interplay

A good example of this is Maja’s statement in the Brak/Rug session that she preferred the surprising, disturbing effects, which gave her new energy and ideas. Tone noted that this is very easy to understand when you have heard Maja’s music, and even easier if you know her as an artist and person. And it can be looked upon as a contrast to Tone/CH who seek a more “natural” connection between action and sounding result, in principle they want the technology to enhance what they are already doing. But, as cited above, Tone commented that this is not the whole truth. Surprises are also welcome in the Tone/Carl Haakon collaboration.

Simon underlined, because of these variables, the need to pin down in each session what actually happens , and not necessarily set up dialectical pairs. Øyvind pointed out, on the other hand, the need to lay out possible extremes and oppositions to create some dimensions (and terms) along which language can be used to reflect on the matters at hand.

Analysing or experiencing?

Another individual variable, both as audience and performer, is the need to analyse, to understand what is happening in the perceiving of a performance. One example brought up related to this was Andreas’ experience of his change of audience perspective after he studied ear training. This new knowledge led him to take an analysing perspective, wanting to know what happened in a composition when performed. He also said: “as an audience you want to know things, you analyse”. Simon referred to “the inner game of tennis” as another example: how it is possible to stop appreciating playing tennis because you become too occupied analysing the game – thinking of the previous shot rather than clearing the mind ready for the next. Tone pointed at the individual differences between performers, even within the same genre (like harmonic jazz improvisation) – some are very analytic, also in the moment of performing, while others are not. This also goes for the various groups of audiences, some are analytic, some are not – and there is also most likely a continuum between the analytic and the intuitive musician/audience.  Øyvind mentioned experiences from presenting the crossadaptive project to several audiences over the last few months. One of the issues he would usually present is that it can be hard for the audience to follow the crossadaptive transformations, since it is an unfamiliar mode of musical (ex)change.  However, responses to some of the simpler examples he then played (e.g. amplitude controlling reverb size and delay feedback), yielded the response that it was not hard to follow. One of the places where this happened was Santa Barbara, where Curtis Roads commented he thought it quite simple and straightforward to follow. Then again, in the following discussion, Roads also conceded that it was simple because the mapping between analysis parameter and modulated effect was known. Most likely it would be much harder to deduce what the connection was just by listening alone, since the connection (mapping) can be anything. Cross Adaptive processing may be a complicated situation, not easy to analyse either for the audience or the performer. Øyvind pointed towards differences in parameters, as we had also discussed collectively: That some were more “natural” than others, like the balance between amplitude and effect, while some are more abstract, like the balance between noise and tone, or regular/irregular rhythms.

Making ecological sense/playing with expectations

Simon pointed out that we have a long history of connections to sound: some connections are musically intuitive because we have used them perhaps for thousands of years, they make ‘ecological’ sense to us. He referred to Eric Clarke’s “ Ways of listening: An ecological approach to the perception of musical meaning (2005)” and William Gaver “ What in the world do we hear?: An ecological approach to auditory event perception ” (1993). We come with expectations towards the world, and one way of making art is playing with those expectations. In modernist thinking there is a tendency to think of all musical parameters as equal – or at least equally organised – which may easily undermine their “ecological validity” – although that need not stop the creation of ‘good music’ in creative hands.

Complexity and connections

So, if the need for conscious analysis and understanding will vary between musicians, is this the same for the experienced connection between input and output? And what about the difference between playing and listening as part of the process, or just listening, either as a musician, musicologist, or an audience member? For Tone and Carl Haakon it seemed like a shared experience that playing with regularity/non- regularity felt intuitive for both – while this was actually hard for Øyvind to believe, because he knew the current weakness in how he had implemented the analysis. Parts of the rhythmic analysis methods implemented are very noisy, meaning they produce results that sometimes can have significant (even huge) errors in relation to a human interpretation of the rhythms being analysed. The fact that the musicians still experienced the analyses as responding intuitively is interesting, and it could be connected to something Mats said later on: “the musicians listen in another way, because they have a direct contact with what is really happening”. So, perhaps, while Tone &CH experienced that some things really made musical sense, Øyvind focused on what didn’t work – which would be easier for him to hear?  So how do we understand this, and how is the analysis connected to the sounding result? Andreas pointed out that there is a difference between hearing and analysing: you can learn how the sound behaves and work with that. It might still be difficult to predict exactly what will happen.Tone’s comment here was that you can relate to a certain unpredictability and still have a sort of control over some larger “groups of possible sound results“ that you can relate to as a musician. There is not only an urge to “make sense” (= to understand and “know” the connection) but also an urge to make “aesthetical sense”.

With regards to the experienced complexity, Øyvind also commented that the analysis of a real musical signal is in many ways a gross simplification, and by trying to make sense of the simplification we might actually experience it as more complex. The natural multidimensionality of the experienced sound is lost, due to the singular focus on one extracted feature. We are reinterpreting the sound as something simpler. An example mentioned was the vibrato, which is a complex input and a complex analysis, that could in some analyses be reduced to a simple “more or less” dimension. This issue also relates to the needs of our project to construct new methods of analysis, so that we can try to find analysis dimensions that correspond to some perceptual or experiential features.

Andreas commented “It is quite difficult to really know what is going on without having knowledge of the system and the processes. Even simple mappings can be difficult to grasp only by ear”. Trond reminded us after the meeting about the further complexity that was perhaps not so present in our discussion: we do not improvise with only one parameter “out of control” (adaptive processing). In the cross adaptive situation someone else is processing our own instrument, so we do not have full control over this output, and at the same time we do not have any control over what we are processing, the input (cross- adapting), which in both cases could represent an alienation and perhaps a disconnection from the input-result- relation. And of course the experience of control is also connected to “understanding” the processing analysis you are working with.

The process of interplay :

Øyvind referred to Tone’s experience of a musical “need” during the Trondheim session, expressed: ”I hope he understands…” –  when she talked about the processes in the interplay. This was pointing at how you realise during the interplay that you have very clear musical expectations and wishes towards the other performer.  This is not in principle different from a lot of other musical improvising situations. Still, because you are dependent on the other’s response in a way that is defining not only the wholeness, but your own part in it, this thought seemed to be more present than is usual in this type of interplay.

Tools and setup

Mats commented that very many of the effects that are used are about room size, and that he felt that this had some – to him – unwanted aesthetical consequences. Øyvind responded that he wanted to start with effects that are easy to control and easy to hear the control of. Delay feedback and reverb size are such effects. Mats also suggested that it was an important aesthetical choice not to have effects all the time, and thereby have the possibility to hear the instrument itself. So to what extent should you be able to choose? We discussed the practical possibilities here: some of the musicians (for example Bjørnar Habbestad) have suggested a foot pedal,  where the musician could control the degree to which their actions will inflict changes on the other musician’s sound (or the other way around, control the degree to which other forces can affect his sound). Trond suggested one could also have control over the level of signal/output  for the effects, adjusting the balance between processed and unprocessed sound. As Øyvind commented, these types of control could be a pedagogical tool for rehearsing with the effect, turning the processing on and off to understand the mapping better. The tools are of course partly defining the musician’s balance between control, predictability and alienation. Connected to this, we had a short discussion regarding amplified sound in general, that the instrumental sound coming from a speaker located elsewhere in the room in itself could already represent an alienation. Simon referred to the Lawrence Casserley /Evan Parker principle of “each performer’s own processor”, and the situation before the age of the big PA, where the electronic sound could be localised to each musician’s individual output. We discussed possibilities and difficulties with this in a cross adaptive setting: which signal should come out of your speaker? The processed sound of the other, or the result of the other processing you? Or both? and then what would the function be – the placement of the sound is already disturbed.

Rhythm

New in this session was the use of the rhythmical analysis. This is very different from all other parameters we have implemented so far. Other analyses relate to the immediate sonic character, but rhythmic analysis tries to extract some  temporal features, patterns and behaviours. Since much of the music played in this project is not based on a steady pulse, and even less confined to a regular grid (meter), the traditional methods of rhythmic analysis will not be appropriate. Traditionally one will find the basic pulse, then deduce some form of meter based on the activity, and after this is done one can relate further activity to this pulse and meter. In our rhythmical analysis methods we have tried to avoid the need to first determine pulse and meter, but rather looked into the immediate time relationships between neighbouring events. This creates much less support for any hypothesis the analyser might have about the rhythmical activity, but also allows much greater freedom of variation (stylistically, musically) in the input. Øyvind is really not satisfied with the current status of the rhythmic analysis (even if he is the one mainly responsible for the design), but he was eager to hear how it worked when used by Tone and Carl Haakon.  It seems that the live use by real musicians allowed the weaknesses of the analyses to be somewhat covered up. The musicians reported that they felt the system responded quite well (and predictably) to their playing. This indicates that, even if refinements are much needed, the current approach is probably a useful one. One thing that we can say for sure is that some sort of rhythmical analysis is an interesting area of further exploration, and that it can encode some perceptual and experiential features of the musical signal in ways that make sense to the performers. And if it makes sense to the performers, we might guess that it will have the possibility of making sense to the listener as well.

Andreas: How do you define regularity (ex. claves-based musics), how “less regular” is that from a steady beat.

Simon: If you ask a difficult question with a range of possible answers this will be difficult to implement within the project.

As a follow up to the refinement of rhythmic analysis, Øyvind asked:  how would *you* analyze rhythm?

Simon: I wouldn’t analyze rhythm for example timeline in African music: a guiding pulse that is not necessarily performed and may exist only in the performer’s head. (This relates directly to Andreas’s next point – Simon later withdrew the idea that he would not analyse rhythm and acknowledged its usefulness in performance practice.)

Andreas: Rhythm is a very complex phenomenon, which involves multiple interconnected temporal levels, often hierarchically organised. Perceptually, we have many ongoing processes involving present, past and anticipations about future events. It might be difficult to emulate such processes in software analysis. Perhaps pattern recognition algorithms can be good for analysing rhythmical features?

Mats: what is rhythm? In our examples: gesture may be more useful than rhythm

Øyvind: rhythm is repeatability, perhaps? Maybe we interpret this in the second after

Simon: no I think we interpret virtually at the same time

Tone: I think of it as a bodily experience first and foremost.  (Thinking about this in retrospect, Tone adds: The impulses -when they are experienced as related to each other- creates a movement in the body. I register that through the years (working with non-metric rhythms in free improvisation) there is less need for a periodical set of impulses to start this movement. (But when I look at babies and toddlers, I recognise this bodily reaction to irregular impulses.) I recognice what Andreas says- the movement is trigged by the expectation of more to come. (Think about the difference in your body when you wait for the next impulse to come (anticipations) and when you know that it is over….)

Andreas: When you listen to rhythms, you group events on different levels and relate to what was already played. The grouping is often referred to as «chunking» (psychology). Thus, it works both on an immediate level (now) as well as a more overarching level (bar, subsection, section) because we have to relate what we hear to the earlier. You can simplify or make it complex

Oslo, First Session, October 18, 2016

First Oslo Session. Documentation of process
18.11.2016

Participants
Gyrid Kaldestad, vocal
Bernt Isak Wærstad, guitar
Bjørnar Habbestad, flute

Observer and Video
Mats Claesson

The Session took place in one of the sound studios at the Norwegian Academy of Music, Oslo , Norway

Gyrid Kaldestad (vocal) and Bernt Isak Wærstad (guitar) had one technical/setup meeting beforehand, and there were numerous emails going back and forth before the session that was about technical issues.
Bjørnar Habbestad (flute) where invited into the session.

The observer, decided to make a video documentation of the session.
I’m glad I did because I think it gives a good insight off the process. And a process it was!
The whole session lasted almost 8 hours and it was not until the very last 30 minutes that playing started.

I am (Mats Claesson) not going to comment on the performative musical side of the session. The only reason for this is that the music making happend at the very end of the session, was very short and it was not recorded so I could evaluate it “in depth” However, just watch the comments, from the participants, at the end of the video. They are very positive…..
I think from the musicians side it was rewarding and highly interesting. I am confident that the next session will generate an musical outcome that is substantial enough to be comment on, from both a performative and a musical side.

In the video there are no processed sound of the very last playing due to use of headphone, but you can listen to excerpts posted below the video.

Here is a link to the video

Reflections on the process given from the perspective of the musicians:

We agreed to make a limited setup to have better control over the processing. Starting with basic sounds and basic processing tools so that we easier could control the system in a musical way. We started with a tuning analysis for each instrument (voice, flute, guitar)

Instead of chosing analysis parameter up front, we analysed different playing techniques, e.g. non- tonal sounds (sss, shhh), multiphonics etc., and saw how the analyser responded. We also recorded short samples of the different techniques that each of us usually play, so that we could investigate the analysis several times.

This is the analysis results we got:

analysis

Since we’re all musicians experienced with live processing, we made a setup based on effects that we already know well and use in our live-electronic setup (reverb, filter, compression, ring modulation and distortion).

To set up meaningful mappings, we chose an approach that we entitled “spectral ducking”, where a certain musical feature on one instrument would reduce the same musical feature on the other – e.g. a sustained tonal sound produced by the vocalist, would reduce harmonic musical features of the flute by applying ring modulation. Here is a complete list of the mappings used:

mapping

Excerpt #1 – Vocal and flute

Excerpt #2 – Vocal and flute

Excerpt #3 – Vocal and flute

Excerpt #4 – Vocal and flute

Lack of consisive and presise analysis results from the guitar in combination with time limitation, it wasn’t possible to set up mappings for the guitar and flute. We did however test out the guitar and flute in the last minutes of the session, where the guitar simply took the role of the vocal in terms of processing and mapping. A knowledge of the vocal analysis and mapping, made it possible to perform with the same setup even though the input instrument had changed. Some short excerpts from this performance can be heard below.

Excerpt #5 – Guitar and flute

Excerpt #6 – Guitar and flute

Excerpt #7 – Guitar and flute

Reflections and comments:

  • We experienced the importance of exploring new tools like this on a known system. Since none of us knew Reaper from before, we used spent quite a lot of time learning a new system (both while preparing and during the session)
  • Could the meters analyser be turned the other way around? It is a bit difficult to read sideways.
  • It would be nice to be able to record and export control data from the analyser tool that will make it possible to use it later in a synthesis.
  • Could it be an idea to have more analyzer sources pr channel? The Keith McMillian Softstep mapping software could possibly be something to look at for inspiration?
  • The output is surprisingly musical – maybe this is a result of all the discussions and reflections we did before we did the setup and before we played?
  • The outcome is something else than playing with live electronics- it is immediate and you can actually focus on the listening – very liberating from a live electronics point of view!
  • The system is merging the different sounds in a very elegant way.
  • Knowing that you have an influence on your fellow musicians output forces you to think in new ways when working with live electronics.
  • The experience for us is that this is similar to work acoustically.

Brief system overview and evaluation

As preparation for upcoming discussions about tecnical needs in the project, it seems appropriate to briefly describe the current status of the software developed so far.

analyzer_2016_10
The Analyzer

The plugins

The two main plugins developed is the Analyzer  and the MIDIator. The Analyzer extracts perceptual features from a live audio signal and transmit signals representing these features over a network protocol (OSC) to the MIDIator. The job of the MIDIator is to combine different analyzed features (scaling, shaping, mixing, gating) into a controller signal that we will ultimately use to control some effect parameter. The MIDIator can run on a different track in the same DAW, it can run on another DAW, or on another computer entirely.

Strong points

The feature extraction generally works reasonably well for the signals it has been tested on. Since a limited set of signals is readily available during implementation, some overfitting to these signals can be expected. Still, a large set of features is extracted, and these have been selected and tweaked for use as intentional musical controllers . This can sometimes differ from the more pure mathematical and analytical descriptions of a signal. The quality of of our feature extraction can best be measured in how well a musician can utilize it to intentionallly control the output. No quantitative mesurement of that sort have been done so far. The MIDIator contains a selection of methods to shape and filter the signals, and to combine them in different ways. Until recently, the only way to combine signals (features) was by adding them together. As of the past two weeks, mix methods for absolute difference, gating, and sample/hold has been added.

midiator_modules_2016_10
MIDIator modules

Weak points

The signal chain transmission from Analyzer to MIDIator, and then again from the MIDIator to the control signal destination each incurs at least one sample block latency. The size of a sample block can vary from system to system, but regardless of the size used our system will have 3 times this latency before an effect parameter value changes in response to a change in the audio input. For many types of parameter changes this is not critical, still it is a notable limitation of the system.

The signal transmission latency points at another general problem, interfacing between technologies. Each time we transfer signals from one paradigm to another we have the potential for degraded performance, less stability and/or added latency. In our system the interface from the DAW to our plugins will incur a sample block of latency, the interface between Csound and Python can sometimes incure performance penalties if large chunks of data needs to be transmitted from one to the other. Likewise, the communication between the Analyzer and MIDIator is such an interface.

Some (many) of the feature extraction methods create somewhat noisy signals. With noise, we mean here that the analyzer output can intermittently deviate from the value we perceptually assume to be “correct”. We can also look at this deviation statistically, if we feed it relatively (perceptually) consistent signals and look at how stable the output of each feature extraction method is. Many of the features show activity generally in the right register, and a statistical average of the output corresponds with general perceptual features. While the average values are good, we will oftentimes see spurious values with relatively high deviation from the general trend. From this, we can assume that the feature extraction model generally works, but intermittently fails. Sometimes, filtering is used as an inherent part of the analysis method, and in all cases, the MIDIator has a moving exponential average filter with separate rise and fall times. Filtering can be used to cover up the problem, but better analysis methods would give us more precise and faster response from the system.

Audio separation between instruments can sometimes be poor. In the studio, we can isolate each musician, but if we want them to be able to play together naturally in the same room, a significant bleed from one instrument to the other will occur. For live performance this situation is obviously even worse. The bleed give rise to two kinds of problems: Signal analysis is disturbed by the signal bleed, and signal processing is cluttered. For the analysis, it does not matter if we had perfect analysis methods if the signal to be analyzed is a messy combination of opposing perceptual dimensions. For the effect processing, controlling an effect parameter for one instrument leads to a change in the processing of the other instrument, just because the other instruments’ sound bleed into the first instrument’s microphones

Useful parameters (features extracted)

In many of the sessions up until now, the most used features has been amplitude (rms) and transient density. One reson for this is probably that they are concptually easy to understand, another is that their output is relatively stable and predictable in relation to the perceptual quality of the sound analyzed. Here are some suggestions of other parameters that expectedly can be utilized effectively in the current implementation:

  • envelope crest ( env_crest ): the peakyness of the amplitude envelope, for sustained sounds this will be low, for percussive onsets with silence between evens it will be high
  • envelope dynamic range ( env_dyn ): goes low for signals operating at a stable dynamic level, high for signals with a high degree of dynamic variation.
  • pitch: well known
  • spectral crest ( s_crest) : goes low for tonal sounds, medium for pressed tones, high for noisy sounds.
  • spectral flux ( s_flux ): goes high for noisy sounds, low for tonal sounds
  • mfccdiff: measure of tension or pressedness, described here

There is also another group of extracted features that is potentially useful but still has some stability issues

  • rhythmic consonance ( rhythm_cons ) and rhythmic irregularity ( rhythm_irreg ): described here
  • rhythm autocorr crest ( ra_crest ) and rhythm autocorr flux ( ra_flux ): described here

The rest of the extracted features can be considered more experimental, in some cases they might yield effective controllers, especially when combined with other features in reasonable proportions

MIDIator mix methods

The first instances of the MIDIator could sum two analysis signals, with separate scaling and sign inversion for each one. Recently, we’ve added two new methods for mixing those two signals, so it warrants this post to explain how it works and which problems it is intended to solve.

For all mix methods there is a MIDI output configuration to the right. All modules output MIDI controller values. We can set the MIDI channel and controller number, enable/disable the module, and also enter notes about the mapping/use of the module. Notes will be saved with the project.

Do take care however, for your own records, take a screenshot of your settings and save this with your project. The plugins can and will change during the further development of the project. If this leads to changes in the GUI configuration (i.e. the number of user interface elements) changes, there is a high probability that not everything will be recalled correctly. In that case you must reconstruct your settings from the (previously saved) screenshot.

Add

Each of the two signals have separate filtering, with separate setting for the rise and fall times. The two signals are scaled (with scale range being from -1 to + 1), and then added together. If both signals are scaled positively, each of them affects the output positively. If one is scaled negatively and the other positively, more complex interactions between them will form. For example, with rms (amplitude) being scaled negatively while transient density is scaled positively; the output will increase with high transient density, but only if we are not playing loud.

midiator_add
Example of the “add” mix method. Higher amplitude (rms) will decrease the output, while higher transient density will increase the output. This means that soft fast playing will produce the highest possible output with these settings.

Abs_diff

This somewhat cryptic term refers to the absolute difference between two signals. We can use it to create an interaction model between two signals, where the output goes high only if the two analysis signals are very diffferent . For example, if we analyze amplitude (rms) from two different musicians, the resulting signal will be low as long as they both play in the same dynamic register. If one plays loud while the other plays soft, the output will be high, regardless of which of the two plays loud. It could also of course be applied to two different analysis signals from the same musician, for example the difference between pitch and the spectral centroid.

midiator_absdiff
Example of the abd_diff mix method. Here we take the difference in amplitude between two different acoustic sources. The higher the difference, the higher the output, regardless of which of the two inputs are loudest.

Gate

The gate mixmethod can be used to turn things on or off. It can also be used to enable/disable the processing of another MIDIator module, effectively acting as a sample and hold gate. The two input channels now are used for different purposes: one channel turns the gate on , the other channel turns the trigger off . Each channel has a separate activation threshold (and selection if the signal must pass the threshold moving upwards or downwards to activate). For simple purposes, this can act like a Schmitt trigger , also termed hysteresis in some applications. This can be used to reduce jitter noise in the output, since the activation and deactivation thresholds can be different.

midiator_gate_same
Gate mixmethod used to create a Schmitt trigger. The input signal must go higher than the activation threshold to turn on. Then it will stay on until the input signal crosses the lower deactivation threshold.

It can also be used to create more untraditional gates. A simple variation let us create a gate that is activated only if the input signal is within a specified band. To do this, the activation threshold must be lower than the deactivation threshold, like this:

midiator_gate_same_band
Band-activated gate. The gate will be activated once the signal crosses the (low) activation threshold. Then it will be turned off once the signal crosses the (higher) deactivation threshold in an upward direction. To activate again, the signal must go lower than the activation threshold.

The up/down triggers can be adjusted to fine tune how the gate responds to the input signal. For example, looking at the band-activated gate above: If we change the deactivation trigger to “down”, then the gate will only turn off after the signal has been higher than the deactivation threshold and then is moving downwards .

So far, we’ve only looked at examples where the two input signals to the gate is the same signal. Since the two input signals can be diffeerent (even come from two different acoustic source, highly intricate gate behaviour can be constructed. Even though the conception of such signal-interdependent gates can be complex (inventing which signals could interact in a meaningful way), the actual operation of the gate is technically no different. Just for the case of the example, here’s a gate that will turn on if the transient density goes high, then it will turn off when the pitch goes high. To activate the gate again, the transient density must first go low , then high .

midiator_gate_different
Gate with different activation and deactivation signals. Transient density will activate the gate, while pitch will deactivate it.

Sample and hold:

The gate mixmethod also can affect the operation of another MIDIator module. is currently hardcoded, so that it will only affect the next module (the one right below the gate). This means that, when the gate is on, the next MIDIator module will work as normal, but when the gate is turned off it will retain the value it has reached at the moment the gate is turned off. In traditional signal processing terms: sample and hold . To enable this function, turn on the button labeled “s/h”.

midiator_gate_sh
The topmost of these two modules agt as a sample and hold gate for the lower module. The lower module mapping amplitude to positively affect the output value, but is only enabled when the topmost gate is activated. The situation in the figure shows the gate enabled.

Evolving Neural Networks for Cross-adaptive Audio Effects

I’m Iver Jordal and this is my first blog post here. I have studied music technology for approximately two years and computer science for almost five years. During the last 6 months I’ve been working on a specialization project which combines cross-adaptive audio effects and artificial intelligence methods. Øyvind Brandtsegg and Gunnar Tufte were my supervisors.

A significant part of the project has been about developing software that automatically finds interesting mappings (neural networks) from audio features to effect parameters. One thing that the software is capable of is making one sound similar to another sound by means of cross-adaptive audio effects. For example, it can process white noise so it sounds like a drum loop.

Drum loop (target sound):

White noise (input sound to be processed):

Since the software uses algorithms that are based on random processes to achieve its goal, the output varies from run to run. Here are three different output sounds:

These three sounds are basically white noise that have been processed by distortion and low-pass filter. The effect parameters were controlled dynamically in a way that made the output sound like the drum loop (target sound).

This software that I developed is open source, and can be obtained here:

https://github.com/iver56/cross-adaptive-audio

It includes an interactive tool that visualizes output data and lets you listen to the resulting sounds. It looks like this:

visualization-screenshot
For more details about the project and the inner workings of the software, check out the project report:

Evolving Artificial Neural Networks for Cross-adaptive Audio (PDF, 2.5 MB)

Abstract:

Cross-adaptive audio effects have many applications within music technology, including for automatic mixing and live music. The common methods of signal analysis capture the acoustical and mathematical features of the signal well, but struggle to capture the musical meaning. Together with the vast number of possible signal interactions, this makes manual exploration of signal mappings difficult and tedious. This project investigates Artificial Intelligence (AI) methods for finding useful signal interactions in cross-adaptive audio effects. A system for doing signal interaction experiments and evaluating their results has been implemented. Since the system produces lots of output data in various forms, a significant part of the project has been about developing an interactive visualization tool which makes it easier to evaluate results and understand what the system is doing. The overall goal of the system is to make one sound similar to another by applying audio effects. The parameters of the audio effects are controlled dynamically by the features of the other sound. The features are mapped to parameters by using evolved neural networks. NeuroEvolution of Augmenting Topologies (NEAT) is used for evolving neural networks that have the desired behavior. Several ways to measure fitness of a neural network have been developed and tested. Experiments show that a hybrid approach that combines local euclidean distance and Nondominated Sorting Genetic Algorithm II (NSGA-II) works well. In experiments with many features for neural input, Feature Selective NeuroEvolution of Augmenting Topologies (FS-NEAT) yields better results than NEAT.