Skype on philosophical implications

Skype with Solveig Bøe, Simon Emmerson and Øyvind Brandtsegg.

Our intention for the conversations was to start sketching out some of the philosophical implications of our project. Partly as a means to understand what we are doing and what it actually means, and partly as a means of informing our further work, choosing appropriate directions and approaches. Before we got to these overarching issues, we also discussed some implications of the very here and now of the project. We come back to the overarching issues in the latter part of this post.


Initially, we commented on rhythm analysis not based on the assumption of an underlying constant meter or pulse. Work of Ken Fields was mentioned, on internet performance and the nature of time. Also John Young’s writing on form and phrase length (in Trevor Wishart’s music); Ambrose Seddon’s chapter on time structures in Emmerson/Landy’s recent book “Expanding the Horizon of Electroacoustic Music Analysis”. In live performance there is also George Lewis’ rhythm analysis (or event analysis) in his interactive pieces (notably Voyager). See

Score as guide

In the context of Oeyvind being at UCSD, where one of their very strong fields of competence is contemporary music performance, he thought loosely about a possible parallel between a composed score and the cross-adaptive performance setting. One could view the situation designed for/around the performers in a cross-adaptive setting as a composition, in terms of it setting a direction for the interplay and thus also posing certain limits (or obstacles/challenges) to what the performer can and cannot do. On the possible analogy between the traditional score and our crossadaptive performance scenario, one could flag the objection that a classical performer not so much just follow the do’s and don’ts described  in a score but rather uses the score as a means to engage in the composer’s intention. This may or may not apply to the musics created in our project, as we strive to not limit ourselves to specific musical styles, still leaning solidly against the quite freely improvised expression in a somewhat electroacoustic timbral environment. In some of our earlier studio sessions, we noted that there is two competing modes of attention:

  • To control an effect parameter with one’s audio signal, or
  • to respond musically to the situation

… meaning that what the performer chooses to play would traditionally be based on an intended musical statement to contribute to the current musical situation, whereas now she might rather play something that (via its analyzed features) will modulate the sound of another instrument in the ensemble. This new something, being a controller signal rather than a self-contained musical statement, might actually express something of its own in contradiction to the intended (and assumedly achieved) effect it has on changing the sound of another instrument. Now on top of this consider that some performers (Simon was quoting the as yet unpublished work of postgraduate and professional bass clarinet performer Marij van Gorkom) experience the use of electronics as perhaps disturbing the model of engagement with the composer’s intention through the written score. This being something that one needs to adjust to and accommodate on top of the other layers of mediation. Then we might consider our crossadaptive experiments as containing the span of pain to love in one system.

…and then to philosophy

A possible perspective suggested by Oeyvind was a line connecting Kant-Heidegger-Stiegler, possibly starting offside but nevertheless somewhere to start. The connection to Kant being his notion of the noumenal world (where the things in themselves are) and the phenomenal world (which is where we can experience and to some extent manipulate the phenomena). Further, to Heidegger with his thoughts on the essence of technology. Rather than viewing technology as a means to an end (instrumental), or as a human activity (anthropological), he refers to a bringing-forth or a bringing out of concealment into unconcealment. Heidegger then assumes technology as a threat to this bringing-forth (of truth), bypassing or denying human active participation (with the real events, i.e. the revealing of truth). Again, he suggests that art can be a way of navigating this paradox and actively sidestepping the potential threat of technology. Then onto Stiegler with his view of technics as organized inorganic matter, somewhat with an imminent drive or dynamic of its own.  This possibly constituting an impediment to socialization, individuation and to intersubjectivization.  In hindsight (after today’s discussion), perhaps a rather gray, or bleak approach to the issues of technology in the arts. In the context of our project, it was intended to portray the difficulty of controlling something with something else, the reaching into another dimension while acting from the present one.

Solveig countered with a more phenomenological approach, bringing Merleau-Ponty’s perspectives of technology as an extension of the body, of communication and communion, of learning and experiencing together, acting on a level below language. On this level the perceptive and the corporal – included extensions that are incorporated into the body schemes, are intertwined and form a unity. Now this seems like a path of further investigation.  One could easily say that we are learning to play a new instrument, or learning to navigate a new musical environment, so our current endeavors must be understood as baby steps in this process (as also reflected on earlier), and we are this intertwined unity taking form. In this context, also Merleau-Ponty’s ideas about  the learning of language as something that takes place by learning a play of roles, where understanding how to change between roles is an essential part of the learning process, comes into view as appropriate.

Simon also reported on another research student project: cellist Audrey Riley was a long-time member of the Merce Cunningham ensemble and is engaged in examining a (post-Cage) view of performance – including ideas of ‘clearing the mind’ of thought and language – as they might ‘get in the way’.  This relates to the idea of ‘communion’ where there is a sense of wholeness and completeness.  This also leads us to Wittgenstein’s ideas that both in art and philosophy there is something that is unutterable, that may just be demonstrated.

How does this affect what we do?

We concluded our meeting with a short discussion on how the philosophical aspects can be set into play to aid our further work in the project. Is it just high-and-wide ramblings or do we actually make use of it? The setting up of some dimensions of tension, or oppositions (that may not be oppositions but rather different perspectives on the same thing sometimes) may be of help in practically approaching the problems we face. Like the assumed opposition of communion – disturbance, does the crossadaptive interventions into the communication between performers act as a disturbance, or can it be a way of enabling communion? Can we increase the feeling of connectivity? We can also use these lines of thought to describe what happened (in an improvisation), or try to convey the implications of the project as a whole.  Most importantly however, is probably the growth of a vocabulary for reflection through the probing of our own work in the light of the philosophic implications.

Documentation as debugging

This may come as no surprise to some readers, but I thought it Documentation_pencilappropriate to note anyway. During the blog writing about the rhythmic analysis – part 1, I noticed I would tighten up the definition of terms, and also the actual implementation of the algorithm significantly. I would start writing something, seeing what I had just written and thinking “this just does not make sense” or “this implementation must be off, or just plainly wrong”.  Then, to be able to write sensibly about the ideas, I went back and tidied up my own mess. In the process, making a much more reliable method for analysing the features I wanted to extract. What was surprising was not that this happened but the degree to which it happened.  In the context of artistic research and the reflection embedded in that process, similar events may occur.

Mixing with Gary

During our week in London we had some sessions with Gary Bromham, first at the Academy of Contemporary Music in Guildford on the June 7th , then at QMUL later in the week. We wanted to experiment with cross-adpative techniques in a traditional mixing session. Using our tools/plugins within a Logic session to work similar to the traditional sidechaining, but with the expanded palette of analysis and modulator mappings enabled by our tools developed in the project. Initially we tried to set this up with Logic as the DAW. It kind of works, but seems utterly unreliable. Logic would not respond to learned controller mappings after we close the session and reopen it. It does receive the MIDI controller signal (and can re-learn) but in all cases refuse to respond to the received automation control. In the end we abandoned Logic altogether and went for our safe always-does-the-job Reaper.

As the test session for our experiments we used Sheryl Crow “Soak up”, using stems for the backing tracks and the vocals.

2016_6_soakup mix1 pitchreverb
 2016_6_soakup mix1 pitchreverb

Example 1: Vocal pitch to reverb send and reverb decay time.

2016_6_soakup mix2 experiment
 2016_6_soakup mix2 experiment

Example 2: Vocal pitch as above. Adding vocal flux to hi cut frequency for the rest of the band. Rhythmic analysis (transient density) of the backing track controls a peaking EQ sweep on the vocals, creating a sweeping effect somewhat like a phaser. This is all somewhat odd all together, but useful as an controlled experiment in polyphonic crossadaptive modulation. The

* First thing Gary ask is to process one track according to the energy in a specific frequency band in another. For example “if I remove 150Hz on the bass drum, I want it to be added to the bass guitar”.  Now, it is not so easy to analyze what is missing, but easier to analyze what is there. So we thought of another thing to try; Sibliants (e.g. S’es) on the Vocals can be problematic when sent to reverb or delay effects. Since we don’t have a multiband envelope follower (yet), we tried to analyze for spectral flux or crest, then use that control signal to duck the reverb send for the vocals.

* We had some latency problems, relating to pitch tracking of vocals, the modulator signal arriving a bit late to precisely control the reverb size for the vocals. The tracking is ok, but the effect responds *after* the high pitch. This was solved by delaying the vocal *after* it was sent to the analyzer, then also delaying the direct vocal signal and the rest of the mix accordingly.

* Trond idea for later: Use vocal amp to control bitcrush mix on drums (and other programmed tracks)

* Trond idea for later: Use vocal transient density to control delay setting (delay time, … or delay mix)

* Bouncing the mix: Bouncing does not work, as we need the external modulation processing (analyzer and MIDIator) to also be active. Logic seems to disable the “external effects” (like Reaper here running via Jack, like an outboard effect in a traditional setting) when bouncing.

* Something good: Pitch controlled reverb send works quite well musically, and is something one would not be able to do without the crossadaptive modulation techniques. Well, it is actually just adaptive here (vocals controlling vocals, not vocals controlling something else).

* Notable: do not try to fix (old) problems, but try to be creative and find new applications/routings/mappings. For example the initial ideas from Gary was related to common problems in a mixing situation, problems that one can already fix (with de-essers or similar)

* Trond: It is unfamiliar in a post production setting to hear the room size change, as one is used to static effects in the mix.

* It would be convenient if we could modulate the filtering of a control signal depending on analyzed features too. For example changing the rise time for pitch depending on amplitude.

* It would also be convenient to have the filter times as sync’ed values (e.g. 16th) relative to the master tempo


– Add multiband rms analysis.

– check roundtrip latency of the analyzer-modulator, so the time it takes from an audio signal is sent until the modulator signal comes back.

– add modulation targets (e.g. rise time). This most probably just works, but we need to open the midi feed back into Reaper.

– add sync to the filter times. Cabbage reads bpm from host, so this should also be relatively straightforward.



Seminar and meetings at Queen Mary University of London

June 9th and 10th we visited QMUL, met Joshua Reiss and his eminent colleagues there.  We were very well taken care of and had a pleasant and interesting stay.  June 9th we had a seminar presenting the project, and discussing related issues with a group of researchers and students. The seminar was recorded on video, to be uploaded on QMUL youtube. The day after we had a meeting with Joshua, going in more detail. We also got to meet several PhD students and got insight into their research.

Seminar discussion

Here’s some issues that were touched upon in the seminar discussion:

* analyze gestures inherent in the signal, e.g. crescendo, use this as trigger, to turn some process on or off, flip a preset etc. We could also analyze for very specific patterns, like a melodic fragment, but probably better to try to find gestures that can be performed in several different ways, so that the musician can have freedom of expression while providing a very clear interface for controlling the processes.

* Analyze features related to specific instruments. Easier to find analysis methods to extract very specific features. Rather than asking for “how can we analyze this to extract something interesting… This is perhaps a lesson for us to be a bit more specific in what we want to extract. This is somewhat opposite to our current exploratory effort in just trying to learn how the currently implemented analysis signals works and what we can get from them.

* Look for deviations from a quantized value. For example pitch deviations within a semitone, and rhythmic deviations from a time grid.

* Semantic spaces. Extract semantic features from the signal, could be timbral descriptors, mood, directions etc. Which semantics? Where to take the terminology from? We should try to develop examples of useful semantics, useful things to extract.

* Semantic descriptors are not necessarily a single point in a multidimensional space, it is more like a blob, and area. Interpolation between these blobs may not be linear in all cases. We don’t have to use all implied dimensions, so we can actually just select features/semantics/descriptors that will give us the possibility of linear interpolation. At least in those situations where we need to interpolate…

* Look at old speech codexes. Open source. Exitation/resonator model, LPC.  This is time domain, so will be really fast/low-latency.

* Cepstral techniques can also be used to separate resonator and exciter. The smoothed cepstrum being the resonator. Take the smoothed cepstrum and subtract it from the full cepstrum to get the excitation.

* The difficulty of control. The challenge to the performer, limiting the musical performance,  inhibiting the natural ways of interaction. This is a recurring issue, and something we might want to take care to handle carefully. This is also really what the project is about: creating *new* ways of interaction.

* Performer will adapt to imperfections of the analysis. Normally, MIR signals are not “aware” that they are being analyzed. They are static and prerecorded. In our case, the performer, being aware of how the analysis method works and what it responds to, can adapt the playing to trigger the analysis method in highly controllable manners. This way the cross-adaptivity is not only technical related to the control of parameters, but adaptive in relation to how the performer shapes her phrases and in turn also what she selects to play.

* Measuring collective features. Features of the mix. Each instrument contributes equally to the modulator signal. One signal can push others down or single them out. Relates to game theory. What is the most favorable behavior over time: suppress others or negotiate and adapt.

Meeting with Joshua

* Josh mentioned a few researchers that we might be interested in. Brech de Man: Intelligen audio switcher. Emannuel Chourdakis: Feature based reverb.  Dave Ronan: Groups, stems, automatic mixing. Vincent Verfaille: effect classification and A-DAFX. Brian Pardo: interfaces for music performance/production, visualization, semantics, machine learning. Pedro Pestana: sound engineer, best practices (phD), automatic mixing. Ryan Stables: SAFE plugins.

* Issues relating to publishing our work and getting an audience for it. As we could in some respect claim to create a new field, creating a community for it might be essential for further use of our research. Increase visibility. Promote also via QMUL press and NTNU info. Among connected fields are Human Computer Interaction, New Instruments for Musical Expression, Audio Engineering Society.

* QMUL has considerable experience in evaluation studies, user experience tests, listening tests etc. Some of this may be beneficial as a perspective on our otherwise experiental approach.

* Collective features (meaning individual signals in relation to each other and to the ensemble mix): Masking (spectral overlap). Onset times in relation to other instruments, lagging. Note durations, percussive/sustained etc.

* We currently use spectral crest, but crest also being useful in the time domain to do rhythmic analysis (rhythmic density, percussiveness, dynamic range). Will work better with loudness matching curve (dB)

* Time domain filterbank faster than FFT. Logarithmically spaced bands

* FFT of the time domain amp envelope

* Separate silence from noise. Automatic gain control. Automatic calibration of noise floor (use peak to average measure to estimate what is background noise and what is actual signal)

* Look for patterns: playing the same note, also pitch classes (octaves), also collectively (between instruments).

* Use log freq spectrum, and amps in dB, then do centroid/skew/flux etc

* How to collaborate with others under QMUL, adapting their plugins. Port the techniques to Csound or re-implement ours in C++? For the prototype and experimentation stage maybe modify their plugins to output just control signals? Describe clearly our framework so their code can be plugged in.



Seminar at De Montfort


Simon and Leigh in Leigh’s office at De Montfort

Wednesday June 8th we visited Simon Emmerson at De Montfort and also met Director Leigh Landy. We were very well taken care of and had a pleasant and interesting stay. One of the main objectives was to do seminar with presentation of the project and discussion among the De Montfort researchers. We found that their musical preference seems to overlap considerably with our own, in the focus on free improvisation and electroacoustic art music. As this is the most obvious and easy context to implement experimental techniques (like the crossadaptive ones) we had taken care to also present examples of use within other genres. This could be interpreted as if we were more interested in traditional applications/genres than the free improvised genres. Now knowing the environment at Leicester better, we could probably have put more emphasis on the free electroacoustic art music applications. But indeed this led to interesting discussions about applicability, for example:

*In metric /rhythmic genres, one could easier analyze and extract musical features related to bar boundaries and rhythmic groupings.

* Interaction itself could also create meter, as the response time (both human and technical), has a rhythm and periodicity that can evolve musically due to the continuous feedback processes built into the way we interact with such a system and each other in the context of such a system..

* Static and deterministic versus random mappings. Several people was interested in more complex and more dynamic controller mappings, expressing interest and curiosity towards playing within a situation where the mapping could quickly and randomly change. References were made to Maja S.K. Ratkje and that her kind of approach would probably make her interested in situations that were more intensely dynamic.  Her ability to respond to the challenges of a quickly changing musical environment (e.g. changes in the mapping) also correlating with an interest to explore this kind of complex situations.  Knowing Maja from our collaborations, I think they may be right, take note to discuss this with her and try to make some challenging mapping situations for her to try out.

* it was discussed whether the crossadaptive methods could be applied to the “dirty electronics” ensemble/course situation, and there was an expressed interest in exploring this. Perhaps it will be crossadaptivity in other ways than what we use directly on our project, as the analysis and feature extraction methods does not necessarily transfer easily to the DIY (DIT – do it together, DIWO – Do it with others) domain. The “Do it with others” approach resonates well with what we generally approach btw.

* The complexity is high even with two performers. How many performers do we envision this to be used with? How large an ensemble? As we have noticed ourselves also, following the actions of two performers somehow creates a multi-voice polyphonic musical flow (2 sources, each source’s influence on the other source and the resulting timbral change resulting thereof, and the response of the other player to these changes). How many layers of polyphony can we effectively hear and distinguish when experiencing the music? (as performers or as audience). References were made to the laminal improvisation techniques of AMM.

* Questions of overall form. How will interactions under a crossadaptive system change the usual formal approach of a large overarching rise and decay form commonly found in “free” improvisation, At first I took the comment to suggest that we also could apply more traditional MIR techniques of analyzing longer segments of sound to extract “direction of energy” and/or other features evolving over longer time spans. This could indeed be interesting, but also poses problems of how the parametric response to long-terms changes should act (i.e. we could accidentally turn up a parameter way too high, and then it would stay high for a long time before the analysis window would enable us to bring it back down). Now, in some ways this would also resemble using extremely long attack and decay times for the low pass filter we already have in place in the MIDIator, creating very slow responses, needing continued excitation over a prolonged period before the modulator value will respond. After the session, I discussed this more with Simon, and he indicated that the large form aspects were probably just as much meant with regards to the perception of the musical form, rather than the filtering and windowing in the analysis process. There are interesting issues of drama and rhetoric posed by bringing these issues in, whether one tackles them on the perception level or the analysis and mapping stage.

* Comments were made that performing successfully on this system would require immense effort in terms of practicing and getting to know the responses and the reactions of the system in such an intimate manner that one could use it effectively for musical expression.  We agree of course.