Theoretical/philosophical issues regarding the development of the project

Skype with Solveig Bøe, Simon Emmerson and Øyvind Brandtsegg.

The starting point and main focus of our conversation was the session that took place 20.-21. September in Studio Olavskvartalet, NTNU, and that is the theme of a former post on this blog. That post also contains audio takes and a video where the musicians discuss their experiences. Øyvind told us about what had been going on and about the reactions and reflections of the participants and himself. Simon had seen the video and listened to the takes, and Solveig will look at the material, try to use Merleau-Ponty’s language to describe the interactions going on.


One interesting case is when two (to make the system simple) musicians control each other, where for example the louder an instrument plays, the less effect of some kind on the other instrument, the louder the other instument plays the larger effect on the other. Simon had noted in the documentation that the participating musicians mentioned this (way of controlling each other) could lead to confusion. An analogy to playing football with two footballs was made, and the ball changing colour each time someone touched it. While not entirely similar in timing and pace, the image sheds some light on the added complexity. We discussed how this confusion could be reduced and how this could create interesting music. There is a learning curve involved and this must be tied to getting the musicians to know each other’s interaction patterns well, learning (by doing) how effects are created in each others sounds. Learning by listening and doing. But also by getting visual technical feedback? We all noted that the video documentation from the session made it particularly easy to gain insight into the session. More so than listening to the audio without video. There is an aspect of seeing the physical action, and also seeing the facial and bodily expression of the performers. Naturally, this in itself is not particular to our project, it also can be said of many recordings of performances. Still, since the phenomenon of cross-adaptiveness in the interaction can be a quite hard to grasp fully, this extra dimension of perception seems to help us engage when reviewing the documentation.  That said, the video on the blog post was also very effectively edited by Andreas to extract significant points of interest and points of reflection. The editing also affects the perception of the event significantly of course. With that perspective, of the extra sensory dimension letting us engange more readily in the perception and understanding of the event, how could this be used to aid the learning process (for the performers)? Some of the performers also noted in the session interviews that some kind of visual feedback would be nice to have. Simon thought that visual feedback of the monitoring (of the sound) could be of good help. Øyvind also agrees to this, while at the same time also stressed that it could result in making the interactions between the musicians even more taxing, because they now also had to look at the screen, in addition to each other. Commonly in computer music performance, there has been a concern to get the performer’s focus away from the computer screen as it in many cases has been detrimental to the experience both for the performer and the audience. Providing a visual interface could lead to similar problems. Then again, the visual interface could potentially be used with success during the learning stage, to speed up the understanding of how the instrument (i.e. the full system of acoustic instrument plus electronic counterparts) works.

We discussed the confusion felt when the effects seemed difficult to control, the feeling of “ghostly transformations” taking place, the disturbances of the awareness of themselves and their “place”. Could confusion here be viewed as something not entirely negative too? Some degree of confusion could be experienced in terms of a pregnant complexity, to stimulate curiousness and lead to deeper engagement? How could we enable this flipping of sign for the confusion, making it more of a positive feature than a negative disorientation? Perhaps one aspect of this process is to provide just enough traction for the perception (in all senses) to hold on to the complex interactions. One way of reducing confusion would be to simplify the interactions. Then again, the analysis dimensions in this session were already simplified (just using amplitude and event density), and the affected modulation (relative balance between effects) also relatively simple. With respect to getting traction for perception to grasp the situation, experimentation with a visual feedback interface is definitely something that needs to be explored.


According to Merleau-Ponty interactions with others are variations in the same matter, something is the same for all participants, even if – and it always is – experienced from different viewpoints. What is the same in this type of interaction? Solveig said that to interact successfully there have to be something that is the same for all of the performers, they have to be directed towards something that is the same. But what could be a candidate for “the same” in a session where the participants interact with each others interactions? The session seems to be an evolving unity encompassing the perceptions and performances of the participants as they learn how ones instruments work in the “network” of instruments. Simon pointed to the book ‘Sonic Virtuality – Sound as Emergent Perception’ (Mark Grimshaw and Tom Garner – OUP 2015), that argues for the fact that no two sounds are ever the same. Neurophysiologically in each brain sounds are experienced differently, so we can’t say really that we hear the same sound. Solveig’s response was that the same is the situation where the participants create the total sound. In this situation one is placed, even if displaced by the effects created on one’s produced sounds. Displacement became a theme that we tried to reflect upon, the being “there” not “here”, by one being controlled by the other instrument(s). The dialectic between dislocation and relocation being important in this connection. Dislocation of the sound could feel like dislocation of oneself. How does amplification (generally in electroacoustic and electronic music) change the perspectives of oneself? How do we perceive the sound of a saxophone and the music performed on it differently when it is amplified so that the main acoustic impression of the instrument is coming to us through speakers? The speakers usually not being positioned in the same location as the acoustic instrument, and even if they were, the acoustic radiation patterns of the speakers would radically differ from the sound coming from the acoustic instrument. In our day and age, this type of sound reproduction and amplification is so common that we sometimes forget how it affects perception of the event. With unprocessed, as clean as possible or “natural” sound reproduction, the percepual effect is still significant. With processed sound even more so, and with the crossadaptive interactions the potential for dislocation and disconnection is manifold. As we (the general music loving population) have learned to love and connect to amplified and processed musics, we assume a similar process needs to take place for our new means of interaction. Similarly, the potential for intentionally exploring the dislocation effects for expressive purposes also can be a powerful resource.

Sound is in the brain, but also in the haptic, visual, auditory, in general, sensual, space. Phenomenologically what is going on in this space is the most interesting, according to Solveig, but the perspective from neuroscience is also something that could bring productive insights to the project.

We returned to the question of monitoring: How much information should the performers get? What should they be able to control? Visualization of big data could help in the interaction and the interventions, but which form should the visualization have? Øyvind showed us an example of visualization from the Analyzer plugin developed in the project. Here, the different extracted features are plotted in three dimensions (x,y and colour) over time. It provides a way of getting insight into how the actual performed audio signal and the resulting analysis correlates. It was developed as a means of evaluating and selecting which features should be included in the system, but can potentially also be used directly by the performer trying to learn how the instrumental actions result in control signals.


Skype on philosophical implications

Skype with Solveig Bøe, Simon Emmerson and Øyvind Brandtsegg.

Our intention for the conversations was to start sketching out some of the philosophical implications of our project. Partly as a means to understand what we are doing and what it actually means, and partly as a means of informing our further work, choosing appropriate directions and approaches. Before we got to these overarching issues, we also discussed some implications of the very here and now of the project. We come back to the overarching issues in the latter part of this post.


Initially, we commented on rhythm analysis not based on the assumption of an underlying constant meter or pulse. Work of Ken Fields was mentioned, on internet performance and the nature of time. Also John Young’s writing on form and phrase length (in Trevor Wishart’s music); Ambrose Seddon’s chapter on time structures in Emmerson/Landy’s recent book “Expanding the Horizon of Electroacoustic Music Analysis”. In live performance there is also George Lewis’ rhythm analysis (or event analysis) in his interactive pieces (notably Voyager). See

Score as guide

In the context of Oeyvind being at UCSD, where one of their very strong fields of competence is contemporary music performance, he thought loosely about a possible parallel between a composed score and the cross-adaptive performance setting. One could view the situation designed for/around the performers in a cross-adaptive setting as a composition, in terms of it setting a direction for the interplay and thus also posing certain limits (or obstacles/challenges) to what the performer can and cannot do. On the possible analogy between the traditional score and our crossadaptive performance scenario, one could flag the objection that a classical performer not so much just follow the do’s and don’ts described  in a score but rather uses the score as a means to engage in the composer’s intention. This may or may not apply to the musics created in our project, as we strive to not limit ourselves to specific musical styles, still leaning solidly against the quite freely improvised expression in a somewhat electroacoustic timbral environment. In some of our earlier studio sessions, we noted that there is two competing modes of attention:

  • To control an effect parameter with one’s audio signal, or
  • to respond musically to the situation

… meaning that what the performer chooses to play would traditionally be based on an intended musical statement to contribute to the current musical situation, whereas now she might rather play something that (via its analyzed features) will modulate the sound of another instrument in the ensemble. This new something, being a controller signal rather than a self-contained musical statement, might actually express something of its own in contradiction to the intended (and assumedly achieved) effect it has on changing the sound of another instrument. Now on top of this consider that some performers (Simon was quoting the as yet unpublished work of postgraduate and professional bass clarinet performer Marij van Gorkom) experience the use of electronics as perhaps disturbing the model of engagement with the composer’s intention through the written score. This being something that one needs to adjust to and accommodate on top of the other layers of mediation. Then we might consider our crossadaptive experiments as containing the span of pain to love in one system.

…and then to philosophy

A possible perspective suggested by Oeyvind was a line connecting Kant-Heidegger-Stiegler, possibly starting offside but nevertheless somewhere to start. The connection to Kant being his notion of the noumenal world (where the things in themselves are) and the phenomenal world (which is where we can experience and to some extent manipulate the phenomena). Further, to Heidegger with his thoughts on the essence of technology. Rather than viewing technology as a means to an end (instrumental), or as a human activity (anthropological), he refers to a bringing-forth or a bringing out of concealment into unconcealment. Heidegger then assumes technology as a threat to this bringing-forth (of truth), bypassing or denying human active participation (with the real events, i.e. the revealing of truth). Again, he suggests that art can be a way of navigating this paradox and actively sidestepping the potential threat of technology. Then onto Stiegler with his view of technics as organized inorganic matter, somewhat with an imminent drive or dynamic of its own.  This possibly constituting an impediment to socialization, individuation and to intersubjectivization.  In hindsight (after today’s discussion), perhaps a rather gray, or bleak approach to the issues of technology in the arts. In the context of our project, it was intended to portray the difficulty of controlling something with something else, the reaching into another dimension while acting from the present one.

Solveig countered with a more phenomenological approach, bringing Merleau-Ponty’s perspectives of technology as an extension of the body, of communication and communion, of learning and experiencing together, acting on a level below language. On this level the perceptive and the corporal – included extensions that are incorporated into the body schemes, are intertwined and form a unity. Now this seems like a path of further investigation.  One could easily say that we are learning to play a new instrument, or learning to navigate a new musical environment, so our current endeavors must be understood as baby steps in this process (as also reflected on earlier), and we are this intertwined unity taking form. In this context, also Merleau-Ponty’s ideas about  the learning of language as something that takes place by learning a play of roles, where understanding how to change between roles is an essential part of the learning process, comes into view as appropriate.

Simon also reported on another research student project: cellist Audrey Riley was a long-time member of the Merce Cunningham ensemble and is engaged in examining a (post-Cage) view of performance – including ideas of ‘clearing the mind’ of thought and language – as they might ‘get in the way’.  This relates to the idea of ‘communion’ where there is a sense of wholeness and completeness.  This also leads us to Wittgenstein’s ideas that both in art and philosophy there is something that is unutterable, that may just be demonstrated.

How does this affect what we do?

We concluded our meeting with a short discussion on how the philosophical aspects can be set into play to aid our further work in the project. Is it just high-and-wide ramblings or do we actually make use of it? The setting up of some dimensions of tension, or oppositions (that may not be oppositions but rather different perspectives on the same thing sometimes) may be of help in practically approaching the problems we face. Like the assumed opposition of communion – disturbance, does the crossadaptive interventions into the communication between performers act as a disturbance, or can it be a way of enabling communion? Can we increase the feeling of connectivity? We can also use these lines of thought to describe what happened (in an improvisation), or try to convey the implications of the project as a whole.  Most importantly however, is probably the growth of a vocabulary for reflection through the probing of our own work in the light of the philosophic implications.

Documentation as debugging

This may come as no surprise to some readers, but I thought it Documentation_pencilappropriate to note anyway. During the blog writing about the rhythmic analysis – part 1, I noticed I would tighten up the definition of terms, and also the actual implementation of the algorithm significantly. I would start writing something, seeing what I had just written and thinking “this just does not make sense” or “this implementation must be off, or just plainly wrong”.  Then, to be able to write sensibly about the ideas, I went back and tidied up my own mess. In the process, making a much more reliable method for analysing the features I wanted to extract. What was surprising was not that this happened but the degree to which it happened.  In the context of artistic research and the reflection embedded in that process, similar events may occur.

Seminar at De Montfort


Simon and Leigh in Leigh’s office at De Montfort

Wednesday June 8th we visited Simon Emmerson at De Montfort and also met Director Leigh Landy. We were very well taken care of and had a pleasant and interesting stay. One of the main objectives was to do seminar with presentation of the project and discussion among the De Montfort researchers. We found that their musical preference seems to overlap considerably with our own, in the focus on free improvisation and electroacoustic art music. As this is the most obvious and easy context to implement experimental techniques (like the crossadaptive ones) we had taken care to also present examples of use within other genres. This could be interpreted as if we were more interested in traditional applications/genres than the free improvised genres. Now knowing the environment at Leicester better, we could probably have put more emphasis on the free electroacoustic art music applications. But indeed this led to interesting discussions about applicability, for example:

*In metric /rhythmic genres, one could easier analyze and extract musical features related to bar boundaries and rhythmic groupings.

* Interaction itself could also create meter, as the response time (both human and technical), has a rhythm and periodicity that can evolve musically due to the continuous feedback processes built into the way we interact with such a system and each other in the context of such a system..

* Static and deterministic versus random mappings. Several people was interested in more complex and more dynamic controller mappings, expressing interest and curiosity towards playing within a situation where the mapping could quickly and randomly change. References were made to Maja S.K. Ratkje and that her kind of approach would probably make her interested in situations that were more intensely dynamic.  Her ability to respond to the challenges of a quickly changing musical environment (e.g. changes in the mapping) also correlating with an interest to explore this kind of complex situations.  Knowing Maja from our collaborations, I think they may be right, take note to discuss this with her and try to make some challenging mapping situations for her to try out.

* it was discussed whether the crossadaptive methods could be applied to the “dirty electronics” ensemble/course situation, and there was an expressed interest in exploring this. Perhaps it will be crossadaptivity in other ways than what we use directly on our project, as the analysis and feature extraction methods does not necessarily transfer easily to the DIY (DIT – do it together, DIWO – Do it with others) domain. The “Do it with others” approach resonates well with what we generally approach btw.

* The complexity is high even with two performers. How many performers do we envision this to be used with? How large an ensemble? As we have noticed ourselves also, following the actions of two performers somehow creates a multi-voice polyphonic musical flow (2 sources, each source’s influence on the other source and the resulting timbral change resulting thereof, and the response of the other player to these changes). How many layers of polyphony can we effectively hear and distinguish when experiencing the music? (as performers or as audience). References were made to the laminal improvisation techniques of AMM.

* Questions of overall form. How will interactions under a crossadaptive system change the usual formal approach of a large overarching rise and decay form commonly found in “free” improvisation, At first I took the comment to suggest that we also could apply more traditional MIR techniques of analyzing longer segments of sound to extract “direction of energy” and/or other features evolving over longer time spans. This could indeed be interesting, but also poses problems of how the parametric response to long-terms changes should act (i.e. we could accidentally turn up a parameter way too high, and then it would stay high for a long time before the analysis window would enable us to bring it back down). Now, in some ways this would also resemble using extremely long attack and decay times for the low pass filter we already have in place in the MIDIator, creating very slow responses, needing continued excitation over a prolonged period before the modulator value will respond. After the session, I discussed this more with Simon, and he indicated that the large form aspects were probably just as much meant with regards to the perception of the musical form, rather than the filtering and windowing in the analysis process. There are interesting issues of drama and rhetoric posed by bringing these issues in, whether one tackles them on the perception level or the analysis and mapping stage.

* Comments were made that performing successfully on this system would require immense effort in terms of practicing and getting to know the responses and the reactions of the system in such an intimate manner that one could use it effectively for musical expression.  We agree of course.



Project start meeting in Trondheim


Monday June 6th we had a project start meeting with the NTNU based contributors: Andreas Bergsland, myself, Solveig Bøe, Trond Engum, Sigurd Saue, Carl Haakon Waadeland and Tone Åse. This gave us the opportunity to present the current state of affairs and our regular working methods to Solveig. Coming from philosophy, she has not taken part in our earlier work on live processing. As the last few weeks have been relatively rich in development, this also gave us a chance to bring all of the team up to speed. Me and Trond also gave a live demo of a simple crossadaptive setup where vocals control delay time and feedback on the guitar, while the guitar controls reverb size and hi-freq damping for the vocal. We had discussions and questions interspersed within each section of the presentation. Here’s a brief recounting of issues wwe touched upon.

Roles for the musician

The role of the musician in crossadaptive interplay has some extra dimensions when compared to a regular acoustic performance situation. A musician will regularly formulate her own musical expression and relate this to what the other musician is playing. On top of this comes the new mode of response created by live processing, where the instrument’s sound constantly changes due to the performative action of a live processing musician. In the cross-adaptive situation, these changes are directly controlled by the other musicians’ acoustic signal, so the musical response is two-fold: responding to the expression and responding to the change in own sound. As these combine, we may see converging or diverging flow of musical energy between the different incentives and responses at play. Additionally, her own actions will influence changes on the other musician’s sound, so the expressive is also two-fold; creating the (regular) musical statement and also considering how the changes inflicted on the other’s sound will affect both how the other one sounds and how that affects their combined effort. Indeed, yes, this is complex. Perhaps a bit more complex that we had anticipated. The question was raised if we do this only to make things difficult for ourselves. Quite justly. But we were looking for ways to intervene in the regular musical interaction between performers, to create yet unheard ways of playing together. It might appear complex just now because we have not yet figured out the rules and physics of this new situation, and it will hopefully become more intuitive over time. Solveig voiced it like we put the regular modes of perception in parenthesis. For good or for bad, I think she may be correct.

Simple testbeds

It seems wise to initially set up simplified interaction scenarios, like the vocal/reverb guitar/delay example we tried in this meeting. It puts emphasis on exploring the combinatorial parameter modulation space. Even with a simple situation of extracting two features for each sound source, controlling two parameters on each other’s sound, the challenges to the musical interaction is prominent. Controlling two features of one’s own sound, to modulate the other’s processing is reasonably manageable while also concentrating on the musical expression.

Interaction as composition

An interaction scenario can be thought of as a composition. In this context we may define a composition as something that guides the performers to play in a certain way (think of the text pieces from the 60’s for example, setting the general mood or each musician’s role while allowing a fair amount of freedom for the performer as no specific events are notated). As the performers formulate their musical expression to act as controllers just as much as to express an independent musical statement, the interaction mode has some of the same function as a composition has in written music. Namely to determine or to guide what the performers will play. In this setting, the specific performative action is freely improvised, but the interaction mode emphasizes certain kinds of action to such an extent that the improvisation is in reality not really free at all, but guided by the possibilities (the affordance []) of the system. The intervention into the interaction also sheds light on regular musical interaction. We become acutely aware of what we normally do to influence how other musicians play together with us. Then this is changed and we can reflect on both the old (regular, unmodulated) kind of interaction and the new crossadaptive mode.

Feature extraction, performatively relevant features

Extracting musically salient features is a Big Question. What is musically relevant? Carl Haakon suggested that some feature related to the energy could be interesting. Energy, for the performer can be induced into the musical statement in several ways. It could be rhythmic activity, loudness, timbre, and other ways of expressing energetic performance. As such it could be a feature taking input from several mathematical descriptions. It could also be a feature allowing a certain amount of expressive freedom for the performer, as energy can be added by several widely different performative gestures, leaving some sort of independence from having to do very specific actions in order to trigger the control of the destination parameter. Mapping the energy feature to a destination parameter that results in a more rich and energetic sound could lead to musically convergent behavior, and conversely, controlling a parameter that makes the resulting sound more sparse could create musical and interactive tension. In general, it might be a good idea to use such higher level analysis. This simplifies the interaction for the musician, and also creates several alternative routes to inflict a desired change in the sound. The option to create the same effect by several independent routes/means, also provides the opportunity for doing so with different kinds of side effects (like in regular acoustic playing too), e.g. creating energy in this manner or that manner gives very different musical results but in general drives the music in a certain direction.
Machine learning (e.g. via neural networks) could be one way of extracting such higher level features, different performance situations, different distinct expressions of a performer. We could expect some potential issues of recalibration due to external conditions, slight variations in the signal due to different room, miking situation etc. Will we need to re-learn the features for each performance, or could we find robust classification methods that are not so sensitive to variations between instruments and performance situations?

Meta mapping, interpolations between mapping situations

Dynamic mappings, allowing the musicians to change the mapping modes during different sections of the performed piece. If the interaction mode becomes limited or “worn out” after a while of playing, the modulation mappings could be gradually changed. This can be controlled by an external musician or sound engineer, or it can be mapped to yet other layers of modulations. So, a separate feature of the analyzed sound is mapped to a modulator changing the mappings (the preset or the general modulation “situation” or “system state”) of all other modulators, creating a layered meta-mapping-modulator configuration. At this point this is just an option, still too complex for our initial investigation. It brings attention to the modulator mapping used in the Hadron Particle Synthesizer, where a simple X-Y pad is used to interpolate between different states of the instrument. Each state containing modulation routing and mapping in addition to parameter values. The current Hadron implementation allows control over 209 parameters and 54 modulators via a simple interface. This enables a simplified multidimensional control in Hadron. Maybe the cross-adaptive situation can be thought as somehow similar. The instrumental interface of Hadron behaves in highly predictable ways, but it is hardly possible to decode intellectually, one has to interact by intuitive control and listening.


The influence of the direct/unprocessed sound; With acoustic instruments, the direct sound from the instrument will be heard clearly in addition to the processed sound. In our initial experiments, we’ve simplified this by using electric guitar and close miked vocals. We mostly hear the result of the effects processing. Still, the analysis of features is done on the dry signal. This creates a situation where it may be hard to distinguish what features controls which modulations, because the modulation source is not heard clearly as a separate entity in the sound image. It is easy to mix the dry sound higher, but then we hear less of the modulations. It is also possible to allow the modulated sound be the basis of analysis (creating the possibility for even more complex cross-adaptive feedback modulations as the signals can affect each other’s source for analysis). Then this would possibly make it even harder for the musicians to have intentional control over the analyzed features and thus the modulations. So, the current scheme is, if not the final answer, it is a reasonable starting point.

Audience and fellow musicians’ perception of the interplay

How will the audience perceive this? Our current project does not focus on this question, but it is still relevant to visit it briefly. It also relates to expectations, to schooling of the listener. Do we want the audience to know? Does knowledge of the modulation interaction impede the (regular) appreciation of the musical expression? One could argue that a common symphony orchestra concert-goer does not necessarily know the written score, or have analyzed the music, but appreciates it as an aesthetic object on its own terms. The mileage may vary, some listeners know more and are more invested in details and tools of the trade. Still, the music itself does not require knowledge of how it is made to be appreciated. For a schooled listener, and also for ourselves, we can hope to be able to play with expectations with in the crossadaptive technique. Andreas mentions that listening to live crossadaptive processing as we demonstrated it is like listening to an unfamiliar spoken language, trying to extract meaning. There might be some frustration over not understanding how it works. Also, expectations of the fantastic new interaction mode and not hearing it can lead to disappointment. Using it as just another means of playing together, another parameter of musical expression alleviates this somewhat. The listener does not have to know, but will probably get the opportunity for an extra layer of appreciation with understanding of the process. In any case, our current research project does not directly concern the audience’s appreciation of the produced music. We are currently at a very basic stage of exploration and we need to experiment at a much lower level to sketch out how this kind of musics can work before starting to consider how (or even if) it can be appreciated by the audience.