Playing or being played – the devil is in the delays

Since the crossadaptive project involves designing relationsips between performative actions and sonic responses, it is also about instrument design in a wide definition of the term. Some of these relationships can be seen as direct extensions to traditional instrument features, like the relationship between energy input and the resulting sonic output. We can call this the mapping between input and output of the instrument. Some other relationships are more more complex, and involves how the actions of one performer affect the processing of another. That relationship can be viewed as an action of one performer changing the mapping between input and output of another instrument. Maybe another instrument is not the correct term to use, since we can view all of this as one combined super-instrument. The situation quickly becomes complex. Let’s take as step back and contemplate for a bit on some of the separate aspects of a musical instrument and what constitutes its “moving parts”.

Playing on the sound

One thing that has always inspired me with electric and electronic instruments is how the sound of the instrument can be tweaked and transformed. I know I have many fellow performers with me in saying that changing the sound of the instrument completely changes what you can and will play. This is of course true also with acoustic instrument, but it is even more clear when you can keep the physical interface identical but change the sonic outcome drastically. The comparision becomes more clear, since the performative actions, the physical interaction with the instrument does not need to change significantly. Still, when the sound of the instrument changes, even the physical gestures to produce it changes and so also what you will play.  There is a connection and an identification between performer and sound, the current sound of the instrument. This als oextends to the amplification system and to the room where the sound comes out. Performers who has a lot of experience playing on big PA systems know the difference between “just” playing your instrument and playing the instrument through the sound system and in the room.

Automation in instruments

In this context, I have also mused on the subject of how much an instrument ‘does for you’. I mean, automatically, for example “smart” computer instruments that will give back (in some sense) more than you put in. Also, in terms of “creative” packages like Garageband and also much of what comes with programs like Ableton Live, where we can shuffle around templates of stuff made by others, like Photoshop beautifying filters for music. This description is not intended to paint a bleak picture of the future of creativity, but indeed is something to be aware of. In the context of our current discussion it is relevant because of the relation between input and output of the instrument; Garageband and Live, as instruments, will transform your input significantly according to their affordances. The concept is not necessarily limited to computer instruments either, as all instruments add ‘something’ that the performer could not have done by himself (without the external instrument). Also, as an example many are familiar with: playing through a delay effect: Creating beautiful rhythmic textures out of a simple input, where there may be a fine line between the moment you are playing the instrument, and all of a sudden the instrument is playing you, and all you can do is try to keep up. The devil, as they say, is in the delays!

Flow and groove

There is also a common concept among musicians, when the music flows so easily as if the instrument is playing itself. Being in the groove, in flow,  transcendent, totally in the moment, or other descriptions may apply.  One might argue that this phenomenon is also result of training, muscle memory, gut reaction, instinct. These are in some ways automatic processes. Any fast human reaction relies in some aspect on a learned response, processing a truly unexpected event takes several hundred milliseconds. Even if it is not automated to the same degree as a delay effect, we can say that there is not a clean division between automated and conteplated responses. We could probably delve deep into physchology to investigate this matter in detail, but for our current purposes it is sufficient to say automation is there to some degree at this level of human performance as well as in the instrument itself.

Another aspect of automation (if we in automation can include external events that triggers actions that would not have happened otherwise), or of “falling into the beat” is the synchronizing action when playing in rhythm with another performer. This has some aspects of similarity to the situation when “being played” by the delay effect. The delay processor has even more of a “chasing” effect since it will always continue, responding to every new event, non stop. Playing with another performer does not have that self continuing perpetual motion, but in some cases, the resulting groove might have.

Adaptive, in what way?

So when performing in a crossadaptive situation, what attitude could or should we attain towards the instrument and the processes therein? Should the musicians just focus on the acoustic sound, and play together more or less as usual, letting the processing unfold in its own right? From a traditionally trained performer perspective, one could expect the processing to adapt to the music that is happening, adjusting itself to create something that “works”. However, this is not the only way it could work, and perhaps not the mode that will produce the most interesting results. Another approach is to listen closely to what comes out of the processing. Perhaps to the degree that we disregard the direct sound of the (acoustic) instrument, and just focus on how the processing responds to the different performative gestures. In this mode, the performer would continually adjust to the combined system of acoustic instrument, processing, interaction with the other musician, signal interaction between the two instruments, also including any contribution from the amplification system and the ambience (e.g. playing on headphones or on a P.A). This is hard for many performers, because the complete instrument system is bigger, has more complex interactions, and sometimes has a delay from an action occurs to the system responds (might be a musical and desirable delay, or a technical artifact), plainly a larger distance to all the “moving parts” of the machinery that enables the transformation of a musical intent to a sounding result. In short, we could describe it as having a lower control intimacy. There is also of course a question of the willingness of the performer to set himself in a position where all these extra factors are allowed to count, as it will naturally render most of us in a position where we again are amateurs, not knowing how the instrument works. For many performers this is not immediately attractive. Then again, it is an opportunity to find something new and to be forced to abandon regular habits.

One aspect that I haven’t seen discussed so much is the instrumental scope of the performer. As described above, the performer may choose to focus on the acoustic and physical device that was traditionally called the instrument, and operate this with proficiency to create coherent musical statements. On the other hand, the performer may take into account the whole system (where does that end?, is it even contained in the room in which we perform the music?) of sound generation and transformation. Many expressive options and possibilities lies within the larger system, and the position of the listener/audience also oftentimes lies somewhere in the bigger space of this combined system. These reflections of course apply just as much to any performance on a PA system, or in a recording studio, but I’d venture to say they are crystallized even more clearly in the context of the crossadaptive performance.

Intellectualize the mapping?

To what degree should the performers know and care about the details of the crossadaptive modulation mappings? Would it make sense to explore the system without knowing the mapping? Just play. It is an attractive approach for many, as any musical performance situation in any case is complex with many unknown factors, so why not just throw in these ones too? This can of course be done, and some of our experiments in San Diego has been following this line of investigation (me and Kyle played this way with complex mappings, and the Studio A session between Steven an Kyle leaned towards this approach). The rationale for doing so is that with complex crossadaptive mappings, the intellectual load of just remembering all connections can override any impulsive musical incentive. Now, after doing this on some occasions, I begin to see that as a general method perhaps this is not the best way to do it. The system’s response to a performative action is in many cases so complex and relates to so many variables, that it is very hard to figure out “just by playing”. Some sort of training, or explorative process to familiarize the performer with the new expressive dimensions is needed in most cases. With complex mappings, this will be a time consuming process. Just listing and intellectualizing the mappings does not work for making them available as expressive dimensions during performance. This may be blindingly obvious after the fact, but it is indeed a point worth mentioning. Familiarization with the expressive potential takes time, and is necessary in order to exploit it. We’ve seen some very clear pedagogical approaches in some of the Trondheim sessions, and these take on the challenge of getting to know the full instrument in a step by step manner. We’ve also seen some very fruitful explorative approaches to performance in some of the Oslo sessions. Similarly, when Miller Puckette in our sessions in San Diego chooses to listen mainly to the processing (not to the direct sound of his instrument, and not to the direct sound of his fellow musician’s instrument, but to the combined result), he actively explores the farthest reaches of the space constituted by the instrumental system as a whole. Miller’s approach can work even if all the separate dimensions has not been charted and familiarized separately, basically because he focus almost exclusively on those aspects of the combined system output. As often happens in conversations with Miller, he captures the complexity and the essence of the situation in clear statements:

“The key ingredients of phrasing is time and effort.”

What about the analytical listener?

In our current project we don’t include any proper research on how this music is experienced by a listener. Still, we as performers and designers/composers are also experiencing the music as listeners, and we cannot avoid wondering how (or if) these new dimensions of expression affects the perception of the music “from the outside”. The different presentations and workshops of the project affords opportunities to hear how outside listeners perceive it.  One recent and interesting such opportunity came when I was asked to present something for Katharina Rosenbergers Composition Analysis class at UCSD. The group comprised of graduate composition students, critical and highly reflective listeners, and in the context of this class especially aimed their listening towards analysis.  What is in there ? How does it work musically?  What is this composition? Where is the composing? In the discussions with this class, I got to ask them if they perceived it as important for the listener to know and understand the crossadaptive modulation mappings. Do they need to learn the intricacies of the interaction and the processing in the same pedagogical manner? The output from this class was quite clear on the subject:

It is the things they make the performers do that is important

In one way, we could understand it as a modernist stance that if it is in there, it will be heard and thus it matters. We could also understand it to mean that the changes in the interaction, the thing that performers will do differently in this setting is what is the most interesting. When we hear surprise (in the performer), and a subsequent change of direction, we can follow that musically without knowing about the exact details that led to the surprise.



The entrails of Open Sound Control, part one

Many of us are very used to employing the Open Sound Control (OSC) protocol to communicate with synthesisers and other music software. It’s very handy and flexible for a number of applications. In the cross adaptive project, OSC provides the backbone of communications between the various bits of programs and plugins we have been devising.

Generally speaking, we do not need to pay much attention to the implementation details of OSC, even as developers. User-level tasks only require us to decide the names of messages addresses, its types and the source of data we want to send. At Programming level,  it’s not very different: we just employ an OSC implementation from a library (e.g. liblo, PyOSC) to send and receive messages.

It is only when these libraries are not doing the job as well as we’d like that we have to get our hands dirty. That’s what happened in the past weeks at the project. Oeyvind has diagnosed some significant delays and higher than usual cost in OSC message dispatch. This, when we looked, seemed to stem from the underlying implementation we have been using in Csound (liblo, in this case). We tried to get around this by implementing an asynchronous operation, which seemed to improve the latencies but did nothing to help with computational load. So we had to change tack.

OSC messages are transport-agnostic, but in most cases use the User Datagram Protocol transport layer to package and send messages from one machine (or program) to another. So, it appeared to me that we could just simply write our own sender implementation using UDP directly. I got down to programming an OSCsend opcode that would be a drop-in replacement for the original liblo-based one.

OSC messages are quite straightforward in their structure, based on 4-byte blocks of data. They start with an address, which is a null-terminated string like, for instance, “/foo/bar”  :

'/' 'f' 'o' 'o' '/' 'b' 'a' 'r' '\0'

This, we can count, has 9 characters – 9 bytes – and, because of the 4-byte structure, needs to be padded to the next multiple of 4, 12, by inserting some more null characters (zeros). If we don’t do that, an OSC receiver would probably barf at it.

Next, we have the data types, e.g. ‘i’, ‘f’, ‘s’ or ‘b’ (the basic types). The first two are numeric, 4-byte integers and floats, respectively. These are to be encoded as big-endian numbers, so we will need to byteswap in little-endian platforms before the data is written to the message. The data types are encoded as a string with a starting comma (‘,’) character, and need to conform to 4-byte blocks again. For instance, a message containing a single float would have the following type string:

',' 'f' '\0'

or “,f”. This will need another null character to make it a 4-byte block. Following this, the message takes in a big-endian 4-byte floating-point number.  Similar ideas apply to the other numeric type carrying integers.

String types (‘s’) denote a null-terminated string, which as before, needs to conform to a length that is a multiple of 4-bytes. The final type, a blob (‘b’), carries a nondescript sequence of bytes that needs to be decoded at the receiving end into something meaningful. It can be used to hold data arrays of variable lengths, for instance. The structure of the message for this type requires a length (number of bytes in the blob) followed by the byte sequence. The total size needs to be a multiple of 4 bytes, as before. In Csound, blobs are used to carry arrays, audio signals and function table data.

If we follow this recipe, it is pretty straightforward to assemble a message, which will be sent as a UDP packet. Our example above would look like this:

'/' 'f' 'o' 'o' '/' 'b' 'a' 'r' '\0' '\0' '\0' '\0'
',' 'f' '\0' '\0' 0x00000001

This is what OSCsend does, as well as its new implementation. With it, we managed to provide a lightweight (low computation cost) and fast OSC message sender. In the followup to this post, we will look at the other end, how to receive arbitrary OSC messages from UDP.

Docmarker tool


During our studio sessions and other practical research work sessions, we noted that we needed a tool to annotate documentation streams. The stream could be an audio file, a video or some line of timed events. Audio editors and DAWs have tools for dropping markers into a file, and there are also tools for annotating video. However, we wanted an easy way of recording timed comments from many users, allowing these to be tied to any sequence of events, wheter recorded as audio, video or in other form. We also wanted each user to be able to make comments without necessarily having access to the file “original”, and for several users to be able to make comments simultaneously. By allowing comments from several users to be merged, one can also use this to do several “passes”  of making comments, merging with one’s own previous comments.

Assumedly, one can use this for other kinds of timed comments too. Taking notes on one’s own audio mixes, making edit lists from long interviews, … even marking of student compositions…

The tool is a simple Python script, and the code can be found at
Download, unzip, and run from the terminal with:


Crossadaptive session NTNU 12. December 2016


Trond Engum (processing musician)

Tone Åse (vocals)

Carl Haakon Waadeland (drums and percussion)

Andreas Bergsland (video)

Thomas Henriksen (sound technician)

Video digest from session:

Session objective and focus:

The main focus in this session was to explore other analysing methods than used in earlier sessions (focus on rhythmic consonance for the drums, and spectral crest on the vocals). These analysing methods were chosen to get a wider understanding of their technical functionality, but also their possible use in musical interplay. In addition to this there was an intention to include the sample/hold function for the MIDIator plug-in. The session was also set up with a large screen in the live room to monitor the processing instrument to all participants at all times. The idea was to democratize the processing musician role during the session to open up for discussion and tuning of the system as a collective process based on a mutual understanding. This would hopefully communicate a better understanding of the functionality in the system, and how the musicians individually can navigate within it through their musical input. At the same time this also opens up for a closer dialog around choice of effects and parameter mapping during the process.

Earlier experiences and process

Following up on experiences documented through earlier sessions and previous blog posts, the session was prepared to avoid the most obvious shortcomings. First of all, separation between instruments to avoid bleeding through microphones was arranged by placing vocals and drums in separate rooms. Bleeding between microphones was earlier affecting both the analysed signals and effects. The system was prepared to be as flexible as possible beforehand containing several effects to map to, flexibility in this context meaning the possibility to do fast changes and tuning the system depending on the thoughts from the musicians. Since the group of musicians remained unchanged during the session this flexibility was also seen as a necessity to go into details and more subtle changes both in the MIDIator and the effects in play to reach common aesthetical intentions.

Due to technical problems in the studio (not connected with the cross adaptive set up or software) the session was delayed for several hours resulting in shorter time than originally planned. We therefore made a choice to concentrate only on rhythmic consonance (referred to as rhythmical regularity in the video) as analysing method for both drums and vocals. The method we used to familiarize with this analysing tool was that we started with drums trying out different playing techniques with both regular and irregular strokes while monitoring the visual feedback from the analyser plug-in without any effect. Regular strokes in this case resulting in high stable value, irregular strokes resulting in low value.


Figure 1. Consonance (regularity) visualized in the upper graph.

What became evident was that when the input stopped, the analyser stayed at the last measured value, and in that way could act as a sort of sample/hold function on the last value and in that sense stabilise a setting in an effect until an input was introduced again. Another aspect was that the analysing method worked well for regularity in rhythm, but had more unpredictable behaviour when introducing irregularity.

After learning the analyser behaviour this was further mapped to a delay plugging as an adaptive effect on the drums. The parameter controlled the time range of 14 delays resulting in larger delay time range the more regularity, and vice versa.

After fine-tuning the delay range we agreed that the connection between the analyser, MIDIator and choice of effect worked musically in the same direction. (This was changed later in the session when trying out cross-adaptive processing).

The same procedure was followed when trying vocals, but then concentrating the visual monitoring mostly on the last stage of the chain, the delay effect. This was experienced as more intuitive when all settings were mapped since the musician then could interact visually with the input during performance.

Cross-adaptive processing.

When starting the cross-adaptive recording everyone had followed the process, and tried out the chosen analysing method on own instruments. Even though the focus was mainly on the technical aspects the process had already given the musicians the possibility to rehearse and get familiar with the system.

The system we ended up with was set up in the following way:

Both drums and vocals was analysed by rhythmical consonance (regularity). The drums controlled the send volume to a convolution reverb and a pitch shifter on the vocals. The more regular drums the less of the effects, the less regular drums the more of the effects.

The vocals controlled the time range in the echo plugin on the drums. The more regular pulses from the vocal the less echo time range on the drums, the less regular pulses from the vocals the larger echo time range on the drums.

Sound example (improvisation with cross adaptive setup): 


Concerts and presentations, fall 2016

A number of concerts, presentations and workshops were given during October and November 2016. We could call it the 2016 Crossadaptive Transatlantic tour, but we won’t. This post gives a brief overview.

Concerts in Trondheim and Göteborg

BRAK/RUG was scheduled for a concert (with a preceding lecture/presentation) at Rockheim, Trondheim on 21. October. Unfortunately, our drummer Siv became ill and could not play. At 5 in the afternoon (concert start at 7) we called Ola Djupvik to ask if he could sit in with us. Ola has experience from playing in a musical setting with live processing and crossadaptive processing, for example the Session 20. – 21 September,  and also from performing with music technology students Ada Mathea Hoel, Øystein Marker and others. We were very happy and grateful for his courage to step in on such short notice. Here’s and excerpt from the presentation that night, showing vocal pitch controlling reverb on the drums (high pitch means smaller reverb size), transient density on the drums controlling delay feedback on the vocals (faster playing means less feedback).

There is a significant amount of crossbleed between vocals and drums, so the crossadaptivity is quite flaky. We still have some work to do on source separation to make this work well when playing live with a PA system.

Crossadaptive demo at Rockheim mix_maja_ola_cross_rockheim_ptmstr


Thanks to Tor Breivik for recording the Rockheim event. The clip here shows only the crossadaptive demonstration. The full concert is available on Soundcloud

Brandtsegg, Ratkje, Djupvik trio at Rockheim

The day after the Trondheim concert, we played at the Göteborg Art Sounds festival. Now, Siv was feeling better and was able to play. Very nice venue at Stora Teatern. This show was not recorded.

And then we take… the US

The crossadaptive project was presented  at the Transatalantic Forum in Chicago on October 24, in a special session titled “Sensational Design: Space, Media, and the Senses”. Both Sigurd Saue, Trond Engum and myself (Øyvind Brandtsegg) took part in the presentation, showing the many-faceted aspects of our work. Being a team of three people also helped the networking effort that is naturally a part of such a forum. During our stay in Chicago, we also visited the School of the Art Institute of Chicago, meeting Nicolas Collins, Shawn Decker, Lou Mallozzi, and Bob Snyder to start working on exchange programs for both students and faculty. Later in the week, Brandtsegg did a presentation of the crossadaptive project during a SAIC class on audio projects.

Sigurd Saue and Bob Snyder at SAIC

After Chicago, Engum and Saue went to Trondheim, while I traveled further on to San Francisco, Los Angeles, Santa Barabara, and then finally to San Diego.
In the Bay area, after jamming with Joel Davel in Paul Dresher’s studio, and playing a concert with Matt Ingalls and Ken Ueno at Tom’s Place, I presented the crossadaptive project at CCRMA, Stanford University on November 2.  The presentations seemed well received and spurred a long discussion where we touched on the use of MFCC’s, ratios and critical bands, stabilizing of the peaks of rhythmic autocorrelation, the difference of the correlation between two inputs (to get to the details of each signal), and more. Getting the opportunity to discuss audio analysis with this crowd was a treat.  I also got the opportunity to go back the day after to look at student projects, which I find gives a nice feel of the vibe of the institution. There is a video of the presentation here

After Stanford, I also did a presentation at the beautiful CNMAT at UC Berkeley, with Ed Campion, Rama Gottfried, a group of enthusiastic students. There I also met colleague P.A. Nilsson from Göteborg, as he was on a residency there. P.A.’s current focus on technology to intervene and structure improvisations is closely related to some of the implications of our project.

CNMAT, UC Berkeley

On November 7 and 8, I did workshops at California Institute of the Arts, invited by Amy Knoles. In addition to presenting the technologies involved, we did practical studies where the students played in processed settings and experienced the musical potential and also the different considerations involved in this kind of performance.

Calarts workshops

Clint Dodson and Øyvind Brandtsegg experimenting together at CalArts

At UC Santa Barbara, I did a presentation in Studio Xenakis on November 9. There, I met with Curtis Roads, Andres Cabrera, and a broad range of their colleagues and students. With regards to the listening to crossadaptive performances, Curtis Roads made a precise observation that it is relatively easy to follow if one knows the mappings, but it could be hard to decode the mapping just by listening to the results. In Santa Barbara I also got to meet Owen Campbell, who did a master thesis on crossadaptive and got insight into his research and software solutions. His work on ADEPT was also presented at the AES workshop on intelligent music production at Queen Mary University this September, where Owen also met our student Iver Jordal, presenting his research on artificial intelligence in crossadaptive processing.

San Diego

Back in San Diego, I did a combined presentation and concert for the computer music forum on November 17.  I had the pleasure of playing together with Kyle Motl on double bass for this performance.

Kyle Motl and Øyvind Brandtsegg, UC San Diego

We demonstrated both live processing and crossadaptive processing between voice and bass.  There was a rich discussion with the audience. We touched on issues of learning (one by one parameter, or learning a combined and complex parameter set like one would do on an acoustic instrument), etudes, inverted mapping sometimes being more musically intuitive, how this can make a musician pay more attention to each other than to self (frustrating or liberating?), and tuning of the range and shape of parameter mappings (still seems to be a bit on/off sometimes, with relatively low resolution in the middle range).

First we did an example of a simple mapping:
Vocal amplitude reduce reverb size for bass,
Bass amplitude reduce delay feedback on vocals

Kyle and Oeyvind Simple mix_sd_nov_17_3_cross_ptmstr


Then a more complex example:
Vocal transient density -> Bass filter frequency  of a lowpass filter
Vocal pitch -> Bass delay filter frequency
Vocal percussive -> Bass delay feedback
Bass transient density -> Vocal reverb size (less)
Bass pitch+centroid -> Vocal tremolo speed
Bass noisiness -> Vocal tremolo grain size (less)

K&O complex mapping mix_sd_nov_17_4_cross_ptmstr


We also demonstrated another and more direct kind of crossadaptive processing, when doing convolution with live sampled impulse response. Oeyvind manually controlled the IR live sampling of sections from Kylse’s playing, and also triggered the convolver with tapping and scratching on a small wooden box with a piezo microphone. The wooden box source is not heard directly in the recording, but the resulting convolution is. No other processing is done, just the convolution process.

K&O, convolution mix_sd_nov_17_2b_conv_ptmstr


We also played a longer track of regular live processing this evening. This track is available on Soundcloud

Thanks to UCSD and recording engineers Kevin Dibella and James Forest Reid for recording the Nov 17 event.