The entrails of Open Sound Control, part one

Many of us are very used to employing the Open Sound Control (OSC) protocol to communicate with synthesisers and other music software. It’s very handy and flexible for a number of applications. In the cross adaptive project, OSC provides the backbone of communications between the various bits of programs and plugins we have been devising.

Generally speaking, we do not need to pay much attention to the implementation details of OSC, even as developers. User-level tasks only require us to decide the names of messages addresses, its types and the source of data we want to send. At Programming level,  it’s not very different: we just employ an OSC implementation from a library (e.g. liblo, PyOSC) to send and receive messages.

It is only when these libraries are not doing the job as well as we’d like that we have to get our hands dirty. That’s what happened in the past weeks at the project. Oeyvind has diagnosed some significant delays and higher than usual cost in OSC message dispatch. This, when we looked, seemed to stem from the underlying implementation we have been using in Csound (liblo, in this case). We tried to get around this by implementing an asynchronous operation, which seemed to improve the latencies but did nothing to help with computational load. So we had to change tack.

OSC messages are transport-agnostic, but in most cases use the User Datagram Protocol transport layer to package and send messages from one machine (or program) to another. So, it appeared to me that we could just simply write our own sender implementation using UDP directly. I got down to programming an OSCsend opcode that would be a drop-in replacement for the original liblo-based one.

OSC messages are quite straightforward in their structure, based on 4- byte blocks of data. They start with an address, which is a null-terminated string like, for instance, “/ foo / bar ”  :

'/' 'f' 'o' 'o' '/' 'b' 'a' 'r' '\0'

This, we can count, has 9 characters – 9 bytes – and, because of the 4-byte structure, needs to be padded to the next multiple of 4, 12, by inserting some more null characters (zeros). If we don’t do that, an OSC receiver would probably barf at it.

Next, we have the data types, e.g. ‘i’, ‘f’, ‘s’ or ‘b’ (the basic types). The first two are numeric, 4-byte integers and floats, respectively. These are to be encoded as big-endian numbers, so we will need to byteswap in little-endian platforms before the data is written to the message. The data types are encoded as a string with a starting comma (‘,’) character, and need to conform to 4-byte blocks again. For instance, a message containing a single float would have the following type string:

',' 'f' '\0'

or “,f”. This will need another null character to make it a 4-byte block. Following this, the message takes in a big-endian 4-byte floating-point number.  Similar ideas apply to the other numeric type carrying integers.

String types (‘s’) denote a null-terminated string, which as before, needs to conform to a length that is a multiple of 4-bytes. The final type, a blob (‘b’), carries a nondescript sequence of bytes that needs to be decoded at the receiving end into something meaningful. It can be used to hold data arrays of variable lengths, for instance. The structure of the message for this type requires a length (number of bytes in the blob) followed by the byte sequence. The total size needs to be a multiple of 4 bytes, as before. In Csound, blobs are used to carry arrays, audio signals and function table data.

If we follow this recipe, it is pretty straightforward to assemble a message, which will be sent as a UDP packet. Our example above would look like this:

'/' 'f' 'o' 'o' '/' 'b' 'a' 'r' '\0' '\0' '\0' '\0'
',' 'f' '\0' '\0' 0x00000001

This is what OSCsend does, as well as its new implementation. With it, we managed to provide a lightweight (low computation cost) and fast OSC message sender. In the followup to this post, we will look at the other end, how to receive arbitrary OSC messages from UDP.

Cross adaptive session with 1st year jazz students, NTNU, March 7-8

This is a description of a session with first year jazz students at NTNU recorded March 7 and 8. The session was organized as part of the ensemble teaching that is given to jazz students at NTNU, and was meant to take care of both the learning outcomes from the normal ensemble teaching, and also aspects related to the cross adaptive project.

Musicians:

Håvard Aufles, Thea Ellingsen Grant, Erlend Vangen Kongstorp, Rino Sivathas, Øyvind Frøberg Mathisen, Jonas Enroth, Phillip Edwards Granly, Malin Dahl Ødegård and Mona Thu Ho Krogstad.

Processing musician:

Trond Engum

Video documentation:

Andreas Bergsland

Sound technician:
Thomas Henriksen

Video digest from the session:

Preparation:

Based on our earlier experiences with bleeding between microphones we located instruments in separate rooms. Since there was quit a big group of different performers it was important that changing set-up took as little time as possible. There was also prepared a system set-up beforehand based on the instruments in use. To gain an understanding of the project from the performer side as early in the process as possible we used the same four step chronology when introducing the performers to the set-up.

  1. Start with individual instruments trying different effects through live processing and decide together with the performers what effects most suitable to add to their instrument.
  2. Introducing the analyser and decide, based on input form the performers, which methods best suited for controlling different effects from their instrument.
  3. Introducing adaptive processing were one performer is controlling the effects on the other, and then repeat vice versa.
  4. Introducing cross-adaptive processing were all previous choices and mappings are opened up for both performers.

Session report:

Day 1. Tuesday 7 th March

Trumpet and drums

Sound example 1: (Step 1) Trumpet live processed with two different effects, convolution (impulse response from water) and overdrive.

The performer was satisfied with the chosen effects, also because the two were quite different in sound quality. The overdrive was experienced as nice, but he would not like to have it present all the time. We decided to save these effects for later use on trumpet, and be aware of dynamic control on the overdrive.

Sound example 2: (Step 1) Drums live processed with dynamically changing delay and a pitch shift 2 octaves down. The performer found the chosen effects interesting, and the mapping was saved for later use.

Sound example 3: (Step 1) Before entering the analyser and adaptive processing we wanted to try playing together with the effects we had chosen to see if they blended well together. The trumpet player had some problems with hearing the drums during the performance, felt as they were a bit in the background. We found out that the direct sound of the drums was a bit low in the mix, and this was adjusted. We discussed that it is possible to make the direct sound of both instruments louder or softer depending what the performer wants to achieve.

Sound example 4. (Step 2/3) For this example we entered into the analyser using transient density on drums. This was tried out by showing the analyser at the same time as doing an accelerando on drums. This was then set up as an adaptive control from drums on the trumpet. For control, the trumpet player had a suggestion that the more transient density the less convolution effect was added to the trumpet (less send to a convolution effect with a recording of water). The reason for this was that it could make more sense to have more water on slow ambient parts than on the faster hectic parts. At the same time he suggested that the opposite should happen when adding overdrive to the trumpet by transient density meaning that the more transient density the more overdrive on the trumpet. During the first take a reverb was added to the overdrive in order to blend the sound more into the production. It felt like the dynamical control over the effects was a bit difficult because the water disappeared to easily, and the overdrive was introduced to easily. We agreed to fine-tune the dynamical control before doing the actual test that is present as sound example 4.

Sound example 5: For this example we changed roles and enabled the trumpet to control the drums (adaptive processing). We followed a suggestion from the trumpet player and used pitch as an analyses parameter. We decided to use this to control the delay effect on the drums. Low notes produced long gaps between delays, whereas high notes produced small gap between delays. This was maybe not the best solution for getting good dynamical control, but we decide to keep this anyway.

Sound example 6: Cross adaptive performance using the effects and control mappings introduced in example 4 and 5. This was a nice experience for the musicians. Even though it still felt a bit difficult to control it was experienced as musical meaningful. Drummer: “Nice to play a steady grove, and listen to how the trumpet changed the sound of my instrument”.

Vocals and piano

Sound example 7: We had now changed the instrumentation over to vocals and piano, and we started with a performance doing live processing on both instruments. The vocals were processed using two different effects using a delay, and convolution through a recording of small metal parts. The piano was processed using an overdrive and convolution through water.

Sound example 8: Cross adaptive performance where the piano was analysed by rhythmical consonance controlling the delay effect on vocals. The vocal was analysed by transient density controlling the convolution effect on the piano. Both musicians found this difficult, but musically meaningful. Sometimes the control aspect was experienced as counterintuitive to the musical intention. Pianist: It felt like there was a 3rd musician present.

Saxophone self-adaptive processing

Sound example 9: We started with a performance doing live processing to familiarize the performer with the effects. The performer found the augmentation of extended techniques as clicks and pops interesting since this magnified “small” sounds.

Sound example 10: Self-adaptive processing performances where the saxophone was analysed by transient density and then used to control two different convolution effects (recording of metal parts and recording of a cymbal). The first one resulting in a delay effect the second as a reverb. The higher transient density in the analyses the more delay and less reverb and vice versa. The performer experienced the quality of the effects quit similar so we removed the delay effect.

Sound example 11: Self-adaptive processing performances using the same set-up but changing the delay effect to overdrive. The use of overdrive on saxophone did not bring anything new to the table the way it was set up since the acoustic sound of the instrument could sound similar to the effect when putting in strong energy.

Day 2. Wednesday 8 th March

Saxophone and piano

Sound example 12: Performance with saxophone and live processing, familiarizing the performer with the different effects and then choose which of the effects to bring further into the session. Performer found this interesting and wanted to continue with reverb ideas.

Sound example 13: Performance with piano and live processing. The performer especially liked the last part with the delays – Saxophonist: “It was like listening to the sound under water (convolution with water) sometimes, and sometimes like listening to an old radio (overdrive)”. Piano wanted to keep the effects that were introduced.

Sound example 14: Adaptive processing, controlling delay on saxophone from the piano by using analyses of the transient density. The higher transient density, the larger gap between delays on the saxophone. The saxophone player found it difficult to interact since the piano had a clean sound during performance. The piano on the other hand felt in control over the effect that was added.

Sound example 15: Adaptive processing using saxophone to control piano. We analyzed the rhythmical consonance on saxophone. The higher degree of consonance, the more convolution effect (water) was added to piano and vice versa. Saxophone didn’t feel in control during performance, and guessed it was due to not holding a steady rhythm over a longer period. The direct sound of the piano was also a bit loud in the mix making the added effect a bit low in the mix. Piano felt that saxophone was in control, but agreed to the point that the analyses was not able to read to the limit because of the lack of a steady rhythm over a longer time period.

Sound example 16: Crossadptive performance using the same set-up as in example 14 and 15. Both performers felt in control, and started to explore more of the possibilities. Interesting point when the saxophone stops to play since the rhythmical consonance analyses will make a drop as soon as it starts to read again. This could result in strong musical statements.

Sound example 17: Crossadaptive performance keeping the same setting but adding rms analyses on the saxophone to control a delay on the piano (the higher rms the less delay and vice versa).

Vocals and electric guitar

Sound example 18: Performance with vocals and live processing. Vocalist: “It is fun, but something you need to get use to, needs a lot of time”.

Sound example 19: Performance with Guitar and live processing. Guitarist: “Adapted to the effects, my direct sound probably sounds terrible, feel that I`m loosing my touch, but feels complementary and a nice experience”.

Sound example 20: Performance with adaptive processing. Analyzing the guitar using rms and transient density. The higher transient density the more delay added to the vocal, and higher rms the less reverb added to the vocal. Guitar: I feel like a remote controller and it is hard to focus on what I play sometimes. Vocalist: “Feels like a two dimensional way of playing”.

Sound example 21: Performance with adaptive processing. Controlling the guitar by vocals. Analyzing the rhythmical consonance on the vocal to control the time gap between delays inserted on the guitar. Higher rhythmical consonance results in larger gaps and vice versa. The transient density on vocal controls the amount of pitch shift added to the guitar. The higher transient density the less volume is sent to the pitch shift.

Sound example 22: Performance with cross adaptive processing using the same settings as in sound example 20 and 21.

Vocalist: “It is another way of making music, I think”. Guitarist: “I feel control and I feel my impact, but musical intention really doesn’t fit with what is happening – which is an interesting parameter. Changing so much with doing so little is cool”.

Observation and reflections

The sessions has now come to a point were there is less time used on setting up and figuring out how the functionality in the software works, and more time used on actual testing. This is an important step taking in consideration working with musicians that are introduced to the concept the first time. A good stability in software and separation between microphones makes the workflow much more effective. It still took some time to set up everything the first day due to two system crashes, the first one related to the midiator, the second one related to video streaming.

Since preparing the system beforehand there was a lot of reuse both concerning analyzing methods and the choice of effects. Even though there were a lot of reuse on the technical side the performances and results has a large variety in expressions. Even though this is not surprising we think it is an important aspect to be reminded of during the project.

Another technical workaround that was discussed concerning the analyzing stage was the possibility to operate with two different microphones on the same instrument. The idea is then to use one for reading analyses, and one for capturing the “total” sound of the instrument for use in processing. This will of course depend on which analyzing parameter in use, but will surely help for a more dynamical reading in some situations both concerning bleeding, but also for closer focus on wanted attributes.

The pedagogical approach using the four-step introduction was experienced as fruitful when introducing the concept to musicians for the first time. This helped the understanding during the process and therefor resulted in more fruitful discussions and reflections between the performers during the session. Starting with live processing says something about possibilities and flexible control over different effects early in the process, and gives the performers a possibility to be a part of deciding aesthetics and building a framework before entering the control aspect.

Quotes from the the performers:

Guitarist: “Totally different experience”. “Felt best when I just let go, but that is the hardest part”. “It feels like I’m a midi controller”. “… Hard to focus on what I’m playing”. “Would like to try out more extreme mappings”

Vocalist: “The product is so different because small things can do dramatic changes”. “Musical intention crashes with control”. “It feels like a 2-dimensional way of playing”

Pianoist: “Feels like an extra musician”