Andreas – Cross adaptive processing as musical intervention http://crossadaptive.hf.ntnu.no Exploring radically new modes of musical interaction in live performance Tue, 27 Nov 2018 13:25:54 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.10 116975052 Cross adaptive session with 1st year jazz students, NTNU, March 7-8 http://crossadaptive.hf.ntnu.no/index.php/2017/04/06/cross-adaptive-session-with-1st-year-jazz-students-ntnu-march-7-8/ http://crossadaptive.hf.ntnu.no/index.php/2017/04/06/cross-adaptive-session-with-1st-year-jazz-students-ntnu-march-7-8/#comments Thu, 06 Apr 2017 13:46:21 +0000 http://crossadaptive.hf.ntnu.no/?p=784 Continue reading "Cross adaptive session with 1st year jazz students, NTNU, March 7-8"]]> This is a description of a session with first year jazz students at NTNU recorded March 7 and 8. The session was organized as part of the ensemble teaching that is given to jazz students at NTNU, and was meant to take care of both the learning outcomes from the normal ensemble teaching, and also aspects related to the cross adaptive project.

Musicians:

Håvard Aufles, Thea Ellingsen Grant, Erlend Vangen Kongstorp, Rino Sivathas, Øyvind Frøberg Mathisen, Jonas Enroth, Phillip Edwards Granly, Malin Dahl Ødegård and Mona Thu Ho Krogstad.

Processing musician:

Trond Engum

Video documentation:

Andreas Bergsland

Sound technician:
Thomas Henriksen

 

Video digest from the session:

Preparation:

Based on our earlier experiences with bleeding between microphones we located instruments in separate rooms. Since there was quit a big group of different performers it was important that changing set-up took as little time as possible. There was also prepared a system set-up beforehand based on the instruments in use. To gain an understanding of the project from the performer side as early in the process as possible we used the same four step chronology when introducing the performers to the set-up.

  1. Start with individual instruments trying different effects through live processing and decide together with the performers what effects most suitable to add to their instrument.
  2. Introducing the analyser and decide, based on input form the performers, which methods best suited for controlling different effects from their instrument.
  3. Introducing adaptive processing were one performer is controlling the effects on the other, and then repeat vice versa.
  4. Introducing cross-adaptive processing were all previous choices and mappings are opened up for both performers.

 

Session report:

Day 1. Tuesday 7th March

Trumpet and drums

Sound example 1: (Step 1) Trumpet live processed with two different effects, convolution (impulse response from water) and overdrive.

 

The performer was satisfied with the chosen effects, also because the two were quite different in sound quality. The overdrive was experienced as nice, but he would not like to have it present all the time. We decided to save these effects for later use on trumpet, and be aware of dynamic control on the overdrive.

 

Sound example 2: (Step 1) Drums live processed with dynamically changing delay and a pitch shift 2 octaves down. The performer found the chosen effects interesting, and the mapping was saved for later use.

 

Sound example 3: (Step 1) Before entering the analyser and adaptive processing we wanted to try playing together with the effects we had chosen to see if they blended well together. The trumpet player had some problems with hearing the drums during the performance, felt as they were a bit in the background. We found out that the direct sound of the drums was a bit low in the mix, and this was adjusted. We discussed that it is possible to make the direct sound of both instruments louder or softer depending what the performer wants to achieve.

 

Sound example 4. (Step 2/3) For this example we entered into the analyser using transient density on drums. This was tried out by showing the analyser at the same time as doing an accelerando on drums. This was then set up as an adaptive control from drums on the trumpet. For control, the trumpet player had a suggestion that the more transient density the less convolution effect was added to the trumpet (less send to a convolution effect with a recording of water). The reason for this was that it could make more sense to have more water on slow ambient parts than on the faster hectic parts. At the same time he suggested that the opposite should happen when adding overdrive to the trumpet by transient density meaning that the more transient density the more overdrive on the trumpet. During the first take a reverb was added to the overdrive in order to blend the sound more into the production. It felt like the dynamical control over the effects was a bit difficult because the water disappeared to easily, and the overdrive was introduced to easily. We agreed to fine-tune the dynamical control before doing the actual test that is present as sound example 4.

 

Sound example 5: For this example we changed roles and enabled the trumpet to control the drums (adaptive processing). We followed a suggestion from the trumpet player and used pitch as an analyses parameter. We decided to use this to control the delay effect on the drums. Low notes produced long gaps between delays, whereas high notes produced small gap between delays. This was maybe not the best solution for getting good dynamical control, but we decide to keep this anyway.

 

Sound example 6: Cross adaptive performance using the effects and control mappings introduced in example 4 and 5. This was a nice experience for the musicians. Even though it still felt a bit difficult to control it was experienced as musical meaningful. Drummer: “Nice to play a steady grove, and listen to how the trumpet changed the sound of my instrument”.

 

Vocals and piano

Sound example 7: We had now changed the instrumentation over to vocals and piano, and we started with a performance doing live processing on both instruments. The vocals were processed using two different effects using a delay, and convolution through a recording of small metal parts. The piano was processed using an overdrive and convolution through water.

 

Sound example 8: Cross adaptive performance where the piano was analysed by rhythmical consonance controlling the delay effect on vocals. The vocal was analysed by transient density controlling the convolution effect on the piano. Both musicians found this difficult, but musically meaningful. Sometimes the control aspect was experienced as counterintuitive to the musical intention. Pianist: It felt like there was a 3rd musician present.

 

Saxophone self-adaptive processing

Sound example 9: We started with a performance doing live processing to familiarize the performer with the effects. The performer found the augmentation of extended techniques as clicks and pops interesting since this magnified “small” sounds.

 

Sound example 10: Self-adaptive processing performances where the saxophone was analysed by transient density and then used to control two different convolution effects (recording of metal parts and recording of a cymbal). The first one resulting in a delay effect the second as a reverb. The higher transient density in the analyses the more delay and less reverb and vice versa. The performer experienced the quality of the effects quit similar so we removed the delay effect.

 

Sound example 11: Self-adaptive processing performances using the same set-up but changing the delay effect to overdrive. The use of overdrive on saxophone did not bring anything new to the table the way it was set up since the acoustic sound of the instrument could sound similar to the effect when putting in strong energy.

 

Day 2. Wednesday 8th March

 

Saxophone and piano

Sound example 12: Performance with saxophone and live processing, familiarizing the performer with the different effects and then choose which of the effects to bring further into the session. Performer found this interesting and wanted to continue with reverb ideas.

 

Sound example 13: Performance with piano and live processing. The performer especially liked the last part with the delays – Saxophonist: “It was like listening to the sound under water (convolution with water) sometimes, and sometimes like listening to an old radio (overdrive)”. Piano wanted to keep the effects that were introduced.

 

Sound example 14: Adaptive processing, controlling delay on saxophone from the piano by using analyses of the transient density. The higher transient density, the larger gap between delays on the saxophone. The saxophone player found it difficult to interact since the piano had a clean sound during performance. The piano on the other hand felt in control over the effect that was added.

 

Sound example 15: Adaptive processing using saxophone to control piano. We analyzed the rhythmical consonance on saxophone. The higher degree of consonance, the more convolution effect (water) was added to piano and vice versa. Saxophone didn’t feel in control during performance, and guessed it was due to not holding a steady rhythm over a longer period. The direct sound of the piano was also a bit loud in the mix making the added effect a bit low in the mix. Piano felt that saxophone was in control, but agreed to the point that the analyses was not able to read to the limit because of the lack of a steady rhythm over a longer time period.

 

Sound example 16: Crossadptive performance using the same set-up as in example 14 and 15. Both performers felt in control, and started to explore more of the possibilities. Interesting point when the saxophone stops to play since the rhythmical consonance analyses will make a drop as soon as it starts to read again. This could result in strong musical statements.

 

Sound example 17: Crossadaptive performance keeping the same setting but adding rms analyses on the saxophone to control a delay on the piano (the higher rms the less delay and vice versa).

 

Vocals and electric guitar

Sound example 18: Performance with vocals and live processing. Vocalist: “It is fun, but something you need to get use to, needs a lot of time”.

 

Sound example 19: Performance with Guitar and live processing. Guitarist: “Adapted to the effects, my direct sound probably sounds terrible, feel that I`m loosing my touch, but feels complementary and a nice experience”.

 

Sound example 20: Performance with adaptive processing. Analyzing the guitar using rms and transient density. The higher transient density the more delay added to the vocal, and higher rms the less reverb added to the vocal. Guitar: I feel like a remote controller and it is hard to focus on what I play sometimes. Vocalist: “Feels like a two dimensional way of playing”.

 

Sound example 21: Performance with adaptive processing. Controlling the guitar by vocals. Analyzing the rhythmical consonance on the vocal to control the time gap between delays inserted on the guitar. Higher rhythmical consonance results in larger gaps and vice versa. The transient density on vocal controls the amount of pitch shift added to the guitar. The higher transient density the less volume is sent to the pitch shift.

 

Sound example 22: Performance with cross adaptive processing using the same settings as in sound example 20 and 21.

Vocalist: “It is another way of making music, I think”. Guitarist: “I feel control and I feel my impact, but musical intention really doesn’t fit with what is happening – which is an interesting parameter. Changing so much with doing so little is cool”.

 

Observation and reflections

The sessions has now come to a point were there is less time used on setting up and figuring out how the functionality in the software works, and more time used on actual testing. This is an important step taking in consideration working with musicians that are introduced to the concept the first time. A good stability in software and separation between microphones makes the workflow much more effective. It still took some time to set up everything the first day due to two system crashes, the first one related to the midiator, the second one related to video streaming.

 

Since preparing the system beforehand there was a lot of reuse both concerning analyzing methods and the choice of effects. Even though there were a lot of reuse on the technical side the performances and results has a large variety in expressions. Even though this is not surprising we think it is an important aspect to be reminded of during the project.

 

Another technical workaround that was discussed concerning the analyzing stage was the possibility to operate with two different microphones on the same instrument. The idea is then to use one for reading analyses, and one for capturing the “total” sound of the instrument for use in processing. This will of course depend on which analyzing parameter in use, but will surely help for a more dynamical reading in some situations both concerning bleeding, but also for closer focus on wanted attributes.

 

The pedagogical approach using the four-step introduction was experienced as fruitful when introducing the concept to musicians for the first time. This helped the understanding during the process and therefor resulted in more fruitful discussions and reflections between the performers during the session. Starting with live processing says something about possibilities and flexible control over different effects early in the process, and gives the performers a possibility to be a part of deciding aesthetics and building a framework before entering the control aspect.

 

Quotes from the the performers:

Guitarist: “Totally different experience”. “Felt best when I just let go, but that is the hardest part”. “It feels like I’m a midi controller”. “… Hard to focus on what I’m playing”. “Would like to try out more extreme mappings”

Vocalist: “The product is so different because small things can do dramatic changes”. “Musical intention crashes with control”. “It feels like a 2-dimensional way of playing”

Pianoist: “Feels like an extra musician”

 

 

 

]]>
http://crossadaptive.hf.ntnu.no/index.php/2017/04/06/cross-adaptive-session-with-1st-year-jazz-students-ntnu-march-7-8/feed/ 1 784
Session with classical percussion students at NTNU, February 20, 2017 http://crossadaptive.hf.ntnu.no/index.php/2017/03/10/session-with-classical-percussion-students-at-ntnu-february-20-2017/ http://crossadaptive.hf.ntnu.no/index.php/2017/03/10/session-with-classical-percussion-students-at-ntnu-february-20-2017/#respond Fri, 10 Mar 2017 13:45:34 +0000 http://crossadaptive.hf.ntnu.no/?p=745 Continue reading "Session with classical percussion students at NTNU, February 20, 2017"]]> Introduction:

This session was a first attempt in trying out cross-adaptive processing with pre-composed material. Two percussionists, Even Hembre and Arne Kristian Sundby, students at the classical section, were invited to perform a composition written for two tambourines. The musicians had already performed this piece earlier in rehearsals and concerts. As a preparation for the session the musicians were asked to do a sound recording of the composition in order to prepare analysis methods and choice of effects before the session. A performance of the piece in its original form can be seen in this video – “Conversation for two tambourines” by Bobby Lopez performed by Even Hembre and Arne Kristian Sundby (recorded by Even Hembre).

Preparation:

Since both performers had limited experience with live electronics in general we decided to introduce the cross adaptive system gradually during the session. The session started with headphone listening, followed by introducing different sound effects while giving visual feedback to the musicians, and then performing with adaptive processing before finally introducing cross-adaptive processing. As a starting point, we used analysis methods which had already proved effective and intuitive in earlier sessions (RMS, transient density and rhythmical consonance). These methods also made it easier to communicate and discuss the technical process with the musicians during the session. The system was set up to control time based effects such as delays and reverbs, but also typical insert effects like filters and overdrive. The effect control contained both dynamical changes of different effect parameters, but also sample/hold function through the MIDIator. We had also brought a foot pedal so the performers could change the effects on the different parts of the composition during the performance.

Session:

After we had prepared and set up the system we discovered severe latency on the outputs of the system. Input signals seemed to function properly, but what was causing the latency of the output was not discovered. To solve the problem, we made a fresh set-up using the same mentioned analysing methods and effects, and after checking that the latency was gone, the session proceeded. We started with a performance of the composition without any effects, but with the performers using headphones to get familiar with the situation. The direct sound of each tambourine was panned hard left/right in the monitoring system to easier identify the two performers. After an initial discussion it was decided that both tambourines should be located in the same room since the visual communication between the performers was important in this particular piece. The microphones were separated with an acoustic barrier/screen and microphones set to cardio characteristic in order to avoid as much bleeding between the two as possible. During the performance the MIDIator was adjusted to the incoming signals. It became clear that there were some issues with bleeding already at this stage affecting the analyser, but we nevertheless retained the set-up to maintain the focus on the performance aspect. The composition had large variations in dynamics, and also in movement of the instruments. This was seen as a challenge considering the microphones’ static placements and the consequently large differences in input signal. Because of the movement, just small distance variations between instrument and microphone would have great impact in how the analysis methods read the signals. During the set-up, the visual feedback from the screen to the performers was a very welcome contribution regarding the understanding of the set-up. While setting up the MIDIator to control the effects we tried playing through the composition again trying out different effects. Adding effects made a big impact to the performance. It became clear that the performers tried to “block out” the effects while playing in order to not loose track of how the piece was composed. In this case the effects almost created a filter between the performers and the composition resulting in a gap between what they expected and what they got. This could of course be a consequence of the effects that was chosen, but the situation demanded another angle to narrow everything down in order to create a better understanding and connection between the performance and the technology. Since the composition consisted of different parts we made a selection of one of the quieter parts where the musicians could see how their playing affected their analysers, and how this further could be mapped to different effects using the MIDIator. There was still a large amount of overlapping between the instruments into the analyser because of bleeding, so we needed to take a break and rearrange the physical set-up in the room to further clarify the connection between musical input, analyser, MIDIator and effects. Avoiding the microphone bleeding helped both the system and the musicians to clarify how the input reacted to the different effects. Since the performers were interested in how this changed the sound of their instruments we agreed to abandon the composition, and instead testing out different set-ups, both adaptive and crossadaptive.

Sound examples:

1. Trying different effects on tambourine, processing musician controlling all parameters. Tambourine 1 (Even) is convolved with a recording of water and a cymbal. Tambourine 2 (Arne Kristian) is processed with delay, convolved with a recording of small metal parts and a pitch delay.

 

2. Tambourine 1 (Even) is analysed using transient density. The transient density is controlling a delay plug in on tambourine 2 (Arne Kristian)

 

3. Tambourine 2 (Arne Kristian) is analysed by transient density controlling a send from tambourine 1 convolved with cymbal. The higher transient density the less send.

 

4. Keeping the mapping settings from example 2 and 3 but adding rhythmical consonance analyses on Tambourine 2 to control another send level from tambourine 1 convolving it with recording of water. The higher consonance the more send. The transient density analysis on tambourine 1 is in addition mapped to control a send from tambourine 2 convolving it with metal parts. The higher density, the more send.

 

Observations:

Even though we worked with a composed piece it would be a good idea to have a “rehearsal” with the performers beforehand focusing on different directions through processing. This could open up for thoughts around how to do a new and meaningful interpretation of the same composition with the new elements.

 

It was a good idea to record the piece beforehand in order to construct the processing system, but this recording did not have any separation between the instruments either. This resulted in preparing and constructing a system that in theory were unable to be cross adaptive since it both analysed and processed the sum of both instruments leaving much less control to the individual musicians. This aspect, also concerning bleeding between microphones in more controlled environments, challenges a concept of fully controlling a cross adaptive performance. This challenge will probably be further magnified in a concert situation preforming through speakers. The musicians also noted that the separation between microphones was crucial for the understanding of the process, and the possibility to get a feeling of control.

In retrospect, the time-based effects prepared for this session could also be changed since several of them often worked against the intention of the composition, especially the most rhythmical parts. Even noted that: “Sometimes it’s like trying to speak with headphones that play your voice right after you have said the word, and that unable you to continue”.

This particular piece could probably benefit from more subtle changes from the processing. The sum of this made the interaction aspect between the performers and the technology more reduced. This became clearer when we abandoned the composition and concentrated on interaction in a more “free” setting. One way of going further into this particular composition could be to take a mixed music approach, and “recompose” and interpret it again with the processing element as a more included part of the composition process.

In the following and final part of the session, the musicians were allowed to freely improvise while being connected to the processing system. This was experienced as much more fruitful by both performers. The analysis algorithms focusing on rhythmical aspects, namely transient density and rhythmical consonance, were both experienced as meaningful and connected to the performers’ playing. These control parameters were mapped to effects like convolution and delay (cf. explanation of sound examples 1-4). The performers focused on issues of control, the differences between “normal” and inverse mapping, headphones monitoring and microphone bleeding when discussing their experiences of the session (see the video digest below for some highlights).

Video digest from session February 20, 2017

]]>
http://crossadaptive.hf.ntnu.no/index.php/2017/03/10/session-with-classical-percussion-students-at-ntnu-february-20-2017/feed/ 0 745
Crossadaptive session NTNU 12. December 2016 http://crossadaptive.hf.ntnu.no/index.php/2016/12/16/crossadaptive-session-ntnu-12-december-2016/ http://crossadaptive.hf.ntnu.no/index.php/2016/12/16/crossadaptive-session-ntnu-12-december-2016/#comments Fri, 16 Dec 2016 14:37:04 +0000 http://crossadaptive.hf.ntnu.no/?p=657 Continue reading "Crossadaptive session NTNU 12. December 2016"]]> Participants:

Trond Engum (processing musician)

Tone Åse (vocals)

Carl Haakon Waadeland (drums and percussion)

Andreas Bergsland (video)

Thomas Henriksen (sound technician)

Video digest from session:

Session objective and focus:

The main focus in this session was to explore other analysing methods than used in earlier sessions (focus on rhythmic consonance for the drums, and spectral crest on the vocals). These analysing methods were chosen to get a wider understanding of their technical functionality, but also their possible use in musical interplay. In addition to this there was an intention to include the sample/hold function for the MIDIator plug-in. The session was also set up with a large screen in the live room to monitor the processing instrument to all participants at all times. The idea was to democratize the processing musician role during the session to open up for discussion and tuning of the system as a collective process based on a mutual understanding. This would hopefully communicate a better understanding of the functionality in the system, and how the musicians individually can navigate within it through their musical input. At the same time this also opens up for a closer dialog around choice of effects and parameter mapping during the process.

Earlier experiences and process

Following up on experiences documented through earlier sessions and previous blog posts, the session was prepared to avoid the most obvious shortcomings. First of all, separation between instruments to avoid bleeding through microphones was arranged by placing vocals and drums in separate rooms. Bleeding between microphones was earlier affecting both the analysed signals and effects. The system was prepared to be as flexible as possible beforehand containing several effects to map to, flexibility in this context meaning the possibility to do fast changes and tuning the system depending on the thoughts from the musicians. Since the group of musicians remained unchanged during the session this flexibility was also seen as a necessity to go into details and more subtle changes both in the MIDIator and the effects in play to reach common aesthetical intentions.

Due to technical problems in the studio (not connected with the cross adaptive set up or software) the session was delayed for several hours resulting in shorter time than originally planned. We therefore made a choice to concentrate only on rhythmic consonance (referred to as rhythmical regularity in the video) as analysing method for both drums and vocals. The method we used to familiarize with this analysing tool was that we started with drums trying out different playing techniques with both regular and irregular strokes while monitoring the visual feedback from the analyser plug-in without any effect. Regular strokes in this case resulting in high stable value, irregular strokes resulting in low value.

picture1

Figure 1. Consonance (regularity) visualized in the upper graph.

What became evident was that when the input stopped, the analyser stayed at the last measured value, and in that way could act as a sort of sample/hold function on the last value and in that sense stabilise a setting in an effect until an input was introduced again. Another aspect was that the analysing method worked well for regularity in rhythm, but had more unpredictable behaviour when introducing irregularity.

After learning the analyser behaviour this was further mapped to a delay plugging as an adaptive effect on the drums. The parameter controlled the time range of 14 delays resulting in larger delay time range the more regularity, and vice versa.

After fine-tuning the delay range we agreed that the connection between the analyser, MIDIator and choice of effect worked musically in the same direction. (This was changed later in the session when trying out cross-adaptive processing).

The same procedure was followed when trying vocals, but then concentrating the visual monitoring mostly on the last stage of the chain, the delay effect. This was experienced as more intuitive when all settings were mapped since the musician then could interact visually with the input during performance.

Cross-adaptive processing.

When starting the cross-adaptive recording everyone had followed the process, and tried out the chosen analysing method on own instruments. Even though the focus was mainly on the technical aspects the process had already given the musicians the possibility to rehearse and get familiar with the system.

The system we ended up with was set up in the following way:

Both drums and vocals was analysed by rhythmical consonance (regularity). The drums controlled the send volume to a convolution reverb and a pitch shifter on the vocals. The more regular drums the less of the effects, the less regular drums the more of the effects.

The vocals controlled the time range in the echo plugin on the drums. The more regular pulses from the vocal the less echo time range on the drums, the less regular pulses from the vocals the larger echo time range on the drums.

Sound example (improvisation with cross adaptive setup): 

 

]]>
http://crossadaptive.hf.ntnu.no/index.php/2016/12/16/crossadaptive-session-ntnu-12-december-2016/feed/ 1 657
Multi-camera recording and broadcasting http://crossadaptive.hf.ntnu.no/index.php/2016/11/21/multi-camera-recording-and-broadcasting/ http://crossadaptive.hf.ntnu.no/index.php/2016/11/21/multi-camera-recording-and-broadcasting/#respond Mon, 21 Nov 2016 07:54:01 +0000 http://crossadaptive.hf.ntnu.no/?p=585 Continue reading "Multi-camera recording and broadcasting"]]> Audio and video documentaion is often an important component of projects that analyse or evaluate musical performance and/or interaction. This is also the case in the Cross Adaptive project where every session was to be recorded in video and multi-track audio, and subsequently within two days, this material should be organized and edited into a 2-5 minute digest. However, when documenting sessions in this project we face the challenge of having several musicians located in one room (the studio recording room), often so that they are facing each other rather than being oriented in the same direction. Thus, an ordinary single camera recording of the musicians would have difficulties including all musicians in one image. Our challenge, then, has been to find some way of doing this that has high enough quality to be able to detect salient aspects of the musical performances, that is in sync both for video streams and audio, and but isn’t too expensive, since the budget for this work has been limited, or too complicated to handle. In this blog post I will present some of the options I have considered, both in terms of hardware and software, and present some of the tests I have done in the process.

Commercial hardware/software solutions
One of the things we considered in the first investigative phase was to check the commercial market for multi-camera recording solutions. After some initial searches on the web I made a few inquiries, which resulted in two quotes for two different systems:

1. A system based on the Matrox VS4 Recorder pro (http://www.matrox.com/video/en/products/vs4/vs4recorderpro/)
2. StreamPix 7, multi-camera version (https://www.norpix.com/products/streampix/streampix.php).

Both included all hardware and software needed for in-sync multi-camera recording, except for the cameras in themselves. The StreamPix quote also included a computer with suitable specs. Both quotes were immediately deemed inadequate since prices were well beyond our budget (> $5000).

I also have to mention that after having set up a system for the first session in September 2016, we have seen the Apollo multicamera recorder/switcher, which looks like a very light weight and convenient way of doing multi-camera recording:

https://www.convergent-design.com/apollo

At $2.995 (price listed on the US web site), it might be a possibility to consider in the future, even if it still is in the high end of our budget range.

Using PCs and USB3 cameras
A project currently running at the NTNU (in collaboration with NRK and UNINETT) is Nettmusikk (Net Music). This project is testing solutions to do high-resolution, low latency streaming of audio and video over the internet. The project has bought four USB3 cameras (Point Grey Grasshopper, GS3-U3-41C6C-C) and four dual boot (Linux/win) PCs with high-performance specs. These cameras deliver a very high quality image with low latency. However, one challenge is that the USB3 standard is still not very developed so that cameras can “plug-and-play” on all platforms, and specifically that the software and drivers for the Point Grey cameras is proprietary, and won’t therefore provide images for all software out-of-the-box.

After doing some research on the Point Grey camera solution, we found that there were several issues that made this option unsuitable and/or impractical.
1. To get full sync between cameras, they needed to have an optical strobe sync, which required a master/slave configuration with all cameras to in the same room.
2. There was no software that could do multi-camera recording from USB3 out-of-the-box. Rather, Point Grey provided an SDK (Fly Capture) with working examples that allowed building of our own applications. While this was an option we considered, it still looked like an option that demanded a bit of programming work (C++).
3. Because the cameras stream and record in raw we would also need very fast SSDs to record 4 video files at once. If we were recording 4 video streams at 8-bit 1920×1080 @ 30 FPS that would be 248.8 mb/s (62.2 x 4) of data. This would also fill up hard drive space pretty fast.

USB2 HD cameras
It turned out that one of the collaborators in the Nettmusikk project had access to four USB2 HD cameras. Since the USB2 protocol is a cross-platform standard that provides plug-and-play and usually has few issues with compatibility. The quality of these Tanberg Precision HD cameras was far from professional production quality, but had reasonably clear images. When viewing the camera image on a computer, there was a barely noticable latency. Here is a review made by the gadgeteer:

http://the-gadgeteer.com/2010/03/26/tandberg-precisionhd-usb-video-camera-review/

Without high-quality lenses, however, the possibilities for adjusting zoom, angle and light sensiticity were non-existent, and particularly the latter proved to be an issue in the sessions. Especially we went into problems if direct light sources were in the image, so that careful placement of light sources was necessary. Also, it was sometimes difficult to get enough distance between the cameras and the performer since the room wasn’t very big, and an adjustable lens would probably have been of help here. Furthermore, the cameras has an autofocus function that sometimes was confused in the first session, making the image blurry for a second or two, until it was appropriately adjusted. Lastly, the cameras had a foot suitable for table placement, but no possibilities of fastening it to a standard stand, like the Point Grey camera has. I therefore had to use duck tape to fasten the cameras, which worked ok, but looked far from professional.

A second issue for concern was whether these cameras could provide images with long cable lengths. What I learned here was that they would operate perfectly with an 5m USB extension if they were connected to a powered hub at the end, that provided the cameras with sufficient power. It turned out that two cameras per hub did work, but might sometimes introduce some latency and drop-outs. Thus, taking the signal from four cameras via four hubs and extension chords into the four USB ports on the PC seemed like the best solution. (Although we only used three hubs in the first session). That gave us a fairly high flexibility in terms of placing the cameras.

Mosaic image
Due to the very high amounts of data involved in dealing with multiple cameras, using instead a composite/ mosaic gathering all four images into one, seemed like a possible solution. Arguably, it would going from HD to HD/4 per image. Still, this option had the great advantage of making post-production switching between cameras superfluous. And since the goal in this project was to produce a 2-5 minute digest in two days this seemed like a very attracive option. The question was then how to collate and sync four images into one without any glitches or hiccups. In the following, I will present the four solutions that was tested:

1. ffmpeg (https://www.ffmpeg.org/)
ffmpeg is an open source command line software that can do a number of different operation on audio and video streams and files; recording, converting, filtering, ressampling, muxing, demuxing, etc. It is highly flexible and customizable with the cost of having a steep user threshold with cumbersome operation, since everything has to be run at command line. The flexibility and the fact that it was free still made it an option to consider.

ffmpeg could be easily installed OSX from pre built binaries. On Windows it also has to be built from sources with the mingw-w64 project (see https://trac.ffmpeg.org/wiki/CompilationGuide/MinGW). This option seemed like a bit of work, but at the time when it was considered, it still sounded like a viable option.

After some initial tests based on examples in the ffmpeg wiki (https://trac.ffmpeg.org/wiki) I was able to run the following script as a test on OSX:

ffmpeg -f avfoundation -framerate 30 -pix_fmt:0 uyvy422 -i “0” -f avfoundation -framerate 30 -pix_fmt:0 uyvy422 -i “1” -f avfoundation -framerate 30 -pix_fmt:0 uyvy422 -i “0” -f avfoundation -framerate 30 -pix_fmt:0 uyvy422 -i “0” -filter_complex “nullsrc=size=640×480 [base]; [0:v] setpts=PTS-STARTPTS, scale=320×240 [upperleft]; [1:v] setpts=PTS-STARTPTS, scale=320×240 [upperright];  [2:v] setpts=PTS-STARTPTS, scale=320×240 [lowerleft]; [3:v] setpts=PTS-STARTPTS, scale=320×240 [lowerright]; [base][upperleft] overlay=shortest=1 [tmp1]; [tmp1][upperright] overlay=shortest=1:x=320 [tmp2]; [tmp2][lowerleft] overlay=shortest=1:y=240 [tmp3]; [tmp3][lowerright] overlay=shortest=1:x=320:y=240” -t 5 -target film-dvd -y ~/Desktop/output.mpg

The script produced four streams of video, one from the an external web camera and three from the internal Face Time HD camera. However, the individual images were far from in sync, and seemed to loose the stream and/or lag behind several seconds at a time. This initial test plus the sync issues and the prospects of a time consuming build process on Windows made me abandon the ffmpeg solution.

2. Processing
Using Processing, described as a “flexible software sketchbook and a language for learning how to code within the context of the visual arts” (https://processing.org/), seemed like a more promising path. Again I relied on examples I located on the web and put together a sketch that seemed to do what we were after (I also had to install the video library to be able to run the sketch).

import processing.video.*;
Capture camA;
Capture camB;
Capture camC;
String[] cameras;

void setup(){
size(1280, 720);
cameras=Capture.list();
println(“Available cameras:”);
for (int i = 0; i < cameras.length; i++) {
println(cameras[i], i);
}

/// Choose cameras with the appropriate number from the list and corresponding resolution
/// Here I had to look in the printout list to find the correct numbers to enter
camA = new Capture(this,640,360,cameras[18]);
camB = new Capture(this,640,360,cameras[3]);
camC = new Capture(this,640,360,cameras[33]);
camA.start();
camB.start();
camC.start();
}
void draw() {
image(camA, 0, 0, 640,360);
image(camB, 640, 0, 640,360);
image(camC, 0, 360, 640,360);
}

void captureEvent(Capture c) {
if(c==camA){
camA.read();
}else if(c==camB) {
camB.read();
}else if(c==camC) {
camC.read();
}
}

This sketch nicely gathered the images from three cameras into my mac, with little latency (approx. 50-100 ms). This made me opt to go further and port the sketch to Windows. After installing Processing on the win-PC and running the sktetch there however, I could only get one image out at a time. My hypothesis was that the problem came from all the different cameras having the same number in the list of drivers. These problem made me abandon Processing in search for simpler solutions.

3. IP Camera Viewer
After doing some more searches, I came across IP Camera viewer (http://www.deskshare.com/ip-camera-viewer.aspx), a free and light-weight application for different win versions, with support for over 2000 cameras according to their web site. After some initial tests, I found that this application was a quick and easy solution to what we wanted in the project. I could easily gather up to four camera streams in the viewer and the quality seemed good enough to capture details of the performers. It was also very easy to set up and use, and also seemed stable and robust. Thus, this solution turned out to be what we used in our first session, and it gave us results that were good enough in quality for performance analysis.

4. VLC (http://www.videolan.org/vlc/)
The leader of the Nettmusikk project, Otto Wittner, made a VLC script to produce a mosaic image, albeit with only two images:

# Webcam no 1
new ch1 broadcast enabled
setup ch1 input v4l2:///dev/video0:chroma=MJPG:width=320:height=240:fps=30
setup ch1 output #mosaic-bridge{id=1}

# Webcam no 2
new ch2 broadcast enabled
setup ch2 input v4l2:///dev/video1:chroma=MJPG:width=320:height=240:fps=30
setup ch2 output #mosaic-bridge{id=2}

# Background surface (image)
new bg broadcast enabled
setup bg input bg.png
# Add mosaic on top of image, encode everything, display the result as well as stream udp
setup bg output #transcode{sfilter=mosaic{width=640,height=480,cols=2,rows=1,position=1,order=”1,2″,keep-picture=enabled,align=1},vcodec=mp4v,vb=5000,fps=30}:duplicate{dst=display,dst=udp{dst=streamer.uninett.no:9400}}
setup bg option image-duration=-1  # Make background image be streamed forever

#Start everything
control bg play
control ch1 play
control ch2 play

While it seemed to work well on Linux with two images, the fact that we already had a solution in place made it natural not to persue this solution further at the time.

5. Other software
Among other software I located that could gather several images into one was CaptureSync (http://bensoftware.com/capturesync/). This software was for mac only, and was therefore not tested.

Screen capture and broadcast
Another important function that still wasn’t covered with the IP camera viewer was recording. Another search uncovered many options here, most of them directed towards gaming capture. The first of these I tested was OBS, Open Broadcaster Software (https://obsproject.com/). This open source software turned out to do exactly what we wanted and more, so this became the solution we used at the first session. The software was easily setup to record from the desktop with a 30fps frame rate and close to HD quality with output set to .mp4. There was, however, a noticable lag between audio and video, with video lagging behind approximately 50-100ms. This was corrected in post-production editing, but could potentially have been solved with inserting a delay on the audio stream until it was in sync. Also there were occasional clicks in the audio stream, but I did not have time to figure out whether this was caused by the software or the (internal) sound device we used for recording. We will do more tests on this matter later. The recordings were started with a clap to allow for post-sync.

Surprisingly, OBS was very easy to set it up for live-streaming. I used my own YouTube account where I set up for the Stream Now option (https://support.google.com/youtube/answer/2853700?hl=en). This gave me a a key code that I could copy into OBS, and then simply press the “Start streaming” button. The stream had a lag of about 10 seconds, but had a good quality and consistence. Thereby we had easily set up something we only considered as optional to begin with. Øyvind was very pleased to be able to follow the last part of the first session from his breakfast table in San Diego!

 

Live stream from session with jazz students
Live stream from session with jazz students

Custom video markup solution
Øyvind wrote a small command line program in python that could generate a time-code based list of markers to go with the video. When pressing a number one could indicate when (number of seconds before pressing) a comment was to be inserted relative to a custom set timer. This made it possible to easily synchronize with the clock that IP Camera Viewer printed on top of the video images. Moreover, it allowed rating significance. Although it required a little bit of practice, I found it useful for making notes during the session, as well as when going through the post-session interview videos. One possible way of improving it would be to have a program that could merge and order (in time) comments made either in different run throughs or by different commentators.

Editing
After two days of recording the videos were to be edited down to a single five minute video. After some testing with free software like iMovie (mac) and MovieMaker (win), I abandoned both options due to lack of options and intuitive use. After a little bit of serching I discovered Filmora from Wondershare (http://filmora.wondershare.com/), which I tried out as a free demo before I decided to buy it ($29.90 for a 1-year license). In my view, it was lightweight, had sufficient options to do simple editing and it was quick and easy to use.

Conclusions
We ended up with a multicamera recording and live-streaming solution that was easy to use and very cheap to set up. The only expenses we have had so far has been USB extension cords and hubs as well as the Filmora editor, which was also cheap. Although we do not have our own cameras yet, the prices of new USB2 cameras would not imply a big cut into the budget, if we need to buy four of them. Moreover, finding free software that gave us what we wanted out-of-the-box was a huge relief after intital tests with software like ffmpeg, Processing and VLC.

]]>
http://crossadaptive.hf.ntnu.no/index.php/2016/11/21/multi-camera-recording-and-broadcasting/feed/ 0 585