The effect of differing user interface presentation styles on audio mixing

by JOSH MYCROFT, JOSHUA. D. REISS, TONY STOCKMAN

Background

When mixing on a Digital Audio Workstation (DAW) the user needs to grasp the global structure of the project, i.e. how many tracks there are, how many are active, song duration etc. (Golkhe et al. 2010) while making iterative adjustments to individual tracks in terms of level, frequency, dynamics, panning, stereo balance and overall coherence.  As only a limited amount of channels can be displayed at one time this requires the user to undertake frequent navigation to check and adjust each channel, placing heavy load on working memory, which can detract from attending to subtle changes in audio (Mycroft et al., 2013). The current designs of DAWs not only use traditional mixing controls (peak and VU meters, dials and faders) but also a host of new and visually complex metering and analysis tools (Bennett, 2011). While these tools provide useful quantitative information, they also increase the visual load within the mixing process that can potentially detract from focused aural acuity.

Aims

The authors’ previous work has found that the way in which mix information is accessed can affect critical listening skills (Mycroft et al., 2013). This study aims to assess to what extent the design of the UI elements can help ameliorate this; does the design of the UI have an effect on critical listening skills, and do certain designs improve or detract from the ability to hear subtle changes to the audio content while undertaking visual interface tasks?

Method

Participants

Ten participants (seven male and three female, aged 18-42), all with at least one-year prior mixing experience were recruited from Music Technology staff and students at City and Islington College, London. Two of the participants had their results withdrawn from the analysis due to their inability to differentiate the named instruments from the rest of the mix. All participants were required to give informed consent to participate in the study. The study was conducted in accordance with the guidelines of the University, and was approved by The Ethics Committee of Queen Mary, University of London.

Listening Task

The participants were required to listen to a two minute eight channel mix. During this task they were asked to identify which of three specified instruments (guitar, snare or shaker) was decreased in volume over the course of the excerpt. The excerpt was played twelve times in total, during which each of the specified instruments was attenuated four times (with the order randomised for each participant). As the audio diminished from full volume at the start to inaudibility at the end it become easier to hear the attenuation further into the excerpt. The participants were asked to identify which instrument was being attenuated as soon as they discerned it. At the same time as undertaking this listening task they were presented with one of four visual interfaces displayed on a 10” by 5.8” screen (see below).

Visual task

A group of 16 channels were created in Max/MSP. Each channel had four parameters with a range of 16 values (1-16). The design of the 16 channels were represented by four different UI designs (Fig 1) namely, numbers, dials, faders and colours (the 16 hues used for the colours were created using an online Colour Ramp creator). Participants were asked to look at channel one and compare the subsequent 15 channels to ascertain if they were the same or different by clicking on a ‘same/ different’ buttons below channels 2-16, while listening to the audio. Due to the number of channels, scrolling was required to view all the channels in all four designs. The participants were presented with twelve interfaces (three occurrences of the four interface designs) with the order and parameter values randomised for each participant. Participants were asked to begin comparing the channels as soon as they began the audio. They were told to press the appropriate key on the QWERTY keyboard as soon as they heard the track attenuation, which would stop the audio, and proceed directly onto the next interface. The time taken to hear the audio changes and the number and accuracy of the channels compared was recorded.

Fig 1

Figure 1. The four interface designs used for each of the sixteen interface channels. These included faders, numbers, dials and colours. Each channel consisted of four parameters, each with a range of 16 values.

Results

The time taken to correctly identify the attenuated audio in each interface design was analysed for each participant. From this the mean and standard deviation were calculated and used to generate confidence intervals (at 95%) showing the range of the true population. The number of UI objects compared for each of the four interface designs was also calculated. Any of the channels that were incorrectly matched were discounted from the analysis. The number of correctly matched channels was used to generate mean, and 95% confidence levels. As all three of the specified instruments were attenuated in each of the interface types it was possible to directly compare the response times and channel matching for each instrument across each interface type.

The analysis of the reaction times to the audio attenuation shows that there was no significant time difference between the four interface designs (Figure 2), suggesting that none of the interface designs diverted attention from the auditory task more or less than any other. However, the analysis for the number of channels successfully compared reveals that participants were able to compare significantly more channels with UI designs using colours, faders and numbers compared to dials (Figure 3).

Fig 2

Figure 2. The mean time taken to correctly identify the changes to the audio using the four different interface types. The analysis (at 95% CI level) shows there is no significant difference in the time taken to hear the changes when using different UI designs.

Fig 3

Figure 3. The mean number of channels compared using the four interface designs. The analysis (at 95% CI level) shows that participants were able to successfully compare significantly more channels when the UI was presented as colours and numbers as opposed to dials.

Conclusions

The analysis revealed that dials produced significantly less channel matching than the other UI designs. The reason for this poor performance may be attributable to visual perception. Quantitative information in dials can be problematic to interpret due to the fact that the human eye has difficulty estimating area and comparing angles (Chawla & Whitman, 2011) specifically underestimating acute angles and overestimating obtuse angles (Robbins, 2005, p. 49). With dials remaining a major part of DAW UI design in both established DAWs and new touch screen interfaces this finding may be of value in informing designs that better support effective task sharing between audio and visual tasks.

In comparison to dials, the fader design performed well. Though the faders also required visual comparison, the human eye can compare the two-dimensional positions of objects (such as the ends of bars) or their lengths more easily and precisely than angles (Few, 2007). This may explain the increased channel matching and lower error rate found in the fader UI design. However, the implementation of faders in DAWs is potentially compromised when viewed at zoomed-out levels as their size and resolution becomes reduced to a point where they are hard to interpret accurately (Hlatky et al., 2009). This is also true of numbers, which become illegible beyond certain zoom levels.

Colours, as well as performing well in the study, can also be interpreted easily when displayed at reduced sizes (Stone, 2006) which makes them useful for conveying UI information at a global perspective (i.e. when displaying the mix at a zoomed out view). In this respect, colour provides some interesting possibilities for DAW design, especially when used in conjunction with other UI designs. However there are both perceptual and physiological caveats that need to be considered. Colour discrimination can be compromised by a variety of factors, such as the lighting conditions, display position, display quality, and viewing angle (Yeh et al., 2013) while colour vision deficiencies (such as colour-blindness) affect approximately nine percent of the population (Galitz, 1997).  Colours also need to be selected carefully to ensure that they are sufficiently different and easily discriminable from each other (Smith & Mosier, 1986).

In screen-based mixing, reaching an optimum balance between visual and auditory modalities is essential. When visual displays are used they must be designed in ways that do not detract from the aural task (compared to other display types) while allowing quick access to mix information. This study suggests that due to problems of visual perception, dials may not be an efficient way to convey multiple mix parameters, especially when navigating the interface. While faders and numbers are more efficient in this respect and can convey precise quantitative information they may perform poorly at smaller views. Colours on the other hand perform well at global zoom levels, and can also convey quantitative information (though this may be prone to errors).

Recognising and quantifying how different UI designs meet the perceptual and workflow needs of DAW users may help provide design heuristics that minimise visual load and aid interface navigation. However, more work is needed, both to assess how UI designs effect users’ ability to interpret mix information at different zoom levels and also to quantify how many values can be efficiently encoded in colours, without increasing error rates or overloading visual perceptual limits.

References

Bennet, R. (2011). Visual Mixing Aids. Sound on Sound Magazine.                http://www.soundonsound.com/sos/mar11/articles/logic-tech-0311.htm. Accessed June 2014.

Chawla, V. & Whitman, L. (2011). Finding the Best Display Type for Your Data. SAS global forum.

Few, S. (2007). Save the Pies for Dessert.   http://www.perceptualedge.com/articles/visual_business_intelligence/save_the_pies_for_dessert.pdf. Accessed May 2014.

Galitz, W. O (1997). Essential Guide to User Interface Design. An Introduction to GUI Design Principles and Techniques. New York: Wiley.

Gohlke, K., Hlatky, M., Heise, S., Black, D., & Loviscach, J. (2010, May). Track displays in DAW software: Beyond waveform views. In Audio Engineering Society Convention 128. Audio Engineering Society, London, UK.

Hlatky, M., Gohlke, K., Black, D., & Loviscach, J. (2009, May). Enhanced Control of On-Screen Faders with a Computer Mouse. In Audio Engineering Society Convention 126. Audio Engineering Society, Munich, Germany.

Mycroft, J., Reiss, J. D., Stockman, T. (2013). The Influence of Graphical User Interface Design on Critical Listening Skills. Sound and Music Computing (SMC), Stockholm, July/Aug., 2013.

Robbins, N. (2005) Creating More Effective Graphs. Wiley, 2005.

Smith, S. L, & Mosier, J. N (1986). Guidelines for designing user interface software. Report ESD-TR-86-278. Bedford, MA: The MITRE Corporation.

Stone, M (2006). Choosing Colors for Data Visualization. http://www.perceptualedge.com/articles/b-eye/choosing_colors.pdf. Retrieved May, 2014.

Yeh, M., Jo, Y. J., Donovan, C., & Gabree, S. (2013). Human Factors Considerations in the Design and Evaluation of Flight Deck Displays and Controls (No. DOT/FAA/TC-13/44).