Sam Ferguson is a musician, researcher and programmer who is a lecturer at the University of Technology, Sydney. His research focus is to understand the relationship between, and the effects of, sound and music on human beings.
He has around 40 publications in areas as diverse as spatial hearing and loudness research, to data sonification, emotion, and tabletop computing. He has been a research fellow or assistant on more than 6 ARC research projects, and continues to maintain several open source code projects. He has taught numerous subjects at the postgraduate and undergraduate level at the University of Technology, Sydney, the University of Sydney and UWS, and currently is a lecturer at UTS in the Faculty of Engineering and IT.
Can supervise: YES
Sam teaches or has taught subjects such as:
31265 Communications for IT Professionals
95569 Digital Media Studio
95566 Digital Information and Interaction Design
31080 Digital Multimedia
32027 Multimedia Systems Design
50858 Audio Production
50846 Situated Media Installation Studio
Sam is currently heavily involved in the Software Development Studio, which is an inter-disciplinary studio where students can get hands-on experience developing software in a collaborative environment with industry mentors. He also acts as an academic advisor for IT students enrolled in the Enterprise Systems Development major.
This paper proposes the term "media multiplicities" to describe contemporary media artworks that create multiples of "internet of things" devices. It discusses the properties that distinguish media multiplicities from other forms of media artwork, provides parameters for categorizing media multiplicities, and discusses aesthetic and creative factors in the production of media multiplicities.
Internet of Things (IoT) technologies enable new forms of media artworks. ‘Media multiplicities’ are defined here as creative media experiences made up of multiples of interacting and coordinated devices. In this paper, we review the state of the art of multiplicitous media artworks and provide a systematic analysis of the novel affordances and different forms such artworks can take, specifically that they are spatial, scalable, scatterable and sensing. We consider the analysis of media multiplicities from the point of view of both user experience and creative production. We offer three primary axes through which a categorisation of multiplicitous media forms can be framed: substrate versus object; composed versus self-organised, and homogeneous versus heterogeneous. We also analyse how the number of elements in the multiplicities (from tens to tens of thousands and beyond) affects the qualities of the experience.
Creativity can be considered one of the key competencies for the twenty-first century. It provides us with the capacity to deal with the opportunities and challenges that are part of our complex and fast-changing world. The question as to what facilitates creative cognition-the ability to come up with creative ideas, problem solutions and products-is as old as the human sciences, and various means to enhance creative cognition have been studied. Despite earlier scientific studies demonstrating a beneficial effect of music on cognition, the effect of music listening on creative cognition has remained largely unexplored. The current study experimentally tests whether listening to specific types of music (four classical music excerpts systematically varying on valance and arousal), as compared to a silence control condition, facilitates divergent and convergent creativity. Creativity was higher for participants who listened to 'happy music' (i.e., classical music high on arousal and positive mood) while performing the divergent creativity task, than for participants who performed the task in silence. No effect of music was found for convergent creativity. In addition to the scientific contribution, the current findings may have important practical implications. Music listening can be easily integrated into daily life and may provide an innovative means to facilitate creative cognition in an efficient way in various scientific, educational and organizational settings when creative thinking is needed.
Ferguson, S, Kenny, DT, Mitchell, HF, Ryan, M & Cabrera, D 2013, 'Change in messa di voce characteristics during 3 years of classical singing training at the tertiary level', Journal of Voice, vol. 27, no. 4, pp. 35-48.View/Download from: Publisher's site
A 3-year longitudinal study was conducted to investigate changes in vocal quality as a result of singing training at a tertiary level conservatorium in Australia. Singers performed a messa di voce (MDV) at intervals of 6 months over the 3-year period of training. The study investigated the evolving acoustic features of the singers' voices exhibited during the MDV, including sound pressure level (SPL), short-term energy ratio (STER), duration, and vibrato parameters of the fundamental frequency (F0), SPL, and STER. The maximum SPL exhibited a marginal systematic increase over the training period, but the maximum STER did not systematically change. F0 vibrato extent increased significantly, whereas the extent of SPL and STER vibrato did not change significantly.
Ferguson, S, Beilharz, KA & Calo, CA 2012, 'Navigation of interactive sonifications and visualisations of time-series data using multi-touch computing', Journal on Multimodal User Interfaces, vol. 5, no. 3-4, pp. 97-109.View/Download from: Publisher's site
This paper discusses interaction design for inter- active sonification and visualisation of data in multi-touch contexts. Interaction design for data analysis is becoming increasingly important as data becomes more openly avail- able. We discuss how navigation issues such as zooming, se- lection, arrangement and playback of data relate to both the auditory and visual modality in different ways, and how they may be linked through the modality of touch and gestural in- teraction. For this purpose we introduce a user interface for exploring and interacting with representations of time-series data simultaneously in both the visual and auditory modali- ties.
Ferguson, S, Schubert, E & Dean, R 2011, 'Continuous subjective loudness responses to reversals and inversions of a sound recording of an orchestral excerpt', Musicae Scientiae, vol. 15, no. 3, pp. 387-401.View/Download from: Publisher's site
Twenty-four respondents continuously rated the loudness of the first 65 seconds of a Dvorak Slavonic Dance, which was known to vary considerably in loudness. They also rated the same excerpt when the sound file was digitally treated so that (1) the sound pressure level (SPL) was inverted or (2) it was temporally reversed or (3) both 1 and 2. Specifically we wanted to see if acoustic intensity was processed into the percept of loudness primarily using a bottom-up (indifferent to timbral environment and thematic cues) or top-down style (where musical context, such as instrument identity and musical expectation affects the loudness rating). Comparing the different versions (conditions) allowed us to ascertain which style they were likely to be using. A single, six-second region was located as being differentiated across two conditions, where loudness seemed to be increased due to expectation of the instrument and orchestral texture, despite the lower SPL. We named this effect an auditory loudness stroop. A second region was differentiated between the two conditions, but its explanation appears to involve two factors, auditory looming perception and the reversal of stimulus note ramps. The overall conclusion was that the predominant processing style for loudness rating was bottom-up. Implications for further research and application to models of loudness are discussed.
Ferguson, S, Kenny, DT & Cabrera, D 2010, 'Effects of training on time-varying spectral energy and sound pressure level in nine male classical singers', Journal of Voice, vol. 24, no. 1, pp. 39-46.View/Download from: Publisher's site
Ferguson, S & Cabrera, D 2005, 'Vertical localization of sound from multiway loudspeakers', AES: Journal of the Audio Engineering Society, vol. 53, no. 3, pp. 163-173.
Practical wide-range loudspeakers are usually implemented with multiple drivers, but the systematic effect of the signal frequency upon the vertical localization of sound is scarcely used for loudspeaker enclosure design. Tendencies in vertical localization for the frequency bands characteristic of woofers and tweeters in loudspeakers are shown. Using vertical arrays of individually controlled loudspeakers, synchronous and asynchronous bands of noise were presented to subjects. The frequency of the source affected the vertical position of the low-and high-frequency auditory image pairs significantly and systematically, in a manner broadly consistent with previous studies concerned with single auditory images. Lower frequency sources are localized below their physical positions whereas high-frequency sources are localized at their true positions. This effect is also shown to occur for musical signals. It is demonstrated that low-frequency sources are not localized well when presented in exact synchrony with high-frequency sources, or when they only include energy below 500 Hz.
Ferguson, SJ & Cabrera, D 2005, 'Vertical Localization of Sound from Multiway Loudspeakers', Journal of the Audio Engineering Society, vol. 53, no. 5, pp. 163-173.
Loke, L & Khut, GP 2014, 'Intimate Aesthetics and Facilitated Interaction' in Candy, L & Ferguson, S (eds), Interactive Experience in the Digital Age, Springer, pp. 91-108.View/Download from: Publisher's site
Tan, C & Ferguson, S 2014, 'The Role of Emotions in Art Evaluation' in Candy, L & Ferguson, S (eds), Interactive Experience in the Digital Age, Springer, Switzerland, pp. 139-152.View/Download from: Publisher's site
With contributions from artists, scientists, curators, entrepreneurs and designers engaged in the creative arts, this book is an invaluable resource for both researchers and practitioners, working in this emerging field.
Ferguson, S, Martens, W & Cabrera, D 2011, 'Statistical Sonification for Exploratory Data Analysis' in Hermann, T, Hunt, A & Neuhoff, JG (eds), The Sonification Handbook, Logos Verlag Berlin GmBH, Berlin, Germany, pp. 175-196.
At the time of writing, it is clear that more data is available than can be practically digested in a straightforward manner without some form of processing for the human observer. This problem is not a new one, but has been the subject of a great deal of practical investigation in many fields of inquiry. Where there is ready access to existing data, there have been a great many contributions from data analysts who have refined methods that span a wide range of applications, including the analysis of physical, biomedical, social, and economic data. A central concern has been the discovery of more or less hidden information in available data, and so statistical methods of data mining for `the gold in there have been a particular focus in these developments. A collection of tools that have been amassed in response to the need for such methods form a set that has been termed Exploratory Data Analysis , or EDA, which has become widely recognized as constituting a useful approach. The statistical methods employed in EDA are typically associated with graphical displays that seek to `tease out a structure in a dataset, and promote the understanding or falsification of hypothesized relationships between parameters in a dataset.
Abu Ul Fazal, M, Karim, S, Ferguson, S & Johnston, A 2019, 'Vinfomize: A framework for multiple voice-based information communication', ACM International Conference Proceeding Series, pp. 143-147.View/Download from: Publisher's site
© 2019 Association for Computing Machinery. In this paper, we discuss investigations conducted with 10 visually challenged users (VCUs) and 8 sighted users (SUs) that aimed to determine user's experience, interest and expectations from concurrent information communication systems. In the first study, we concurrently played two voice-based streams in continuous form in both the ears, and in the second study, we concurrently communicated one stream continuously in one ear and three news headlines as interval-based short interruptions in another ear. We first reported the participants' experience qualitatively and then based on the feedback received from the users, we proposed a framework that may help in developing systems to communicate multiple voice-based information to the users. It is expected that the application of this new framework to information systems that provide multiple concurrent communication will provide a better user experience for users subject to their contextual and perceptual needs and limitations.
Bown, O, Ferguson, S, Bray, L, Fraietta, A & Loke, L 2019, 'Facilitating Creative Exploratory Search with Multiple Networked Audio Devices Using HappyBrackets', http://www.nime.org/proceedings/2019/, New Interfaces for Musical Expression, Porto Alegre, Brazil, pp. 286-291.
Bajwa, MA, Prior, J, Leaney, J & Ferguson, S 2018, 'Enterprise IT Governance impact on Agile Software Development Project Success', Designing Digitalization (ISD2018 Proceedings), International Conference on Information Systems Development, AIS, Lund, Sweden, pp. 1-3.
Enterprise IT (EIT) governance has become the primary approach in leveraging the IT function to achieve business objectives. We found in previously published work that decision making is the core of EIT governance. We collected quantitative data from professionals on decision making in Agile Software Development (ASD) projects, which we analyzed using Spearman’s Ranked Correlation Coefficient. Decision-making clarity in implementation and decision-making distribution in the organization layers positively impact ASD project success. However, our finding that tailoring the decision-making process does not impact ASD project success was most surprising. We conclude that the impact of decision-making factors in an ASD project’s success needs to be explored more deeply.
Abu Ul Fazal, M, Ferguson, S & Johnston, A 2018, 'Investigating concurrent speech-based designs for information communication', ACM International Conference Proceeding Series, Audio Mostly on Sound in Immersion and Emotion, ACM, Wrexham, United Kingdom, pp. 1-8.View/Download from: Publisher's site
© 2018 Association for Computing Machinery. Speech-based information is usually communicated to users in a sequential manner, but users are capable of obtaining information from multiple voices concurrently. This fact implies that the sequential approach is possibly under-utilizing human perception capabilities to some extent and restricting users to perform optimally in an immersive environment. This paper reports on an experiment that aimed to test different speech-based designs for concurrent information communication. Two audio streams from two types of content were played concurrently to 34 users, in both a continuous or intermittent form, with the manipulation of a variety of spatial configurations (i.e. Diotic, Diotic-Monotic, and Dichotic). In total, 12 concurrent speech-based design configurations were tested with each user. The results showed that the concurrent speech-based information designs involving intermittent form and the spatial difference in information streams produce comprehensibility equal to the level achieved in sequential information communication.
Ul Fazal, MA, Ferguson, S, Karim, MS & Johnston, A 2018, 'Concurrent voice-based multiple information communication: A study report of profile-based users' interaction', 145th Audio Engineering Society International Convention, AES 2018, Audio Engineering Society International Convention, Audio Engineering Society, New York, New York.
© 2018 KASHYAP. This paper reports a study conducted with 10 blind and 8 sighted participants using a prototype system for communicating multiple information streams concurrently, using two methods of presentation. The prototype system in method one played two continuous voice-based articles diotically, differing by voice gender and content. In the second method, prototype communicated one continuous article in the female voice and three headlines as interval-based short interruptions in a male voice dichotically. In this investigation the continuous method remained more effective in communicating multiple information compared to the interval-based interruption method, and also the users who possessed at least tertiary qualification performed better in comprehending the multiple concurrent information than the non-tertiary qualified users.
Loke, L, Bown, O, Ferguson, S, Bray, L, Fraietta, A & Packham, K 2018, 'Your move sounds so predictable!', CHI PLAY 2018 - Proceedings of the 2018 Annual Symposium on Computer-Human Interaction in Play Companion Extended Abstracts, Chi Play, ACM, Melbourne, VIC, Australia, pp. 121-125.View/Download from: Publisher's site
© 2018 Copyright is held by the owner/author(s). Your Move Sounds So Predictable is a semi-improvised two-player movement and sound game, based around a pair of bespoke motion-sensing sonic balls. Players pull a card and follow the instruction on where to place the ball in relation to their body. The sonic behavior of each ball has been programmed to exhibit a moderately complex and hard-to-predict set of responses to the user input that challenge the user’s expectation and the experience of autonomy and causality. The balls also communicate with each other, adding additional causal flows. Each player explores this relationship between movement and sound through play, whilst at the same time attending to the emergent sonic composition created by the group. Chaos or harmony will ensue.
Ferguson, SJ 2017, 'Creative Coding for the Raspberry Pi using the HappyBrackets Platform', ACM SIGCHI Conference on Creativity and Cognition, ACM, Singapore, Singapore.View/Download from: Publisher's site
This workshop will introduce creative coding audio for the Raspberry Pi, using the 'beads' platform for audio programming, and the 'HappyBrackets' platform for inter-device communication and sensor data acquisition. We will demonstrate methods to allow each self-contained battery-powered device to acquire sensor data about its surroundings and the way it is being interacted with, as well as methods for designing systems where groups of these devices wirelessly communicate their state, allowing new interaction possibilities and approaches.
Ferguson, SJ, Bown, O, Rowe, A, Birtles, L & Bennewith, C 2017, 'Networked Pixels: Strategies for Building Visual and Auditory Images with Distributed Independent Devices', Proceedings of the 2017 ACM SIGCHI Conference on Creativity and Cognition, ACM Creativity and Cognition, ACM, Singapore, pp. 299-299.
This paper describes the development of the hardware and software for Bloom, a light installation installed at Kew Gardens, London in December of 2016. The system is made up of a set of nearly 1000 distributed pixel devices each with LEDs, GPS sensor, and sound hardware, networked together with WiFi to form a display system. Media design for this system required consideration of the distributed nature of the devices. We outline the software and hardware designed for this system, and describe two approaches to the software and media design, one whereby we employ the distributed devices themselves for computation purposes (the approach we ultimately selected), and another whereby the devices are controlled from a central server that is performing most of the computation necessary. We then review these approaches and outline possibilities for future research.
Ferguson, SJ, Rowe, A, Bown, O, Birtles, L & Bennewith, C 2017, 'Sound Design for a System of 1000 Distributed Independent Audio-Visual Devices', Proceedings of the international conference on new interfaces for musical expression, Copenhagen, Denmark, 2017, New Interfaces for Musical Expression 2017, NIME, Copenhagen, Denmark, pp. 245-250.
This paper describes the sound design for Bloom, a light and sound installation made up of 1000 distributed independent audio-visual pixel devices, each with RGB LEDs, Wifi, Accelerometer, GPS sensor, and sound hardware. These types of systems have been explored previously, but only a few systems have exceeded 30-50 devices and very few have included sound capability, and therefore the sound design possibilities for large systems of distributed audio devices are not yet well understood. In this article we describe the hardware and software implementation of sound synthesis for this system, and the implications for design of media for this context.
Prior, J, Ferguson, S & Leaney, J 2016, 'Reflection is hard: teaching and learning reflective practice in a software studio', Proceedings of the Australasian Computer Science Week Multiconference, Australasian Computing Education Conference, ACM, Canberra, Australia.View/Download from: Publisher's site
We have observed that it is a non-trivial exercise for undergraduate students to learn how to reflect. Reflective practice is now recognised as important for software developers and has become a key part of software studios in universities, but there is limited empirical investigation into how best to teach and learn reflection. In the literature on reflection in software studios, there are many papers that claim that reflection in the studio is mandatory. However, there is inadequate guidance about teaching early stage students to reflect in that literature. The essence of the work presented in this paper is a beginning to the consideration of how the teaching of software development can best be combined with teaching reflective practice for early stage software development students. We started on a research programme to understand how to encourage students to learn to reflect. As we were unsure about teaching reflection, and we wished to change our teaching as we progressively understood better what to do, we chose action research as the most suitable approach. Within the action research cycles we used ethnography to understand what was happening with the students when they attempted to reflect. This paper reports on the first 4 semesters of research.
We have developed and tested a reflection model and process that provide scaffolding for students beginning to reflect. We have observed three patterns in how our students applied this process in writing their reflections, which we will use to further understand what will help them learn to reflect. We have also identified two themes, namely, motivation and intervention, which highlight where the challenges lie in teaching and learning reflection.
Bown, O & Ferguson, S 2016, 'A musical game of bowls featuring the DIADs', http://www.nime.org/proceedings/2016, New Interfaces for Musical Expression, Griffith University, Brisbane, Australia, pp. 371-372.
We describe a project in which a game of lawn bowls was recreated
using Distributed Interactive Audio Devices (DIADs), to create an
interactive musical experience in the form of a game. This paper
details the design of the underlying digital music system, some
of the compositional and design considerations, and the technical
challenges involved. We discuss future directions for our system
and compositional method.
Bown, O, Loke, L, Ferguson, SJ & Reinhardt, D 2015, 'Distributed Interactive Audio Devices: Creative strategies and audienceresponses to novel musical interaction scenarios', http://isea2015.org/publications/proceedings-of-the-21st-international-…, International Symposium on Electronic Art, ISEA, Vancouver, Canada.
With the rise of ubiquitous computing, comes new possibilities
for experiencing audio, visual and tactile media in
distributed and situated forms, disrupting modes of media
experience that have been relatively stable for decades. We
present the Distributed Interactive Audio Devices (DIADs)
project, a set of experimental interventions to explore future
ubiquitous computing design spaces in which electronic
sound is presented as distributed, interactive and portable.
The DIAD system is intended for creative sound and
music performance and interaction, yet it does not conform
to traditional concepts of musical performance, suggesting
instead a fusion of music performance and other forms of
collaborative digital interaction. We describe the thinking
behind the project, the state of the DIAD system’s technical
development, and our experiences working with userinteraction
in lab-based and public performance scenarios.
Ferguson, SJ 2015, 'Using audio feature extraction for interactive feature-based sonification of sound', https://smartech.gatech.edu/bitstream/handle/1853/54106/ICAD%20Proceedi…, International Conference on Auditory Display, Georgia Institute of Technology, Graz, Austria, pp. 66-72.
Murray-Leslie, A, Ferguson. S & Johnston, A 2014, 'Colour Tuning', http://www.tuttocongressi.it/website/congresses/congressDetail2.aspx?id…, Costume Colloquium IV: Colors in Fashion, Life Beyond Tourism, Florence, Italy.
Colour Tuning is practice based research into the relationship between colour, dance, fashion and music in the form of an APP (IPad application) developed to be used in conjunction with a performance fashion or live Art context. The APP is used during a performance, encouraging the audience and performers tune into each other, via an IPad APP, to compose acoustic compositions or “Colour Music”.
Colour Tuning enables environments and bodies in space to tune in and out of each other, by using the IPad as a digital viewfinder through the APP. The APP player (eg: Audience member) points the IPad in the direction he or she would like to compose music and create colour feedback to and selects colours (eg: a collection of coloured clothing worn by dancers, models or actors on stage) on the IPad screen, which are then tracked. Each colour denotes a different sound space (each colour being mapped to an acoustic generative algorithim). Once the colours on the screen (colour fields denote the bodies of the actors) start moving, the sounds change according to what colour / actor comes close to another colour / actor and when the colours/actors overlap or make contact, new sounds are generated, like mixing coloured paint, meaning the performative composition of colour and sound is always in flux and never sounds the same.
Colour Tuning addresses multiple themes of the conference, including: “Symbolism of colours in dress and fashion”, by translating colour and fashion into metaphorical sounds and timbres in music, to create a larger synesthesia Live Art experience. Colour Tuning presents a participatory dialogue between audience and designer, through the interactive nature of the Colour Tuning APP’s mode of presentation, by inviting audience members to use the IPad APP to tune into the colours they want to hear whilst watching the actors on stage or on the Street. Colour Tuning presents a critical view on the history of colors in style and fashion; questioning the powerful...
Ferguson, SJ, Johnston, A & Murray-Leslie, A 2014, 'Methodologies with fashion acoustics Live on Stage!', Proceedings of the International Conference on New Interfaces for Musical Expression, New Interfaces for Musical Expression, Creativity and Cognition Workshop, Goldsmiths, University of London, pp. 1-4.
Tan, CT, Johnston, A, Bluff, A, Ferguson, S & Ballard, KJ 2014, 'Retrogaming as visual feedback for speech therapy', Proceeding SA'14 SIGGRAPH Asia 2014 Mobile Graphics and Interactive Applications, International Conference on Computer Graphics and Interactive Techniques in Australasia and Southeast Asia, ACM, Shenzen Convention & Exhibition Center.View/Download from: Publisher's site
A key problem in speech therapy is the motivation of patients in repetitive vocalization tasks. One important task is the vocalization of vowels. We present a novel solution by incorporating formant speech analysis into retro games to enable intrinsic motivation in performing the vocalization tasks in a fun and accessible manner. The visuals in the retro games also provide a simple and instantaneous feedback mechanism to the patients' vocalization performance. We developed an accurate and efficient formant recognition system to continuously recognize vowel vocalizations in real time. We implemented the system into two games, Speech Invaders and Yak-man, published on the iOS App Store in order to perform an initial public trial. We present the development to inform like-minded researchers who wish to incorporate real-time speech recognition in serious games.
Tan, CT, Johnston, AJ, Bluff, A, Ferguson, S & Ballard, KJ 2014, 'Speech invaders & yak-man: retrogames for speech therapy', Proceeding SA '14 SIGGRAPH Asia 2014 Mobile Graphics and Interactive Applications, SIGGRAPH Asia 2014 Mobile Graphics and Interactive Applications, Shenzen Convention & Exhibition Center.View/Download from: Publisher's site
Speech therapy is used for the treatment of speech disorders and commonly involves a patient attending clinical sessions with a speech pathologist, as well as performing prescribed practice exercises at home [Ruggero et al. 2012]. Clinical sessions are very effective -- the speech pathologist can carefully guide and monitor the patient's speech exercises -- but they are also costly and timeconsuming. However, the more inexpensive and convenient home practice component is often not as effective, as it is hard to maintain sufficient motivation to perform the rigid repetitive exercises.
Ferguson, S, Schubert, E & Stevens, CJ 2014, 'Dynamic dance warping: Using dynamic time warping to compare dance movement performed under different conditions', Proceedings of the 1st International Workshop on Movement and Computing, International Workshop on Movement and Computing, ACM, Paris, France, pp. 94-99.
Ferguson, S, Johnston, AJ & Martin, AG 2013, 'A corpus-based method for controlling guitar feedback', Proceedings of the International Conference on New Interfaces for Musical Expression, New Interfaces for Musical Expression, Korea Advance Institute of Science and Technology, Daejeon & Seoul, Korea Republic, pp. 541-546.
The use of feedback created by electric guitars and amplifiers is problematic in musical settings. For example, it is difficult for a performer to accurately obtain specific pitch and loudness qualities. This is due to the complex relationship between these quantities and other variables such as the string being fretted and the positions and orientations of the guitar and amplifier. This research investigates corpus-based methods for controlling the level and pitch of the feedback produced by a guitar and amplifier. A guitar-amplifier feedback system was built in which the feedback is manipulated using (i) a simple automatic gain control system, and (ii) a band-pass filter placed in the signal path. A corpus of sounds was created by recording the sound produced for various combinations of the parameters controlling these two components. Each sound in the corpus was analysed so that the control parameter values required to obtain particular sound qualities can be recalled in the manner of concatenative sound synthesis. As a demonstration, a recorded musical target phrase is recreated on the feedback system.
Tan, C, Johnston, AJ, Ballard, KJ, Ferguson, S & Perera-Schulz, D 2013, 'sPeAK-MAN: towards popular gameplay for speech therapy', Proceedings of 9th Australasian Conference on Interactive Entertainment IE'13, Interactive Entertainment, ACM, Melbourne, VIC, Australia, pp. 1-4.View/Download from: Publisher's site
Current speech therapy treatments are not easily accessible to the general public due to cost and demand. Therapy sessions are also laborious and maintaining motivation of patients is hard. We propose using popular games and speech recognition technology for speech therapy in an individualised and accessible manner. sPeAK-MAN is a Pac-Man-like game with a core gameplay mechanic that incorporates vocalisation of words generated from a pool commonly used in clinical speech therapy sessions. Other than improving engagement, sPeAK-MAN aims to provide real-time feedback on the vocalisation performance of patients. It also serves as an initial prototype to demonstrate the possibilities of using familiar popular gameplay (instead of building one from scratch) for rehabilitation purposes.
Ferguson, S 2013, 'Sonifying every day: Activating everyday interactions for ambient sonification systems', Website Proceedings of the 2013 International Conference on Auditory Display, International Conference on Auditory Display, Lodz University of Technology Press, Lodz, Poland, pp. 77-84.
Sonifying every day: Activating everyday interactions for ambient sonification systems
Ferguson, S, Emery, S, Lee, D, Cabrera, D & McPherson, GE 2013, 'A comparison between continuous categorical emotion responses and stimulus loudness parameters', 4th International Conference on Information, Intelligence, Systems and Applications, International Conference on Information, Intelligence, System and Applications, IEEE, Piraeus, Greece.View/Download from: Publisher's site
This paper investigates the use of psychoacoustic loudness analysis as a method for determining the likely emotional responses of listeners to musical excerpts. 19 excerpts of music were presented to 86 participants (7 randomly chosen excerpts per participant) who were asked to rate the emotion category using the emotion-clock-face continuous response interface. The same excerpts were analysed with a loudness model, and time series results were summarised as both loudness median and standard deviation. Comparisons indicate that the median and standard deviation of loudness plays an important role in determining the emotion category responses.
Ferguson, S, Nagai, Y, Hewett, T, Yi-Luen Do, E, Dow, S, Ox, J, Smith, S, Nishimoto, K & Tan, C 2013, 'Proceedings of the 9th ACM Conference on Creativity & Cognition', Proceedings of the 9th ACM Conference on Creativity & Cognition, ACM, Sydney, NSW, Australia.
Schubert, E, Ferguson, S, Farrar, N, Taylor, D & McPherson, GE 2012, 'The Six Emotion-Face Clock as a Tool for Continuously Rating Discrete Emotional Responses to Music', From Sounds to Music and Emotions (LNCS), CMMR: International Symposium on Computer Music Modeling and Retrieval, Springer, 9th International Symposium, pp. 1-18.View/Download from: Publisher's site
Recent instruments measuring continuous self-reported emotion responses to music have tended to use dimensional rating scale models of emotion such as valence (happy to sad). However, numerous retrospective studies of emotion in music use checklist style responses, usually in the form of emotion words, (such as happy, angry, sad) or facial expressions. A response interface based on six simple sketch style emotion faces aligned into a clock-like distribution was developed with the aim of allowing participants to quickly and easily rate emotions in music continuously as the music unfolded. We tested the interface using six extracts of music, one targeting each of the six faces: `Excited (at 1 oclock), `Happy (3), `Calm (5), `Sad (7), `Scared (9) and `Angry (11). 30 participants rated the emotion expressed by these excerpts on our `emotion-face-clock. By demonstrating how continuous category selections (votes) changed over time, we were able to show that (1) more than one emotion-face could be expressed by music at the same time and (2) the emotion face that best portrayed the emotion the music conveyed could change over time, and (3) the change could be attributed to changes in musical structure. Implications for research on orientation time and mixed emotions are discussed.
Ferguson, S, Johnston, AJ, Ballard, KJ, Tan, C & Perera-Schulz, D 2012, 'Visual feedback of acoustic data for speech therapy: model and design parameters', Proceedings of the 7th Audio Mostly Conference: A Conference on Interaction with Sound, Audio Mostly Conference: A Conference on Interaction with Sound, ACM, Corfu, Greece, pp. 135-140.View/Download from: Publisher's site
Feedback, usually of a verbal nature, is important for speech therapy sessions. Some disadvantages exist however with traditional methods of speech therapy, and visual feedback of acoustic data is a useful alternative that can be used to complement typical clinical sessions. Visual feedback has been investigated before, and in this paper we propose sev- eral new prototypes. From these prototypes we develop an iterative model of analysing the design of feedback sys- tems by examining the feedback process. From this iterative model, we then extract methods to inform design of visual feedback systems for speech therapy
Schubert, E, Ferguson, S, Farrar, N, Taylor, D & McPherson, GE 2012, 'Continuous Response to Music using Discrete Emotion Faces', Proceedings of the 9th International Symposium on Computer Music Modelling and Retrieval, International Symposium on Computer Music Modeling and Retrieval (CMMR), Queen Mary University of London, London, UK, pp. 3-19.
An interface based on expressions in simple graphics of faces were aligned in a clock-like distribution with the aim of allowing participants to quickly and easily rate emotions in music continuously. We developed the interface and tested it using six extracts of music, one targeting each of the six faces: `Excited (at 1 oclock), `Happy (3), `Calm (5), `Sad (7), `Scared (9) and `Angry (11). 30 participants rated the emotion expressed by these excerpts on our `emotion-face-clock. By demonstrating how continuous category selections (votes) changed over time, we were able to show that (1) more than one emotion-face could be expressed by music at the same time and (2) the emotion face that best portrayed the emotion the music conveyed could change over time, and that the change could be attributed to changes in musical structure.
Taylor, D, Schubert, E, Ferguson, S & McPherson, GE 2012, 'The Role of Musical Features in the Perception of Initial Emotion', Proceedings of the 9th International Symposium on Computer Music Modelling and Retrieval, International Symposium on Computer Music Modeling and Retrieval (CMMR), Queen Mary University of London, London, pp. 136-143.
170 participants were played short excerpts of orchestral music and instructed to move a mouse cursor as quickly as possible to one of six faces that best corresponded to the emotion they thought the music expressed. Excerpts were analysed and the musical cues coded. Relationships between the number of cues and participantsâ response times were investigated and reported. No relationship between the number of cues available to the listener and the speed of response was found. Findings suggest that the initial response to ecologically plausible musical excerpts is quite complex, and requires further investigation to provide emotion-retrieval models of music with psychologically driven data
Johnston, AJ, Beilharz, KA, Chen, Y & Ferguson, S 2010, 'Proceedings of the 2010 Conference on New Interfaces for Musical Expression (NIME 2010)', Proceedings of the 2010 Conference on New Interfaces for Musical Expression (NIME 2010), University of Technology Sydney, Sydney, Australia.
Ferguson, S, Cabrera, D & Schubert, E 2010, 'Comparing continuous subjective loudness responses and computational models of loudness for temporally varying sounds', 129th Audio Engineering Society Convention 2010, pp. 857-864.
There are many ways in which loudness can be objectively estimated, including simple weighted models based on physical sound level, as well as complex and computationally intensive models that incorporate many psychoacoustical factors. These complex models have been generated from principles and data derived from listening experiments using highly controlled, usually brief, artificial stimuli; whereas the simple models tend to have a real world emphasis in their derivation and validation. Loudness research has recently also focused on modelling time-varying loudness, as temporal aspects can have a strong effect on loudness. In this research, continuous subjective loudness responses are compared to time-series outputs of loudness models. We use two types of stimuli: a sequence of sine tones, and a sequence of band-limited noise bursts. The stimuli were analyzed using a variety of loudness models, including those of Glasberg and Moore, and Chalupper and Fastl, and Moore, Glasberg and Baer. Continuous subjective responses were obtained from 24 university students, who rated loudness continuously in time over the period of the experiment, while using an interactive interface.
Beilharz, KA & Ferguson, S 2009, 'An Interface and Framework Design for interactive Aesthetic Sonification', Proceedings of the 15th International Conference on Auditory Display, International Conference on Auditory Display, Re:New Digital Arts Forum, Copenhagen, Denmark, pp. 1-8.
This paper describes the interface design of our AeSon (Aesthetic Sonification) Toolkit motivated by user-centred customisation of the aesthetic representation and scope of the data. The interface design is developed from 3 premises that distinguish our approach from more ubiquitous sonification methodologies. Firstly, we prioritise interaction both from the perspective of changing scale, scope and presentation of the data and the user's ability to reconfigure spatial panning, modality, pitch distribution, critical thresholds and granularity of data examined. The user, for the majority of parameters, determines their own listening experience for real-time data sonification, even to the extent that the interface can be used for live data-driven performance, as well as traditional information analysis and examination. Secondly, we have explored the theories of Tufte, Fry and other visualization and information design experts to find ways in which principles that are successful in the field of information visualization may be translated to the domain of sonification. Thirdly, we prioritise aesthetic variables and controls in the interface, derived from musical practice, aesthetics in information design and responses to experimental user evaluations to inform the design of the sounds and display. In addition to using notions of meter, beat, key or modality and emphasis drawn from music, we draw on our experiments that evaluated the effects of spatial separation in multivariate data presentations.
Ferguson, S & Beilharz, KA 2009, 'An Interface for Live Interactive Sonification', Proceedings of New Interfaces for Musical Expression, New Interfaces for Musical Expression, NIME, Pittsburgh PA, pp. 35-36.
Ferguson, SJ & Cabrera, D 2009, 'Auditory Spectral Summarisation for Audio Signals with Musical Applications', 10th International Society for Music Information Retrieval Conference, International Symposium for Music Information Retrieval, International Society on Music Information Retrieval, Kobe, Japan, pp. 567-572.
Methods for spectral analysis of audio signals and their
graphical display are widespread. However, assessing music
and audio in the visual domain involves a number of
challenges in the translation between auditory images into
mental or symbolically represented concepts. This paper
presents a spectral analysis method that exists entirely in
the auditory domain, and results in an auditory presentation
of a spectrum. It aims to strip a segment of audio signal
of its temporal content, resulting in a quasi-stationary
signal that possesses a similar spectrum to the original signal.
The method is extended and applied for the purpose
of music summarisation.
Cabrera, D, Ferguson, S & Schubert, E 2008, 'PsySound3: An integrated environment for the analysis of sound recordings', Annual Conference of the Australian Acoustical Society, AAS'08, pp. 286-292.
This paper presents possibilities offered by a computer program for analysing features of sound recordings, PsySound3. A wide variety of spectral and sound level analysis methods are implemented, together with models of loudness, roughness, pitch and binaural spatial analysis. In addition to providing access to these analysis methods, this analysis environment provides a context for easy comparison between analysis methods, which is very useful both for teaching and for the testing and development of models for research applications. The paper shows some of the potential for this by way of example. The software is structured so as to be easily extensible (using the Matlab programming environment), and many extensions are envisaged. Written by the authors and colleagues, PsySound3 is freely available via www.psysound.org.
Beilharz, KA & Ferguson, S 2007, 'Gestural Hyper Instrument Collaboration with Generative Computation for Real Time Creativity', Creativity and Cognition Conference '07, ACM Creativity and Cognition, Sheridan Printing, Washington DC, USA, pp. 213-222.
This paper describes the performance, mapping, transformation and representation phases of a model for gesturetriggered musical creativity. These phases are articulated in an example creative environment, Hyper-Shuku (BorderCrossing), an audio-visually augmented shakuhachi performance to demonstrate the adaptive, empathetic response of the generative systems. The shakuhachi is a Japanese traditional end-blown bamboo Zen flute. Its 5 holes and simple construction require subtle and complex gestural movements to produce its diverse range of pitches, vibrato and pitch inflections, making it an ideal candidate for gesture capture. The environment uses computer vision, gesture sensors and computer listening to process and generate electronic music and visualization in real time response to the live performer. The integration oflooming auditory motion and Neural Oscillator Network (NOSC) generative modules are implemented in this example.
Beilharz, KA, Jakovich, J & Ferguson, S 2006, 'Hyper-shaku [Border-crossing]: Towards the multi-modal gesture-controlled hyper-instrument', NIME06: Sixth International Conference on New Interfaces for Musical Expression 2006, International Conference on New Interfaces for Musical Expression, Ircam - Centre Pompidou, Paris, France, pp. 352-357.
Cabrera, D & Ferguson, S 2006, 'Auditory display of audio', Audio Engineering Society - 120th Convention Spring Preprints 2006, pp. 455-461.
In this paper, we consider applications of auditory display for representing audio system and audio signal characteristics. Conventional analytic representations of system characteristics, such as impulse response or non-linear distortion, rely on numeric and graphic communication. Alternatively, simply listening to the system under test can also reveal important aspects of its performance. Given that auditioning systems is so effective, it seems useful to develop higher-level auditory representations (auditory displays) of system performance parameters to exploit these listening abilities. For this purpose, we consider ways in which audio signals can be further transformed for auditory display, beyond the simple act of playing the sound.
Cabrera, D, Ferguson, S & Maria, R 2006, 'Using sonification for teaching acoustics and audio', 1st Australasian Acoustical Societies' Conference 2006, ACOUSTICS 2006: Noise of Progress, pp. 383-390.
In this paper we develop examples of how the understanding of acoustic and audio phenomena can be enhanced through sonification, especially with a view to application in education. The term sonification refers to the process of converting data into non-speech audio, and is distinct from auralization in that the process does not aim to simulate an actual or imagined sound environment. Measurements of audio and acoustical systems are most commonly represented numerically and graphically, and these two methods each have distinct advantages. However, display of such data using sound not only conveys important information, but also may provide an experience of important aspects of the phenomenon under consideration. When used in an education context, this method of data display should improve listening skills. We demonstrate various data transformations that allow a sonification of acoustical measurements or phenomena to bring out features of interest. We also demonstrate more abstract sonifications (auditory graphs) that can be usefully applied to this context. Copyright © (2006) by the Australian Acoustical Society.
Ferguson, S, Cabrera, D, Beilharz, KA & Song, H 2006, 'Using Psychoacoustical Models for Information Sonification', Proceedings of 12th International Conference on Auditory Display ICAD 2006, International Conference on Auditory Display, ICAD, London, UK, pp. 113-120.
Cabrera, D, Ferguson, S & Laing, G 2005, 'Development of auditory alerts for air traffic control consoles', Audio Engineering Society - 119th Convention Fall Preprints 2005, pp. 1-21.
This paper documents a project that developed a hierarchical auditory alert scheme for air traffic control consoles, replacing a basic system of auditory alerts. Alerts are designed to convey the level of urgency, not provoke annoyance, be easily distinguished, minimize speech interference, and be easily localized. User evaluations indicate that the new alert scheme is highly advantageous, especially when combined with improved visual coding of alerts. The alert scheme was implemented in Australian air traffic control centers in July 2005.
Ferguson, S, Moere, AV & Cabrera, D 2005, 'Seeing sound: Real-time sound visualisation in visual feedback loops used for training musicians', Proceedings of the International Conference on Information Visualisation, pp. 97-102.View/Download from: Publisher's site
Musicians in training need to understand the sound they are producing in order to improve its deficient aspects. Verbal feedback from musical masters is the usual method used for attaining this understanding. However, using real-time sound visualisation as a complementary form of feedback, allows the large amounts of data typical of real-time acoustic analysis to be employed within training. This improves the efficiency of the feedback loop normally present within musical training and pedagogy. The implementation and effect of such a system is discussed. © 2005 IEEE.
Subkey, A, Cabrera, D & Ferguson, S 2005, 'Localization and image size effects for low frequency sound', Audio Engineering Society - 118th Convention Spring Preprints 2005, pp. 1974-1983.
Using four subwoofers, this study investigates horizontal auditory image characteristics for one-third octave bands of pink noise in the frequency range 25 Hz to 100 Hz. The subwoofers were located at 90 degree intervals: 45 degrees to the left and right, and in front of and behind the subject. Single noise bands, coherent pairs, and incoherent pairs were subjectively assessed. Subjects drew the auditory image as an ellipse on a response sheet. Results indicate that left-right discrimination occurs even at the lowest frequencies of human hearing - a finding consistent with other recent research. Image width and depth are correlated, increasing at low frequencies for the stimuli tested, and for simultaneous presentation of coherent or incoherent signals. Like other recently published studies using multiple channels of low frequency sound, this study indicates that multiple subwoofers should be beneficial in multichannel audio systems.
Haeusler, M, Beilharz, KA, Ferguson, S & Barker, T, 'Polymedia Pixel', Media Architecture Biennale 2010, Media Architecture Institute, Kuenstlerhaus Vienna.