UTS site search

Dr Do Heon Lee


Dr. Do Heon Lee received his Bachelor degree (with Honours) from Australian Institute of Music, and his Master (with Honours) and PhD degrees from the University of Sydney. Previously, he worked at the University of New South Wales as a reserach associate associate, and at the University of Sydney as an honorary resreach associate. 

His main research interest is the application of psychoacoustic algorithms for modelling the human perception of room acoustic conditions. His research outcomes have been published in various refereed international journals and conferences. He is currently working at UTS as a postdoctoral research associate. 

Image of Do Heon Lee
Postdoctoral Research Associate, School of Mechanical and Mechatronic Engineering
Audio Engineering and Sound Production, Audio Technology, Acoustics, Acoustics

Research Interests

  • Room Acoustics
  • Psychoacoustics


Lee, D., Gong, E., Cabrera, D., Yadav, M. & Martens, W.L. 2015, 'Intelligibility of reverberant speech with amplification: Limitation of speech intelligibility metrics, and a preliminary examination of an alternative approach', 2015 Conferences on New Advances in Acoustics, Shanghai, CHina.
This study examines the effect of speech level on intelligibility in different reverberation condi- tions, and explores the potential of loudness-based reverberation parameters proposed by Lee et al. [J. Acoust. Soc. Am., 131(2), 1194-1205 (2012)] to explain the effect of speech level on intelligi- bility in various reverberation conditions. Listening experiments were performed with three speech levels (LAeq of 55 dB, 65 dB and 75 dB) and three reverberation conditions (T20 of 1.0 s, 1.9 s and 4.0 s), and subjects listened to speech stimuli through headphones. Collected subjective data were compared with two conventional speech intelligibility parameters (Speech Intelligibility In- dex and Speech Transmission Index) and two loudness-based reverberation parameters (EDTN and TN). Results reveal that the effect of speech level on intelligibility changes with a room's re- verberation conditions, and that increased level results in reduced intelligibility in highly rever- berant conditions. EDTN and TN explain this finding better than do STI and SII, because they con- sider many psychoacoustic phenomena important for the modeling of the effect of speech level varying with reverberation.
Yadav, M., Cabrera, D., Miranda, L., Martens, W.L., Lee, D. & Collins, R. 2013, 'Investigating auditory room size perception with autophonic stimuli', The 135th Audio Engineering Society, New York, USA.
Although looking at a room gives a visual indicator of its 'size', auditory stimuli alone can also provide an appreciation of room size. This paper investigates such aurally perceived room size by allowing listeners to hear the sound of their own voice in real-time through two modes: natural conduction and auralization. The auralization process involved convolution of the talking-listener's voice with an oral-binaural room impulse response (OBRIR; some from actual rooms, and others manipulated), which was output through head-worn ear-loudspeakers, and thus augmented natural conduction with simulated room reflections. This method allowed talking-listeners to rate room size without additional information about the rooms. The subjective ratings were analyzed against relevant physical acoustic measures derived from OBRIRs. The results indicate an overall strong effect of reverberation time on the room size judgments, expressed as a power function, although energy measures were also important in some cases.
Ferguson, S., Emery, S., Lee, D., Cabrera, D. & McPherson, G.E. 2013, 'A comparison between continuous categorical emotion responses and stimulus loudness parameters', The 4th International Conference on Information, Intelligence, System and Applications, Piraeus, Greece.
This paper investigates the use of psychoacoustic loudness analysis as a method for determining the likely emotional responses of listeners to musical excerpts. 19 excerpts of music were presented to 86 participants (7 randomly chosen excerpts per participant) who were asked to rate the emotion category using the emotion-clock-face continuous response interface. The same excerpts were analysed with a loudness model, and time series results were summarised as both loudness median and standard deviation. Comparisons indicate that the median and standard deviation of loudness plays an important role in determining the emotion category responses.
Cabrera, D., Lee, D., Yadav, M. & Martens, W.L. 2011, 'Decay envelope manipulation of room impulse responses: Techniques for auralization and sonification', Acoustics 2011: Proceedings of Australian Acoustical Society Conference, Gold Coast, Australia.
Room impulse responses (RIRs) are very commonly used to represent the acoustic response of rooms for the deriva- tion of acoustical parameters and for auralization. This paper presents a set of signal processing techniques that can be used to enhance the usefulness of recorded RIRs for convolution-based room simulations (which could be classed as a type of auralization), which include using the noise floor to extend the decay, and manipulating the RIR to repre- sent arbitrarily different yet plausible room conditions. The paper also considers how the manipulation of the decay slope can be used to make other features of RIRs more audible, which could have applications in RIR sonification.
Lee, D., Cabrera, D. & Martens, W.L. 2010, 'Equal reverberance matching of running musical stimuli having various reverberation times and SPLs', the 20th International Congress on Acoustics,, Sydney, Australia.
This paper examines effects of listening level and reverberation time on the reverberance of running musical stimuli. A listening test was conducted which tested an anechoic music stimulus convolved with synthetic RIRs having a range of listening levels and reverberation times: in the test, subjects adjusted the reverberance of a musical stimulus (by adjusting the decay rate of an impulse response convolved with dry music) to match that of reference stimuli. In this way, we constructed equal reverberance contours as a function of sound pressure level and reverberation time. The experiment results confirm that the listening level and reverberation time both have a significant effect on rever- berance: increased listening level or reverberation time leads to greater reverberance. Loudness-based predictors of reverberance outperform the conventional reverberance predictors.
William, W.W., Guru, A. & Lee, D. 2010, 'Effects of individualised headphone response equalization on front/back hemifield discrimination for virtual sources displayed on the horizontal plane', The 20th International Congress on Acoustics, Sydney, Australia.
In the most demanding virtual auditory display applications, in which individualised Head Related Transfer Func- tions (HRTFs) are used for the presentation of virtual sound sources via headphones, there is controversy regarding how important it may be for individualised Headphone Transfer Function (HpTF) measurements to be used in equalizing the headphone response for each listener. In order to test what impact the use of such individualized HpTF-based correction might have on directional judgments, filtered noise bursts were presented with and without such headphone correction during a test of front/back hemifield discrimination for virtual sound sources positioned on six sagittal planes offset from the median plane by 15o, 30o, and 45o to either side. While perfect discrimination performance was observed given repeated two-interval forced choice discrimination trials in which a pair of short noise bursts were presented using individualised HRTFs, within-trial variation in the spectrum of the source submit- ted to HRTF-based processing made the task quite difficult, reducing performance to chance levels for 7 of the 17 listeners tested. For the remaining listeners who showed above-chance performance under all conditions tested, per- formance levels were well below the perfect performance that had been observed when the spectrum of the HRTF- processed source was held constant. Through inter-stimulus variation in source spectra, which functioned to remove the so-called 'known-source-spectrum ceiling effect associated with simple laboratory tests of virtual auditory dis- play technology, it was possible to show that front/back discrimination performance was clearly affected when sour- ces were processed using headphone correction filters that were based upon a each individual's measured HpTF.
Guru, A., Martens, W.L. & Lee, D. 2010, 'Effects of individualised headphone correction on front/back discrimination of virtual sound sources displayed using individualised head related transfer functions', The AES 40th International Conference, Tokyo, Japan.
Individualised Head Related Transfer Functions (HRTFs) were used to process brief noise bursts for a 2-interval forced choice (2IFC) front/back discrimination of virtual sound source locations presented via two models of headphones, frequency responses of which could be made nearly flat for each of 21 listeners using individualised headphone correction filters. In order to remove virtual source timbre as a cue for front/back discrimination, spectral centroid of sources processed using rearward HRTFs were manipulated so as to be more or less similar to that of source processed using frontward HRTFs. As this manipulation reduced front/back discrimination to chance levels for 12 out of 21 listeners, performance of 9 listeners showing "good discrimination" was analysed separately. For these 9 listeners, the virtual sources presented using individualised headphone correction filters supported significantly better front/back discrimination rates than did virtual sources presented without correction to headphone responses.
Lee, D., Cabrera, D. & Martens, W.L. 2009, 'Equal reverberance matching of music', Acoustics 2009: Proceedings of the Australian Acoustical Society Conference, Adelaide, Australia.
This study explores the reverberance of music in simulated auditoria. It investigates the effects of gain on the rever- berance of an anechoic music recording convolved with auditorium impulse responses. Based on objective loudness modelling, our hypothesis is that gain has a positive effect on reverberance (even though it has no effect on reverbera- tion time). In a subjective experiment, participants adjusted decay rate of auditorium impulse responses convolved with an anechoic music sample in order to match the reverberance of each music stimulus to that of a reference music sample. Results support the hypothesis, and are similar to those of a previous study in which auditorium impulse re- sponses (without convolution) were matched similarly for reverberance.
Lee, D. & Cabrera, D. 2007, 'Matching the reverberance of room impulse responses', The 10th Western Pacific Acoustics Conference, Beijing, China.
This study investigates the subjective reverberance of room impulse responses (RIRs) when directly listened to (rather than convolved with a dry signal such as speech or music). We investigate the effects of gain (or listening level) and background noise level on the reverberance of RIRs measured in concert auditoria. The task of the subjective experiment was to match the reverberance of RIRs to that of a reference RIR by adjusting the exponential decay rate using a slider. Based on objective loudness modeling, gain should have a positive effect on reverberance and background noise have a negative effect. This is confirmed in the results of the experiment.
Lee, D. & Cabrera, D. 2007, 'Nonlinear effects in airborne sound insulation measurement', The14th International Congress on Sound and Vibration, Cairns, Australia.
The level difference (D) between a source and receiving room should be independent of the source's sound power level if the acoustic system (including the building fabric) is linear and time-invariant (LTI) and the signal to noise ratio is adequate. Furthermore, various measurement signal types such as white noise, maximum length sequence, and swept sinusoid (to derive impulse responses) should also yield equivalent results in an LTI system with adequate signal to noise ratio. This study investigates the presence of non-linear effects in a case study of a real building by measuring D using a range of sound power levels and signal types. This follows on from previous work which suggested that substantial non-linearities could affect measurements in the very low frequency range (20-100 Hz), so the present study includes these very low frequencies, but also investigates the usual frequency range for airborne sound insulation measurement. In this study, fixed source and receiver positions were used to measure D (without spatial averaging) between a pair of adjacent rooms. Three test signals were used: maximum length sequence (used both as white noise, and deconvolved to impulse responses), a linear swept sinusoid and a logarithmic swept sinusoid (deconvolved to impulse responses).

Journal articles

Cabrera, D., Lee, D., Leembruggen, G. & Jimenez, D. 2014, 'Increasing robustness in the calculation of the speech transmission index from impulse responses', Building Acoustics, vol. 21, no. 3, pp. 181-198.
There are many factors that can affect the measured values of the speech transmission index (STI), and this paper examines how and why identical inputs into STI calculation software are yielding varying results. The study involved a survey of a number of software implementations of the Indirect Method for computing the STI from an impulse response, one of which was written by the authors. Results are presented for artificial and measured impulse responses, and for signal and noise spectra that were designed to test particular aspects of the STI calculation. While most deviations between implementations were within 0.01 STI, some were not, revealing a need for greater robustness in the design of software and greater clarity in the STI standard (IEC60268-16), including more support for validation. This paper provides some data for such validation.
Jeong, C.-.H., Lee, D., Santurette, S. & Ih, J.-.G. 2014, 'Influence of impedance phase angle on sound pressures and reverberation times in a rectangular room', Journal of the Acoustical Society of America, vol. 135, no. 2, pp. 712-723.
In most room acoustic predictions, phase shift on reflection has been overlooked. This study aims to quantify the effects of the surface impedance phase angle of the boundary surfaces on room acoustic conditions. As a preliminary attempt, a medium-sized rectangular room is simulated by a phased beam tracing model, after verifying it numerically against boundary element simulations. First, the absorption characteristic of the boundary surfaces varies uniformly from 0.2 to 0.8, but with various impedance phase angles. Second, typical non-uniform cases having hard walls and floor, but with an absorptive ceiling are investigated. The zero phase angle, which has commonly been assumed in practice, is regarded as reference and differences in the sound pressure level and early decay time from the reference are quantified. As expected, larger differences in the room acoustic parameters are found for larger impedance phase angles. Additionally, binaural impulse responses are compared in a listening test for the uniform absorption cases, revealing that non-zero impedance phase angle cases can be perceptually different from the reference condition in terms of reverberance perception. For the non-uniform settings, the change in the impedance phase angle of the ceiling does not affect the acoustic conditions significantly.
Lee, D., Cabrera, D. & Martens, W.L. 2014, 'Accounting for listening level in the prediction of reverberance using early decay time', Acoustics Australia, vol. 40, no. 2, pp. 103-110.
Reverberance, which is an auditory attribute describing the extent to which a room or system is reverberant, is conventionally estimated using early decay time (similar to reverberation time). In a series of recent studies, the authors have shown that reverberance is better estimated using loudness decay parameters, i.e., parameters derived from the decay function of a room impulse response analysed using an objective time-varying loudness model. This approach is based on the notion that the experience of sound decaying in a room is an experience of loudness decay. One reason for the success of this approach is that the loudness decay rate depends on listening level, and this dependency corresponds to subjective experimental data on reverberance. However, loudness-based analysis is neither simple nor computationally efficient, and so this paper proposes a simplified approach to reverberance estimation, using listening level to modify early decay time or reverberation time values.
Lee, D., Cabrera, D. & Martens, W.L. 2012, 'The effect of loudness on the reverberance of music: Reverberance prediction using loudness models', Journal of the Acoustical Society of America, vol. 131, no. 2, pp. 1194-1205.
This study examines the auditory attribute that describes the perceived amount of reverberation, known as 'reverberance. Listening experiments were performed using two signals commonly heard in auditoria: excerpts of orchestral music and western classical singing. Listeners adjusted the decay rate of room impulse responses prior to convolution with these signals, so as to match the reverber- ance of each stimulus to that of a reference stimulus. The analysis examines the hypothesis that reverberance is related to the loudness decay rate of the underlying room impulse response. This hy- pothesis is tested using computational models of time varying or dynamic loudness, from which pa- rameters analogous to conventional reverberation parameters (early decay time and reverberation time) are derived. The results show that listening level significantly affects reverberance, and that the loudness-based parameters outperform related conventional parameters. Results support the pro- posed relationship between reverberance and the computationally predicted loudness decay function of sound in rooms.
Lee, D., Cabrera, D. & Martens, W.L. 2011, 'Equal reverberance contours for synthetic room impulse responses listened to directly: Evaluation of reverberance in terms of loudness decay parameters', Building Acoustics, vol. 18, no. 1,2, pp. 189-206.
This paper examines effects of listening level and reverberation time on the perceived decay rate of synthetic room impulse responses (RIRs). A listening test was conducted with synthetic RIRs having a range of listening levels and reverberation times: in the test, subjects adjusted a physical decay rate of the RIRs to match the perceived decay rate of reference stimuli. In this way, we constructed equal reverberance contours as a function of sound pressure level and reverberation time. The experiment results confirm that listening level and reverberation time both significantly affect reverberance. The study also supports our previous finding: that the loudness decay function can be used to predict reverberance better than the conventional reverberance predictors.
Cabrera, D., Lee, D., Collins, R., Hartmann, B., Martens, W.L. & Sato, H. 2011, 'Variation in oral-binaural room impulse responses for horizontal rotations of a head and torso simulator', Building Acoustics, vol. 18, no. 1,2, pp. 227-251.
Oral-binaural room impulse responses (OBRIRs) describe the room acoustical response from the mouth to the ears of a head or dummy head. In this study, we measured OBRIRs in ten rooms, ranging from small to large. In each room, a head and torso simulator (HATS) was rotated at 2 degree increments to sample the room response at the selected measurement position. In rotating the HATS, the radiation pattern of the mouth rotates with the reception pattern of ears. This paper characterises the variation in room gain and interaural response of the tested rooms, and in doing so, we consider how OBRIRs can be usefully understood in terms of acoustical parameters.
Lee, D. & Cabrera, D. 2010, 'Effect of listening level and background noise on the subjective decay rate of room impulse responses: Using time-varying loudness to model reverberance', Applied Acoustics, vol. 71, no. 9, pp. 801-811.
Effect of listening level and background noise on the subjective decay rate of room impulse responses: Using time-varying loudness to model reverberance
Cabrera, D., Sato, H., Martens, W.L. & Lee, D. 2009, 'Binaural measurement and simulation of the room acoustical response from a person's mouth to their ears', Acoustics Australia, vol. 37, no. 3, pp. 98-103.
This paper outlines methods to simulate the sound of one's own voice as it is affected by room acoustics, using binaural technology. An oral-binaural room impulse response (OBRIR) measurement can be made of a real room environment from the mouth to the ears of the same head. For simulation, a talker's voice is convolved in real-time with the OBRIR, so that they can hear the sound of their own voice in the simulated room environment. We show by example how OBRIR measurements can be made using human subjects (by measuring the transfer function of speech) or by a head and torso simulator (HATS), and we illustrate the differences between individualised measurements and HATS measurements. We extend the HATS measurement method through binaural room scanning, which allows the simulation system to produce natural changes in the OBRIR as subjects rotate their heads while listening to their own voice.
Lee, D. & Cabrera, D. 2009, 'Basic considerations for loudness-based analysis of room impulse responses', Building Acoustics, vol. 16, no. 1, pp. 31-46.
Room impulse responses (RIRs) are used to characterise the acoustical conditions inside sound- critical rooms such as auditoria. The analysis of RIRs typically involves octave-band filtering, with parameters such as reverberation time, early decay time, temporal energy ratios and spatial parameters derived from this. This paper explores the potential for applying auditory models for the analysis of RIRs – incorporating auditory temporal integration (and masking), auditory filterbank analysis, and loudness calculation. The purpose of this is to produce analysis results that are closely related to the sound experienced by listeners. A preliminary step for such analysis is to filter RIRs so that their power spectrum is similar to that of typical material that would be listened to in the rooms (e.g. music or speech), and this paper proposes a music filter suitable for orchestral music, derived from long term power spectra of anechoic music recordings. Dynamic loudness analysis of RIRs yields loudness decay functions that are approximately exponential, which should provide a useful analogy with conventional analysis methods applied to RIRs.
Marshall, S., Lee, D. & Cabrera, D. 2007, 'Comparison of low frequency sound insulation field measurement methods', New Zealand Acoustics, vol. 20, no. 4, pp. 23-33.
The reliable field measurement of airborne sound insulation between rooms in the very low frequency range (20 Hz – 100 Hz) presents a substantial challenge for several reasons. Sound source and microphone placement can have a strong effect on the transmission, and diffuse field conditions are usually not possible to establish in medium-sized rooms. In this study we compare three methods that have been proposed previously for transducer placement with each other, and with mass law theory. Our results show that substantially different values may be obtained from each method of measurement. Furthermore, we examine the influence of the test signal on the measurement, and find that non-linearities in the building fabric can also substantially affect the apparent sound reduction index in the very low frequency range. We discuss how measurement techniques might be refined to increase their reliability.