The issue of auditory segregation of simultaneous sound sources has been addressed in speech research but was given less attention in musical acoustics. In perception of concurrent speech, or speech with noise, the operation of time-frequency masking was often used as a research tool. In this work, an ex- tension of time-frequency masking, leading to the removal of spectro-temporal overlap between sound sources, was applied to musical instruments playing together. The perception of the original mixture was compared with the perception of the same mixture with all spectral overlap electronically removed. Ex- periments differed in the method of listening (headphones or a loudspeaker), sets of instruments mixed, and populations of participants. The main findings were: (i) in one of the experimental conditions the removal of spectro-temporal overlap was imperceptible, (ii) perception of the effect increased when removal of spectro-temporal overlap was performed in larger time-frequency regions rather than in small ones, (iii) perception of the effect decreased in loudspeaker listening. The results support both the multiple looks hypothesis and the “glimpsing” hypothesis known from speech perception.
Independent Component Analysis (ICA) can be used for single channel audio separation, if a mixed signal is transformed into time-frequency domain and the resulting matrix of magnitude coefficients is processed by ICA. Previous works used only frequency (spectral) vectors and Kullback-Leibler distance measure for this task. New decomposition bases are proposed: time vectors and time-frequency components. The applicability of several different measures of distance of components are analysed. An algorithm for clustering of components is presented. It was tested on mixes of two and three sounds. The perceptual quality of separation obtained with the measures of distance proposed was evaluated by listening tests, indicating "beta" and "correlation" measures as the most appropriate. The "Euclidean" distance is shown to be appropriate for sounds with varying amplitudes. The perceptual effect of the amount of variance used was also evaluated.
Whenever the recording engineer uses stereo microphone techniques, he/she has to consider a recording angle resulting from the positioning of microphones relative to sound sources, besides other acoustic factors. The recording angle, the width of a captured acoustic scene and the properties of a particular microphone technique are closely related. We propose a decision supporting method, based on the mapping of the actual position of a sound source to its position in the reproduced acoustic scene. This research resulted in a set of localisation curves characterising four most popular stereo microphone techniques. The curves were obtained by two methods: calculation, based on appropriate engineering formulae, and experiment consisting in the recording of sources and estimation of the perceived position in listening tests. The analysis of curves brings several conclusions important in the recording practice.
In virtual acoustics or artificial reverberation, impulse responses can be split so that direct and reflected components of the sound field are reproduced via separate loudspeakers. The authors had investigated the perceptual effect of angular separation of those components in commonly used 5.0 and 7.0 multichannel systems, with one and three sound sources respectively (Kleczkowski et al., 2015, J. Audio Eng. Soc. 63, 428-443). In that work, each of the front channels of the 7.0 system was fed with only one sound source. In this work a similar experiment is reported, but with phantom sound sources between the front loud- speakers. The perceptual advantage of separation was found to be more consistent than in the condition of discrete sound sources. The results were analysed both for pooled listeners and in three groups, according to experience. The advantage of separation was the highest in the group of experienced listeners.