.. currentmodule:: brian.hears .. index:: pair: auditory; modelling .. _brianhears: Brian hears =========== .. image:: images/brianhearslogo.png :align: right Brian hears is an auditory modelling library for Python. It is part of the neural network simulator package Brian, but can also be used on its own. To download Brian hears, simply :ref:`download Brian `: Brian hears is included as part of the package. Brian hears is primarily designed for generating and manipulating sounds, and applying large banks of filters. We import the package by writing:: from brian import * from brian.hears import * Then, for example, to generate a tone or a whitenoise we would write:: sound1 = tone(1*kHz, .1*second) sound2 = whitenoise(.1*second) These sounds can then be manipulated in various ways, for example:: sound = sound1+sound2 sound = sound.ramp() If you have the `pygame `__ package installed, you can also play these sounds:: sound.play() We can filter these sounds through a bank of 3000 gammatone filters covering the human auditory range as follows:: cf = erbspace(20*Hz, 20*kHz, 3000) fb = Gammatone(sound, cf) output = fb.process() The output of this would look something like this (zoomed into one region): .. image:: images/cochleagram.png Alternatively, if we're interested in modelling auditory nerve fibres, we could feed the output of this filterbank directly into a group of neurons defined with Brian:: # Half-wave rectification and compression [x]^(1/3) ihc = FunctionFilterbank(fb, lambda x: 3*clip(x, 0, Inf)**(1.0/3.0)) # Leaky integrate-and-fire model with noise and refractoriness eqs = ''' dv/dt = (I-v)/(1*ms)+0.2*xi*(2/(1*ms))**.5 : 1 I : 1 ''' anf = FilterbankGroup(ihc, 'I', eqs, reset=0, threshold=1, refractory=5*ms) This model would give output something like this: .. image:: images/auditory-nerve-fibre-rasterplot.png The human cochlea applies the equivalent of 3000 auditory filters, which causes a technical problem for modellers which this package is designed to address. At a typical sample rate, the output of 3000 filters would saturate the computer's RAM in a few seconds. To deal with this, we use online computation, that is we only ever keep in memory the output of the filters for a relatively short duration (say, the most recent 20ms), do our modelling with these values, and then discard them. Although this requires that some models be rewritten for online rather than offline computation, it allows us to easily handle models with very large numbers of channels. 3000 or 6000 for human monaural or binaural processing is straightforward, and even much larger banks of filters can be used (for example, around 30,000 in `Goodman DFM, Brette R (2010). Spike-timing-based computation in sound localization. PLoS Comput. Biol. 6(11): e1000993. doi:10.1371/journal.pcbi.1000993 `__). Techniques for online computation are discussed below in the section `Online computation`_. Brian hears consists of classes and functions for defining `sounds`_, `filter chains`_, cochlear models, neuron models and `head-related transfer functions`_. These classes are designed to be modular and easily extendable. Typically, a model will consist of a chain starting with a sound which is plugged into a chain of filter banks, which are then plugged into a neuron model. The two main classes in Brian hears are :class:`Sound` and :class:`Filterbank`, which function very similarly. Each consists of multiple channels (typically just 1 or 2 in the case of sounds, and many in the case of filterbanks, but in principle any number of channels is possible for either). The difference is that a filterbank has an input source, which can be either a sound or another filterbank. All scripts using Brian hears should start by importing the Brian and Brian hears packages as follows:: from brian import * from brian.hears import * To download Brian hears, simply :ref:`download Brian `: Brian hears is included as part of the package. .. seealso:: Reference documentation for :ref:`brian-hears-reference`, which covers everything in this overview in detail, and more. List of :ref:`examples of using Brian hears `. .. index:: single: sound pair: sound; wav pair: sound; aiff .. _sounds_overview: Sounds ------ Sounds can be loaded from a WAV or AIFF file with the :func:`loadsound` function (and saved with the :func:`savesound` function or :meth:`Sound.save` method), or by initialising with a filename:: sound = loadsound('test.wav') sound = Sound('test.aif') sound.save('test.wav') Various standard types of sounds can also be constructed, e.g. pure tones, white noise, clicks and silence:: sound = tone(1*kHz, 1*second) sound = whitenoise(1*second) sound = click(1*ms) sound = silence(1*second) You can pass a function of time or an array to initialise a sound:: # Equivalent to Sound.tone sound = Sound(lambda t:sin(50*Hz*2*pi*t), duration=1*second) # Equivalent to Sound.whitenoise sound = Sound(randn(int(1*second*44.1*kHz)), samplerate=44.1*kHz) Multiple channel sounds can be passed as a list or tuple of filenames, arrays or :class:`Sound` objects:: sound = Sound(('left.wav', 'right.wav')) sound = Sound((randn(44100), randn(44100)), samplerate=44.1*kHz) sound = Sound((Sound.tone(1*kHz, 1*second), Sound.tone(2*kHz, 1*second))) A multi-channel sound is also a numpy array of shape ``(nsamples, nchannels)``, and can be initialised as this (or converted to a standard numpy array):: sound = Sound(randn(44100, 2), samplerate=44.1*kHz) arr = array(sound) Sounds can be added and multiplied:: sound = Sound.tone(1*kHz, 1*second)+0.1*Sound.whitenoise(1*second) For more details on combining and operating on sounds, including shifting them in time, repeating them, resampling them, ramping them, finding and setting intensities, plotting spectrograms, etc., see :class:`Sound`. Sounds can be played using the :func:`play` function or :meth:`Sound.play` method:: play(sound) sound.play() .. index:: pair: sound; sequence Sequences of sounds can be played as:: play(sound1, sound2, sound3) .. index:: pair: sound; stereo single: sound; multiple channels The number of channels in a sound can be found using the ``nchannels`` attribute, and individual channels can be extracted using the :meth:`Sound.channel` method, or using the ``left`` and ``right`` attributes in the case of stereo sounds:: print sound.nchannels print amax(abs(sound.left-sound.channel(0))) As an example of using this, the following swaps the channels in a stereo sound:: sound = Sound('test_stereo.wav') swappedsound = Sound((sound.right, sound.left)) swappedsound.play() .. index:: pair: sound; level pair: sound; dB The level of the sound can be computed and changed with the ``sound.level`` attribute. Levels are returned in dB which is a special unit in Brian hears. For example, ``10*dB+10`` will raise an error because ``10`` does not have units of dB. The multiplicative gain of a value in dB can be computed with the function ``gain(level)``. All dB values are measured as RMS dB SPL assuming that the values of the sound object are measured in Pascals. Some examples:: sound = whitenoise(100*ms) print sound.level sound.level = 60*dB sound.level += 10*dB sound *= gain(-10*dB) .. index:: single: filter single: filter bank pair: cochlea; modelling Filter chains ------------- The standard way to set up a model based on filterbanks is to start with a sound and then construct a chain of filterbanks that modify it, for example a common model of cochlear filtering is to apply a bank of gammatone filters, and then half wave rectify and compress it (for example, with a 1/3 power law). This can be achieved in Brian hears as follows (for 3000 channels in the human hearing range from 20 Hz to 20 kHz):: cfmin, cfmax, cfN = 20*Hz, 20*kHz, 3000 cf = erbspace(cfmin, cfmax, cfN) sound = Sound('test.wav') gfb = GammatoneFilterbank(sound, cf) ihc = FunctionFilterbank(gfb, lambda x: clip(x, 0, Inf)**(1.0/3.0)) The :func:`erbspace` function constructs an array of centre frequencies on the ERB scale. The ``GammatoneFilterbank(source, cf)`` class creates a bank of gammatone filters with inputs coming from ``source`` and the centre frequencies in the array ``cf``. The ``FunctionFilterbank(source, func)`` creates a bank of filters that applies the given function ``func`` to the inputs in ``source``. Filterbanks can be added and multiplied, for example for creating a linear and nonlinear path, e.g.:: sum_path_fb = 0.1*linear_path_fb+0.2*nonlinear_path_fb A filterbank must have an input with either a single channel or an equal number of channels. In the former case, the single channel is duplicated for each of the output channels. However, you might want to apply gammatone filters to a stereo sound, for example, but in this case it's not clear how to duplicate the channels and you have to specify it explicitly. You can do this using the :class:`Repeat`, :class:`Tile`, :class:`Join` and :class:`Interleave` filterbanks. For example, if the input is a stereo sound with channels LR then you can get an output with channels LLLRRR or LRLRLR by writing (respectively):: fb = Repeat(sound, 3) fb = Tile(sound, 3) To combine multiple filterbanks into one, you can either join them in series or interleave them, as follows:: fb = Join(source1, source2) fb = Interleave(source1, source2) For a more general (but more complicated) approach, see :class:`RestructureFilterbank`. Two of the most important generic filterbanks (upon which many of the others are based) are :class:`LinearFilterbank` and :class:`FIRFilterbank`. The former is a generic digital filter for FIR and IIR filters. The latter is specifically for FIR filters. These can be implemented with the former, but the implementation is optimised using FFTs with the latter (which can often be hundreds of times faster, particularly for long impulse responses). IIR filter banks can be designed using :class:`IIRFilterbank` which is based on the syntax of the ``iirdesign`` scipy function. You can change the input source to a :class:`Filterbank` by modifying its ``source`` attribute, e.g. to change the input sound of a filterbank ``fb`` you might do:: fb.source = newsound Note that the new source should have the same number of channels. .. index:: pair: filtering; control path You can implement control paths (using the output of one filter chain path to modify the parameters of another filter chain path) using :class:`ControlFilterbank` (see reference documentation for more details). For examples of this in action, see the following: * :ref:`example-hears_time_varying_filter1`. * :ref:`example-hears_time_varying_filter2`. * :ref:`example-hears_dcgc`. Connecting with Brian --------------------- To create spiking neuron models based on filter chains, you use the :class:`FilterbankGroup` class. This acts exactly like a standard Brian :class:`~brian.NeuronGroup` except that you give a source filterbank and choose a state variable in the target equations for the output of the filterbank. A simple auditory nerve fibre model would take the inner hair cell model from earlier, and feed it into a noisy leaky integrate-and-fire model as follows:: # Inner hair cell model as before cfmin, cfmax, cfN = 20*Hz, 20*kHz, 3000 cf = erbspace(cfmin, cfmax, cfN) sound = Sound.whitenoise(100*ms) gfb = Gammatone(sound, cf) ihc = FunctionFilterbank(gfb, lambda x: 3*clip(x, 0, Inf)**(1.0/3.0)) # Leaky integrate-and-fire model with noise and refractoriness eqs = ''' dv/dt = (I-v)/(1*ms)+0.2*xi*(2/(1*ms))**.5 : 1 I : 1 ''' G = FilterbankGroup(ihc, 'I', eqs, reset=0, threshold=1, refractory=5*ms) # Run, and raster plot of the spikes M = SpikeMonitor(G) run(sound.duration) raster_plot(M) show() And here's the output (after 6 seconds of computation on a 2GHz laptop): .. image:: images/auditory-nerve-fibre-rasterplot.png Plotting -------- Often, you want to use log-scaled axes for frequency in plots, but the built-in matplotlib axis labelling for log-scaled axes doesn't work well for frequencies. We provided two functions (:func:`log_frequency_xaxis_labels` and :func:`log_frequency_yaxis_labels`) to automatically set useful axis labels. For example:: cf = erbspace(100*Hz, 10*kHz) ... semilogx(cf, response) axis('tight') log_frequency_xaxis_labels() .. index:: pair: online; computation Online computation ------------------ Typically in auditory modelling, we precompute the entire output of each channel of the filterbank ("offline computation"), and then work with that. This is straightforward, but puts a severe limit on the number of channels we can use or the length of time we can work with (otherwise the RAM would be quickly exhausted). Brian hears allows us to use a very large number of channels in filterbanks, but at the cost of only storing the output of the filterbanks for a relatively short period of time ("online computation"). This requires a slight change in the way we use the output of the filterbanks, but is actually not too difficult. For example, suppose we wanted to compute the vector of RMS values for each channel of the output of the filterbank. Traditionally, or if we just use the syntax ``output = fb.process()`` in Brian hears, we have an array ``output`` of shape ``(nsamples, nchannels)``. We could compute the vector of RMS values as:: rms = sqrt(mean(output**2, axis=0)) To do the same thing with online computation, we simply store a vector of the running sum of squares, and update it for each buffered segment as it is computed. At the end of the processing, we divide the sum of squares by the number of samples and take the square root. The :meth:`Filterbank.process` method allows us to pass an optional function ``f(output, running)`` of two arguments. In this case, :meth:`~Filterbank.process` will first call ``running = f(output, 0)`` for the first buffered segment ``output``. It will then call ``running = f(output, running)`` for each subsequent segment. In other words, it will "accumulate" the output of ``f``, passing the output of each call to the subsequent call. To compute the vector of RMS values then, we simply do:: def sum_of_squares(input, running): return running+sum(input**2, axis=0) rms = sqrt(fb.process(sum_of_squares)/nsamples) If the computation you wish to perform is more complicated than can be achieved with the :meth:`~Filterbank.process` method, you can derive a class from :class:`Filterbank` (see that class' reference documentation for more details on this). .. index:: pair: buffering; interface Buffering interface ------------------- The :class:`Sound`, :class:`OnlineSound` and :class:`Filterbank` classes (and all classes derived from them) all implement the same buffering mechanism. The purpose of this is to allow for efficient processing of multiple channels in buffers. Rather than precomputing the application of filters to all channels (which for large numbers of channels or long sounds would not fit in memory), we process small chunks at a time. The entire design of these classes is based on the idea of buffering, as defined by the base class :class:`Bufferable` (see section :ref:`brian-hears-class-diagram`). Each class has two methods, ``buffer_init()`` to initialise the buffer, and ``buffer_fetch(start, end)`` to fetch the portion of the buffer from samples with indices from ``start`` to ``end`` (not including ``end`` as standard for Python). The ``buffer_fetch(start, end)`` method should return a 2D array of shape ``(end-start, nchannels)`` with the buffered values. From the user point of view, all you need to do, having set up a chain of :class:`Sound` and :class:`Filterbank` objects, is to call ``buffer_fetch(start, end)`` repeatedly. If the output of a :class:`Filterbank` is being plugged into a :class:`FilterbankGroup` object, everything is handled automatically. For cases where the number of channels is small or the length of the input source is short, you can use the :meth:`Filterbank.fetch(duration)` method to automatically handle the initialisation and repeated application of ``buffer_fetch``. To extend :class:`Filterbank`, it is often sufficient just to implement the ``buffer_apply(input)`` method. See the documentation for :class:`Filterbank` for more details. Library ------- Brian hears comes with a package of predefined filter classes to be used as basic blocks by the user. All of them are implemented as filterbanks. First, a series of standard filters widely used in audio processing are available: .. tabularcolumns::|p{3cm}|p{15cm}|p{3cm}| +------------------------------------+---------------------------------------------------------------------------------------------------+----------------------------------------------+ | Class | Descripition | Example | +====================================+===================================================================================================+==============================================+ | :class:`IIRFilterbank` | Bank of low, high, bandpass or bandstop filter of type Chebyshef, Elliptic, etc... | :ref:`example-hears_IIRfilterbank` | +------------------------------------+---------------------------------------------------------------------------------------------------+----------------------------------------------+ | :class:`Butterworth` | Bank of low, high, bandpass or bandstop Butterworth filters | :ref:`example-hears_butterworth` | +------------------------------------+---------------------------------------------------------------------------------------------------+----------------------------------------------+ | :class:`LowPass` | Bank of lowpass filters of order 1 | :ref:`example-hears_cochleagram` | +------------------------------------+---------------------------------------------------------------------------------------------------+----------------------------------------------+ Second, the library provides linear auditory filters developed to model the middle ear transfer function and the frequency analysis of the cochlea: .. tabularcolumns::|p{3cm}|p{15cm}|p{3cm}| +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ | Class | Description | Example | +====================================+===================================================================================================+============================================================+ | :class:`MiddleEar` | Linear bandpass filter, based on middle-ear frequency response properties | :ref:`example-hears-tan_carney_2003_tan_carney_simple_test`| +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ | :class:`Gammatone` | Bank of IIR gammatone filters (based on Slaney implementation) | :ref:`example-hears_gammatone` | +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ | :class:`ApproximateGammatone` | Bank of IIR gammatone filters (based on Hohmann implementation) | :ref:`example-hears_approximate_gammatone` | +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ | :class:`LogGammachirp` | Bank of IIR gammachirp filters with logarithmic sweep (based on Irino implementation) | :ref:`example-hears_log_gammachirp` | +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ | :class:`LinearGammachirp` | Bank of FIR chirp filters with linear sweep and gamma envelope | :ref:`example-hears_linear_gammachirp` | +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ | :class:`LinearGaborchirp` | Bank of FIR chirp filters with linear sweep and gaussian envelope | | +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ Finally, Brian hears comes with a series of complex nonlinear cochlear models developed to model nonlinear effects such as filter bandwith level dependency, two-tones suppression, peak position level dependency, etc. .. tabularcolumns::|p{3cm}|p{15cm}|p{3cm}| +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ | Class | Description | Example | +====================================+===================================================================================================+============================================================+ | :class:`DRNL` | Dual resonance nonlinear filter as described in Lopez-Paveda and Meddis, JASA 2001 | :ref:`example-hears_drnl` | +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ | :class:`DCGC` | Compressive gammachirp auditory filter as described in Irino and Patterson, JASA 2001 | :ref:`example-hears_dcgc` | +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ | :class:`TanCarney` | Auditory phenomenological model as described in Tan and Carney, JASA 2003 | :ref:`example-hears-tan_carney_2003_tan_carney_simple_test`| +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ | :class:`ZhangSynapse` | Model of an inner hair cell -- auditory nerve synapse (Zhang et al., JASA 2001) | :ref:`example-hears-tan_carney_2003_tan_carney_simple_test`| +------------------------------------+---------------------------------------------------------------------------------------------------+------------------------------------------------------------+ .. index:: single: HRTF pair: IRCAM; HRTF pair: database; HRTF Head-related transfer functions ------------------------------- You can work with head-related transfer functions (HRTFs) using the three classes :class:`HRTF` (a single pair of left/right ear HRTFs), :class:`HRTFSet` (a set of HRTFs, typically for a single individual), and :class:`HRTFDatabase` (for working with databases of individuals). At the moment, we have included only one HRTF database, the :class:`IRCAM_LISTEN` public HRTF database. However, we will add support for the CIPIC and MIT-KEMAR databases in a subsequent release. There is also one artificial HRTF database, :class:`HeadlessDatabase` used for generating HRTFs of artifically introduced ITDs. An example of loading the IRCAM database, selecting a subject and plotting the pair of impulse responses for a particular direction:: hrtfdb = IRCAM_LISTEN(r'F:\HRTF\IRCAM') hrtfset = hrtfdb.load_subject(1002) hrtf = hrtfset(azim=30, elev=15) plot(hrtf.left) plot(hrtf.right) show() :class:`HRTFSet` has a set of coordinates, which can be accessed via the ``coordinates`` attribute, e.g.:: print hrtfset.coordinates['azim'] print hrtfset.coordinates['elev'] You can also generated filterbanks associated either to an :class:`HRTF` or an entire :class:`HRTFSet`. Here is an example of doing this with the IRCAM database, and applying this filterbank to some white noise and plotting the response as an image:: # Load database hrtfdb = IRCAM_LISTEN(r'D:\HRTF\IRCAM') hrtfset = hrtfdb.load_subject(1002) # Select only the horizontal plane hrtfset = hrtfset.subset(lambda elev: elev==0) # Set up a filterbank sound = whitenoise(10*ms) fb = hrtfset.filterbank(sound) # Extract the filtered response and plot img = fb.process().T img_left = img[:img.shape[0]/2, :] img_right = img[img.shape[0]/2:, :] subplot(121) imshow(img_left, origin='lower left', aspect='auto', extent=(0, sound.duration/ms, 0, 360)) xlabel('Time (ms)') ylabel('Azimuth') title('Left ear') subplot(122) imshow(img_right, origin='lower left', aspect='auto', extent=(0, sound.duration/ms, 0, 360)) xlabel('Time (ms)') ylabel('Azimuth') title('Right ear') show() This generates the following output: .. image:: images/hrtfset_response_plot.png For more details, see the reference documentation for :class:`HRTF`, :class:`HRTFSet`, :class:`HRTFDatabase`, :class:`IRCAM_LISTEN` and :class:`HeadlessDatabase`.