Yost Chapter 3: Sound transmission pp 22 - 38 [22 January 2003]

This chapter begins to introduce concepts from physics that are important for hearing. Some of it is very important to our later work, some not so important. I will try to indicate which is which.

Sound propagation: Yost makes the point that we do not "hear" vibrations (I think in the sense that he might mean that we feel vibrations with our skin); instead we hear the effect of the vibration transmitted to our ears through (usually) air: certainly, never through a vacuum. This is because the vibration is transmitted through the banging together of molecules pushed out of their usual place by the pressure wave, and transmitting it on to the next molecule in the line. "Propagation" Yost says is the same as "transfer" -- the energy is transferred from one molecule to another through their collision, and then finally to the ear through their collision with the tympanic membrane. The basic impetus to the sound wave is the rarefaction and condensation of molecules next to the vibrating object, condensation increasing the density of the gas (air) and thus increasing its pressure; but then as the vibrating object retreats a partial vacuum occurs into which the molecules return, thus lowering the density, and hence the pressure, in the area of rarefaction. Normally air is under standard "static" atmospheric pressure, which is then modified (slightly) by the repeated cycles of condensation and rarefaction, these affecting the space adjacent to them, and hence propagating the pressure wave. This is seen more or less clearly in Figure 3.1, in what Yost calls "the billiard ball model of sound transmission." One remarkable attribute of the acoustic signal is how small it is. In reality changes in acoustic pressure are modifications of the atmospheric pressure. Atmospheric pressure is about 15 pounds per square inch (actually 14.7 at sea level), which is also in other units 105 Newtons/meter2, or 105 Pascals. In contrast a threshold sound has an RMS pressure of 2 X 10-5 Pascals, 10 orders of magnitude less than atmospheric pressure, and a sound wave having about 4 orders of magnitude less than atmospheric pressure would damage the ear. [Please think about the reason why atmospheric pressure does not damage the ear, in contrast to a sound wave having about 1/10,000 as much pressure].

The distance between successive waves in the sound wave is the wavelength (lambda), usually expressed as some fraction or multiple of a meter. The wavelength is in part a function of the frequency of the sound wave, but also it is a function of the speed of sound in that medium, which is related to its density. Thus the wavelength of a particular sinusoid is longer in water compared to air because the sound travels faster in water (by a factor of about 4 to 1), and thus the pressure variations go further in space per unit time. It is as a result of this effect of density of the medium on the speed of sound, and thus the wavelength that underwater mammals are more sensitive to high frequencies than terrestrial mammals for the same head size, as it is related to sound localization (to point ahead a little bit). Formula 3.1 captures this as Lambda = c/f, where c is the speed of sound in that medium, usually m/sec. and frequency is given in Hz. Wave length is important for hearing because it determines our ability to detect objects in space on the basis of phase differences (large wavelength) or sound shadows (small wave length).

The speed of sound at sea level is about 340 m/sec, or about 1 ft/millisec. This is a good number to remember. It means that a 1 kHz tone has a wave length of about 1 foot. (Or otherwise, a 340 Hz tone has a wavelength of about 1 meter, and a 3400 Hz tone a wave length of about 10 cm: this is important because of resonant frequencies in the ear canal Ð see below.)

Yost next describes what he means by "pressure" vs "intensity". This is a subtle difference, and most people outside the purist inner circle of the auditory science societies don't really worry about it. Obviously the greater the amplitude of the vibration (which is its displacement) then the greater is the variation in pressure from condensation at the peak to rarefaction at the trough. Figure 3.2 gives the relationship among three related variables, the momentary displacement of the vibrating object in time; the momentary velocity of the object (this is NOT the velocity of the sound wave, but the velocity of the vibrating object and thus of the air molecules as they move back and forth from their normal intermediate position) and the momentary pressure exerted on the surrounding environment. Note that velocity is 90 degrees out of phase with displacement, because when the displacement reaches its peak and then is reversed, velocity is at that point zero; in fact velocity is at its greatest as displacement passes through zero on its way to the trough. But then pressure is proportional to velocity. Pressure is related to their bashing into each other: obviously if they are not moving, they are not bashing! This is given without enough explanation in equation 3.2. Instantaneous pressure p(t) [pressure at some given time] is directly proportional to velocity and inversely proportional to the area on which the object is pushing (so pressure at the end of a pointed instrument is greater than that at the end of a blunt instrument, because pressure is force/area). This is important because it is illuminates the way in which the middle ear increases the pressure of the sound wave as it moves towards the inner ear. So the formula is p(t) = mv/tAr [or mass X velocity divided by time X Area]: So where did time come from? It is that F = ma [force = mass X acceleration] and pressure = F/Area and also acceleration = distance/(time X time), and velocity = distance/time, so acceleration = velocity/time and so substituting every place one can ends up with equation 3.2. But pressure = Force/Area is good to remember.

Right after specifying that pressure is force/area (equation 3.3) Yost argues that if pressure is being exerted and displacement then occurs, it means that work is being done (i.e., force has been applied through some distance), and energy (E) is the ability to do work; and another concept, power (P) is the rate at which work is being done:

equations 3.4 P = E/T and thus E = PT where T is measured in seconds.

Now we get to the measure of Sound Intensity (I) which is the measure of sound power, which is proportional to the square of the average (RMS) power divided by the density of the medium (air for example) multiplied by the speed of sound in that medium. Intensity (or acoustic power) = p2 /(_oc) where _o is the density and c is the speed of sound in that medium. So you could imagine that if you put a certain amount of power in a signal and tried to drive the resulting sound wave through water, which is dense, and in which sound travels very fast, then the amount of sound intensity would not be great, compared to putting that same amount of power into air, which is less dense and in which sound travels less rapidly. Hence, for a given amount of power in a wave travelling through air, and having a given amount of sound intensity, imagine that it hits the inner ear, which is full of salt water: the power transmitted to the new medium is going to be very much attenuated.

Yost is a purist on terms (page 25, near the bottom): sound intensity is when the measurement of the sound is in energy units or power units; while "amplitude" refers to measures of pressure or displacement. We need not be that critical.

The next section on Page 25 is really important, as it introduces the term "decibel" [abbreviated dB] named for Alexander Graham Bell, the inventor of the telephone. There are two justifications for this measure of amplitude. One is that the range of hearing is immense, from intensities at threshold of 0.0002 dynes/cm2 (in a range of about 1 kHz to 4 kHz) to intensities that are about 200,000,000,000 dynes/ cm2-- A very big difference. However, if these numbers are converted into logarithms (base 10) then if the lowest threshold value is just 1, then the highest (near pain) value is just 15: not too bad a compression of the range, and in fact perhaps just a bit too much of a compression. And so then, if we called this log (10) number a bel, one could imagine a number one-tenth the size, to be called a "decibel", which has a range of numbers from 1 to 150 (and actually has some negative numbers as well, as we shall see).

The second reason has to do more with the characteristics of hearing (and any other sensory system). It is that our ability to hear a difference between one sound intensity and a second is determined in large measure by their relative intensities. Imagine for example trying to determine if one weight, which weighs just one-ounce, is different from another. Probably you could tell a difference of some fraction of an ounce. Suppose though that the initial weight weighed 10 pounds. Now it would require a difference of some appreciable fraction of a pound to tell that another weight was more or less heavy than the first, but relatively as a ratio, the differences are much the same. This law of relative sensory judgements was discovered (while lying in bed, on 22 October 1850) by Gustav Fechner (1801 -1887) as he was pondering how he could scientifically study the mind. He remembered some data published by E. H. Weber for which Weber had suggested that noticing that one stimulus was just different from another in a series of stimuli required a constant ratio of their physical intensities (or weights, etc.). To Fechner this meant that the way to connect the mental with the physical world was to make "the relative increase of bodily energy the measure of the increase of the corresponding mental intensity." [This section is taken from Boring, p 28.] Fechner, trained in medicine, then self trained in physics and mathematics, at one time a distinguished professor of physics at Leipzig, and later a religious mystic, began this pursuit with great vigor at the age of 50, and in so doing founded the field of experimental psychology.

Fechner's Law is "S = k log R", where S is the magnitude of a sensation and R is a measure of a stimulus at threshold (Reiz means stimulus in German). This in some sense then justifies taking a log of a physical unit, say, intensity, and thinking that now one has a measure of a sensation. And the decibel measure is exactly this (at least for one type of measure of the stimulus). The intensity at the threshold for hearing is taken as the base value and then all other intensities are given as a ratio: thus, if the threshold is at some particular value of intensity, say, I r [where Ir is a reference value], then other stimulus intensities, say, I x, might be expressed as the ratio (I x / I r); and then in true Fechnerian spirit we could say that this stimulus level is related to log(I x / I r); which would give the answer in Bels; and to turn this into decibels, then the level would be 10log(I x / I r).

The complication is that this formula for intensity is given in units of power; but usually the measures that we take of sounds are more conveniently pressure measures, rather than power measures. So we go back to equation 3.5 to see that I = f(p2) (that is, intensity goes up as the square of the pressure). And that then leads to the equation:

dB = 10 log (P x2 / P r2).

And then we do the logarithm trick in which log(x y) = y(log(x)) to arrive at dB = 20 log(p x / p r). So when the units of measurement are pressure units, the formula is 20log (a/b), and when power units are used it is 10log(a/b).

The reference pressure (or power), following Fechner, is a threshold value. But another complication is that there are different kinds of threshold values. One is the average threshold for people with no known hearing loss in the area where they are thought to be most sensitive (or were thought to be most sensitive in the 1930s, at 1000 Hz). When that is used as the value then the measure is given in units of dB (SPL, meaning Sound Pressure Level); but some other threshold could be used. It could, for example, be that person's threshold for that particular stimulus. When this is used then the stimulus level is given in units of dB (SL, meaning Sensation Level). Or it could be the average threshold for a large group of normal hearing people for that particular stimulus. When this is used then the stimulus level is given in units of dB (HL, meaning Hearing Level). So saying "This stimulus is 30 dB" doesn't tell us enough: it must say "This stimulus is 30 dB SPL (or SL, or HL)". It must also tell us whether it was measured as an average (rms) measure, or a peak measure, or a peak-to-peak measure.

The next topic is that of interference, which happens when several sound waves propagate at the same time in more or less the same space. However the section begins with not an idea about two sound waves, but with just one: what happens when a sound wave propagates though space: obviously, its level changes (that is depending on the shape of the space -- here Yost is talking about 3-D space. For 1-D space its level doesn't change, at least to a first approximation, is shown in "whispering galleries" and speaking tubes.). So how and why does the level change in 3-D space? Some point source is emitting a constant energy per unit time which is being propagated in all three dimensions, that is, through spherical space. The surface of this sphere varies as a function of the radius of the sphere according to the rule Area = 4_ r 2, where the radius is Ôr'. To measure the intensity of the sound for any particular distance from the source then we have to know the power at that distance. Power is force per unit area (equation 3.3), and that means that as the sphere increases in radius that the force emitted by the source must be spread over an increasing surface. The surface of the sphere for radius x is (4_ x 2) and the surface of the sphere at radius y is (4_ y2). So the power at x, relative to the power at y must be given by the ratio of the two surface areas, which is (4_ x 2/4_ y 2) = (x 2/y 2). And then the level of the sound in dB at radius x relative to radius y would be given by the formula dB = 10log(x 2/y 2) which is 20 log(x/y). And now suppose x = 2y: then this formula would result in the expression 20log(2); log(2) = .3, and so the difference in the level of the stimulus would be 6 dB. And from this comes the famous principle that when the distance from the sound source is doubled, then the level of the sound is decreased by 6 dB.

Oh well, it would be in perfect conditions. But conditions are not perfect, because typically we listen to sounds in rooms, the sounds reflect off objects, and so there is interference (as in Figure 3.4). What then? [Well really it is not interference -- it is addition and subtraction at a point -- in fact waves do not affect each other, but cross each other without interacting. If there really was interference then we would never hear anything at all, or see anything for that matter!]

One concept which is very important is that of the sound shadow, which is responsible for some of our ability to localize objects in space. When a sound runs into an object some of the sound will go around the object and some will bounce backwards. In general high frequencies (short wave lengths) bounce back, low frequencies (long wavelengths go around). A short wave length is short relative to the size of the object: if its width is more than one-half the wavelength then the wave will bounce back. That means that on the other side of the object there is no sound at those high frequencies, and the shadow is on the order of 2 wave lengths in length. This is critical for spatial location as high frequency (short wave length) tones.

The "Sound field" is another important concept. When continuous sounds bounce off walls they may add or subtract from each other, and thus they have points at which the sound is especially loud or especially soft. These waves of amplitude are called standing waves. The nodes of a standing wave are places where the sound pressure is at zero and antinodes are places where the sound pressure is high. The actual places of the nodes and antinodes depend on the frequency. Standing waves exist in stringed instruments and in pipes. In a pipe open at one end the antinode is at the closed end and the node is at the open end for a frequency that is 4 times the length of the pipe, which is then a resonant frequency which is amplified in the pipe. The outer ear can be modeled as a pipe that is about 3 cm long (a little over an inch, on average), and so the resonant frequency has a wave length of a little over 12 cm, and thus a frequency of a little over 3 kHz. The pipe is not rigid and so the peak is quite wide, but for frequencies around 3 kHz a pressure peak is always at the tympanic membrane. This is important in determining the shape of the human audiogram.