| Home | Studio | Technology | Equipment | Productions | Artwork | Xenomorph's Bio | MediaLab & MMA | Resources | Contact |
spacer
    r e s o u r c e s
top of page ...

File Version: 12.11.2005

Acoustic Basics and Digital Recording

  • dynamic and frequency range
    dynamic range
    - Bel (B) = unit of measure for ratios
    - dB (deciBel) = 1/10 Bel (more practical for everyday use); uses a logarithmic scale in base 10
    - ratio in dB = 20 x log10 (U1 / U 0), where U0 is the reference level and U1 the level being measured
    - the ref. level can be a sound pressure level, like 0.00002 Pa; or a voltage amplitude, like 775 mV
    - dB SPL = unit of measure for Sound Pressure Level; ref. level = 0 dB SPL, defined as 0.00002 Pa
    - 1 Pa (Pascal) = unit of measure for pressure = 1 N/m² (1 Newton per square meter)
    - standard atmospheric pressure at sea level = 101 325 Pa = 1 013.25 millibar (1 millibar = 100 Pa)
    - dBu = unit of measure for the amplitude of analogue audio signals (as electrical voltage),
      with ref. level = 0 dBu, defined as 0.775 V (775 mV)
    - dBV = unit of measure for the amplitude of analogue audio signals (as electrical voltage),
      with ref. level = 0 dBV, defined as 1 V (1 000 mV)
    - threshold of hearing (or threshold of audibility) = 0 dB SPL = 0.00002 Pa (pressure deviation)
    - threshold of pain = 137.5 dB SPL = 150 Pa (pressure deviation)
    - human hearing max dynamic range: about 137,5 dB from the threshold of hearing to the threshold of pain
    - Phon = unit of measure for the perceived loudness
    - 1 Phon corresponds to 1 dB at 1 000 Hz in the dB SPL scale; it does not correspond to the dB SPL scale at any other frequency; see also the Fletcher/Munson diagram for reference
    - examples of dB calculation:
      how much louder is an electrical signal of 2V compared to the reference 0 dBu?
      20 x log10 (2 / 0.775) = 8.234 dB
      how much louder is an electrical signal of 2V compared to the reference 0 dBV?
      20 x log10 (2 / 1) = 6.020 dB
      how much difference in level is there between 0 dBV and 0 dBu?
      20 x log10 (1 / 0.775) = 2,21 dB
    - remember: the difference in level between -10 dBV and +4 dBu is [14 - 2,21 = 11,79 dB] , and not 14 dB!
    frequency range
    - Hz (Hertz) = unit of measure for frequency
    - 1 Hz (Hertz) = 1 cycle per second; 1 kHz = 1 000 Hz; 1 MHz = 1 000 kHz; 1 GHz = 1 000 MHz
    - human hearing freq. range: about 16 - 20 000 Hz (for young individuals that did not suffer any trauma, did not spend too much time at the disco or at rock concerts, etc. ...)
    - freq. below 16 Hz = infrasound; freq. beyond 20 kHz = ultrasound
    - doubling of frequency in Hz = 1 octave
    - human hearing freq. range = about 10 octaves bandwidth (from 16 to 16 384 Hz there are exactly 10 octaves)
    - the freq. range in the hearing of animals can vary greatly; some examples:
      dog (60 - 45 000 Hz)
      cat (45 - 64 000 Hz)
      horse (55 - 33 500 Hz)
      mouse (1 000 - 91 000 Hz)
      bat (2 000 - 110 000 Hz)
      beluga whale (1 000 - 123 000 Hz)
      owl (200 - 12 000 Hz)
      chicken (125 - 2 000 Hz)

  • speed of sound
    - speed of sound (propagation in Earth atmosphere) at sea level, 20° C temperature = 343.8 m/s (approx. 340 m/s)
    - speed of sound at sea level, at 0° C temperature = 331.8 m/s
    - wavelength (in m) = c / f (where "c" is the speed of sound in m/s, and "f" the frequency in Hz)
    - for example: the wavelength of 440 Hz = 340 / 440 = 0.772 m
    - frequency (in Hz) = c / L (where "c" is the speed of sound in m/s, and "L" the wavelength in m)
    - for example: the frequency of a sound with 6 m wavelength = 340 / 6 = 56.6 Hz
    - quick reference: sound travels 340 m in 1 sec; 34 m in 100 ms; 3.4 m in 10 ms; 34 cm in 1 ms
    - distance between ears = about 17 cm = max 0.5 ms delay between left and right ear for an incident wave
    - speed of sound depends mostly on medium density; as both altitude and temperature affect air pressure, they affect in turn also the speed of sound in the atmosphere
    - speed of sound propagation in some other materials at 0° C:
      air: 331 m/s
      water: 1485 m/s
      copper: 3710 m/s
      iron: 5100 m/s
      wood: 3000-4000 m/s
      glass: 5000 m/s
      hard rubber: 1500 m/s
    - sound propagation in free field for a sound source that is not moving is spherical

  • sampling, sampling rate, quantization, Nyquist theorem, aliasing, oversampling
    - sampling: the information of an analogue signal is reduced from continuous time, non quantized state to discrete time, quantized state, resulting in a finite amount of digital (= numerical) information
    - this digital information can be processed in real time (by DSP systems or general purpose computer CPUs) or stored in digital form on a magnetic medium (DAT, ADAT tape, hard disk) or optical medium (CD/DVD-R) for later processing
    - at playback time the process is inverted: the digital information is used to reconstruct the original analogue signal; provided sampling rate and quantization are accurate enough, the regenerated analogue signal will be virtually identical to the original
    - ADC: analogue to digital converter; DAC: digital to analogue converter
    - typical conversion time required by a modern ADC or DAC converter with 128x oversampling: about 1 ms
    - sampling frequency or sampling rate (in Hz) = temporal accuracy: how many samples (discrete amplitude measurements) per sec. of the analogue signal are being taken
    - bandwidth depends on the sampling rate (see Nyquist Theorem under)
    - quantization (in bits) = amplitude accuracy: how accurate is the amplitude measurement of each sample
    - the dynamic range and S/N ratio depend mostly on the quantization accuracy
    - Nyquist Theorem: the sampling rate (in Hz) must be at least 2 times the desired bandwidth (in Hz); in other words: to represent a given frequency it is necessary to have at least one sample per positive and one sample per negative phase of the wave cycle
    - Nyquist frequency = 1/2 the sampling rate
    - aliasing = distortion in the form of signal artifacts not present in the original signal, and not related to it in a harmonic way (this is not THD, Total Harmonic Distortion!); this happens when signals above the Nyquist frequency enter the ADC without proper "antialiasing filtering" and get "mirrored" around the Nyquist frequency itself, appearing back into the hearable freq. range
    - for example, in a 44 100 sampling rate system, the Nyquist frequency is 22 050 Hz; without filtering, a signal of 30 000 Hz would be mirrored at 22 050 - (30 000 - 22 050) = 14 100 Hz
    - an antialiasing filter is generally a low pass filter with very sharp slope; as practical filters cannot be manufactured with "infinite slope", a higher sampling rate is required in order to effectively remove all frequencies above the Nyquist, without affecting the desired bandwidth
    - for example, in a typical "CD quality" recording (44 100 Hz sampling rate) the antialiasing filter must not affect signals under 20 000 Hz, but must filter all signals above 22 050 Hz to avoid aliasing, leaving just 2 050 Hz bandwidth to change from "full pass" to "full cut" operation
    - a similar (inverse) process occurs in the DAC: after DA conversion, an analogue low pass filter removes all undesired artifacts generated by the sampling process over 20 000 Hz
    - modern ADC and DAC use a different approach, for example "128x oversampling"
    - in these oversampling converters, the signal is first filtered by a much more "relaxed" analogue antialiasing filter and sampled at a much higher frequency (for example 128 x 44 100 = 5 644 800 Hz); then a digital low pass filter is used (which is easier and cheaper to implement than a similar analogue antialiasing filter, and can also be very accurate); finally the signal is downsampled to the desired rate (for example 44 100 Hz) before further processing

  • digital recording - dynamic and frequency range
    - 44.1 kHz digital recording freq. range allows for a 20 - 20 000 Hz bandwidth (however, usually also freq. under 20 Hz can be recorded and reproduced, depending on the converters)
    - digital recording dynamic range and signal to noise ratio (simplified formula):
      S/N ratio in dB = 6 N + 1.8 (where N is the number of quantization bits)
    - 16-bit digital recording max. dyn. range: (6 x 16) + 1.8 = 97.8 dB
    - 24-bit digital recording max. dyn. range (in the digital domain): (6 x 24) + 1.8 = 145.8 dB
    - but: typical dyn. range of good quality 24-bit ADC/DAC units is just about 110-120 dB
    - this means: the 24-bit signal theoretical dynamic range (145.8 dB) can only be achieved in the digital domain; actual recordings will only manage up to 110-120 dB, which is the dynamic range offered by most good quality 24-bit ADC/DAC units
    - AFAIK there is just one ADC on the market which is capable of more than 144 dB true dynamic range: the StageTec "TrueMatch" converter (with up to 28-bit precision); an 8-channel unit costs about 12 000 EUR, so it is not exactly a "budget" solution ...
    - dB fs = dB "full scale" = unit of measure for the amplitude of digital audio signals
    - the reference level is "0 dB fs", which is also the maximum signal amplitude that can be stored digitally in a typical digital audio recording system (for example: DAT, ADAT, DTRS; also 16 or 24-bit WAV/AIFF files)
    - signals louder than "0 dB fs" just produce "clipping" (= truncation of the waveform, hence distortion), except in a 32-bit float system (for example: the audio engine of HDR systems such as Cubase SX, Nuendo and Logic)
    - in a digital audio recording system most signal levels are defined as a negative dB fs amount; for example, -6 dB fs = 6 dB quieter than a "full scale" signal (at 0 dB), or 6 dB from clipping
    - even in systems supporting 32-bit float resolution (where theoretically the full 24-bit resolution of the signal is maintained throughout the signal path), the output should never exceed 0 dB fs before DA conversion, or clipping occurs in the converter, causing distortion
    - to avoid clipping of the DA stage in a HDR systems supporting internally 32-bit float resolution, it is usually enough to reduce the level of the master output fader, unless clipping occurs before in some plugin that does not support the 32-bit float format

  • relation between sound power (intensity), sound pressure level and loudness
    - sound power or intensity is a measure of the sound energy that passes through a given area each second
    - energy per second is measured in Watt (1 W = 1 Joule per second)
    - intensity is related to the sound pressure amplitude: specifically the energy in a wave is proportional to the square of the pressure amplitude
    - formula: I = P², where "I" is the sound power (intensity), and "P" the sound pressure amplitude
    - examples:
      double sound pressure = 4 times the power (or intensity)
      1/2 sound pressure level = 1/4 the power (or intensity)
    - the formula to translate sound power (intensity) (in W) to dB is:
      Li = 10 x log10 (I1 / I0), where I 0 is the reference intensity and I1 the intensity being measured
    - while the formula to translate sound pressure level (in Pa) to dB is:
      Lp = 20 x log10 (U1 / U 0), where U0 is the reference level and U1 the level being measured
    - therefore, sound power (intensity) doubles every 3,01 dB [10 x log10 (2 / 1) = 3,01 dB] ...
    - ... while sound pressure level (in Pa) doubles every 6,02 dB [20 x log10 (2 / 1) = 6,02 dB]
    - perceived loudness (in Phon) doubles about every 10 dB
    - however: perceived loudness is frequency dependant, and varies quite a lot between individuals (see Fletcher/Munson diagram!)
    - examples:
      10 times the power (in W) = +10 dB SPL = 3,162 times the SPL in Pa, but is just perceived as "double as loud"
      100 times the power (in W) = +20 dB SPL = 10 times the SPL in Pa, but is just perceived as "4 times as loud"
      1000 times the power (in W) = +30 dB SPL = 31,62 times the SPL in Pa, but is just perceived as "8 times as loud"

  • relation between sound pressure level, sound power (intensity) and distance
    - in a free field, the sound pressure level is inversely proportional to the distance from the sound source
    - formula: p = 1/d (p = sound pressure, d= distance)
    - examples:
      double distance = 1/2 sound pressure level (in Pa) = -6,02 dB
      4 times the distance = 1/4 sound pressure level (in Pa) = -12,04 dB
      1/2 the distance = double sound pressure level (in Pa) = +6,02 dB
    - in a free field, the sound power (intensity) is inversely proportional to the square of the distance from the sound source
    - formula: i = 1/d² (i = intensity, d = distance)
    - think it like this: the sound waves carry energy; doubling the distance, this energy is spread on an area that is 4 times as large
    - examples:
      double distance = 1/4 power in W = -6,02 dB (remember: 1/2 intensity is only -3 dB!)
      4 times the distance = 1/16 power in W = -12,04 dB
      1/2 the distance = 4 times the power in W = +6,02 dB

    - example: a loudspeaker has an efficiency of 90 dB SPL /W at 1 m distance; what sound pressure level will it produce at 32 m distance, in a free field (an ideal open space with no boundaries and no reflections)?
    p = 1/d = 1/32; to calculate the ratio in dB: 20 x log10 (1/32) = 30,10 dB
    90 - 30,10 = 59,90 dB. So at 32 m distance this loudspeaker just produces about 60 dB SPL
    - how much more power is required to still produce 90 dB SPL at 32 m distance?
    For each additional 10 dB SPL we need 10 times the power (in W); for +30 dB SPL we need 1 000 times the power! If this loudspeaker needed 1 W to produce 90 dB SPL at 1 m, we are going to need 1 000 W power to produce 120 dB SPL at 1 m, corresponding to 90 dB SPL at 32 m distance.

  • non-linearity of the human ear's sensitivity
    - Phon = unit of measure for the perceived loudness
    - only at 1000 Hz 1 dB SPL = 1 Phon
    - our hearing response shows reduced sensitivity in the low and in the very high frequency ranges, and increased sensitivity in the range between 2 and 4 kHz (which is very important for speech recognition)
    - our subjective hearing response for a given constant dB SPL varies more than 40 Phon depending on the frequency!
    - even the difference in dB required to give the impression of "double as loud" is not constant: at 1000 Hz, about 10 dB are required to give the impression of double as loud, but at very low frequencies variations of just 6 dB can produce the same effect
    - this non-linearity is more pronounced at very low listening levels: this is why home stereo often have a "loudness" button, which boosts low and high frequencies to compensate for our the ear non-linearity when listening at very low levels
    - that's why it is important to work at specific levels when mixing and mastering, usually around 80-85 dB SPL: at this level the ear works more linear (or less non-linear ...)
    - it is also good to sometimes switch from the standard listening level to very soft, or very loud levels; or to listen through a door or from a stair case, to check if the most important elements in a mix are still clearly hearable ...

  • hard disk usage
    to calculate HD usage recording at different resolutions (stereo, multitrack, 16 and 24 bit):
    - HD usage in Bytes/min = [Bytes of quantization per channel] x [n. of audio channels] x [sampling freq. in Hz] x 60 sec
    - "CD quality" recording (16-bit stereo, 44.1 kHz): 2 Bytes x 2 Ch. x 44100 x 60 = 10 584 000 Bytes/min = 10,093 MB/min
    - 24-bit recording, 8 track, 96 kHz: 3 Bytes x 8 Ch, x 96000 x 60 = 138 240 000 Bytes/min = 131,835 MB/min
    - remember: for HD manufactures, 10 584 000 equals to 10,58 MB, but for your Operating System this is just 10,093 MB
    - 1 KB = 1024 (and not 1000) Bytes; 1 MB = 1024 KB; 1 GB = 1024 MB; 1 TB = 1024 GB; etc.
    - disk manufactures want to advertise larger disk capacities, so they sell a hard disk as a 120 GB part, while the effective capacity as seen by the OS and programs is only 111,7 GB ...

Microphones and Recording Techniques

  • microphone types
    - microphones can be classified according to several criteria, such as:
    a) transducer type (condenser, dynamic, ribbon)
    b) membrane size (small, large) and position (front or side address, PZM)
    d) polar pattern (omni, cardioid, figure-of-eight, variable polar pattern)

    - proximity effect: when a microphone is positioned very close to the sound source, the bass frequencies tend to be unnaturally boosted; large membrane microphones suffer less from the "proximity effect": this is why they are ideal for close vocal or instrumental recording; other types of microphone might have a "low cut" filter switch to reduce the sensitivity in the low range, or optimized freq. response for close miking (like the SM 57/58)
    - frequency response: to obtain the most natural recorded sound, the frequency response of a microphone should be as linear as possible through the entire hearing range (20 to 20 000 Hz); large membrane mics do not necessarily have better freq. response in the bass; in fact, some of the most linear mics have very small membranes (for example, DPA 4006 with B&K capsules); generally, omni condenser microphones have the most linear frequency response, while cardioid dynamic microphones the least (this is due to the construction principle)
    - impulse response: the capacity to react to instant changes in amplitude (for example, like in the fast "transients" generated by percussion or plucked instruments)
    - comb filtering: microphones should never be placed in proximity of large reflecting surfaces (walls, floor, ceiling), as the reflected wave front will reach the microphone with just a small delay compared to the wave front coming directly from the sound source; this produces interference, hearable as a very unnatural form of "coloration" (comb filter effect)
    - exception: boundary layer mics (grenzflächemikrophone), or PZM (Pressure Zone Microphone) are designed to work best when placed directly on surfaces; they use the boost in level in proximity of a surface to optimize sensitivity, and because they just cannot get the reflected wave from the surface (as they are placed on the surface itself), they sound very natural and uncolored (no comb filter effect)
    - high headroom: the capacity to record with low distortion at very high input levels possible; higher headroom is better
    - sensitivity: the ratio between the input level and the output signal; higher sensitivity is better
    - S/N ratio: ratio (difference in dB) between signal and noise; lower noise floor (= higher S/N ratio) is better
    - THD: Total Harmonic Distortion
    - THD+N: Total Harmonic Distortion + Noise

  • condenser (or capacitor) microphones
    Transducer Principle
    - the microphone diaphragm acts as one plate of a capacitor; the plates are biased with a fixed charge (typically 40-200 V) through a large resistance (>500 M Ohm); the diaphragm vibrations produce changes in the distance between the plates, which in turn affects the "capacitance" and produce accordingly changes in current at the resistance end; this current cannot be used directly as signal, due to the very high output resistance: for this purpose a preamp transforms the impedance to about 200 Ohm and amplifies the audio signal
    - condenser microphones require phantom power for the capsule bias charge, as well as for the preamp; exception: in "electret" type microphones the plates are permanently charged, so the phantom power is only required to operate the preamp
    Characteristics
    - very linear frequency response (especially the omni types), both in the very low and very high range
    - very accurate impulse response (reacts well to fast transients, like in percussive sounds)
    - usually better S/N ratio than dynamic mics
    - low distortion
    - overall: more transparent, detailed sound than dynamic mics, especially in the high range
    - high sensitivity, sometimes lower headroom
    - require phantom power, unless for models with an internal battery, or for "electret condenser" types
    - double membrane condenser mics are very flexible, as you can switch the polar characteristic between omni, cardioid and figure-of-eight; in some types, you can seamlessly blend through the different characteristics, with virtually infinite variations in polar pattern response
    - critical use for live, as very delicate - also: easier to get feedback because of wide frequency response, high sensitivity, etc.
    - nevertheless: should be used live for instruments with very strong energy in the high freq. range, like for example for cymbals ("drums overhead" configuration), tambourine, triangle, shaker, cabasa, etc.
    - examples of large membrane condenser: Neumann U87, U89, M147, M149 TLM103; Audio Technica AT4050, AT3035; AKG C414; RODE NT 1000; Brauner VLM1, Brauner Phantom, etc.
    - examples of small membrane condenser: Neumann KM 183/184/185; AKG C480, C391B; RODE NT5, NT3; DPA 4006 B&K, etc.


  • dynamic microphones
    Transducer Principle
    - the microphone diaphragm is connected to a ring shaped induction coil, which is positioned around a permanent magnet; when the diaphragm vibrates, the coil moves in the magnetic field, producing per induction a varying current in the coil; this current has already a resistance of about 200 Ohm, so it can be used directly as audio signal; the principle is similar to the loudspeaker, only reversed
    Characteristics
    - usually the response in the very high frequency range is not very good (roll off might start around 15 kHz or earlier)
    - can be sometimes "rolled off" in the low range to reduce "proximity effect" (undesired boost of the low frequencies), especially on vocal mics like the SM58
    - not very accurate impulse response (do not react well to fast transients, due to the high mass of membrane+coil)
    - generally worse S/N ratio than condenser mics
    - overall: less transparent and detailed sound than condenser mics
    - OTOH: rounder, softer sound than condenser, therefore good for harsh signals (for example, a distorted e-guitar cabinet)
    - lower sensitivity than condenser mics, but often higher headroom (can be used in direct proximity for very loud instruments, like close miking of drums, without clipping or being damaged)
    - do not require phantom power, or batteries - the principle is the same as an "inverted loudspeaker"
    - very robust, easy to use in live P.A. situations (hard to get a feedback)
    - examples of dynamic microphones: Shure SM57, SM58 (for vocals); AKG D300, D112 (for bass drum), Sennheiser MD421, E865, etc.

  • microphone polar pattern (= richtcharakteristik)
    Patterns
    - omnidirectional (= kugel): theoretically, omni pattern microphones have the same sensitivity from sound coming from all directions, over the complete freq. range; this is not always true: some omnis (for example, the DPA 4006) are more sensitive to high freq. for sound coming on-axis (0°), and could therefore be defined as being "mildly directional" in the high range; omnis usually have a very flat response throughout the frequency spectrum
    - mildly directional: the wide cardioid or sub-cardioid (= breite niere) types are a middle-stage between omni and cardioid
    - directional: the cardioid (= niere) types have maximum sensitivity on-axis (0°), -6 dB from the sides (90° and 270°) and minimum sensitivity (theoretically -oo dB) from the back (180°); this is one of the most common capsule types
    - strongly directional: the supercardioid (= superniere) and hypercardioid (= hyperniere) types have more attenuation from the sides than the cardioid type, but react also moderately to signals from the back (180°), however with inverted phase; they are often used together with video cameras (extreme example: a "shotgun" microphone, that can be used to interview somebody meters away in the middle of a noisy crowd)
    - bidirectional: the figure-of-eight (= 8-charakteristik) types have two symmetrical sensitivity lobes, with max sensitivity on-axis (0°) and from the back (180°), minimum sensitivity (theoretically -oo dB) from the sides (90° and 270°); note: the back lobe response is inverted in phase; example of usage: a bidirectional (S+/- signals) is used together with a cardioid (M signal) to realise the M-S stereo recording technique
    Principles
    - omnidirectional mics are "pressure transducers": the output voltage is proportional to the variations in air pressure
    - bidirectional mics are "pressure gradient transducers": the output voltage is proportional to the difference between the variations in pressure from the front and from the back
    - directional mics are conceptually a superposition of an "omni" and a "figure of eight" polar diagram: when sounds come from the back (180°), the negative phase response of the "figure of eight" cancels the positive response from the "omni" pattern, resulting in minimum sensitivity; when sounds come from the front (0°), the positive phase response of the "figure of eight" is added to the positive response of the "omni", resulting in maximum sensitivity; when sounds come from the side, the "figure of eight" does not contribute at all to the "omni" pattern, resulting in 6 dB less sensitivity than for sounds coming from the front

  • stereo recording techniques overview
    - Interaural Time Difference (ITD) stereophony (= laufzeitstereophonie): for example, A-B; the principle is that signals coming from one side will hit one microphone before the other, so there is a time delay between L and R channel;
    - Interaural Amplitude (or Level) Difference (IAD, ILD) stereophony (= intensitätsstereophonie): for example, X-Y; the principle is that signals coming from one side are louder in one of the two microphones;
    - combinations of the ITD and IAD principles (for example in ORTF, OSS);
    - our hearing system works in a similar way than OSS (see under for details); the max ITD between our ears is about 0,5 ms (calculated for 17 cm distance).

  • A-B: Interaural Time Difference (ITD) stereophony
    - wide stereo image but poor localization;
    - setup: two cardioid or omni mics; typical distance: 40 to 80 cm; angle: typically parallel (= 0°) for cardioids;
    - the L-R signals are not coherent, therefore not mono-compatible (for mono, use just one channel in this case);
    - a "hole in the middle" might occur when the mics are very far apart (2-3 m), but this can be fine when recording ambience as A-B.

  • X-Y and M/S: Interaural Amplitude Difference (IAD) stereophony
    - relatively narrow stereo image but excellent localization;
    - setup for X-Y: two cardioid mics; distance: 0 cm; angle: 60° to 120° (typical 90°);
    - setup for M/S (Mid/Side): one cardioid mic for the Mid signal, one figure-of-eight mic for the Side signal (therefore, the membranes are 90° to each other);
    - remember: to "decode" a M/S group you use three mixer channels: one for the Mid signal, panned center; one for the S+ signal, panned hard Left; one for the S- signal, panned hard Right (= again the Side signal, reversed in phase); this might not be possible on cheap mixers with no phase polarity switch;
    - adjusting the balance between Mid and Side, you can blend between mono and full stereo signal; you have a greater degree of control than with X-Y;
    - with X-Y, the L-R signals are correlated (same phase, as the microphones are coincident), therefore you have excellent mono-compatibility;
    - with M/S, the S+ and S- signals are opposite in phase and erase each other when the final L-R stereo (decoded) channels are mixed together, leaving out only the Mid signal; so this is the most mono-compatible system you can use, excellent for radio and TV recordings.

  • ORTF, OSS (Jecklin-Disk): combination of ITD and IAD stereophony
    - setup for ORTF: two cardioid mics; standard distance: 17 cm (it also works with distances between 15 and 25 cm); standard angle: 110° (it also works with angles between 60 ° to 120°)
    - note: to maintain a similar recording stereo base, distance and angle between capsules should be adjusted inversely proportional! smaller distance, greated angle; greater distance, smaller angle
    - setup for OSS (Optimal Stereo Signal): two omni mics; distance: about 17-20 cm; angle: 60° - the "Jecklin Disk" is placed between the mics to dampen mid/high freq. from the sides and create the required interaural amplitude differences (we would otherwise have an ordinary A-B setup);
    - balanced stereo image, good localization;
    - nice deep and spatial sound (OSS), due to omni mic type;
    - altough the principle might appear to be similar, OSS should not be mixed up with recordings done with "dummy-head", which are only compatible for reproduction over headphones!
    - the L-R signals are still quite correlated, due to small ITDs between the mics, therefore still acceptable mono compatibility; with greater distance between the capsules, stronger coloration (comb filter effect) might occur.

Mixer - Signal Routing

  • setting the input level
    - make sure you use the proper microphone or line switch at the input;
    - remember to activate phantom power for condenser mics;
    - when possible, use balanced cables (symmetrische kabel) to minimize undesired noises (especially ground hum);
    - set the channel fader to 0 dB (= Unit) and use trim/gain to set the channel input level, making sure that the signal does not clip (= you get distortion) or is too quiet (= you get undesired noise);
    - you might want to press the "solo" switch with PFL (Pre Fader Listening) to adjust level using the master meters on a mixer that does not have separate meters for each input;
    - for line signals, start from complete left (counter-clockwise) and move right (clockwise) until the desired level is reached;
    - for microphone signals, you might try to start with the knob in the middle position (12 o'clock) and then adjust left or right accordingly;
    - remember: usually condenser mics have higher sensibility than dynamic ones;
    - do not use the channel fader (= channel output) or ever worse the bus fader to set levels! Distortion or noise that happens at the input stage cannot be eliminated later in the path.

  • pan and balance
    - difference between panorama (on mono signals) and balance (on stereo signals):
    - panorama positions a mono signal across the stereo field;
    - balance just adjust the relative volume between the L and R signals, that are always panned "hard L" and "hard R";
    - to adjust the stereo-width of a stereo signal, you have to route it to two separate mono inputs and use the pan controls (more under).

  • adding effects: when to use insert and when aux send/return
    - generally, EQs, filters and dynamics (compressor, gate, etc.). are used as channel insert (if stereo, also as master insert), to use specific settings for each track;
    - delays and reverbs are usually used as aux send/return, to spare processing power and/or to make the same effect available on more channels;
    - modulation effects (chorus, flanger, phaser, etc.) and distortion types can be used in both ways; if they are used as insert, you use the "mix" parameter to adjust the amount of dry and effect signal; otherwise you use the "aux send" to add effected signal to the try one; note: chorus, flanger etc. usually work best at around 50% dry + 50% effect settings;
    - there are of course exceptions! For example, you can send to the same aux send a whole drum-kit (and even the bass), then compress the hell out of it and add the result to the original uncompressed signal - can sound terrific!
    - note: in Logic Audio, the aux sends are called "bus send" and the effects are inserted directly in the "bus returns" (these use the "bus objects" as standard, but since Logic 5 you can also be replace them with the "aux objects", which are more flexible); in other words, in Logic a "bus" has the double function of "group" (like when you route the output of a number of channels to the same bus, so that they can be controlled with a single fader and you can add insert effects to the whole group) and as "aux return" for the effects;
    - important: when you use any effect as aux send/return, you should always make sure to have the "mix" to 100%, in other words to get back effect signal only!
    - especially VST plugins in programs like Logic do not "know" whether they are used as insert or aux.

  • using an external effect unit
    - you can of course use the aux-return (as standard) to re-insert the signal from an external effect unit (for example a reverb) in the mixer, but it might be more convenient to use a free couple of mono-channel inputs, because:
    - you can finely adjust the stereo width with the separate pan controls on the L and R signal;
    - you can change the color of the effect sound with the EQ (works great on reverbs and delays!);
    - you can also send again the FX signal to another effect processor with another aux send (for example send to aux 1 = delay, and send again some of the delay return signal to aux 2 = reverb);
    - note: to achieve this in Logic Audio you have to use the "aux object" instead of a standard "bus object" as FX return.

  • connecting a 8-bus mixer to recording devices
    - use the 3x 8 bus outputs to connect to 3x 8 Ch. multitrack device (note: you can group and submix signals, but you can record max 8 tracks at once);
    - use the 24 direct outs if you need to record all 24 channels at once;
    - use the master output (main mix) to connect to a stereo mastering device;
    - use the Control Room out (= Regie) for the monitors of the control room;
    - use the Studio out for the speakers or headphones in the recording room.

  • in-line mixer details
    - there are two effective inputs per channel – so a 24 Ch. mixer has 48 inputs; this saves a lot of place, but then you do not have full features on every input;
    - for example on Mackie 8-bus: you have 24 A-channels and 24 B-channels, also called "tape return" as they are normally used to listen to the playback of the multitrack tape-machines;
    - using the "flip" switch you can toggle between A and B channels, so you can for example use the main EQ on the tape return signal;
    - you can also "split" the EQ between A- and B-channels, for example use the 2 peak parametric filters on the A-channels and the two shelving filters on the B-channels.

Filters, Effects and Plugins

  • shelving EQ
    - used for general tone correction; works like the bass and treble controls on a standard amplifier or car stereo;
    parameters: usually only the gain (boost or cut); frequency on hardware mixers is typically set at 80 or 100 Hz for the low shelving and 10 or 12 kHz for the high shelving;
    - sometimes (mostly on digital versions) also the frequency and slope can be set, allowing a greater flexibility.

  • peak (bell) EQ
    - used for more accurate tone shaping, to remove or emphasize specific formants, to change the character of a sound, etc.;
    parameters:
    - a semi-parametric peak filter has only frequency and gain controls;
    - a full-parametric peak filter has frequency, gain and bandwidth (Q);
    - when you look for formants in an instrument, set the peak filter to "boost", medium Q, and sweep around for the desired frequency until you can spot it: then you can set the filter to "cut" , if you wish to remove the undesired formant, or to moderate "boost", if you want the instrument to come better through the mix without altering the channel volume.

  • low cut (= high pass) and high cut (= low pass) filters
    parameters: frequency, sometimes slope (= flankensteilheit);
    - normally you use a low cut to eliminate rumble noise (for example, mechanical noise transmitted through the floor the the mic stand);
    - a high cut might be used on instruments that have to sound only in the bass range (like BD or bass) but also to "repair" a take where you have some distortion (clipping), provided there is not much energy in the high freq. range, or for special effects (especially a low pass with resonance, very much used in dance, techno, trance, drum'n'bass, etc.).

  • notch filter
    - to cut single frequencies out, for example a 50 Hz noise from a bad ground loop, without affecting the rest of the spectrum;
    parameters: like a full-parametric peak, but it only works in "cut" mode, the gain loss can be as much as -36 dB or more and the Q is much narrower.

  • compressor
    - to change the character of the sound (for example on BD or bass), making it thicker, punchier, compact and/or louder;
    - to keep the changes of volume under control (for example, on vocals) and help sounds to "cut through the mix";
    parameters:
    - threshold: defines which part of the dynamic range should be compressed (all signals louder than the set threshold); the lower you set the threshold, the larger portion of the dynamic range gets compressed; usually set to about -10 to -20 dB for most signals;
    - compression ratio: the amount of compression relative to input (for example, 2:1 means you need 2 dB at the input for 1dB more at the output); ratios of 4:1 or more can be used for vocals, sometimes 8:1 or more for e-bass or e-guitar; more moderate on drums, unless you are looking for a special sound;
    - soft or hard knee: hard knee is more efficient, but can make compression detectable; soft knee works more "musically", as it starts compressing lightly before the threshold level and reaches the set ratio much after that, but is not as efficient at compressing loud peaks (you could have clipping in some cases);
    - attack time: how fast does the compressor reacts after the signal has crossed the threshold (usually set shorter for percussive sounds, but avoid "0 ms" on digital compressors!); try settings between 2 and 20 ms;
    - release time: how fast does the compressor return to the original volume gain after the signal has dropped under the threshold level (set usually to about 20 to 100 ms, even longer for guitars and bass);
    - make up, or output level: you can compensate for the volume "lost" through the compression, and make the whole signal louder again.

  • limiter
    - a compressor working with hard-knee, infinite compression ratio and very fast (theoretically undetectable) attack and release times;
    - to avoid signal clipping; usually the last effect in a chain (like in mastering);
    - to reduce in an undetectable way the level of short signal peaks and make the whole track louder without losing perceived dynamics like with a compressor;
    - beware of too short attack/release times, ad they can create distortion.

  • transient designer
    - it's a kind of combination of compressor and expander;
    - can be used to change shape and expression of percussive material, drums, loops;
    - can be used to give the impression that there is less ambience in a drum-loop.

  • de-esser
    - a special kind of compressor that reacts and also affects only the frequencies in the specific range of "S", "T" and other consonants (normally, 5 to 7 kHz);
    - parameters: usually you can set exactly which frequencies it should react on and affect, the threshold and the compression ratio.

  • gate
    - to remove undesired, low level parts of a signal;
    - example: to reduce "leaking" from the different microphones when recording a drum-kit;
    parameters:
    - threshold: only when the signal is louder than the set threshold, the gate opens (otherwise it stays closed and you hear nothing);
    - reduction: normally a gate reduces the level to -oo when closed, but if this parameter is available you can also chose the gate to just drop the level by a given amount of dB;
    - attack time: how fast does the gate open after a signal reaches the threshold;
    - hold time: how long does the gate stay fully open;
    - release time: how long does it take for the gate to gradually close again, after the signal has dropped under the threshold level again;
    - high cut and low cut in the "side chain": to define which frequency range should control the gate operation;
    - example: if you have a drum loop and want to isolate the snare only, you set the gate to react only do mid-high frequencies, and very strong level peaks; if you want to isolate only the bass drum, it should react only to very low frequencies, and strong peaks.

  • chorus, flanger, phaser and other modulation effects
    - can be used to "thicken up" and add some modulation to a sound that is too flat, static and / or uninteresting (especially chorus);
    - for example, a simple synth pad with just one oscillator per voice can sound way nicer with some added chorus and flanger
    - can be used on acoustic and electric guitars, on vocals etc. to make them "thicker"
    - does not sound very good on complex acoustic instruments (like piano) or ensembles (like strings, brass, etc.);
    - with extreme settings, you can create special effects (especially with the flanger and phaser)
    parameters:
    - delay (chorus and flanger only): how much are the modulated lines delayed from the dry signal; chorus uses around 10 ms delay, flanger uses shorter delay; more than 20 ms are perceived as short echo; phaser uses no delay;
    - speed: the speed of the LFO(s) controlling the modulation;
    - depth (chorus and flanger only): how much does the signal modulates above and under the reference pitch;
    - feedback (normally in flanger only): sends parts of the signal back into the effect input, creating the typical flanger "sweep", liquid sounds;
    - phase: defines the phase of the delayed lines;
    - co lour, sweep floor and ceiling (on phaser only): control the co lour of the phaser effect;
    - mix: defines the proportion between dry and effect signal.

  • reverb, echo and other delay based effects
    - you use these to add "depth of field", dimension and space to a mix, or as "special effects";
    - be careful not too use too much reverb, as it can make the whole track sound very muddy, make the vocals unclear, and also sound "old fashioned" (like some late seventies /early eighties productions);
    - if you can, use a shorter instead of a longer reverb - avoid more than one long reverb per song;
    - try to use different sound spaces with different character (long, short, bright, dark, etc.);
    - if you you can, try delays instead of reverbs: they make the recording less muddy and they can be "timed" to the song tempo;
    - if you can, use EQ on the rev. return: for example, you could take away all bass frequencies and even the lower mids on a snare reverb, leaving enough room free for the vocals;
    - you can also decide to have some reverbs in mono, or at least not 100% stereo, to leave some more space free for other instruments;
    - try to avoid too much reverb or delay on low frequency sounds (like bass and bass drum), except for intros, soundtracks or as special effect;
    reverb parameters:
    - pre-delay: the time before the early reflections start (defines distance between sound source and walls);
    - reverb time: the average decay time of the reverb;
    - bass and treble multiply (or ratio): how low/high frequencies decay compared to the average (for a 3 sec. reverb time, setting bass to 2X would mean 6 sec. decay in the low range); simulates the different way most material absorb low and high frequencies;
    - room size: defines the size of the emulated virtual space; you could also have a small room size, but long rev. time (bathroom reverb)
    - room shape: in some effects this defines the type of decay envelope, and/or the co lour of the reverb;
    - stereo spread: how much the reverb lines are spread in stereo;
    - density: the distance between the single delay lines; usually higher values equal to a smoother decay; always use high density on percussive sounds!
    - high cut: the highest frequencies reverberated; for a acoustic sounding rev. you might already cut at 4 or 6 kHz;
    - high damping: the speed at which very high frequencies are absorbed (decaying);
    - ER level: the level of the early reflection part of the reverb;
    - reverb level: the level of the reverb tail.
    delay parameters:
    - delay time: usually separate for L and R channel in stereo delays; this can be in millisecond, or in note values related to the song tempo;
    - feedback: controls the number of delay repetitions;
    - cross feedback: feeds the output of the L delay with the input of the R delay, and vice versa; good for ping-pong type of delay, when setting different times for the two channels;
    - high and low cut filters: control the co lour of the delayed signal; especially when emulating vintage equipment, you should make sure the delay does not sound like the original signal;
    - mix: defines the proportion between dry and effect signal.

  • overdrive, distortion, bit-crusher, etc.
    - overdrive and distortion can be used to simulate the saturation of analogue tube amplifiers in a e-guitar speaker cabinet; but the "real thing" sound so much better, so please try to use a real guitar distortion when you can;
    - bit-crusher and similar effects can be use to simulate the "bad sound" of early digital system (for example, an old sound-blaster, or the DX7 DA, etc.);
    - in general, you use these together with EQ to get that trendy "lo-fi" sound used in so many musical styles (for example on some sounds in trance, drum'n bass, etc.)
    distortion and overdrive parameters:
    - drive: controls the input stage, and how hard the amp is driven - therefore higher levels = more distortion;
    - output: the more you drive the input, the more the signal becomes loud, so you have to compensate lowering the output;
    - tone or co lour: the character of the distortion (often this is just a low pass filter)
    bit-crusher parameters:
    - drive: see above;
    - bit resolution: chops the LSB (least significant bits) from the digital signal: so setting to "8" limits the resolution to 8-bit and the dynamic range to 48 dB; as a result you also get more "quantizing noise";
    - downsampling: in this case, the sampling frequency is artificially reduced without using proper antialiasing filters, so the sound deteriorates in quality and you get "aliasing" artifacts; in detail: frequencies higher than the Nyquist (= ½ the sampling rate) get "mirrored" and create random-like "alias" frequencies in the range below the Nyqust;
    - clip level: at what level should the signal peaks be "chopped"; you can also select the kind of clipping distortion you want to get.

Mixing

  • what is mixing
    - actually mixing is not just a technique, it is more like an art through which the musical idea of an artist/composer can be shaped into something special, that will awake emotions in the listener and make it unforgettable .... at least, ideally; but (bad) mixing can also be the way to completely ruin a decent recording.

  • important mix parameters
    - balance (the volume relation between the musical elements)
    - panorama, width (panorama position within the stereo field, stereo width)
    - height (position of an instrument in the frequency range)
    - colour (spectrum, formants, use of filters and EQs, etc.)
    - depth, dimension (dry/wet balance, use of ambience effects such as hall, delay, etc.)
    - dynamics (use of compression, volume envelopes, etc.)
    use all these to "find the right space" for each instrument in the mix!
    From the musical point of view, these are very important aspects to consider:
    - focus: keep the listener attention on the most important elements, and avoid too many things "happening" at once: it can be confusing and/or cause listening fatigue;
    - interest: avoid that the song/composition becomes boring, introduce new elements as the track develops and keep enough variation elements throughout the track;
    - personality: make the mix sound personal, unique and unlike anything else! At an advanced stage you should forget all standard "rules" and just follow your instinct.

  • places from where to start the mix
    - there is really no "rule" for this, but you might try these ideas out:
    - from the bass drum or snare drum, from the bass, from the lead vocals or main instrument ...
    - if it is a typical song, the vocals should be added as soon as possible, as all other instrument will relate to the vocal track sooner or later anyway;
    - if it is a soundtrack with orchestral sounds, you might want to start from the most important melody line (for example, the violins), or from the bass, which is the fundament of the harmony;
    - if it is a dance track, you almost certainly want to start with drums, then bass line and the most important rhythmical elements.

  • sound of single instruments in a mix
    - they do not necessarily have to sound "nice" when listened in "solo" mode; more likely that an instrument that sounds nice and full in solo does not fix in a mix;
    - they must integrate in a complementary way with all the other elements to create a balanced, full overall sound (unless you are looking for that '80s ultra-bombastic sound ...);
    - for example: you might almost completely cut the bass freq. away from a guitar track, to leave more space to the bass drum and the bass guitar, so that if listened alone it would sound very thin and bodiless, but would complement perfectly with the other instruments.

  • how to prevent instruments "fighting" with each other
    - changing the arrangement (always the best way!);
    - muting one of the instruments (do not let them play at the same time);
    - lowering the level of one of the two instruments;
    - using very different EQ settings, emphasizing different formants (sometimes called "frequency juggling");
    - using the pan to position the instruments in a different place of the stereo basis;
    - using different level of ambience (dry, or with different types and amount of reverb, delay, etc.), so that you can have sounds more to the front and others more to the back.

  • positioning sounds in the stereo field
    - avoid to pan mono signals hard Left or Right, it sounds very unnatural on the headphones and it is not necessary: when you pan something 90% R or L, it already sounds like coming out from one loudspeaker only anyway ...
    - try to use many intermediate pan positions as well;
    - try to keep the most important elements close to the the center, or in some listening situation the balance between those might be completely wrong;
    - typical instruments to keep middle: BD, SD, Bass, Vocals (solo);
    - typical instruments to have open in stereo: piano, strings, pads, background vocals, the return lines of stereo effects, etc.;
    - typical instruments to position at different degrees left or right: guitars, synth lines, toms, percussion, cymbals, etc.

  • setting stereo width on stereo signals
    - do not pan all stereo signals hard L/R: there are not just mono and stereo signals, but also different degrees of stereo width;
    - it is not good to just mix all keyboards and synths in stereo - you get something called "big mono", with absolutely no L/R definition and causing listening fatigue;
    - so either use two mono inputs for stereo signals, so that you can control the stereo width with the separate channel pans, or try a plugin like the Waves S1 Imager (width parameter) or the Logic Dir. Mixer (basis parameter) to ev. reduce or enhance the stereo width.

  • the mixdown (or master mix)
    - try to keep the audio resolution as high as possible throughout the signal path - so if possible record at 24-bit, mix at 32-bit float and mix down at 24-bit;
    - it will be up to the master engineer to maximize the dynamics of the track, set the proper compression, and finally dither down to 16-bit for CD production;
    - you might use some sum-compression, but be moderate: additional compression can always be added at the mastering stage, too much compression can not be taken away;
    - if you mix digitally, make really sure that the master out is not clipping (which can easily happen if you use a lot of tracks); on some programs like Logic, just lowering the level of the master fader will fix the problem (as the program works internally with 32-bit float, there cannot be internal clipping); best is of course to set the level of the single channels properly; as an orientation, you might try to have BD, SD and bass all hitting around -5 to -7 dB in the channel output;
    - another solution might be to split the signals in different groups: you could use one stereo out for drums and percussion, one for vocals, one for all instruments and one for the effects ... and then mix everything analogue externally and master to DAT; in this way you can also easily adjust the level of the most important groups in the song;
    - analogue or digital master tape? they certainly sound different; if you do have the possibility, try to make the mixdown at the same time to DAT (possibly 24-bit) and 1/2" tape machine - some songs might sound better on DAT, some on tape;
    - for safety you might want to do a "vocal up" version, with the vocals about 0,8 dB louder, and a vocal down version, with the vocals about 0,4 dB quieter: so if the mastering engineer has a problem with the level of the vocal line at any point in the song, you will not need to redo the mix.

  • don't worry, we'll fix it in the mix!
    ... probably the biggest lie in the recording industry!
    - some "mistakes" happen already at the composition/arrangement/recording stage and can only be fixed if some parts are muted and/or replaced by other, compatible ones;
    - for example, you might have a strings arrangement that "fights" with the vocals, which would force you to push the vocal too much up, use too much compression, etc.; the right solution in this case might be to take away the strings when the vocal line is there, and just leave them between lines ... or to replace the strings with a darker pad sound that leaves enough room for the the vocals.
    - "bad sound" captured during recording cannot always be improved or corrected using EQ and other effects. If it does not sound right during recording, something is wrong.

Mastering

  • what is mastering
    - the process of optimizing the frequency and dynamic range of a recording so that it sounds best on most reproducing systems (including home stereo, hi-end systems, car stereo, ghetto-blaster, walkman, disco sound system, etc.);
    - the process of preparing a recording for the final support media (for example, CD, DVD-Audio, Tape, etc.): this includes trimming the tracks to the exact length, setting fade ins and outs, pauses of the proper length between the tracks, setting the relative volume and balance of the single tracks, setting the track start and end markers (PQ editing), etc.
    - mastering is also the last chance to fix things that went horribly wrong during the production process! sometimes small edits and corrections might be performed at this stage, as well as "surgical" DSP processing to fix problems with sound, disturbing noises (like a 50 Hz hum), ev. distortion, clicks, etc.
    - re-mastering usually refers to restoring and polishing an old or damaged master tape, using different techniques, such as denoising, decrackling, etc.
    Mastering (like mixing) has a lot to do with music and style, and with taste as well; different musical styles often require very different approaches and the sound-aesthetics can vary considerably (just think for example of the difference in requirements between a Jazz-Avant-garde and a Hard Rock production, or between a classical orchestra and a techno/trance production ...).

  • typical tools used for mastering
    - a (phase linear) mastering EQ to even out the freq. range;
    - a (multi-band) compressor to control the dynamics;
    - a brickwall limiter to avoid clipping and maximize loudness;
    - a stereo imager, or psychoacoustic processors like the SPL Vitalizer to adjust the stereo width;
    - bass/treble enhancer/exciter to "refresh" a dull sounding recording;
    - a DAW (Digital Audio Workstation) with excellent AD/DA converters;
    - an audio mastering program (affordable: Wavelab, WaveBurner, Peak, Spark, etc. - less affordable: Sonic Solutions audio systems, Sadie Disk Editor, etc.);
    - very often a combination of analogue and digital processing is used – just choose what sounds best for the track!

  • you will also need
    - a pair of excellent studio monitors, possibly specifically made for mastering purposes (usually this is expensive stuff, but a pair of full-range near fields might still be affordable); for example, Dynaudio, Genelec, Quested, etc.
    - a very neutral (uncolored, even at all frequencies) and natural (pleasant) sounding mastering suite - this might require a lot of work, time and money;
    - a pair of really good ears and ... good taste!

  • what can and should be fixed/adjusted
    - setting the track start: make sure there is no unwanted pause (or noises) before the music begins, but avoid trimming the track so tight that ev. breath, air or whatever is there before the sound starts is cut away! Typically the music waveform should start 50 to 500 ms after the beginning of the sample (shorter for pop/rock and longer for classic tracks);
    - setting the track end: make sure you do not cut the track too short (especially if there is some ambience or delay at the end) and use a nice fade-out even if it is only for the background noise;
    - on many pop/rock tracks there is a longer fade out on the chorus: make sure to perform this as the artist/band/producer desires to have it; it is always better to do this at the mastering stage, and not as mixdown (you might get noise at the end of the fade out);
    - types of fade ins and outs: exp. and log. curves sound more "musical" than linear ones; use inverted "S" like curves for longer fade outs;
    - adjusting the track volume and L-R balance, also in relation to the other tracks: make sure there is no undesired difference between the L and R channel, and that the track has the "proper" loudness in relation to the others; in no case you should just normalizeevery track on the CD! doing this, a "quiet" performed track might be perceived as way louder than a "loud" performed track;
    - adjust subtle differences in balance between the different frequency ranges (see EQ tips);
    - adjust the stereo width when the track sounds too "narrow" or too "wide", using a stereo imager (like Waves S1) or a psychoacoustic processors (like the SPL Vitalizer, or Behringer Ultrafex);
    - to refresh a dull recording if nothing else helps: use (with moderation) a bass/treble enhancer or exciter (but try with the EQ first);
    - to remove undesired tape hiss: try a denoiser, possibly with "fingerprint" function to identify the exact spectrum of the noise to be removed; make sure you do not cut vital high frequency parts of the signal (better noisy than dull);
    - adjust the dynamic range if it does not fit the final medium and/or the final listening environment (so that you might end up having to regulate the volume all the time as some parts are too quiet, and some are too loud);
    - compare the overall sound also with other productions (your "reference" CDs) to make sure you are within the range of possible variations.

  • EQ tips
    - adjusting the low freq. end: too little of it and the recording sounds thin, powerless; too much and it sound boomy and distorted on most loudspeakers; you can use a mild low shelving filter for this purpose, or a wide peak param. EQ centered around 50 Hz;
    - adjusting the high freq. end: too little of it and the recording sounds dull, unclear, especially when listening at low volume; too much and it sounds very harsh and unpleasant, especially at high volume levels; you can use a mild high shelving filter for this purpose, or a wide peak param. EQ centered around 16 kHz;
    - if you need some more power, try boosting 16 to 60 Hz, but check this out with a system that does respond down to 16 Hz, or with good headphones, or you might overdo it! Remember that energy in this range can "eat up" a good deal of the whole dynamic available;
    - if you need some more BD punch, try boosting 50-60 Hz;
    - if the bass is too loud, try cutting 100 to 150 Hz;
    - if the sound is boomy (especially obvious when listening on small multimedia speakers or a ghetto blaster), try to cut around 250 Hz;
    - if the overall sound lacks some warmth and fullness, try boosting 250-400 Hz, or cut this range if the overall sound is muddy;
    - if the vocals are not getting through the mix, you might try to enhance the range were the vocals have the most important overtones (800 Hz to 1,5 kHz);
    - if the guitars are too sharp, you might reduce a bit the range between 2,5 and 4 kHz;
    - if the mix is unclear (= not transparent), you could try to boost around 2-3 kHz; beware, 2 to 4 kHz is the range we are most sensible to: if this is boosted too much, at high listening levels it can cause listening fatigue or even hearing damage!
    - if the vocals lack presence, you might boost a little around 5 kHz (but take care of the "S"!);
    - if the "S" and "T" are too sharp, you might cut around 6-7 kHz (but it is better to use a de-esser when mixing, and better yet to position the microphone not directly in front of the singer when recording);
    - if the cymbals and hi-hat sound harsh and too metallic, you might cut 10 kHz and boost a little over 12-15 kHz; it sounds more elegant;
    - if the recording lacks some spark and finish, you might add some "air" with a shelving at 16 kHz;
    - if available, use "phase linear EQs"! They will not affect any other frequency than the one you are working on.

  • Compression
    - if the overall dynamic range is too wide (too much difference between most quiet and most loud passages), you can use a compressor at moderate compression ratio (1:1.2 to 1:1.6), relatively low threshold (-20 or even lower) and "soft knee", to "adapt" the overall dynamic range of the recording to the final audio support (for example CD);
    - if just the signal peaks are too loud, you can try a compressor with high ratio (1:2 to 1:4), high threshold (-10 or higher), hard-knee and fast attack/release times, to just control those peaks; in this case you might also want to try a limiter instead;
    - sometimes you just want to adjust the level of different song parts (like the intro): in this case use a volume curve instead of the compressor;
    - generally: classic and jazz productions are very little or not compressed at all (especially if there was already compression on the single channels during mixing); for pop, rock, dance etc. you might need some additional sum compression, but beware that if you overdo it, it will sound terribly when broadcast over the radio (where it is additionally compressed like hell) as it will have no dynamic left at all!
    - if you have a good sounding analogue compressor and decent AD/DA., don't be afraid to go out and in of your DAW and use that instead of your favorite plugin.

  • Multi-Band Compression
    I am personally not a big fan of this tool, however:
    - it can be used to specifically optimize the dynamic of different ranges (therefore also instruments) in the spectrum;
    - if used correctly, it can let you reach a higher perceived volume without distortion;
    - it can also be used instead of an EQ to balance lows, mids and highs in a track;
    - unfortunately, most cheap multi-band compressor create some artifacts in the crossover areas between the set freq. zones, which leads to inferior sound quality than a good sounding full-band compressor - in that case use EQ + single band compression instead!

  • Limiter
    - in any case, you should use a limiter as the last effect on your chain: a "brickwall" limiter, to avoid any clipping of the signal;
    - typically you will set the max level as - 0.2 dB, as some older DA converters sound bad when the signal reaches 0 dB;
    - here you can maybe try to gain 1 or 2 dB additional loudness, but make sure there is no distortion on the loudest passages!
    - if available, use the "look ahead" function of the limiter so that it sees in advances when peaks are coming, and starts controlling the volume in time with no distortion;
    - set the release time as short as possible to avoid hearable "pumping" artifacts, but beware that very short release times can lead to unwanted distortion!
    - clipping on the master signal sucks and should be avoided like the pest: do not try to boost the volume of your production to be the loudest of all, but go for the best sound quality you can get, at the loudest level you can reach without distortion/clipping;
    - most playback devices DO have some option to adjust the volume, don't they? so volume can be adjusted to match your recording average level! Remember: a less-compressed recording listened louder has more "punch" than a very compressed recording listened quieter ...
    - clipping on single tracks during mixing (like on drums) while keeping other tracks like the vocals "clean" and unclipped is a completely different matter - it's totally ok if it is that aggressive sound on the drums you are after (like in hip-hop, techno, etc.)... but ruining all the other instruments for this does not make sense.


  • no problem, the mastering engineer will fix that!
    ... probably the second greatest lie in the record industry!
    - of course, like in mixing, some things cannot be fixed;
    - for example, if you have two instruments in the same freq. range and one is too loud/too quiet, there is nothing you can do at this point; example: bass is too loud, covers the bass drum - nothing can bring this bass drum to life and make it punch ... but if the bass drum is too loud and the bass is too quiet, you can try to compress the bass drum and bring the bass (low level signal) up;
    - if the downmix was already distorted or clipped, it is almost impossible to fix this later: in this case, re-do the mixdown!

© 2003-2005 Michele "Xenomorph" Gaggia – DNS Studios – all rights reserved
 
top of page ...
 24-bit 192 kHz Recording & Mastering Studios  The State of Sound 
spacer
 | Home | Studio | Technology | Equipment | Productions | Artwork | Xenomorph's Bio | MediaLab & MMA | Resources | Contact |
 © 1999-2007 Xenomorph - Michele Gaggia [DigitalNaturalSound] - all rights reserved - Impressum & Link Disclaimer