There is a lot of confusion surrounding the terms audio compression, audio encoding, and audio decoding. This section will give you an overview what audio coding (another one of these terms...) is all about.
The purpose of audio compression Up to the advent of audio
compression, high-quality digital audio data took a lot of hard disk
space to store. Let us go through a short example.
You want to, say, sample your favorite 1-minute song and store it
on your harddisk. Because you want CD quality, you sample at
44.1 kHz, stereo, with 16 bits per sample. 44100 Hz means that
you have 44100 values per second coming in from your sound card (or
input file). Multiply that by two because you have two
channels. Multiply by another factor of two because you have two bytes
per value (that's what 16 bit means). The song will take up 44100
samples/s * 2 channels * 2 bytes/sample * 60 s/min ~ 10 Mbytes of
storage space on your harddisk. If you wanted to download that over
the internet, given an good 56k modem connected at 44k, it would
take you (at least) 10000000 bytes * 8 bits/byte / (44000 bits/s) /
(60 s/min) ~ 30 minutes just to download one minute of music! Digital
audio coding, which - in this context - is synonymously called digital
audio compression as well, is the art of minimizing storage space (or
channel bandwidth) requirements for audio data. Modern perceptual
audio coding techniques (like MPEG Layer III) exploit the properties
of the human ear (the perception of sound) to achieve a size reduction
by a factor of 11 with little or no perceptible loss of
quality. Therefore, such schemes are the key technology for high
quality low bit-rate applications, like sound tracks for CD-ROM games,
solid-state sound memories, Internet audio, digital audio broadcasting
systems, and the like.
The two parts of audio compression
Audio compression really
consists of two parts. The first part, called encoding,
transforms the digital audio data that resides, say, in a WAVE file,
into a highly compressed form called bitstream. To play the
bitstream on your soundcard, you need the second part, called
decoding. Decoding takes the bitstream and re-expands it to a
WAVE file. The program that effects the first part is called an audio
encoder. LAME is such an encoder . The program that does
the second part is called an audio decoder. Decoders can be
found on http://www.mp3-tech.org.
Compression ratios, bitrate, and quality
It has not been
explicitly mentioned up to now: What you end up with after encoding
and decoding is not the same sound file anymore: All superfluous
information has been squeezed out, so to say. It is not the same
file, but it will sound the same - more or less,
depending on how much compression had been performed on it. Generally
speaking, the lower the compression ratio, the better the sound
quality will be in the end - and vice versa. The table gives
you an overview about quality achievable. Because compression ratio is
a somewhat unwieldy measure, experts use the term bitrate when
speaking of the strength of compression. Bitrate denotes the average
number of bits that one second of audio data will take up in your
compressed bitstream.Usually the units used will be kbps, which is
Kbits/s, or 1000 bits/s. To calculate the number of bytes per second
of audio data, simply divide the number of bits per second by
eight.
Table: Bitrate versus sound quality
Bitrate
Bandwidth
Quality comparable to or better than
16 kbps
4.5 kHz
short-wave radio
32 kbps
7.5 kHz
AM radio
96 kbps
11 kHz
FM radio
128 kbps
16 kHz
near CD
160-180 kbps
20 kHz
perceptual transparency
256 kbps
22 kHz
studio
Table: MPEG-Version versus Samplerate
MPEG1
MPEG2
MPEG2.5
44100 Hz
22050 Hz
11025 Hz
48000 Hz
24000 Hz
12000 Hz
32000 Hz
16000 Hz
8000 Hz
Table: Valid Bitrates in kbit/second
MPEG 1
MPEG2
MPEG2.5
Layer I
Layer II
Layer III
Layer I
Layer II and III
32
32
32
32
8
64
48
40
48
16
96
56
48
56
24
128
64
56
64
32
160
80
64
80
40
192
96
80
96
48
224
112
96
112
56
256
128
112
128
64
288
160
128
144
80
320
192
160
160
96
352
224
192
176
112
384
256
224
192
128
416
320
256
224
144
448
384
320
256
160
The mp3 Standard
The reason MP3 took off and became the audio
standard on the Web is that the original patent holders made it freely
available for anyone to develop a decoder, or player, for it. So the
early MP3 innovators hacked around and developed players and other
cool software that spread fast and wide. By contrast, several other
digital audio formats, which are more efficient or sound better than
MP3, are proprietary formats, developed by companies like Lucent,
Yamaha, and Microsoft, which have restrictions on how outside
developers can employ their technology. These other audio formats may
gain wider acceptance in the future, as record companies use them to
distribute popular music, but for now MP3 still has the momentum.
Lame is just one out of many different encoders for the
mp3 format. The name, the word stands for Lame ain't an
mp3 Encoder, has historic reasons. While the format is
public and decoding is a standard process, the encoding is not
standard and not public. Many variations in speed and quality
exist. Lame therefore was, for patent reasons, not an encoder, but a
patch to the reference implementation. The reference implementation
being an implementation that illustrated the principle but wasn't
particularly good. Later Lame developed into a full implementations.
Details can be found at lame.sourceforge.net. The
implementation of m3w uses the lame encoder DLL, called lame_enc.dll,
and loads it at program start from either the current working
directory, or one of the default directories for DLL's like
c:\windows. If you want to get the latest version, download it from
some place on the internet and copy it to the appropriate place on
your machine where m3w can find it. m3w will always show the version
loaded in the main window. Lame comes with many options and not all of
them are available in m3w.
The differences of various encoders that matter most are the
differences in quality they can achieve. The quality does not only
depend on the available bitrate but also on the psycho-acoustic model
of the encoder. All encoders will produce good quality at 256kbit/s
and lousy quality at 16kbit/s but in-between there are noticeable
differences. Lame (since version 3.7) is considered to be one of the
better encoders about as good as the encoder of the Fraunhofer
Institute, the inventors of mp3. Lame, however, is free software, that
is, you can download it and use it without paying anything for
it. That and some philosophical concerns, is why m3w uses lame. The
psycho-acoustic model is a model of the human ear, and hearing
process, that is used to determine what parts of the sound are not
audible and therefore can be dropped, if it has to be, to achieve the
desired compression. A good model will discard exactly those parts of
the signal that cause the least distortion to the audible result. Of
course this is all a matter of subjective judgment.