A Detour into Encoding
With digitization becoming the common method of production, one must consider the implications of changing from analog to digital recording techniques. As is common knowledge, analog recordings are taken in tandem with the recording and are in a physical medium. Therefore they are basically superior in storage fidelity, because they record every measurable moment in the recording. However, they are relatively difficult to reproduce, and few high-quality recordings are made on a durable medium. Digitizing, on the other hand, is extremely easy to reproduce, as well as edit and apply changes to. Unfortunately, digital recording, by definition, records in gaps, from one fraction of a second to the next, and is therefore fundamentally inferior to analog recording techniques, at least in theory.
The reality of the issue is that sufficiently high-quality recordings are audibly indistinguishable from analog recordings. Modern speed of technology has granted us the speed of process necessary to record audio at that rate. Because of this, digital recording has become the medium of choice for mainstream audio. Although this has had the unfortunate side effect of making piracy simple and therefore rampant, it has led to a much higher quality and more durable method of storage. That being said, it’s time for the details.
The baseline standard for audio is PCM, or Pulse Code Modulation. Developed all the way back in 1939, PCM establishes a 16-bit encoding system (that’s 65,536 possible frequencies to the laymen out there, or 2^2^2^2^2 for us geeks) at a rate of 44,100 per second. Although this is the most thorough method of encoding, because each entry is independent of the other, this is a rate of 88,200 bytes per second, making it both the most bulky and the most demanding form of encoding in common use. This is usually just treated as “raw” audio, and used for high-fidelity editing before conversion into a less cumbersome format.
Another common standard is DPCM, or Differential Pulse Code Modulation. This particular method references the previous frequency, and therefore only uses 4-bit (16 possible values) encoding at the same 44,100 sample per second rate. Although this technique leads to a much more compact storage and resource demand (22,050 bytes per second), its recursive referencing and limited range of variation mean that it is susceptible to data fragmentation (loss of parts of the recording) or to extreme variances.
The most efficient (and therefore most common) method of conversion is ADPCM, or Adaptive Differential Pulse Code Modulation. This method varies the amount of bits it uses from moment to moment, and therefore will change from 1-bit (to indicate no change) to 16-bit (to indicate the most change). Although it is the most efficient recording method for storage, the demands on system resources vary as the encoding varies, and is therefore more difficult to account for. Moreover, it must rely on a more complex translating code to convert it into audio, due to its more complex technique.
It is important to note that all of these methods other than PCM tend to be susceptible to data loss because they reference the previous measurement and merely indicate the change between them. This is typically solved by enclosing multiple copies of the data in a single file. Although it defies common sense, it is actually more efficient to record a particular sound byte in ADPCM and enclose, say, four copies, than it is to encode a single PCM copy.