Signals such as speech and image data are stored and processed in computers as files or collections of bits. Big files have many disadvantages in comparison with small files: they fill storage devices, they take longer to transmit to users, and they can overwhelm algorithms designed to draw conclusions from the raw data. Anyone who has patiently waited for an image to be transmitted through a modem appreciates the value of a speedup of 10 times, or even 2 times. It is clearly desirable and sometimes necessary to reduce the number of bits required to describe a signal, but it is equally clear that this must be done with great care if the reduction results in a loss of information. Signal compression or data compression is concerned with reducing the number of bits required to describe a signal to a prescribed accuracy. Because compression is often one of the first processes applied to a digital signal produced by a source of information such as a microphone, sensor, or camera, the combination of quantization and compression is called source coding in the theoretical literature, and the application to speech and image data are called speech coding and image coding, respectively.
Signal compression is a key technology for multimedia applications on the NII. Many current applications treat sound, images and even video as uncompressed bitstreams, but this situation will change when their use becomes widespread. Storage and bandwidth limitations call for powerful compression methods.
Compression algorithms fall into two general classes:
The main shortcoming of lossless compression is that the amount of compression is limited, with typical compression on computer data files being about 2:1. Commercial, freeware, and shareware programs for lossless compression abound throughout the NII; they usually succeed in halving memory requirements for text and program files.
Quantization causes loss of information, even if that loss is not perceptible. Quantization can occur in the original A/D conversion, in the digital arithmetic used to transform a signal, or in quantization mapping a digital signal with a high bit rate into a compressed digital signal with a low bit rate. Such quantization can operate either on individual samples or pixels (scalar quantization,) or on groups or blocks or vectors of samples or pixels (vector quantization). In his pioneering development of the mathematical theory of communication, Shannon showed that that better performance can be achieved by using vector quantization, but scalar quantization is simpler and often adequate. The international JPEG (Joint Photographic Experts Group) standard achieves compression of digitized images by 16:1 with reasonable quality using scalar quantization.
Image compression is already widely available
on small computers through compressed versions of standard image formats
like TIFF and GIF as well as
the JPEG standard and Apple Quicktime.
The
JPEG
standard for still imagery uses a discrete cosine transform (DCT)
scheme
with scalar quantization
combined with lossless coding and achieves reasonable quality
with 16:1 compression.
Software implementations are widely available and existing
special purpose chip implementing this
technique operate at 10 Mbps. Other related standards such as
the H.261 standard (popularly referred to as p
64) and
and MPEG 2 are now well established, but current standards committees
developing future standards such as MPEG 4 for low rate video have not yet
agreed on which compression approach to adopt.
The MPEG 2 standard for video compression
is the basis for the U.S. HDTV broadcast standard
currently being tested by the FCC in conjunction with the AC-3 multichannel
audio compression algorithm.
Proprietary vector quantization algorithms provide fast decompression of video even on personal computers, and are included in such popular software as Apple QuickTime and IBM Photomotion. The current code-excited linear-predictive (CELP) international speech compression (or speech coding) standards combine sophisticated SP with vector quantization, and accomplish compression ratios of about 12:1. Speech coding has enjoyed a recent rebirth in importance because of the growth of telecommunications networks, cellular telephony, and personal communications systems. The coming worldwide mobile telephony systems will add even more importance to this technology. Speech coding also plays a critical role in voice mail and satellite paging systems with voice messages downloaded to pagers. Speech coding is moving into personal computers and hence to immediate NII access. For example, Microsoft's new Windows release (nicknamed Chicago) will have speech coding feature called TrueSpeech.
Lossy compression algorithms are important for the rapid transmission of voice and image data across the NII and are widely available. They are often components of far more complex algorithms such as Mosaic, the increasingly popular software for finding text, sound, images, and video in the World Wide Web through the NII. By providing scalable transmission, they permit the user to balance signal qualities against data rate. Without such flexibility, browsing is not possible, and massive archives lose their utility.
Existing popular compression algorithms are based on ideas that have matured over the last 10-15 years in academia and industry. During recent years the emphasis in the compression community has shifted from that of developing new techniques to that of software and hardware implementation of time-proven techniques, and the development and establishment of standards to achieve interoperability of equipment. Of course, as existing systems improve and new compression systems are devised, these standards will be modified and updated. At the same time, choosing these standards will have a major impact on future developments and growth of various bandwidth compression techniques. These standards have already had a tremendous influence on the ``digitalization'' of the consumer electronic world, and they have led to digital HDTV where the U.S. currently leads. In high quality audio coding (CD-quality digital audio compression), advances in filter banks, perceptual coding and masking have led to standards such as MUSICAM, which in turn will permit digital audio broadcast to replace FM radio. The technology of high-speed and high-resolution analog/digital conversion, especially in digital implementations, has been instrumental in all applications whose data originates in the real world.