Perceptual Coding



next up previous contents
Next: Quality and Utility Up: Signal Processing for Previous: Hardware

Perceptual Coding

Compression schemes often operate on signal values like the amplitude of speech at a specific instant (sample) or the intensity of an image at a specific location (pixel) without regard to the way that the final reproduced signal will be heard or seen by a human user. This is appropriate for some data such as measurements or text, but it fails to take advantage of potentially useful information when reconstructing a signal intended for subjective perception by humans. If, for example, greater compression can be achieved at the cost only of loss imperceptible by the human ear or eye, then a lossy system can appear to have as high a performance as a lossless system with far inferior compression. Compression methods taking advantage of the nature of these phenomena are referred to collectively as perceptual coding, and seminal work during the past decade promises significant improvements in compression. Perceptual coding can be accomplished by a variety of means, but it usually involves using models of human perception, such as a human auditory system or human visual system model. These models can be quite complex and their incorporation into compression algorithms quite involved, often involving cooperative work among psychologists, computer scientists, and engineers. The potential gains have been estimated at 10-50% improvements in efficiency of compression with no perceptual distortion. One approach is to transform the raw data using the perceptual model into features deemed important for perception. It is these features that are then explicitly compressed and used to reconstruct the signal. Another approach is to incorporate the perceptual knowledge into the measures of distortion and fidelity used to design the codes. Regardless of the specific method, sensible incorporation of quantitative aspects of human perception is likely to provide substantial improvements in compression performance for speech, audio, images, and video with a modest increase in cost or complexity.



next up previous contents
Next: Quality and Utility Up: Signal Processing for Previous: Hardware



Vijay K. Madisetti
Mon Jan 30 11:05:18 EST 1995