S. Voran and S. Wolf, "Objective Estimation of Video and Speech Quality to Support Network QoS Efforts", 1st Department of Energy / Internet2 Quality of Service Workshop, Houston, TX, February 2000.

One of the questions that ongoing QoS efforts seek to answer is: "Given fixed network resources, how does one provide the highest possible quality of service to the maximal number of users in a fair way, even when those users are generating competing real-time and non-real-time network traffic?" Two important real-time network traffic sources are network speech transmission applications and network video transmission applications. This presentation describes perception-based algorithms that estimate received speech or video quality as it would be judged by a user. We believe that these algorithms could provide valuable feedback for the evaluation and optimization of the lower-level protocols that modify per-hop behaviors to meet QoS requirements. We propose that these algorithms be considered as a partial answer to the stated open Internet2 design space question: "How should the network and/or applications be instrumented to provide the measurements necessary to debug, audit, and analyze the performance of new DiffServ services?"

The perception-based estimation algorithms work by comparing transmitted speech and video streams with received speech and video streams. Streams are transformed in ways that effectively extract auditory or visual features of high perceptual relevance. The transformed streams are then compared using mathematical processes that seem to emulate the comparison processes that humans perform. The resulting numerical values tell how far apart the transmitted and received streams are, in a perceptual sense, and are closely related to the perceived quality of the received stream. The estimation algorithms have been validated against numerous formal subjective speech quality tests and video quality tests. The estimators show good correlation with subjective test results for a broad range of speech and video transmission systems. Continuing work includes the extension of the speech quality estimation algorithm to deal properly with delay jitter, and the development of a software video quality measurement toolbox for distribution to industry.