Browsing by Subject "Perceptual quality"

Now showing 1 - 2 of 2

Cross-layer perceptual optimization for wireless video transmission
(2013-12) Abdel Khalek, Amin Nazih; Heath, Robert W., Jr, 1973-; Caramanis, Constantine
Bandwidth-intensive video streaming applications occupy an overwhelming fraction of bandwidth-limited wireless network traffic. Compressed video data are highly structured and the psycho-visual perception of distortions and losses closely depends on that structure. This dissertation exploits the inherent video data structure to develop perceptually-optimized transmission paradigms at different protocol layers that improve video quality of experience, introduce error resilience, and enable supporting more video users. First, we consider the problem of network-wide perceptual quality optimization whereby different video users with (possibly different) real-time delay constraints are sharing wireless channel resources. Due to the inherently stochastic nature of wireless fading channels, we provide statistical delay guarantees using the theory of effective capacity. We derive the resource allocation policy that maximizes the sum video quality and show that the optimal operating point per user is such that the rate-distortion slope is the inverse of the supported video source rate per unit bandwidth, termed source spectral efficiency. We further propose a scheduling policy that maximizes the number of scheduled users that meet their QoS requirement. Next, we develop user-level perceptual quality optimization techniques for non-scalable video streams. For non-scalable videos, we estimate packet loss visibility through a generalized linear model and use for prioritized packet delivery. We solve the problem of mapping video packets to MIMO subchannels and adapting per-stream rates to maximize the total perceptual value of successfully delivered packets per unit time. We show that the solution enables jointly reaping gains in terms of improved video quality and lower latency. Optimized packet-stream mapping enables transmission of more relevant packets over more reliable streams while unequal modulation opportunistically increases the transmission rate on the stronger streams to enable low latency delivery of high priority packets. Finally, we develop user-level perceptual quality optimization techniques for scalable video streams. We propose online learning of the mapping between packet losses and quality degradation using nonparametric regression. This quality-loss mapping is subsequently used to provide unequal error protection for different video layers with perceptual quality guarantees. Channel-aware scalable codec adaptation and buffer management policies simultaneously ensure continuous high-quality playback. Across the various contributions, analytic results as well as video transmission simulations demonstrate the value of perceptual optimization in improving video quality and capacity.
Perceptual quality prediction of social pictures, social videos, and telepresence videos
(2022-07-01) Ying, Zhenqiang; Bovik, Alan C. (Alan Conrad), 1958-; Ghadiyaram, Deepti; De Veciana, Gustavo; Wang, Atlas; Geisler, Wilson S
The unprecedented growth of online social-media venues and rapid advances in technology by camera and mobile device manufacturers have led to the creation and consumption of a limitless supply of images/videos. Given the tremendous prevalence of Internet images/videos, monitoring the perceptual quality of images/videos would be a high-stakes problem. This dissertation focuses on the perceptual quality prediction of social pictures, social videos, and telepresence videos by constructing datasets of images/videos with their perceptual quality labels, as well as on designing algorithms that accurately predict the perceptual quality of images/videos. While considerable efforts have been put into effectively predicting the perceptual quality of synthetically distorted images/videos, real-world images/videos contain complex, composite mixtures of multiple distortions that non-uniformly distribute across space/time. The primary goal of my research is to design automatic image/video quality predictors that can effectively tackle the widely diverse authentic distortions of images/videos. To develop effective quality predictors, we trained deep neural networks on large-scale databases of authentically distorted images/videos. To improve the quality prediction by exploiting the non-uniformity of distortions, we collected quality labels for both the whole images/videos and patches/clips cropped from them. For social images, we built the LIVE-FB Large-Scale Social Picture Quality Database, containing about 40K real-world distorted pictures and 120K patches, on which we collected about 4M human judgments of picture quality. Using these picture and patch quality labels, we built deep region-based models that learn to produce state-of-the-art global picture quality predictions as well as useful local picture quality maps. Our innovations include picture quality prediction architectures that produce global-to-local inferences as well as local-to-global inferences (via feedback). For social videos, we built the Large-Scale Social Video Quality Database, containing 39K real-world distorted videos and 117K space-time localized video patches, and 5.5 M human perceptual quality annotations. Using this, we created two unique blind video quality assessment (VQA) models: (a) a local-to-global region-based blind VQA architecture (called PVQ) that learns to predict global video quality and achieves state-of-the-art performance on 3 video quality datasets, and (b) a first-of-a-kind space-time video quality mapping engine (called PVQ Mapper). For telepresence videos, we mitigated the dearth of subjectively labeled telepresence data by collecting 2k telepresence videos from different countries, on which we crowdsourced 80k subjective quality labels. Using this new resource, we created a first-of-a-kind online video quality prediction framework for live streaming, using a multi-modal learning framework with separate pathways to compute visual and audio quality predictions. Our all-in-one model is able to provide accurate quality predictions at the patch, frame, clip, and audiovisual levels. Our model achieves state-of-the-art performance on both existing quality databases and our new database, at a considerably lower computational expense, making it an attractive solution for mobile and embedded systems.