Quality-of-Experience of Adaptive Video Streaming:
Exploring the Space of Adaptations


With the remarkable growth of adaptive streaming media applications, especially the wide usage of dynamic adaptive streaming schemes over HTTP (DASH), it becomes ever more important to understand the perceptual quality-of-experience (QoE) of end users, who may be constantly experiencing adaptations (switchings) of video bitrate, spatial resolution, and frame-rate from one time segment to another in a scale of a few seconds. This is a sophisticated and challenging problem, for which existing visual studies provide very limited guidance. Here we build a new adaptive streaming video database and carry out a series of subjective experiments to understand human QoE behaviors in this multi-dimensional adaptation space. Our study leads to several useful findings. First, our path-analytic results show that quality deviation introduced by quality adaptation is asymmetric with respect to the adaptation direction (positive or negative), and is further influenced by the intensity of quality change (intensity), dimension of adaptation (type), intrinsic video quality (level), content, and the interactions between them. Second, we find that for the same intensity of quality adaptation, a positive adaptation occurred in the low-quality range has more impact on QoE, suggesting an interesting Weber's law effect; while such phenomenon is reversed for a negative adaptation. Third, existing objective video quality assessment models are very limited in predicting time-varying video quality. The video database together with the subjective data will be made available to the public.

    author    = {Duanmu, Zhengfang and Ma, Kede and Wang, Zhou}, 
    title     = {Quality-of-Experience of Adaptive Video Streaming: Exploring the Space of Adaptations}, 
    booktitle = {ACM Multimedia},
    year      = {2017}
Dataset Description

The Waterloo Streaming Quality-of-Experience Database-III consists of 12 RAW HD reference videos covering diverse content. An 8-second video segment is extracted from each source video, which is further partitioned into two non-overlapping 4-second segments, referred to as short segments (SS). We encode SS into 7 representations with H.264 encoder using three compression levels, spatial resolutions, and frame rates. To simulate quality adaptation events in adaptive streaming, we concatenated two consecutive 4-second segments with different representations from the same content into an 8-second long segment (LS). In total, we obtain 168 SS and 588 LS. The study involved 36 subjects. Subjects score the quality of each video sequence according to the eleven-grade 0-10 numerical quality scale suggested in the ITU-T recommendation P.910.

Experiment Methodology

We carry out three subjective experiments as illustrated in Figure above. Subjects are invited to rate the quality of SS in Experiment I. The subjective rating of each SS is defined as the intrinsic quality. We perform Experiment II on LS, wherein subjects give two opinions (post-hoc quality) to the first and second 4-second video segments (referred to as SS-I and SS-II, respectively). An audio stimulus is introduced in the middle of each LS, indicating the end of SS-I and the start of SS-II. In Experiment III, subjects are requested to watch the LS but to provide a single score to reflect their overall QoE.


- Intrinsic Quality vs. Post-hoc Quality
- Overall Performance