Large-Scale Peer-Assisted Online Hosting, Distribution and Video Streaming Systems: Design, Modeling and Practice

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Large-Scale Peer-Assisted Online Hosting, Distribution and
Video Streaming Systems: Design, Modeling and Practice"

By

Mr. Fangming Liu


Abstract

Large-scale peer-assisted content distribution systems within the "cloud" 
of the Internet have provided valuable services to a large population of 
end users, ranging from file sharing, live video streaming, to 
video-on-demand (VoD). Great attention from both academia and industry has 
been devoted into this area. With users not only downloading data but also 
uploading data to one another, such peer-assisted systems are easy to 
deploy and have good scalability. However, due to the highly dynamic 
nature of distributed peers with heterogeneous capacities and diverse 
behaviors, there still remain several fundamental challenges in 
large-scale peer-assisted content distribution and video streaming 
systems, with respect to the cost-performance trade-off in peer-assisted 
online hosting and distribution, and the flash crowd problem in P2P live 
streaming, as well as the service qualities of peer-assisted VoD. This 
thesis seeks to address these challenges through not only mathematical 
modeling and analysis, but also practical system design and measurement, 
in order to bridge theory and practice.

First, to guarantee adequate levels of service quality while conserving 
prohibitive server costs, we seek to explore the design space of online 
hosting and distribution systems that integrate peer bandwidth 
contributions with strategic server resource provisioning in a 
complementary and transparent manner. Specifically, we first model and 
analyze new strategies to allocate scarce server resources --- including 
both storage space and bandwidth --- in peer-assisted online hosting 
systems. The objective is to maximize the use of limited server storage 
and bandwidth resources to guarantee adequate levels of file availability 
and downloading performance, while taking full advantage of peer 
assistance. We identify a number of unique challenges involved in such 
systems, and propose our design of resource allocation protocols to 
address these challenges. Based on the guidelines derived from our 
analysis, we design and measure FS2You, a large-scale and real-world 
online hosting system with peer bandwidth assistance and semi-persistent 
file availability. FS2You is designed to dramatically mitigate server 
bandwidth costs. We present our architectural and protocol design, as well 
as an extensive measurement study at a large scale to demonstrate the 
effectiveness of our design, using real-world traces that we have 
collected. To our knowledge, our work represents the first attempt to 
design, implement, and evaluate a new peer-assisted semi-persistent online 
hosting system at a realistic scale. Since the launch of FS2You, it has 
quickly become one of the most popular online hosting systems in mainland 
China, and a favorite in many online forums across the country.

Second, it is evident from our experiences with real-world P2P live 
streaming systems that, it is not uncommon to have hundreds of thousands 
of users trying to join a program in the first few minutes of a live 
broadcast. This phenomenon, unique in live streaming systems, referred to 
as the flash crowd, poses significant challenges in the system scalability 
and user experience. We develop a mathematical model to capture and 
understand the inherent relationship between time and scale in P2P 
streaming systems under the flash crowd. Specifically, we show that there 
is a fundamental upper bound on the system scale with respect to a time 
constraint. In addition, our analysis has brought forth an in-depth 
understanding on the effects of the gossip protocol and peer churn. To our 
knowledge, our work represents the first attempt to provide an analytical 
characterization and understanding of the inherent scale-time relationship 
in P2P streaming systems, with a particular focus on the flash crowd and 
various critical factors.

Third, due to the lack of theoretical foundation and new storage and 
transmission mechanisms, the service qualities --- including the video 
streaming bit rates and the startup and seek latencies --- provided by 
current peer-assisted VoD systems are still far from optimum. In practice, 
we design, implement, fine-tune and measure Novasky, a real-world VoD 
system capable of delivering cinematic-quality video streams to end users. 
The foundation of the Novasky design is a P2P storage cloud, storing and 
refreshing media streams in a decentralized fashion using local storage 
spaces of end users. Different from existing peer-assisted VoD systems, it 
features a new peer storage and replacement algorithm using Reed-Solomon 
codes and an adaptive server push-to-peer strategy. Novasky has been 
deployed in the Tsinghua University campus network, operational since 
September 2009, attracting 10,000 users to date, and providing over 1,000 
DVD or 720p video streams with bit rates of 1 - 2 Mbps. Based on 
real-world traces collected over 6 months, we show that Novasky can 
achieve rapid startups within 4 - 9 seconds, and extremely short seek 
latencies within 3 seconds.

Furthermore, we develop a theoretical framework based on queueing models, 
in order to (1) justify the superiority of service prioritization based on 
a taxonomy of requests, and (2) understand the fundamental principles 
behind optimal caching and prefetching designs in peer-assisted VoD 
systems. The focus is to instruct how limited uploading bandwidth 
resources and peer caching capacities can be utilized most efficiently to 
achieve better system performance. Specifically, we first use priority 
queueing analysis to prove how service quality and user experience can be 
statistically guaranteed, by prioritizing requests in the order of 
significance, including urgent playback (e.g., random seeks or initial 
startup), normal playback, and prefetching. We then proceed to construct a 
fine-grained stochastic supply-demand model to investigate peer caching 
and prefetching as a global optimization problem. This can not only 
provide insights in understanding the fundamental characterization of 
demand, but also offer guidelines towards optimal caching and prefetching 
strategies in peer-assisted VoD systems.


Date:			Tuesday, 25 January 2011

Time:			2:00pm – 4:00pm

Venue:			Room 3501
 			Lifts 25/26

Chairman:		Prof. Jingshen Wu (MECH)

Committee Members:	Prof. Bo Li (Supervisor)
 			Prof. Lin Gu
 			Prof. Qian Zhang
                         Prof. Danny Tsang (ECE)
                         Prof. Jiannong Cao (Computing, PolyU.)


**** ALL are Welcome ****