The Hong Kong University of Science and Technology Department of Computer Science PhD Thesis Defence "Characterizing Web Linking and Usage with Hierarchical Models" By Mr. Wenwu Lou Abstract After a decade of rapid growth the World Wide Web, or the Web for short, has become a centerpiece for information exchange for the mankind. The rise of the Web has encouraged a surge of research activities that seek to measure and understand the Web in terms of its structural properties, evolutionary dynamics and navigational patterns. The primary goal of this work is to advance the existing studies of the Web using a graph model that incorporates the inherent hierarchical organization of the Web, a characteristic which has been often overlooked in previous studies. We start by proposing a colored graph model for the Web, which captures both the explicit hyperlink structure and the latent hierarchical organization of the Web at the same time. We then present empirical findings that give evidence to the influence of the latent structure on the formation of explicit hyperlink structure of the Web. We further provide theoretical explanations to these findings, using a new class of random graph models in which the evolution of the Web is related to the latent structures intrinsic to the Web itself. Finally, in the context of Web proxy mining, we show that the latent hierarchical structure of the Web also imposes regularities in Web users' navigational behavior, e.g., locality in Web reference, and creates new opportunities for improving effectiveness and scalability of Web usage-mining algorithms and applications. The Web represents just one example of a wide variety of systems in the real world that have latent classes or hierarchical structures embedded in their web-like existence in nature. In this large context, this research establishes a clean framework for incorporating latent structures in measuring, understanding, and therefore simulating the structural properties and evolutionary dynamics of these systems, such as the Internet, citation networks and social networks. Date: Friday, 3 June 2005 Time: 2:30p.m.-4:30p.m. Venue: Room 2302 Lifts 17-18 Chairman: Prof. Xudong Xiao (PHYS) Committee Members: Prof. Qiang Yang (Supervisor) Prof. Dik-Lun Lee Prof. Frederick Lochovsky Prof. Hong Xue (BICH) Prof. David Wai-Lok Cheung (Comp. Sci., HKU) Prof. Jeffrey Xu Yu (Sys. Engg. & Engg. Mgmt., CUHK) **** ALL are Welcome ****