More about HKUST
A Survey on Approaches for Large-Scale Dataset Analysis
PhD Qualifying Examination Title: "A Survey on Approaches for Large-Scale Dataset Analysis" by Miss Wuman LUO Abstract: Industrical and scientific datasets have been growing enormously in size and complexity in recent years. Many science and industrial users already or will soon manage petabytes of data. An important topic addressed by both these communities over the last several years is the large-scale dataset analysis in a shared-nothing architecture on large clusters of commodity hardware. The foremost requirements of large-scale dataset analysis are scalability sustaining performance, flexibility and high availability. This paper surveys the main approaches for large-scale dataset analysis: parallel RDBMSs, MapReduce, and special scientific databases. We first make a comparative study of parallel RDBMSs, MapReduce, and their hybrid approaches. We then discuss the approaches in scientific data analysis, highlighting the special requirements in this field and the corresponding solutions. Some research problems and challenges are also pointed out in our future work. Date: Thurday, 25 February 2010 Time: 3:00pm - 5:00pm Venue: Room 4480 lifts 25/26 Committee Members: Prof. Lionel Ni (Supervisor) Dr. Qiong Luo Dr. Lei Chen Dr. Qian Zhang **** ALL are Welcome ****