More about HKUST
The New HDFS Features in Apache Hadoop v2
Speaker: Dr. Tsz-Wo Nicholas Sze Technical Staff at Hortonworks Title: "The New HDFS Features in Apache Hadoop v2" Date: Monday, 10 Feb 2014 Time: 4:00pm - 5:00pm Venue: Lecture Theater F (near lifts 25/26), HKUST Abstract: Apache Hadoop v2.2.0, the GA release of Hadoop v2, offers several significant HDFS improvements including new append-pipeline, federation, NameNode HA, snapshots, wire compatibility, NFS interface, further performance improvements, etc. In this talk, we first give a brief introduction to Hadoop and then discuss some of these new features in details. The append feature is added to HDFS and the write-pipeline is improved dramatically for better durability, visibility and consistency guarantees. Federation uses multiple independent NameNodes and namespaces in order to scale the name service horizontally. NameNode HA addresses the problem of the NameNode being a single point of failure in a HDFS cluster. Snapshots are read-only point-in-time copies of the file system for supporting "time travel in big data. We also describe some of the development that is underway for the next release and some future works. ******************* Biography: Dr. Tsz-Wo Nicholas Sze is a Member of Technical Staff at Hortonworks and also a Member of the Project Management Committee at Apache Hadoop. His interests include distributed computing, algorithms and mathematical analysis. He started contributing to Hadoop in 2007. Two of his recent Hadoop contributions were HDFS Snapshots and WebHDFS. He accomplished a new computation world record of Pi using Hadoop with Yahoo's clusters in 2010. He received his Ph.D. degree in Computer Science from the University of Maryland College Park in 2007, and his M.Phil. and B.Eng. degrees from the Hong Kong University of Science and Technology respectively in 2001 and 1999.