More about HKUST
Making sense of software documentation with natural language processing
====================================================================== Joint Seminar ====================================================================== The Hong Kong University of Science & Technology Dept. of Computer Science and Engineering Human Language Technology Center ----------------------------------------------------------------------- Speaker: Dr. Christoph Treude University of Adelaide Title: "Making sense of software documentation with natural language processing" Date: Monday, 18 April 2016 Time: 4:00pm - 5:00pm Venue: Lecture Theater F (near lifts 25 & 26), HKUST Abstract: Knowledge management plays a central role in many software development organizations. While much of the important technical knowledge can be captured in documentation, there often exists a gap between the information needs of software developers and the documentation structure. To help developers access documentation more effectively, we are developing approaches based on natural language processing to automatically analyze and repackage software documentation into formats that are more amenable to the readers of documentation. This talk will focus on two such approaches: First, I will present TaskNav, a user interface for search queries that suggests tasks automatically extracted from documentation in an auto-complete list along with concepts, code elements, and section headers. In a field study, we found search results identified through extracted tasks to be more helpful to developers than those found through concepts, code elements, and section headers. Second, I will present SISE, a machine learning based approach to automatically augment API documentation with "insight sentences" from Stack Overflow -- sentences that are related to a particular API type and that provide insight not contained in the API documentation of that type. In a comparative study, we found that SISE resulted in the highest number of sentences that were considered to add useful information not found in the API documentation compared to several baseline approaches. These results indicate that natural language processing can be used to analyze and repackage software documentation automatically, and that it can help bridge the gap between documentation structure and the information needs of software developers. ******************** Biography: Christoph Treude received his Diploma degree in computer science/management information systems from the University of Siegen, Germany, and his PhD degree in computer science from the University of Victoria, Canada. After postdocs in Canada and Brazil, he is now working as a faculty member in the School of Computer Science, University of Adelaide, Australia. His research interests include empirical software engineering, natural language processing, and social media.