More about HKUST
Intelligent Sampling over Wireless Sensor Networks
PhD Thesis Proposal Defence Title: "Intelligent Sampling over Wireless Sensor Networks" by Miss Yongzhen Zhuang Abstract: Nowadays, wireless sensor networks are being widely used in many monitoring applications, such as monitoring farm belts, large-scale habitats, environmental disasters, and so on and so forth. The tiny distributed sensors provide samples of environmental parameters, for example, temperature, humidity, gas pressure, decibel levels, to the applications. Since sensors are always constrained by their limited battery power, how to efficiently use a large number of sensors and their samples according to the application needs is of great importance. More specifically, we discuss two types of sampling techniques, temporal and spatial. Temporal sampling decides how many samples a sensor should obtain to, for example, process a query, clean the data, or detect an event. In other words, temporal sampling adjusts the sampling rates to achieve enough accuracy in the application. It is known that tiny imprecise sensors are of low data quality and are easily affected by random noise. They cannot provide bounded accuracy for the applications built upon them. One way to increase the accuracy is by multiple sampling, which is the most effective and straightforward technique to improve the data quality. The basic idea is to obtain multiple samples from either one or several sensors, and use their average in the applications. The enforcement of this multiple sampling technique in sensor network applications is not trivial, for the sensed data are distributed time series and sensors have to collaborate to achieve the overall accuracy requirement of the application. Our goal is to use an appropriate number of samples in each sensor depending on the application requirements. We mainly focus on three applications, query processing, data cleaning and event detection. In the query application, we build a query plan to minimize the sampling cost while satisfying the accuracy constraint. In the data cleaning application, we use weighted moving average with weights came from a sampling process to efficiently reconstruct the true data with low sampling costs. In the event detection application, we use range prediction and a hypothesis test to locate events in realtime with high confidence. On the other hand, spatial sampling focuses on selecting a subset of sensors to be used in a monitoring application. In many applications, spatial sampling can dramatically reduce the energy consumption because the sensors that are not selected can switch themselves to a low-cost sleep mode. Spatial sampling can have many alternatives, depending on the application requirements. For example, clustering-based sampling, such as snapshot sampling, uses a representative sensor to substitute for the similar sensors nearby. Our ongoing works of spatial sampling mainly focus on the following directions: (1) We model the event detection application as regional pattern queries, and propose a spatial sampling approach based on random walk to find the pattern regions. We show that it is more efficient to walk through a random path to locate the pattern region than abundantly revoke many irrelevant sensors, especially when the monitoring field is huge and the pattern regions are rare. (2) In many applications, the placement of the sensors can be quite irregular due to the construction of the monitoring area (steep rocky, moderate incline, or flat silty) and sorts of natural effects (flood or wind). We propose an unbiased sampling technique that selects and weights sensors so that to provide an unbiased representation to the application as if the sensors are deployed uniformly. (3) We present a section sampling technique that groups sensors into random sections. Each single section can represent the monitoring field by its own. We can achieve more energy efficiency by using fewer sections, or high query accuracy by using more sections. (4) We propose a new model for the event detection application that defines events as the top-k regions with the highest regional statistics (e.g. regional sum). We try to adopt the sampling techniques to provide an accurate and energy efficient approximation of the top-k regions. Date: Thursday, 10 January 2008 Time: 10:00a.m.-12:00noon Venue: Room 3501 lifts 25-26 Committee Members: Dr. Lei Chen (Supervisor) Prof. Frederick Lochovsky (Chairperson) Dr. Shing-Chi Cheung Prof. Vincent Shen **** ALL are Welcome ****