More about HKUST
Efficient Transactional Database Storage Management on Flash Solid State Drives
PhD Thesis Proposal Defence
Title: "Efficient Transactional Database Storage Management on Flash Solid State
Drives"
by
Mr. Jun YANG
ABSTRACT:
Flash solid state drives (SSDs), or flash disks, are a type of persistent
storage devices with the potential to replace magnetic disks. They outperform
magnetic disks on access speed, bandwidth, shock resistance, and power
efficiency. As their capacity increases and prices decrease, flash disks are
considered for the storage of database systems. Due to the differences in flash
SSDs and magnetic disks, traditional data management techniques designed for
magnetic disks need to be re-examined for flash disks. In particular, the flash
memory used in flash disks has an asymmetry between read and write speeds,
where reads, no matter random or sequential, are much faster than writes.
This thesis studies the performance of transactional workloads on flash disks
and designs efficient storage schemes for them. Specifically, we study the
performance of the TPC-C workload on flash SSDs. Overall, the flash SSDs
outperform the magnetic disk by up to an order of magnitude. Moreover, the I/O
performance of the SSDs is dominated by random writes, whereas that of the
magnetic disk by random reads. Additionally, both minimising logging and
adopting MVCC (Multi-Version Concurrency Control) than 2PL (Two-Phase Locking)
helps improve the performance on flash SSDs.
Observing the dominance of random writes in flash SSDs under TPC-C workloads,
we propose a new database storage layout, called Partitioned Logging (PTL). In
PTL, we replace data writes with logging to eliminate random page writes, and
put data and logs into separate blocks. Moreover, we group data blocks into
partitions so that updates on each partition are appended as log entries to one
log block. This way, we can tune the partition size to balance the read and
write performance based on the hardware and workload characteristics. The
results show a considerable improvement over both the traditional storage and a
leading flash-based database storage scheme.
Finally, for transactional workloads on key-value stores, we propose FlashTKV,
which adopts a purely sequential storage format where all the data and
transactional information are log records. Furthermore, we support MVCC on this
sequential storage efficiently. Our initial results show that FlashTKV improves
the transaction throughput by 70% over two well-known KV-stores under TPC-C
workloads on flash SSDs.
Keywords: Flash SSDs, asymmetric I/O, online transaction processing (OLTP),
TPC-C, database storage, log-structured, Key-Value store.
Date: Wednesday, 30 January 2013
Time: 2:00pm - 4:00pm
Venue: Room 3501
lifts 25/26
Committee Members: Dr. Qiong Luo (Supervisor)
Prof. Dik-Lun Lee (Chairperson)
Prof. Frederick Lochovsky
Dr. Ke Yi
**** ALL are Welcome ****