DEBII Open Seminar by Dr Mukesh Mohania on "Cloud based Active Archiving Solution for Databases"?
Dr Mukesh Mohania
Date: Friday, 30 April 2010
Time: 9:30 AM-11:30 AM (GMT+08:00) Perth.
Location: DEBII Board Room, Enterprise Unit 4, Technology Park
In this talk , we present an active archival solution for data warehouses that makes use of Hadoop distributed file system (HDFS) to store the data in an always available and cost-effective manner. Our system offers seamless integration with the existing data warehouse system make user oblivious of the location and state of data i.e., user can query the archived data in the manner similar to the live data. Since most documented uses of MapReduce based platforms have been for dealing with unstructured data, we started by performing evaluations to compare ways for retrieving structured data stored in Apache Hadoop - an open source implementation of MapReduce. We discuss various issues concerning the active archive system including schema modification, query federation, query optimization, access control and data provenance. Using TPC-DS benchmark data, we present evaluation results that shows the ability of our system to seamlessly query archive data along with data stored in the warehouse in order of minutes compared to hours required to move the data into the warehouse from traditional archival systems.
Biography of the speaker
He received an award from IBM Tivoli Software in 2004 for his research contribution to Policy Management for Autonomic Computing product. He was also a recipient of the "Excellence in People Management" award in IBM India in 2007. He received the “Outstanding Innovation Award” from IBM Corporation in 2008 for his Context-Oriented Information Integration work, and Technical Accomplishment Award in 2009 for his Policy work. He is an IEEE and ACM Distinguished Speaker.