DEBII Open Seminar by Dr Mukesh Mohania on "Cloud based Active Archiving Solution for Databases"?

The Digital Ecosystems and Business Intelligence (DEBI) Institute is pleased to invite you to a presentation by Dr Mukesh Mohania on "Cloud based Active Archiving Solution for Databases"


Dr Mukesh Mohania
Date: Friday, 30 April 2010
Time: 9:30 AM-11:30 AM (GMT+08:00) Perth.
Location: DEBII Board Room, Enterprise Unit 4, Technology Park

Presentation Outline

The need to analyze structured data for various business intelligence applications such as customer churn analysis, social network analysis, etc. is well known. However, the potential size to which such data will scale in future will make solutions that revolve around data warehouses hard to scale. In particular, the movement of data into archives would become more frequent as we go towards larger data stores. However, current file based archive models make the data unusable for any type of insight extraction.
In this talk , we present an active archival solution for data warehouses that makes use of Hadoop distributed file system (HDFS) to store the data in an always available and cost-effective manner. Our system offers seamless integration with the existing data warehouse system make user oblivious of the location and state of data i.e., user can query the archived data in the manner similar to the live data. Since most documented uses of MapReduce based platforms have been for dealing with unstructured data, we started by performing evaluations to compare ways for retrieving structured data stored in Apache Hadoop - an open source implementation of MapReduce. We discuss various issues concerning the active archive system including schema modification, query federation, query optimization, access control and data provenance. Using TPC-DS benchmark data, we present evaluation results that shows the ability of our system to seamlessly query archive data along with data stored in the warehouse in order of minutes compared to hours required to move the data into the warehouse from traditional archival systems.

Biography of the speaker

Mukesh Mohania received his Ph.D. in Computer Science & Engineering from Indian Institute of Technology, Bombay, India in 1995. He was a faculty member in University of South Australia from 1996-2001. Currently, he is an STSM and senior manager in IBM Research - India, and leading Information Management research group. He has worked extensively in the areas of distributed databases, data warehousing, data integration, and autonomic computing. He has published more than 100 papers and also filed more than 30 patents in these or related areas. He received the best paper award for his XML and data integration work in CIKM 2004 and CIKM 2005, respectively.
He received an award from IBM Tivoli Software in 2004 for his research contribution to Policy Management for Autonomic Computing product. He was also a recipient of the "Excellence in People Management" award in IBM India in 2007. He received the “Outstanding Innovation Award” from IBM Corporation in 2008 for his Context-Oriented Information Integration work, and Technical Accomplishment Award in 2009 for his Policy work. He is an IEEE and ACM Distinguished Speaker.

