TrapperKeeper: Using Virtualization to Add Type-Awareness to File Systems
Daniel Peek and Jason Flinn
Abstract
TrapperKeeper is a system that enables the development of type-aware
file system functionality. In contrast to existing plug-in-based
architectures that require a software developer to write and maintain
separate code modules for each new file type, TrapperKeeper requires
no type-specific code. Instead, TrapperKeeper executes existing
software applications that parse the desired file type inside virtual
machines. It then uses accessibility APIs to control the application
and extract desired information from the application's graphical user
interface. We have implemented metadata extraction and document
preview features that use TrapperKeeper, and we have used
TrapperKeeper to capture the type-specific cognizance of over 20
applications that collectively parse more than 100 distinct file
types. Our experimental results show that TrapperKeeper can execute
these two features on hundreds of files per hour, a pace that far
exceeds the rate that files are modified or created on the average
desktop.