THE INFINITE MEMORY MULTIFUNCTION MACHINE (IM³)

JONATHAN J. HULL AND PETER HART

Ricoh California Research Center,, Menlo Park, CA
E-mail: hull@crc.ricoh.com

A complete, working document storage and retrieval system is described. Designed to help users solve the problem of lost documents, this system illustrates the concepts of automatic document capture and easy retrieval. Every document a user copies or prints is automatically indexed and saved for later retrieval. A prototype implementation of such a system was constructed and used daily by approximately 20 people for over two years. The design of this system is presented and experimental results are discussed.

1 Introduction

Lost documents are a significant problem for office workers. A recent study estimates that at any time 3% to 5% of documents are lost and the average cost of lost documents to a Fortune 500 company is in the range of $3 million to $5 million [10]. An obvious solution to this problem is for users to scan all their documents. However, for any number of reasons, not the least of which are users' natural reluctance to alter their work practices, such an approach has not been widely adopted [3]. In fact, recent results indicate that for new retrieval techniques to gain wide acceptance they should be easy to use, be familiar and require as little user effort as possible [6].

This paper proposes a system design called the Infinite Memory Multifunction Machine (IM³) in which every document a user copies, prints, or faxes is automatically captured and indexed for later retrieval with a web browser. The automatic capture process is performed as a natural side-effect of copying, printing, or faxing and is almost completely transparent to the user. This removes any need for the user to decide at the time a document is processed whether it should be saved. By so doing, users are almost guaranteed that when they need to find a document, the system will contain a copy of it.

Economic considerations are always important factors when users decide whether to adopt new technologies. An important consideration in the design of the IM³ system was the relative cost of printing a document on paper vs. storing an image of the same document on magnetic disk. Of course, there is a wide variation in the price of paper and toner needed to print the range of documents encountered in the typical office. For the purposes of this analysis, it was assumed that, on average, the cost of an 8.5x11 inch sheet of paper is one cent. It was also assumed that a 400 dpi binary image of the same document on average would require 100 KB. At the time the IM³ project was started (late 1993) it was observed that it cost about 3 cents for 100 KB of magnetic disk storage. This was just for the disk space. It did not take into account the cost of the computer, etc. However, we projected that over time these costs would significantly decrease and eventually become less than the cost of a sheet of paper. That time arrived sometime in 1996. Today, it costs about 0.27 cents for 100 KB of disk space. This is significantly less than the cost of a sheet of paper. The 4:1 difference in cost of the two media now favors the adoption of a document storage and retrieval system like the IM³.