Figure 3 shows an example of how stored documents appear to users. The source of the document (printed or copied) is given as well as the date and time when it was captured. Ten keywords are shown that were automatically extracted from the document. They provide a way for users to quickly determine whether a given document is relevant to their interests. The 8 dpi thumbnails shown in Figure 3 are hotlinked to 72 dpi thumbnails of the corresponding page.
Various formats are stored for each document. These include the original postscript file for printed documents. Copied documents are also stored in postscript form as a 400 dpi binary image compressed with CCITT group 4 with the appropriate postscript header. The postscript file is compressed

with gzip. Thumbnails are generated in GIF format for every page at 4 dpi, 8 dpi, and 72 dpi. OCR results are also saved for copied documents. ASCII text is extracted from printed documents.






