7. Conclusions And Future Work

In this paper we introduced a new multimedia representation for documents call Multimedia Thumbnails and presented a method for automatically generating this representation. Our method includes the new idea of associating time attributes with different document parts and finding an optimum navigation path through the document given the display and time constraints.

The idea of representing static document images in a multimedia format using both audio and visual channels opens up many interesting research questions. For example, we employ entropy of figures as a measure of comprehension time for the user, but better complexity measures for figures can be developed which makes the distinction between photos, tables, and graphs, as they require different levels of attention from users. We employ constant information attributes for different parts of a document based on our user study. Nevertheless, more sophisticated methods can be used to assign information attributes. For example, figures can be assigned an information value based on their size, how many times they were referenced in the paper or the existence of some objects in the figure such as faces and buildings. Also, more user studies are needed to better understand the user's document browsing behaviors for different tasks (e.g., browsing, search, overview) and how Multimedia Thumbnails can be improved to be more useful for their browsing needs.

References

[1]T. M. Breuel, W. C. Janssen, K. Popat, H. S. Baird, "Paper to PDA", Proceedings of the International Conference on Pattern Recognition, 2002.
[2]K. Berkner, E. L. Schwartz, C. Marle, "SmartNails - Image and Display Dependent Thumbnails," Proceedings of SPIE, vol. 5296, pp. 53-65, San Jose, 2004.
[3]M-Y. Wang, X. Xie, W-Y. Ma, H-J. Zhang, "MobiPicture - Browsing Pictures on Mobile Devices," International Conference of ACM Multimedia, Berkeley, Nov. 2003.
[4]G. Salton, Automatic Text Processing, Addison-Wesley, 1989.
[5]V. Eglin and S. Bres, "Document page similarity based on layout visual saliency: Application to query by example and document classification", Proceedings of ICDAR, pp. 1208-1212, 2003.
[6]R. Neelamani, K. Berkner, "Adaptive Representation of JPEG 2000 Images using Header-based Processing", Proceedings of ICIP, pp. 381-384, 2002.
[7]R.L. Rivest, H.H. Cormen, C.E. Leiserson, Introduction to Algorithms, MIT Pres, MC-Graw-Hill, Cambridge Massachusetts, 1997.