A Paper-Based Interface for Video Browsing and Retrieval

Abstract

A paper-based interface for browsing video is proposed. A paper document shows key frames selected from a video, a transcript for the parallel audio track, and bar codes that, when scanned, invoke a multimedia player. The paper document provides a stand-alone representation for a video recording that lets a user both understand the content of the file and replay only selected parts of the multimedia that are necessary to gain a better understanding. This approach applies the twodimensional display characteristics of a newspaper to multimedia retrieval. By so doing, the user's browsing and search efficiency is greatly improved. This poster describes an implementation of the Video Paper system using a Pocket PC with a bar code reader as the remote control device and an archive of TV programs on the Pocket PC or an external server.

Introduction

Video is difficult to browse and search because it is essentially a one-dimensional medium. When we watch a video, at any given time we can only see a small portion of the available information. It's a challenge to visualize the surrounding context. While there are many elegant online solutions for this problem (e.g., [2, 3, 8]), they can be difficult for new users to understand.

Newspapers, on the other hand, are familiar to everyone and are designed for easy browsing. Paper is a low cost, high resolution display medium with many advantages, including portability, low power requirements, and usability in almost any environment. Layout rules, developed over hundreds of years, provide a two-dimensional representation that allows a user to perceive massive amounts of information in a single glance. The structure of the text, including carefully chosen titles and well engineered paragraphs, combined with photos that attract a reader's attention, help people decide which stories they might be interested in and how much time they should devote to reading them.

We set out to apply the principles of newspaper design to the problem of video browsing and understanding. We wanted to provide a paper representation for a video recording that was "standalone" in the sense that a reader could understand as much as possible about a video by merely glancing over the document. However, that document should also include an easy-to-use means for replaying portions of the video so that the user could see and listen to the multimedia recording whenever a more in-depth explanation was required. Ideally, the need for selective replaying would be reduced as much as possible since every time the user has to play the video to search for information, they would be back in the mode of one-dimensional search and passive uptake that we are trying to minimize.

Our solution for interacting with multimedia content such as a TV broadcast or a recorded video is called Video Paper. This system includes paper versions of the multimedia content that include a text representation for the audio (we use the closed caption when available), formatted with guidelines designed for the newspaper industry. We use fonts chosen for their readability at a small point size, multiple columns and short lines of text. The line spacing, bolding and capitalization techniques also contribute to improving the ability of the user to skim the document.

We also display key frames extracted from the video at various locations. Figure 1 depicts a sample of the proposed Video Paper interface. Bar codes refer to corresponding points in the recorded video. Swiping a bar code causes the video to begin playing at that time. This allows users to read the paper document and view only those parts of the video that are relevant to their needs. (Glyphs [5] could also be embedded in the key frames for similar functionality.) Given a multi-paged document of this type representing, say, an hour long TV program or recorded meeting, a reader can quickly skim the contents of the program to see if anything relevant might be present in the text.