Publications

By Lynn Wilcox (Clear Search)

2006
Publication Details
  • UIST 2006 Companion
  • Oct 16, 2006

Abstract

Close
Video surveillance requires keeping the human in the loop. Software can aid security personnel in monitoring and using video. We have developed a set of interface components designed to locate and follow important activity within security video. By recognizing and visualizing localized activity, presenting overviews of activity over time, and temporally and geographically contextualizing video playback, we aim to support security personnel in making use of the growing quantity of security video.
Publication Details
  • UIST 2006 Companion
  • Oct 16, 2006

Abstract

Close
With the growing quantity of security video, it becomes vital that video surveillance software be able to support security personnel in monitoring and tracking activities. We have developed a multi-stream video player that plays recorded and live videos while drawing the users' attention to activity in the video. We will demonstrate the features of the video player and in particular, how it focuses on keeping the human in the loop and drawing their attention to activities in the video.
Publication Details
  • Interactive Video; Algorithms and Technologies Hammoud, Riad (Ed.) 2006, XVI, 250 p., 109 illus., Hardcover.
  • Jun 7, 2006

Abstract

Close
This chapter describes tools for browsing and searching through video to enable users to quickly locate video passages of interest. Digital video databases containing large numbers of video programs ranging from several minutes to several hours in length are becoming increasingly common. In many cases, it is not sufficient to search for relevant videos, but rather to identify relevant clips, typically less than one minute in length, within the videos. We offer two approaches for finding information in videos. The first approach provides an automatically generated interactive multi-level summary in the form of a hypervideo. When viewing a sequence of short video clips, the user can obtain more detail on the clip being watched. For situations where browsing is impractical, we present a video search system with a flexible user interface that incorporates dynamic visualizations of the underlying multimedia objects. The system employs automatic story segmentation, and displays the results of text and image-based queries in ranked sets of story summaries. Both approaches help users to quickly drill down to potentially relevant video clips and to determine the relevance by visually inspecting the material.
2005
Publication Details
  • INTERACT 2005, LNCS 3585, pp. 781-794
  • Sep 12, 2005

Abstract

Close
A video database can contain a large number of videos ranging from several minutes to several hours in length. Typically, it is not sufficient to search just for relevant videos, because the task still remains to find the relevant clip, typically less than one minute of length, within the video. This makes it important to direct the users attention to the most promising material and to indicate what material they already investigated. Based on this premise, we created a video search system with a powerful and flexible user interface that incorporates dynamic visualizations of the underlying multimedia objects. The system employes an automatic story segmentation, combines text and visual search, and displays search results in ranked sets of story keyframe collages. By adapting the keyframe collages based on query relevance and indicating which portions of the video have already been explored, we enable users to quickly find relevant sections. We tested our system as part of the NIST TRECVID interactive search evaluation, and found that our user interface enabled users to find more relevant results within the allotted time than other systems employing more sophisticated analysis techniques but less helpful user interfaces.
Publication Details
  • Sixteenth ACM Conference on Hypertext and Hypermedia
  • Sep 6, 2005

Abstract

Close
Hyper-Hitchcock is a hypervideo editor enabling the direct manipulation authoring of a particular form of hypervideo called "detail-on-demand video." This form of hypervideo allows a single link out of the currently playing video to provide more details on the content currently being presented. The editor includes a workspace to select, group, and arrange video clips into several linear sequences. Navigational links placed between the video elements are assigned labels and return behaviors appropriate to the goals of the hypervideo and the role of the destination video. Hyper-Hitchcock was used by students in a Computers and New Media class to author hypervideos on a variety of topics. The produced hypervideos provide examples of hypervideo structures and the link properties and behaviors needed to support them. Feedback from students identified additional link behaviors and features required to support new hypervideo genres. This feedback is valuable for the redesign of Hyper-Hitchcock and the design of hypervideo editors in general.
Publication Details
  • ACM Transactions on Multimedia Computing, Communications, and Applications
  • Aug 8, 2005

Abstract

Close
Organizing digital photograph collections according to events such as holiday gatherings or vacations is a common practice among photographers. To support photographers in this task, we present similarity-based methods to cluster digital photos by time and image content. The approach is general, unsupervised, and makes minimal assumptions regarding the structure or statistics of the photo collection. We present several variants of an automatic unsupervised algorithm to partition a collection of digital photographs based either on temporal similarity alone, or on temporal and content-based similarity. First, inter-photo similarity is quantified at multiple temporal scales to identify likely event clusters. Second, the final clusters are determined according to one of three clustering goodness criteria. The clustering criteria trade off computational complexity and performance. We also describe a supervised clustering method based on learning vector quantization. Finally, we review the results of an experimental evaluation of the proposed algorithms and existing approaches on two test collections.
Publication Details
  • International Conference on Image and Video Retrieval 2005
  • Jul 21, 2005

Abstract

Close
Large video collections present a unique set of challenges to the search system designer. Text transcripts do not always provide an accurate index to the visual content, and the performance of visually based semantic extraction techniques is often inadequate for search tasks. The searcher must be relied upon to provide detailed judgment of the relevance of specific video segments. We describe a video search system that facilitates this user task by efficiently presenting search results in semantically meaningful units to simplify exploration of query results and query reformulation. We employ a story segmentation system and supporting user interface elements to effectively present query results at the story level. The system was tested in the 2004 TRECVID interactive search evaluations with very positive results.
Publication Details
  • CHI 2005 Extended Abstracts, ACM Press, pp. 1395-1398
  • Apr 1, 2005

Abstract

Close
We present a search interface for large video collections with time-aligned text transcripts. The system is designed for users such as intelligence analysts that need to quickly find video clips relevant to a topic expressed in text and images. A key component of the system is a powerful and flexible user interface that incorporates dynamic visualizations of the underlying multimedia objects. The interface displays search results in ranked sets of story keyframe collages, and lets users explore the shots in a story. By adapting the keyframe collages based on query relevance and indicating which portions of the video have already been explored, we enable users to quickly find relevant sections. We tested our system as part of the NIST TRECVID interactive search evaluation, and found that our user interface enabled users to find more relevant results within the allotted time than those of many systems employing more sophisticated analysis techniques.
2004
Publication Details
  • UIST 2004 Companion, pp. 37-38
  • Oct 24, 2004

Abstract

Close
As the size of the typical personal digital photo collection reaches well into the thousands or photos, advanced tools to manage these large collections are more and more necessary. In this demonstration, we present a semi-automatic approach that opportunistically takes advantage of the current state-of-the-art technology in face detection and recognition and combines it with user interface techniques to facilitate the task of labeling people in photos. We show how we use an accurate face detector to automatically extract faces from photos. Instead of having a less accurate face recognizer classify faces, we use it to sort faces by their similarity to a face model. We demonstrate our photo application that uses the extracted faces as UI proxies for actions on the underlying photos along with the sorting strategy to identify candidate faces for quick and easy face labeling.
Publication Details
  • Proceedings of the International Workshop on Multimedia Information Retrieval, ACM Press, pp. 99-106
  • Oct 10, 2004

Abstract

Close
With digital still cameras, users can easily collect thousands of photos. We have created a photo management application with the goal of making photo organization and browsing simple and quick, even for very large collections. A particular concern is the management of photos depicting people. We present a semi-automatic approach designed to facilitate the task of labeling photos with people that opportunistically takes advantage of the strengths of current state-of-the-art technology in face detection and recognition. In particular, an accurate face detector is used to automatically extract faces from photos while the less accurate face recognizer is used not to classify the detected faces, but to sort faces by their similarity to a chosen model. This sorting is used to present candidate faces within a user interface designed for quick and easy face labeling. We present results of a simulation of the usage model that demonstrate the improved ease that is achieved by our method.
Publication Details
  • Proceedings of the Working Conference on Advanced Visual Interfaces, AVI 2004, pp. 290-297
  • May 25, 2004

Abstract

Close
We introduced detail-on-demand video as a simple type of hypervideo that allows users to watch short video segments and to follow hyperlinks to see additional detail. Such video lets users quickly access desired information without having to view the entire contents linearly. A challenge for presenting this type of video is to provide users with the appropriate affordances to understand the hypervideo structure and to navigate it effectively. Another challenge is to give authors tools that allow them to create good detail-on-demand video. Guided by user feedback, we iterated designs for a detail-on-demand video player. We also conducted two user studies to gain insight into people's understanding of hypervideo and to improve the user interface. We found that the interface design was tightly coupled to understanding hypervideo structure and that different designs greatly affected what parts of the video people accessed. The studies also suggested new guidelines for hypervideo authoring.

MiniMedia Surfer: Browsing Video Segments on Small Displays

Publication Details
  • CHI 2004 short paper
  • Apr 27, 2004

Abstract

Close
It is challenging to browse multimedia on mobile devices with small displays. We present MiniMedia Surfer, a prototype application for interactively searching a multimedia collection for video segments of interest. Transparent layers are used to support browsing subtasks: keyword query, exploration of results through keyframes, and playback of video. This layered interface smoothly blends the key tasks of the browsing process and deals with the small screen size. During exploration, the user can adjust the transparency levels of the layers using pen gestures. Details of the video segments are displayed in an expandable timeline that supports gestural interaction.
2003
Publication Details
  • Proc. ACM Multimedia 2003. pp. 364-373
  • Nov 1, 2003

Abstract

Close
We present similarity-based methods to cluster digital photos by time and image content. The approach is general, unsupervised, and makes minimal assumptions regarding the structure or statistics of the photo collection. We present results for the algorithm based solely on temporal similarity, and jointly on temporal and content-based similarity. We also describe a supervised algorithm based on learning vector quantization. Finally, we include experimental results for the proposed algorithms and several competing approaches on two test collections.
Publication Details
  • Proc. ACM Multimedia 2003, pp. 546-554
  • Nov 1, 2003

Abstract

Close
We present a system that allows remote and local participants to control devices in a meeting environment using mouse or pen based gestures "through" video windows. Unlike state-of-the-art device control interfaces that require interaction with text commands, buttons, or other artificial symbols, our approach allows users to interact with devices through live video of the environment. This naturally extends our video supported pan/tilt/zoom (PTZ) camera control system, by allowing gestures in video windows to control not only PTZ cameras, but also other devices visible in video images. For example, an authorized meeting participant can show a presentation on a screen by dragging the file on a personal laptop and dropping it on the video image of the presentation screen. This paper presents the system architecture, implementation tradeoffs, and various meeting control scenarios.
Publication Details
  • Proc. ACM Multimedia 2003. pp. 92-93
  • Nov 1, 2003

Abstract

Close
To simplify the process of editing interactive video, we developed the concept of "detail-on-demand" video as a subset of general hypervideo. Detail-on-demand video keeps the authoring and viewing interfaces relatively simple while supporting a wide range of interactive video applications. Our editor, Hyper-Hitchcock, provides a direct manipulation environment in which authors can combine video clips and place hyperlinks between them. To summarize a video, Hyper-Hitchcock can also automatically generate a hypervideo composed of multiple video summary levels and navigational links between these summaries and the original video. Viewers may interactively select the amount of detail they see, access more detailed summaries, and navigate to the source video through the summary.
Publication Details
  • Proc. ACM Multimedia 2003. pp. 392-401
  • Nov 1, 2003

Abstract

Close
In this paper, we describe how a detail-on-demand representation for interactive video is used in video summarization. Our approach automatically generates a hypervideo composed of multiple video summary levels and navigational links between these summaries and the original video. Viewers may interactively select the amount of detail they see, access more detailed summaries, and navigate to the source video through the summary. We created a representation for interactive video that supports a wide range of interactive video applications and Hyper-Hitchcock, an editor and player for this type of interactive video. Hyper-Hitchcock employs methods to determine (1) the number and length of levels in the hypervideo summary, (2) the video clips for each level in the hypervideo, (3) the grouping of clips into composites, and (4) the links between elements in the summary. These decisions are based on an inferred quality of video segments and temporal relations those segments.

Detail-on-Demand Hypervideo

Publication Details
  • Proc. ACM Multimedia 2003. pp. 600-601
  • Nov 1, 2003

Abstract

Close
We demonstrate the use of detail-on-demand hypervideo in interactive training and video summarization. Detail-on-demand video allows viewers to watch short video segments and to follow hyperlinks to see additional detail. The player for detail-ondemand video displays keyframes indicating what links are available at each point in the video. The Hyper-Hitchcock authoring tool helps users create hypervideo by automatically dividing video into clips that can be combined in a direct manipulation interface. Clips can be grouped into composites and hyperlinks can be placed between clips and composites. A summarization algorithm creates multi-level hypervideo summaries from linear video by automatically selecting clips and placing links between them.
Publication Details
  • SPIE Information Technologies and Communications
  • Sep 9, 2003

Abstract

Close
Hypervideo is a form of interactive video that allows users to follow links to other video. A simple form of hypervideo, called "detail-on-demand video," provides at most one link from one segment of video to another, supporting a singlebutton interaction. Detail-on-demand video is well suited for interactive video summaries, because the user can request a more detailed summary while watching the video. Users interact with the video is through a special hypervideo player that displays keyframes with labels indicating when a link is available. While detail-on-demand summaries can be manually authored, it is a time-consuming task. To address this issue, we developed an algorithm to automatically generate multi-level hypervideo summaries. The highest level of the summary consists of the most important clip from each take or scene in the video. At each subsequent level, more clips from each take or scene are added in order of their importance. We give one example in which a hypervideo summary is created for a linear training video. We also show how the algorithm can be modified to produce a hypervideo summary for home video.
Publication Details
  • Human-Computer Interaction INTERACT '03, IOS Press, pp. 33-40
  • Sep 1, 2003

Abstract

Close
To simplify the process of editing interactive video, we developed the concept of "detail-on-demand" video as a subset of general hypervideo where a single button press reveals additional information about the current video sequence. Detail-on-demand video keeps the authoring and viewing interfaces relatively simple while supporting a wide range of interactive video applications. Our editor, Hyper-Hitchcock, builds on prior work on automatic analysis to find the best quality video clips. It introduces video composites as an abstraction for grouping and manipulating sets of video clips. Navigational links can be created between any two video clips or composites. Such links offer a variety of return behaviors for when the linked video is completed that can be tailored to different materials. Initial impressions from a pilot study indicate that Hyper-Hitchcock is easy to learn although the behavior of links is not immediately intuitive for all users.
Publication Details
  • Human-Computer Interaction INTERACT '03, IOS Press, pp. 196-203
  • Sep 1, 2003

Abstract

Close
With digital still cameras, users can easily collect thousands of photos. Our goal is to make organizing and browsing photos simple and quick, while retaining scalability to large collections. To that end, we created a photo management application concentrating on areas that improve the overall experience without neglecting the mundane components of such an application. Our application automatically divides photos into meaningful events such as birthdays or trips. Several user interaction mechanisms enhance the user experience when organizing photos. Our application combines a light table for showing thumbnails of the entire photo collection with a tree view that supports navigating, sorting, and filtering photos by categories such as dates, events, people, and locations. A calendar view visualizes photos over time and allows for the quick assignment of dates to scanned photos. We fine-tuned our application by using it with large personal photo collections provided by several users.
Publication Details
  • Proceedings of Hypertext '03, pp. 124-125
  • Aug 26, 2003

Abstract

Close
Existing hypertext systems have emphasized either the navigational or spatial expression of relationships between objects. We are exploring the combination of these modes of expression in Hyper-Hitchcock, a hypervideo editor. Hyper-Hitchcock supports a form of hypervideo called "detail-on-demand video" due to its applicability to situations where viewers need to take a link to view more details on the content currently being presented. Authors of detail-on-demand video select, group, and spatially arrange video clips into linear sequences in a two-dimensional workspace. Hyper-Hitchcock uses a simple spatial parser to determine the temporal order of selected video clips. Authors add navigational links between the elements in those sequences. This combination of navigational and spatial hypertext modes of expression separates the clip sequence from the navigational structure of the hypervideo. Such a combination can be useful in cases where multiple forms of inter-object relationships must be expressed on the same content.
Publication Details
  • IEEE International Conference on Multimedia and Expo, v. II, pp. 753-756
  • Jul 7, 2003

Abstract

Close
We created an alternative approach to existing video summaries that gives viewers control over the summaries by selecting hyperlinks to other video with additional information. We structure such summaries as "detail-on-demand" video, a subset of general hypervideo in which at most one link to another video sequence is available at any given time. Our editor for such video, Hyper-Hitchcock, provides a workspace in which an author can select and arrange video clips, generate composites from clips and from other composites, and place links between composites. To simplify dealing with a large number of clips, Hyper-Hitchcock generates iconic representations for composites that can be used to manipulate the composite as a whole. In addition to providing an authoring environment, Hyper-Hitchcock can automatically generate multi-level hypervideo summaries for immediate use or as the starting point for author modification.