Asynchronous collaborators often use freeform ink annotations to point to visually salient perceptual features of line charts such as peaks or humps, valleys, rising slopes and declining slopes. We present a set of techniques for interpreting such annotations to algorithmically identify the corresponding perceptual parts. Our approach is to first apply a parts-based segmentation algorithm that identifies the visually salient perceptual parts in the chart. Our system then analyzes the freeform annotations to infer the corresponding peaks, valleys or sloping segments. Once the system has identified the perceptual parts it can highlight them to draw further attention and reduce ambiguity of interpretation in asynchronous collaborative discussions.
Panning and zooming interfaces for exploring very large images containing billions of pixels (gigapixel images) have recently appeared on the internet. This paper addresses issues that arise when creating and rendering auditory and textual annotations for such images. In particular, we define a distance metric between each annotation and any view resulting from panning and zooming on the image. The distance then informs the rendering of audio annotations and text labels. We demonstrate the annotation system on a number of panoramic images.
We explore the use of tracked 2D object motion to enable novel approaches to interacting with video. These include moving annotations, video navigation by direct manipulation of objects, and creating an image composite from multiple video frames. Features in the video are automatically tracked and grouped in an off-line preprocess that enables later interactive manipulation. Examples of annotations include speech and thought balloons, video graffiti, path arrows, video hyperlinks, and schematic storyboards. We also demonstrate a direct-manipulation interface for random frame access using spatial constraints, and a drag-and-drop interface for assembling still images from videos. Taken together, our tools can be employed in a variety of applications including film and video editing, visual tagging, and authoring rich media such as hyperlinked video.