

We present Sikuli, a visual approach to search and automation of graphical user interfaces using screenshots. Sikuli allows users to take a screenshot of a GUI element (such as a toolbar button, icon, or dialog box) and query a help system using the screenshot instead of the element's name. Sikuli also provides a visual scripting API for automating GUI interactions, using screenshot patterns to direct mouse and keyboard events. We report a web-based user study showing that searching by screenshot is easy to learn and faster to specify than keywords. We also demonstrate several automation tasks suitable for visual scripting, such as map navigation and bus tracking, and show how visual scripting can improve interactive help systems previously proposed in the literature.

We explore the use of abstracted screenshots as part of a new help interface. Graphstract, an implementation of a graphical help system, extends the ideas of textually oriented Minimal Manuals to the use of screenshots, allowing multiple small graphical elements to be shown in a limited space. This allows a user to get an overview of a complex sequential task as a whole. The ideas have been developed by three iterations of prototyping and evaluation. A user study shows that Graphstract helps users perform tasks faster on some but not all tasks. Due to their graphical nature, it is possible to construct Graphstracts automatically from pre-recorded interactions. A second study shows that automated capture and replay is a low-cost method for authoring Graphstracts, and the resultant help is as understandable as manually constructed help.