Searching...
Sunday, October 07, 08:00 - 17:00
Doctoral Symposium (by invitation only)
Interactions Speak Louder Than Words: Shared User Models and Adaptive Interfaces - Doctoral Symposium
Abstract » Touch-screens are becoming increasingly ubiquitous. They have great appeal due to their capabilities to support new forms of human interaction, including their abilities to interpret rich gestural inputs, render flexible user interfaces and enable multi-user interactions. However, the technology creates new challenges and barriers for users with limited levels of vision and motor abilities. The PhD work described in this paper proposes a technique combining Shared User Models (SUM) and adaptive interfaces to improve the accessibility of touch-screen devices for people with low levels of vision and motor ability. SUM, built from an individual’s interaction data across multiple applications and devices, is used to infer new knowledge of their abilities and characteristics, without the need for continuous calibration exercises or user configurations. This approach has been realized through the development of an open source software framework to support the creation of applications that make use of SUM to adapt interfaces that match the needs of individual users.
An Interface Agent for Non-Visual, Accessible Web Automation - Doctoral Symposium
Abstract » The Web is far less usable and accessible for the users with visual impairments than it is for the sighted people. Web automation has the potential to bridge the divide between the ways visually impaired people and sighted people access the Web, and enable visually impaired users to breeze through Web browsing tasks that beforehand were slow, hard, or even impossible to achieve. Typical automation interfaces require that the user record a macro, a useful sequence of browsing steps, so that these steps can be re-played in the future. In this paper, I present a high-level overview of an approach that enables users to find quickly relevant information on the webpage, and automate browsing without recording macros. This approach is potentially useful both for visually impaired, and sighted users.
Medical Operating Documents: Dynamic Checklists Improve Crisis Attention - Doctoral Symposium
Abstract » The attentional aspects of crisis computing—supporting highly trained teams as they respond to real-life emergencies—have been underexplored in the user interface community. My research investigates the development of interactive software systems that support crisis teams, with an eye towards intelligently managing attention. In this paper, I briefly describe MDOCS, a Medical operating DOCuments System built for time-critical interaction. MDOCS is a multi-user, multi-surface software system that implements dynamic checklists and interactive cognitive aids written to support medical crisis teams. I present the results of a study that evaluates the deployment of MDOCS in a realistic, mannequin-based medical simulator used by anesthesiologists. I propose controlled laboratory experiments that evaluate the feasibility and effectiveness of our design principles and attentional interaction techniques.
Spatial augmented reality to enhance physical artistic creation - Doctoral Symposium
Abstract » Spatial augmented reality (SAR) promises the integration of digital information
in the real (physical) world through projection. In this doctoral symposium paper,I propose different tools to improve speed or ease the drawing by projecting photos, virtual construction lines and interactive 3D scenes. After describing the tools, I explain some future challenges to explore such as the creation of tools which helps to create
drawings that are “difficult” to achieve for a human being, but easy to do by a computer. Furthermore, I propose some insights for the creation of digital games and programs which can take full advantages of physical drawings.
in the real (physical) world through projection. In this doctoral symposium paper,I propose different tools to improve speed or ease the drawing by projecting photos, virtual construction lines and interactive 3D scenes. After describing the tools, I explain some future challenges to explore such as the creation of tools which helps to create
drawings that are “difficult” to achieve for a human being, but easy to do by a computer. Furthermore, I propose some insights for the creation of digital games and programs which can take full advantages of physical drawings.
Machine Learning Models for Uncertain Interaction - Doctoral Symposium
Abstract » As interaction methods beyond the static mouse and keyboard setup of the desktop era - such as touch, gesture sensing, and visual tracking - become more common, existing interaction paradigms are no longer good enough. These new modalities have high uncertainty, and conventional interfaces are not designed to reflect this. Research has shown that modelling uncertainty can improve the quality of interaction with these systems. Machine learning offers a rich set of tools to make probabilistic inferences in uncertain systems - this is the focus of my thesis work. In particular, I'm interested in making inferences at the sensor level and propagating uncertainty forward appropriately to applications. In this paper I describe a probabilistic model for touch interaction, and discuss how I intend to use the uncertainty in this model to improve typing accuracy on a soft keyboard. The model described here lays the groundwork for a rich framework for interaction in the presence of uncertainty, incorporating data from multiple sensors to make more accurate inferences about the goals of users, and allowing systems to adapt smoothly and appropriately to their context of use.
Towards Document Engineering on Pen and Touch-Operated Interactive Tabletops - Doctoral Symposium
Abstract » Touch interfaces have now become mainstream thanks to modern smartphones and tablets. However, there are still very few “productivity” applications, i.e. tools that support mundane but essential work, especially for large interactive surfaces such as digital tabletops. This work aims to partly fill the relative void in the area of document engineering by investigating what kind of intuitive and efficient tools can be provided to support the manipulation of documents on a digital workdesk, in particular the creation and editing of documents. The fundamental interaction model relies on bimanual pen and multitouch input, which was recently introduced to tabletops and enables richer interaction possibilities. The goal is ultimately to provide useful and highly accessible UIs for document-centric applications, whose design principles will hopefully pave the way from DTP towards DTTP (Digital Tabletop Publishing).
Closing the Loop Between Intentions and Actions - Doctoral Symposium
Abstract » In this document, I propose systems that aim to minimize the gap between intentions and the corresponding actions under different scenarios. The gap exists because of many reasons like subjective mapping between the two, lack of resources to implement the action, or inherent noise in the physical processes.
The proposed system observes the action and infers the intention behind it. The system then generates a refined action using the inference. The inferred intention and the refined action are then provided as feedback to the user who
can then perform corrective actions or choose the refined action as it is as the desired result. I demonstrate the design and implementation of such systems through five projects - Image Deblurring, Tracking Block Model Assembly, Animating with Physical Proxies, What Affects Handwriting and Spying on the Writer.
The proposed system observes the action and infers the intention behind it. The system then generates a refined action using the inference. The inferred intention and the refined action are then provided as feedback to the user who
can then perform corrective actions or choose the refined action as it is as the desired result. I demonstrate the design and implementation of such systems through five projects - Image Deblurring, Tracking Block Model Assembly, Animating with Physical Proxies, What Affects Handwriting and Spying on the Writer.
Data-Driven Interactions for Web Design - Doctoral Symposium
Abstract » This thesis describes how data-driven approaches to Web design problems can
enable useful interactions for designers. It presents three machine learning
applications which enable new interaction mechanisms for Web design: rapid
retargeting between page designs, scalable design search, and generative
probabilistic model induction to support design interactions cast as
probabilistic inference. It also presents a scalable architecture for efficient
data-mining on Web designs, which supports these three applications.
enable useful interactions for designers. It presents three machine learning
applications which enable new interaction mechanisms for Web design: rapid
retargeting between page designs, scalable design search, and generative
probabilistic model induction to support design interactions cast as
probabilistic inference. It also presents a scalable architecture for efficient
data-mining on Web designs, which supports these three applications.
Sunday, October 07, 17:00 - 21:00
Registration & Welcome Reception
Monday, October 08, 09:00 - 10:15
Keynote
Chair: Rob Miller, MIT CSAIL, USA
Abstract »
Artists have been doing experiments on vision longer than neurobiologists. Some major works of art have provided insights as to how we see; some of these insights are so fundamental that they can be understood in terms of the underlying neurobiology. For example, artists have long realized that color and luminance can play independent roles in visual perception. Picasso said, "Colors are only symbols. Reality is to be found in luminance alone." This observation has a parallel in the functional subdivision of our visual systems, where color and luminance are processed by the newer, primate-specific What system, and the older, colorblind, Where (or How) system. Many techniques developed over the centuries by artists can be understood in terms of the parallel organization of our visual systems. I will explore how the segregation of color and luminance processing are the basis for why some Impressionist paintings seem to shimmer, why some op art paintings seem to move, some principles of Matisse's use of color, and how the Impressionists painted "air". Central and peripheral vision are distinct, and I will show how the differences in resolution across our visual field make the Mona Lisa's smile elusive, and produce a dynamic illusion in Pointillist paintings, Chuck Close paintings, and photomosaics. I will explore how artists have intuited important features about how our brains extract relevant information about faces and objects, and I will discuss why learning disabilities may be associated with artistic talent.
Bio »
Margaret Livingstone is Professor of Neurobiology at Harvard Medical School. She has done research on hormones and behavior, learning, dyslexia, and vision. Livingstone has explored the ways in which vision science can understand and inform the world of visual art. She has written a popular lay book, Vision and Art, which has brought her acclaim in the art world as a scientist who can communicate with artists and art historians, with mutual benefit. She generated some important insights into the field, including a simple explanation for the elusive quality of the Mona Lisa's smile (it is more visible to peripheral vision than to central vision) and the fact that Rembrandt, like a surprisingly large number of famous artists, was likely to have been stereoblind.
Monday, October 08, 10:15 - 11:00
Break
Monday, October 08, 11:00 - 12:40
Groups & Crowds
Chair: Meredith Ringel Morris, Microsoft Research, USA
Abstract » Smartphones are convenient, but their small screens make searching, clicking, and reading awkward. Thus, perusing product reviews on a smartphone is difficult. In response, we introduce RevMiner---a novel smartphone interface that utilizes Natural Language Processing techniques to analyze and navigate reviews. RevMiner was run over 300K Yelp restaurant reviews extracting attribute-value pairs, where attributes represent restaurant attributes such as sushi and service, and values represent opinions about the attributes such as fresh or fast. These pairs were aggregated and used to: 1) answer queries such as "cheap Indian food", 2) concisely present information about each restaurant, and 3) identify similar restaurants. Our user studies demonstrate that on a smartphone, participants preferred RevMiner's interface to tag clouds and color bars, and that they preferred RevMiner's results to Yelp's, particularly for conjunctive queries (e.g., "great food and huge portions"). Demonstrations of RevMiner are available at revminer.com.
Abstract » GroupTogether is a system that explores cross-device interaction using two sociological constructs. First, F-formations concern the distance and relative body orientation among multiple users, which indicate when and how people position themselves as a group. Second, micro-mobility describes how people orient and tilt devices towards one another to promote fine-grained sharing during co-present collaboration. We sense these constructs using: (a) a pair of overhead Kinect depth cameras to sense small groups of people, (b) low-power 8GHz band radio modules to establish the identity, presence, and coarse-grained relative locations of devices, and (c) accelerometers to detect tilting of slate devices. The resulting system supports fluid, minimally disruptive techniques for co-located collaboration by leveraging the proxemics of people as well as the proxemics of devices.
Abstract » Real-time captioning provides deaf and hard of hearing people
immediate access to spoken language and enables participation in
dialogue with others. Low latency is critical because it allows speech
to be paired with relevant visual cues. Currently, the only reliable
source of real-time captions are expensive stenographers who must be
recruited in advance and who are trained to use specialized
keyboards. Automatic speech recognition (ASR) is less expensive and
available on-demand, but its low accuracy, high noise sensitivity, and
need for training beforehand render it unusable in real-world
situations. In this paper, we introduce a new approach in which groups
of non-expert captionists (people who can hear and type) collectively
caption speech in real-time on-demand.
We present Legion:Scribe, an end-to-end system that allows deaf people to
request captions at any time. We introduce an algorithm for merging
partial captions into a single output stream in real-time, and a
captioning interface designed to encourage coverage of the entire
audio stream. Evaluation with 20 local participants and 18 crowd
workers shows that non-experts can provide an effective solution for captioning,
accurately covering an average of 93.2% of an audio stream with only
10 workers and an average per-word latency of 2.9 seconds. More
generally, our model in which multiple workers contribute partial inputs
that are automatically merged in real-time may be extended to allow dynamic
groups to surpass constituent individuals (even experts) on a variety
of human performance tasks.
immediate access to spoken language and enables participation in
dialogue with others. Low latency is critical because it allows speech
to be paired with relevant visual cues. Currently, the only reliable
source of real-time captions are expensive stenographers who must be
recruited in advance and who are trained to use specialized
keyboards. Automatic speech recognition (ASR) is less expensive and
available on-demand, but its low accuracy, high noise sensitivity, and
need for training beforehand render it unusable in real-world
situations. In this paper, we introduce a new approach in which groups
of non-expert captionists (people who can hear and type) collectively
caption speech in real-time on-demand.
We present Legion:Scribe, an end-to-end system that allows deaf people to
request captions at any time. We introduce an algorithm for merging
partial captions into a single output stream in real-time, and a
captioning interface designed to encourage coverage of the entire
audio stream. Evaluation with 20 local participants and 18 crowd
workers shows that non-experts can provide an effective solution for captioning,
accurately covering an average of 93.2% of an audio stream with only
10 workers and an average per-word latency of 2.9 seconds. More
generally, our model in which multiple workers contribute partial inputs
that are automatically merged in real-time may be extended to allow dynamic
groups to surpass constituent individuals (even experts) on a variety
of human performance tasks.
Abstract » Interactive surfaces have great potential for co-located collaboration because of their ability to track multiple inputs simultaneously. However, the multi-user experience on these devices could be enriched significantly if touch points could be associated with a particular user. Existing approaches to user identification are intrusive, require users to stay in a fixed position, or suffer from poor accuracy. We present a non-intrusive, high-accuracy technique for mapping touches to their corresponding user in a collaborative environment. By mounting a high-resolution camera above the interactive surface, we are able to identify touches reliably without any extra instrumentation, and users are able to move around the surface at will. Our technique, which leverages the back of users' hands as identifiers, supports walk-up-and-use situations in which multiple people interact on a shared surface.
Abstract » Electronic response systems known as “clickers” have demonstrated educational benefits in well-resourced classrooms, but remain out-of-reach for most schools due to their prohibitive cost. We propose a new, low-cost technique that utilizes computer vision for real-time polling of a classroom. Our approach allows teachers to ask a multiple-choice question. Students respond by holding up a qCard: a sheet of paper that contains a printed code, similar to a QR code, encoding their student IDs. Students indicate their answers (A, B, C or D) by holding the card in one of four orientations. Using a laptop and an off-the-shelf webcam, our software automatically recognizes and aggregates the students’ responses and displays them to the teacher. We built this system and performed initial trials in secondary schools in Bangalore, India. In a 25-student classroom, our system offers 99.8% recognition accuracy, captures 97% of responses within 10 seconds, and costs 15 times less than existing electronic solutions.
Abstract » Crowdsourcing has become a powerful paradigm for ac-complishing work quickly and at scale, but involves significant challenges in quality control. Researchers have developed algorithmic quality control approaches based on either worker outputs (such as gold standards or worker agreement) or worker behavior (such as task fingerprinting), but each approach has serious limitations, especially for complex or creative work. Human evaluation addresses these limitations but does not scale well with increasing numbers of workers. We present CrowdScape, a system that supports the human evaluation of complex crowd work through interactive visualization and mixed initiative machine learning. The system combines information about worker behavior with worker outputs, helping users to better understand and harness the crowd. We describe the system and discuss its utility through grounded case studies. We explore other contexts where CrowdScape’s visualizations might be useful, such as in user studies.
Monday, October 08, 12:40 - 14:20
Lunch
Monday, October 08, 14:20 - 16:00
Tutorials & Learning
Chair: Joel Brandt, Adobe Research, USA
Abstract » Design patterns have proven useful in many creative fields, providing content creators with archetypal, reusable guidelines to leverage in projects. Creating such patterns, however, is a time-consuming, manual process, typically relegated to a few experts in any given domain. In this paper, we describe an algorithmic method for learning design patterns directly from data using techniques from natural language processing and structured concept learning. Given a set of labeled, hierarchical designs as input, we induce a probabilistic formal grammar over these exemplars. Once learned, this grammar encodes a set of generative rules for the class of designs, which can be sampled to synthesize novel artifacts. We demonstrate the method on geometric models and Web pages, and discuss how the learned patterns can drive new interaction mechanisms for content creators.
Abstract » Web chat is becoming the primary customer contact channel in customer relationship management (CRM), and Question/Answer/Lookup (QAL) is the dominant communication pattern in CRM agent-to-customer chat. Text-based web chat for QAL has two main usability problems. Chat transcripts between agents and customers are not tightly integrated into agent-side applications, requiring customer service agents to re-enter customer typed data. Also, sensitive information posted in chat sessions in plain text raises security concerns. The addition of web form widgets to web chat not only solves both of these problems but also adds new usability benefits to QAL. Forms can be defined beforehand or, more flexibly, dynamically composed. Two preliminary user studies were conducted. An agent-side study showed that adding inline forms to web chat decreased overall QAL completion time by 47 percent and increased QAL accuracy by removing all potential human errors. A customer-side study showed that web chat with inline forms is intuitive to customers.
Waken: Reverse Engineering Usage Information and Interface Structure from Software Videos - Paper - ACM
Abstract » We present Waken, an application-independent system that recognizes UI components and activities from screen captured videos, without any prior knowledge of that application. Waken can identify the cursors, icons, menus, and tooltips that an application contains, and when those items are used. Waken uses frame differencing to identify occurrences of behaviors that are common across graphical user interfaces. Candidate templates are built, and then other occurrences of those templates are identified using a multi-phase algorithm. An evaluation demonstrates that the system can successfully reconstruct many aspects of a UI without any prior application-dependant knowledge. To showcase the design opportunities that are introduced by having this additional meta-data, we present the Waken Video Player, which allows users to directly interact with UI components that are displayed in the video.
Abstract » Users of complex software applications often learn concepts and skills through step-by-step tutorials. Today, these tutorials are published in two dominant forms: static tutorials composed of images and text that are easy to scan, but cannot effectively describe dynamic interactions; and video tutorials that show all manipulations in detail, but are hard to navigate. We hypothesize that a mixed tutorial with static instructions and per-step videos can combine the benefits of both formats. We describe a comparative study of static, video, and mixed image manipulation tutorials with 12 participants and distill design guidelines for mixed tutorials. We present MixT, a system that automatically generates step-by-step mixed media tutorials from user demonstrations. MixT segments screencapture video into steps using logs of application commands and input events, applies video compositing techniques to focus on salient infor-mation, and highlights interactions through mouse trails. Informal evaluation suggests that automatically generated mixed media tutorials were as effective in helping users complete tasks as tutorials that were created manually.
Abstract » We present GamiCAD, a gamified in-product, interactive tutorial system for first time AutoCAD users. We introduce a software event driven finite state machine to model a user’s progress through a tutorial, which allows the system to provide real-time feedback and recognize success and failures. GamiCAD provides extensive real-time visual and audio feedback that has not been explored before in the context of software tutorials. We perform an empirical evaluation of GamiCAD, comparing it to an equivalent in-product tutorial system without the gamified components. In an evaluation, users using the gamified system reported higher subjective engagement levels and performed a set of testing tasks faster with a higher completion ratio.
Abstract » Powerful image editing software like Adobe Photoshop and
GIMP have complex interfaces that can be hard to master. To
help users perform image editing tasks, we introduce tutorial-based
applications (tapps) that retain the step-by-step structure
and descriptive text of tutorials but can also automatically
apply tutorial steps to new images. Thus, tapps can
be used to batch process many images automatically, similar
to traditional macros. Tapps also support interactive exploration
of parameters, automatic variations, and direct manipulation
(e.g., selection, brushing). Another key feature of
tapps is that they execute on remote instances of Photoshop,
which allows users to edit their images on any Web-enabled
device. We demonstrate a working prototype system called
TappCloud for creating, managing and using tapps. Initial
user feedback indicates support for both the interactive features
of tapps and their ability to automate image editing. We
conclude with a discussion of approaches and challenges of
pushing monolithic direct-manipulation GUIs to the cloud.
GIMP have complex interfaces that can be hard to master. To
help users perform image editing tasks, we introduce tutorial-based
applications (tapps) that retain the step-by-step structure
and descriptive text of tutorials but can also automatically
apply tutorial steps to new images. Thus, tapps can
be used to batch process many images automatically, similar
to traditional macros. Tapps also support interactive exploration
of parameters, automatic variations, and direct manipulation
(e.g., selection, brushing). Another key feature of
tapps is that they execute on remote instances of Photoshop,
which allows users to edit their images on any Web-enabled
device. We demonstrate a working prototype system called
TappCloud for creating, managing and using tapps. Initial
user feedback indicates support for both the interactive features
of tapps and their ability to automate image editing. We
conclude with a discussion of approaches and challenges of
pushing monolithic direct-manipulation GUIs to the cloud.
Monday, October 08, 16:00 - 16:40
Break
Monday, October 08, 16:40 - 18:20
Hands & Fingers
Chair: Chris Harrison, Carnegie Mellon University, USA
Abstract » We present Facet, a multi-display wrist worn system consist- ing of multiple independent touch-sensitive segments joined into a bracelet. Facet automatically determines the pose of the system as a whole and of each segment individually. It further sup- ports multi-segment touch, yielding a rich set of touch input techniques. Our work builds on these two primitives to allow the user to control how applications use segments alone and in coordination. Applications can expand to use more seg- ments, collapses to encompass fewer, and be swapped with other segments. We also explore how the concepts from Facet could apply to other devices in this design space.
Abstract » We present the iRing, an intelligent input ring device developed for measuring finger gestures and external input. iRing recognizes rotation, finger bending, and external force via an infrared (IR) reflection sensor that leverages skin characteristics such as reflectance and softness. Furthermore, iRing allows using a push and stroke input method, which is popular in touch displays. The ring design has potential to be used as a wearable controller because its accessory shape is socially acceptable, easy to install, and safe, and iRing does not require extra devices. We present examples of iRing applications and discuss its validity as an inexpensive wearable interface and as a human sensing device.
Abstract » Gesture keyboards represent an increasingly popular way to input text on mobile devices today. However, current gesture keyboards are exclusively unimanual. To take advantage of the capability of modern multi-touch screens, we created a novel bimanual gesture text entry system, extending the gesture keyboard paradigm from one finger to multiple fingers. To address the complexity of recognizing bimanual gesture, we designed and implemented two related interaction methods, finger-release and space-required, both based on a new multi-stroke gesture recognition algorithm. A formal experiment showed that bimanual gesture behaviors were easy to learn. They improved comfort and reduced the physical demand relative to unimanual gestures on tablets. The results indicated that these new gesture keyboards were valuable complements to unimanual gesture and regular typing keyboards.
Abstract » We present Magic Finger, a small device worn on the fingertip, which supports always-available input. Magic Finger inverts the typical relationship between the finger and an interactive surface: with Magic Finger, we instrument the user’s finger itself, rather than the surface it is touching. Magic Finger senses touch through an optical mouse sensor, enabling any surface to act as a touch screen. Magic Finger also senses texture through a micro RGB camera, allowing contextual actions to be carried out based on the particular surface being touched. A technical evaluation shows that Magic Finger can accurately sense 22 textures with an accuracy of 98.9%. We explore the interaction design space enabled by Magic Finger, and implement a number of novel interaction techniques that leverage its unique capabilities.
Abstract » There is a growing body of work in HCI on the design of communication technologies to help support lovers in long distance relationships. We build upon this work by presenting an exploratory study of hand-holding prototypes. Our work distinguishes itself by basing distance communication metaphors on elements of familiar, simple co-located behaviours. We argue that the combined evocative power of unique co-created physical representations of the absent other can be used by separated lovers to generate powerful and positive experiences, in turn sustaining romantic connections at a distance.
Abstract » Digits is a wrist-worn sensor that recovers the full 3D pose of the user's hand. This enables a variety of freehand interactions on the move. The system targets mobile settings, and is specifically designed to be low-power and easily reproducible using only off-the-shelf hardware. The electronics are self-contained on the user's wrist, but optically image the entirety of the user's hand. This data is processed using a new pipeline that robustly samples key parts of the hand, such as the tips and lower regions of each finger. These sparse samples are fed into a new kinematics model that leverages the biomechanical constraints of the hand to recover the 3D pose of user's hand. The proposed system works without the need for full instrumentation of the hand (for example using data gloves), additional sensors in the environment, or depth cameras which are currently prohibitive for mobile scenarios due to power and form factor considerations. We demonstrate the utility of Digits for a variety of application scenarios, including 3D spatial interaction on mobile phones, eyes-free interaction on-the-move, and gaming. We conclude with a quantitative and qualitative evaluation of our system, and discussion of strengths, limitations and future work.
Monday, October 08, 18:20 - 19:30
Break
Monday, October 08, 19:30 - 22:00
Demonstrations & Student Contest
iRotate Grasp: Automatic Screen Rotation based on Grasp of Mobile Devices - Refereed Demo
Abstract » Automatic screen rotation improves viewing experience and usability of mobile devices, but current gravity-based approaches do not support postures such as lying on one side, and manual rotation switches require explicit user input. iRotate Grasp automatically rotates screens of mobile devices to match users’ viewing orientations based on how users are grasping the devices. Our insight is that users’ grasps are consistent for each orientation, but significantly differ between different orientations. Our prototype embeds a total of 32 light sensors along the four sides and the back of an iPod Touch, and uses support vector machine (SVM) to recognize grasps at 25Hz. We collected 6-users’ usage under 54 different conditions: 1) grasping the device using left, right, and both hands, 2) scrolling, zooming and typing, 3) in portrait, landscape-left, and landscape-right orientations, and while 4) sitting and lying down on one side. Results show that our grasp-based approach is promising, and our iRotate Grasp prototype could correctly rotate the screen 90.5% of the time when training and testing on different users.
Point and Share: From Paper to Whiteboard - Refereed Demo
Abstract » Traditional writing instruments have the potential to enable new forms of interactions and collaboration though digital enhancement. This work specifically enables the user to utilize pen and paper as input mechanisms for content to be displayed on a shared interactive whiteboard. We introduce a pen cap with an infrared led, an actuator and a switch. Pointing the pen cap at the whiteboard allows users to se-lect and position a “canvas” on the whiteboard to display handwritten text while the actuator enables resizing the canvas and the text. It is conceivable that anything one can write on paper anywhere, could be displayed on an interac-tive whiteboard.
Pebbles: An Interactive Configuration Tool for Indoor Robot Navigation - Refereed Demo
Abstract » This study presents an interactive configuration tool that assists non-expert users to design specific navigation route for mobile robot in an indoor environment. The user places small active markers, called pebbles, on the floor along the desired route in order to guide the robot to the destination. The active markers establish a navigation network by communicating each other with IR beacon and the robot follows the markers to reach the designated goal. During the installation, a user can get effective feedback from LED indicators and voice prompts, so that the user can immedi-ately understand if the navigation route is appropriately configured as expected. With this tool a novice user may easily customize a mobile robot for various indoor tasks.
Speaking With the Crowd - Refereed Demo
Abstract » Automated systems are not yet able to engage in a robust dialogue with users due the complexity and ambiguity of natural language. However, humans can easily converse with one another and maintain a shared history of past interactions. In this paper, we introduce Chorus, a system that enables real-time, two-way natural language conversation between an end user and a crowd acting as a single agent. Chorus is capable of maintaining a consistent, on-topic conversation with end users across multiple sessions, despite constituent individuals perpetually joining and leaving the crowd. This is enabled by using a curated shared dialogue history.
Even though crowd members are constantly providing input, we present users with a stream of dialogue that appears to be from a single conversational partner. Experiments demonstrate that dialogue with Chorus displays elements of conversational memory and interaction consistency. Workers were able to answer 84.6% of user queries correctly, demonstrating that crowd-powered communication interfaces can serve as a robust means of interacting with software systems.
Even though crowd members are constantly providing input, we present users with a stream of dialogue that appears to be from a single conversational partner. Experiments demonstrate that dialogue with Chorus displays elements of conversational memory and interaction consistency. Workers were able to answer 84.6% of user queries correctly, demonstrating that crowd-powered communication interfaces can serve as a robust means of interacting with software systems.
TouchCast : An On-line Platform for Creation and Sharing of Tactile Content Based on Tactile Copy & Paste - Refereed Demo
Abstract » We propose TouchCast, which is an on-line platform for the creating and sharing of tactile content based on Tactile Copy & Paste. User-Generated Tactile Content refers to tactile content that is created, shared and appreciated by general Internet users. TouchCast enables users to create tactile content by applying tactile textures to existing on- line content (e.g., illustrations) and to share the created content over the network. Applied textures are scanned from real objects as audio signals and we call this technique Tactile Copy & Paste. In this study, we implement the system as a web browser add-on and to create User Generated Tactile Content.
Spatial augmented reality for physical drawing - Refereed Demo
Abstract » Spatial augmented reality (SAR) makes possible the projection of virtual environments into the real world. In this demo, we propose to demonstrate our SAR tools dedicated to the creation of physical drawings.
From the most simple tools: the projection on virtual guidelines enabling to trace lines and curves to more advanced techniques enabling stereoscopic drawing through the projection of a 3D scene. This demo presents how we can use computer graphics tools to ease the drawing, and how it will enable new kinds of physical drawings.
From the most simple tools: the projection on virtual guidelines enabling to trace lines and curves to more advanced techniques enabling stereoscopic drawing through the projection of a 3D scene. This demo presents how we can use computer graphics tools to ease the drawing, and how it will enable new kinds of physical drawings.
SlickFeel: Sliding and Clicking Haptic Feedback on a Touchscreen - Refereed Demo
Abstract » We present SlickFeel, a single haptic display setup that can deliver two distinct types of feedback to a finger on a touchscreen during typical operations of sliding and click-ing. Sliding feedback enables the sliding finger to feel in-teractive objects on a touchscreen through variations in friction. Clicking feedback provides a key-click sensation for confirming a key or button click. Two scenarios have been developed to demonstrate the utility of the two haptic effects. In the first, simple button-click scenario, a user feels the positions of four buttons on a touchscreen by slid-ing a finger over them and feels a simulated key-click sig-nal by pressing on any of the buttons. In the second scenario, the advantage of haptic feedback is demonstrated in a haptically-enhanced thumb-typing scenario. A user enters text on a touchscreen with two thumbs without having to monitor the thumbs’ locations on the screen. By integrating SlickFeel with a Kindle Fire tablet, we show that it can be used with existing mobile touchscreen devices.
Touch Sensing by Partial Shadowing of PV Module - Refereed Demo
Abstract » A novel touch sensing technique is proposed. By utilizing partial shadowing of a photovoltaic (PV) module, touch events are accurately detected. Since the PV module also works as a power source, a battery-less touch sensing de-vice is easily realized. We develop a wireless touch com-mander consisting of 6 PV modules so the user can input by using both touch and swipe actions.
The FreeD – A Handheld Digital Milling Device for Craft and Fabrication - Refereed Demo
Abstract » We present an approach to combine digital fabrication and craft that is focused on a new fabrication experience. The FreeD is a hand-held, digitally controlled, milling device. It is guided and monitored by a computer while still preserving gestural freedom. The computer intervenes only when the milling bit approaches the 3D model, which was designed beforehand, either by slowing down the spindle’s speed or by drawing back the shaft. The rest of the time it allows complete freedom, allowing the user to manipulate and shape the work in any creative way. We believe The FreeD will enable a designer to move in between the straight boundaries of established CAD systems and the free expression of handcraft.
Elastic Scroll for Multi-focus Interactions - Refereed Demo
Abstract » This paper proposes a novel and efficient multi-focus scroll interface that consists of a two-step operation using a con-tents distortion technique. The displayed content can be handled just like an elastic material that can be shrunk and stretched by a user’s fingers. In the first operation, the us-er’s dragging temporarily shows the results of the viewport transition of the scroll by elastically distorting the content. This operation allows the user to see both the newly ob-tained and the original focus on the viewport. Then, three types of simple gestures can be used to perform the second operation such as scrolling, restoring and zooming out to get the demanded focus (or foci).
Needle User Interface: A Sewing Interface Using Layered Conductive Fabrics - Refereed Demo
Abstract » Embroidery is a creative manual activity practiced by many people for a living. Such a craft demands skill and knowledge, and as it is sometimes complicated and delicate, it can be difficult for beginners to learn. We propose a system, named the Needle User Interface, which enables sewers to record and share their needlework, and receive feedback. In particular, this system can detect the position and orientation of a needle being inserted into and removed from a textile. Moreover, this system can give visual, auditory, and haptic feedback to users in real time for directing their ac-tions appropriately. In this paper, we describe the system design, the input system, and the feedback delivery mechanism.
Sawtooth planar waves for haptic feedback - Refereed Demo
Abstract » Current touchscreen technology does not provide adequate haptic feedback to the user. Mostly haptic feedback solutions for touchscreens involve either a) deforming the surface layers screen itself or b) placing actuators under the screen to vibrate it. This means that we have only limited control over where on the screen the feedback feels like it is coming from, and that we are limited to feedback that feels like movement up and down, orthogonal to the screen. In this work I demonstrate a novel technique for haptic feedback: sawtooth planar waves. In a series of paper Canny & Reznick showed that sawtooth planar waves could be used for object manipulation. Here that technique is applied to haptic feedback. By varying the input waves, from 1 one to 4 actuators, it is possible to provide feelings of motion in any planar direction to a finger at one point on the screen while providing a different sensation, or none at all, to fingers placed at several other points on the screen.
Enjoying Virtual Handcrafting with ToolDevice - Refereed Demo
Abstract » ToolDevice is a set of devices developed to help users in spatial work such as layout design and three-dimensional (3D) modeling. It consists of three components: TweezersDevice, Knife/HammerDevice, and BrushDevice, which use hand tool metaphors to help users recognize each device’s unique functions. We have developed a mixed reality (MR) 3D modeling system that imitates real-life woodworking using the TweezersDevice and the Knife/HammerDevice. In the system, users can pick up and move virtual objects with the TweezersDevice. Users can also cut and join virtual objects using the Knife/HammerDevice. By repeating these operations, users can build virtual wood models.
Tuesday, October 09, 09:00 - 10:55
Toolkits
Chair: Krzysztof Gajos, Harvard University, USA
Abstract » On the desktop, users are accustomed to having visible handles to objects that they want to organize, share, or manipulate. Web applications today feature many classes of such objects, like flight itineraries, products for sale, people, recipes, and businesses, but there are no interoperable handles for high-level semantic objects that users can grab. This paper proposes Clui, a platform for exploring a new data type, called a Webit, that provides uniform handles to rich objects. Clui uses plugins to 1) create Webits on existing pages by extracting semantic data from those pages, and 2) augmenting existing sites with drag and drop targets that accept and interpret Webits. Users drag and drop Webits between sites to transfer data, auto-fill search forms, map associated locations, or share Webits with others. Clui enables experimentation with handles to semantic objects and the standards that underlie them.
Abstract » The increasing popularity of interactive camera-based programs highlights the inadequacies of conventional IDEs in developing these programs given their distinctive attributes and workflows. We present DejaVu, an IDE enhancement that eases the development of these programs by enabling programmers to visually and continuously monitor program data in consistency with the frame-based pipeline of computer-vision programs; and to easily record, review, and reprocess temporal data to iteratively improve the processing of non-reproducible camera input. DejaVu was positively received by three experienced programmers of interactive camera-based programs in our preliminary user trial.
Abstract » This paper presents a user driven redesign of the domestic network infrastructure that draws upon a series of ethnographic studies of home networks. We present an infrastructure based around a purpose built access point that has modified the handling of protocols and services to reflect the interactive needs of the home. The developed infrastructure offers a novel measurement framework that allows a broad range of infrastructure information to be easily captured and made available to interactive applications. This is complemented by a diverse set of novel interactive control mechanisms and interfaces for the underlying infrastructure. We also briefly reflect on the technical and user issues arising from deployments.
DataPlay: Interactive Tweaking and Example-driven Correction of Graphical Database Queries - Paper - ACM
Abstract » Writing complex queries in SQL is a challenge for users. Prior work has developed several techniques to ease query specification but none of these techniques are applicable to a particularly difficult class of queries: quantified queries. Our hypothesis is that users prefer to specify quantified queries interactively by trial-and-error. We identify two impediments to this form of interactive trial-and-error query specification in SQL: (i) changing quantifiers often requires global syntactical query restructuring, and (ii) the absence of non-answers from SQL’s results makes verifying query correctness difficult. We remedy these issues with DataPlay, a query tool with an underlying graphical query language, a unique data model and a graphical interface. DataPlay provides two interaction features that support trial-and-error query specification. First, DataPlay allows users to directly manipulate a graphical query by changing quantifiers and modifying dependencies between constraints. Users receive real-time feedback in the form of updated answers and non-answers. Second, DataPlay can auto-correct a user’s query, based on user feedback about which tuples to keep or drop from the answers and non-answers. We evaluated the effectiveness of each interaction feature with a user study and we found that direct query manipulation is more effective than auto-correction for simple queries but auto-correction is more effective than direct query manipulation for more complex queries.
SnipMatch: Using Source Code Context to Enhance Snippet Retrieval and Parameterization - Paper - ACM
Abstract » Programmers routinely use source code snippets to increase their productivity. However, locating and adapting code snippets to the current context still takes time: for example, variables must be renamed, and dependencies included. We believe that when programmers decide to invest time in creating a new code snippet from scratch, they would also be willing to spend additional effort to make that code snippet configurable and easy to integrate. To explore this insight, we built SnipMatch, a plug-in for the Eclipse IDE. SnipMatch introduces a simple markup that allows snippet authors to specify search patterns and integration instructions. SnipMatch leverages this information, in conjunction with current code context, to improve snippet search and parameterization. For example, when a search query includes local variables, SnipMatch suggests compatible snippets, and automatically adapts them by substituting in these variables. In the lab, we observed that participants integrated snippets faster when using SnipMatch than when using standard Eclipse. Findings from a public deployment to 93 programmers suggest that SnipMatch has become integrated into the work practices of real users.
ConstraintJS: Programming Interactive Behaviors for the Web by Integrating Constraints and States - Paper - ACM
Abstract » Interactive behaviors in GUIs are often described in terms of states, transitions, and constraints, where the constraints only hold in certain states. These constraints maintain relationships among objects, control the graphical layout, and link the user interface to an underlying data model. However, no existing Web implementation technology provides direct support for all of these, so the code for maintaining constraints and tracking state may end up spread across multiple languages and libraries. In this paper we describe ConstraintJS, a system that integrates constraints and finite-state machines (FSMs) with Web languages. A key role for the FSMs is to enable and disable constraints based on the interface’s current mode, making it possible to write constraints that sometimes hold. We illustrate that constraints combined with FSMs can be a clearer way of defining many interactive behaviors with a series of examples.
Abstract » User interface toolkit research has traditionally assumed that developers have full control of an interface. This assumption is challenged by the mashup nature of many modern interfaces, in which different portions of a single interface are implemented by multiple, potentially mutually distrusting developers (e.g., an Android application embedding a third-party advertisement). We propose considering security as a primary goal for user interface toolkits. We motivate the need for security at this level by examining today’s mashup scenarios, in which security and interface flexibility are not simultaneously achieved. We describe a security-aware user interface toolkit architecture that secures interface elements while providing developers with the flexibility and expressivity traditionally desired in a user interface toolkit. By challenging trust assumptions inherent in existing approaches, this architecture effectively addresses important interface-level security concerns.
Tuesday, October 09, 10:55 - 11:35
Break / Poster session
Interactions Speak Louder Than Words: Shared User Models and Adaptive Interfaces - Doctoral Symposium
Abstract » Touch-screens are becoming increasingly ubiquitous. They have great appeal due to their capabilities to support new forms of human interaction, including their abilities to interpret rich gestural inputs, render flexible user interfaces and enable multi-user interactions. However, the technology creates new challenges and barriers for users with limited levels of vision and motor abilities. The PhD work described in this paper proposes a technique combining Shared User Models (SUM) and adaptive interfaces to improve the accessibility of touch-screen devices for people with low levels of vision and motor ability. SUM, built from an individual’s interaction data across multiple applications and devices, is used to infer new knowledge of their abilities and characteristics, without the need for continuous calibration exercises or user configurations. This approach has been realized through the development of an open source software framework to support the creation of applications that make use of SUM to adapt interfaces that match the needs of individual users.
An Interface Agent for Non-Visual, Accessible Web Automation - Doctoral Symposium
Abstract » The Web is far less usable and accessible for the users with visual impairments than it is for the sighted people. Web automation has the potential to bridge the divide between the ways visually impaired people and sighted people access the Web, and enable visually impaired users to breeze through Web browsing tasks that beforehand were slow, hard, or even impossible to achieve. Typical automation interfaces require that the user record a macro, a useful sequence of browsing steps, so that these steps can be re-played in the future. In this paper, I present a high-level overview of an approach that enables users to find quickly relevant information on the webpage, and automate browsing without recording macros. This approach is potentially useful both for visually impaired, and sighted users.
Medical Operating Documents: Dynamic Checklists Improve Crisis Attention - Doctoral Symposium
Abstract » The attentional aspects of crisis computing—supporting highly trained teams as they respond to real-life emergencies—have been underexplored in the user interface community. My research investigates the development of interactive software systems that support crisis teams, with an eye towards intelligently managing attention. In this paper, I briefly describe MDOCS, a Medical operating DOCuments System built for time-critical interaction. MDOCS is a multi-user, multi-surface software system that implements dynamic checklists and interactive cognitive aids written to support medical crisis teams. I present the results of a study that evaluates the deployment of MDOCS in a realistic, mannequin-based medical simulator used by anesthesiologists. I propose controlled laboratory experiments that evaluate the feasibility and effectiveness of our design principles and attentional interaction techniques.
Spatial augmented reality to enhance physical artistic creation - Doctoral Symposium
Abstract » Spatial augmented reality (SAR) promises the integration of digital information
in the real (physical) world through projection. In this doctoral symposium paper,I propose different tools to improve speed or ease the drawing by projecting photos, virtual construction lines and interactive 3D scenes. After describing the tools, I explain some future challenges to explore such as the creation of tools which helps to create
drawings that are “difficult” to achieve for a human being, but easy to do by a computer. Furthermore, I propose some insights for the creation of digital games and programs which can take full advantages of physical drawings.
in the real (physical) world through projection. In this doctoral symposium paper,I propose different tools to improve speed or ease the drawing by projecting photos, virtual construction lines and interactive 3D scenes. After describing the tools, I explain some future challenges to explore such as the creation of tools which helps to create
drawings that are “difficult” to achieve for a human being, but easy to do by a computer. Furthermore, I propose some insights for the creation of digital games and programs which can take full advantages of physical drawings.
Machine Learning Models for Uncertain Interaction - Doctoral Symposium
Abstract » As interaction methods beyond the static mouse and keyboard setup of the desktop era - such as touch, gesture sensing, and visual tracking - become more common, existing interaction paradigms are no longer good enough. These new modalities have high uncertainty, and conventional interfaces are not designed to reflect this. Research has shown that modelling uncertainty can improve the quality of interaction with these systems. Machine learning offers a rich set of tools to make probabilistic inferences in uncertain systems - this is the focus of my thesis work. In particular, I'm interested in making inferences at the sensor level and propagating uncertainty forward appropriately to applications. In this paper I describe a probabilistic model for touch interaction, and discuss how I intend to use the uncertainty in this model to improve typing accuracy on a soft keyboard. The model described here lays the groundwork for a rich framework for interaction in the presence of uncertainty, incorporating data from multiple sensors to make more accurate inferences about the goals of users, and allowing systems to adapt smoothly and appropriately to their context of use.
Towards Document Engineering on Pen and Touch-Operated Interactive Tabletops - Doctoral Symposium
Abstract » Touch interfaces have now become mainstream thanks to modern smartphones and tablets. However, there are still very few “productivity” applications, i.e. tools that support mundane but essential work, especially for large interactive surfaces such as digital tabletops. This work aims to partly fill the relative void in the area of document engineering by investigating what kind of intuitive and efficient tools can be provided to support the manipulation of documents on a digital workdesk, in particular the creation and editing of documents. The fundamental interaction model relies on bimanual pen and multitouch input, which was recently introduced to tabletops and enables richer interaction possibilities. The goal is ultimately to provide useful and highly accessible UIs for document-centric applications, whose design principles will hopefully pave the way from DTP towards DTTP (Digital Tabletop Publishing).
Closing the Loop Between Intentions and Actions - Doctoral Symposium
Abstract » In this document, I propose systems that aim to minimize the gap between intentions and the corresponding actions under different scenarios. The gap exists because of many reasons like subjective mapping between the two, lack of resources to implement the action, or inherent noise in the physical processes.
The proposed system observes the action and infers the intention behind it. The system then generates a refined action using the inference. The inferred intention and the refined action are then provided as feedback to the user who
can then perform corrective actions or choose the refined action as it is as the desired result. I demonstrate the design and implementation of such systems through five projects - Image Deblurring, Tracking Block Model Assembly, Animating with Physical Proxies, What Affects Handwriting and Spying on the Writer.
The proposed system observes the action and infers the intention behind it. The system then generates a refined action using the inference. The inferred intention and the refined action are then provided as feedback to the user who
can then perform corrective actions or choose the refined action as it is as the desired result. I demonstrate the design and implementation of such systems through five projects - Image Deblurring, Tracking Block Model Assembly, Animating with Physical Proxies, What Affects Handwriting and Spying on the Writer.
Data-Driven Interactions for Web Design - Doctoral Symposium
Abstract » This thesis describes how data-driven approaches to Web design problems can
enable useful interactions for designers. It presents three machine learning
applications which enable new interaction mechanisms for Web design: rapid
retargeting between page designs, scalable design search, and generative
probabilistic model induction to support design interactions cast as
probabilistic inference. It also presents a scalable architecture for efficient
data-mining on Web designs, which supports these three applications.
enable useful interactions for designers. It presents three machine learning
applications which enable new interaction mechanisms for Web design: rapid
retargeting between page designs, scalable design search, and generative
probabilistic model induction to support design interactions cast as
probabilistic inference. It also presents a scalable architecture for efficient
data-mining on Web designs, which supports these three applications.
A Guidance Technique for Motion Tracking with a Handheld Camera using Auditory Feedback - Poster
Abstract » We introduce a novel guidance technique based on auditory feedback for a handheld video camera. Tracking a moving object with a handheld camera is a difficult task, especially when the camera operator follows the target, because it is difficult to see through the viewfinder at the same time as following the target. The proposed technique provides audi- tory feedback via a headphone, which assists the operator to keep the target in sight. Two feedback sounds are intro- duced: three-dimensional (3D) audio and amplitude modu- lation (AM)-based sonification.
A Proposal for a MMG-based Hand Gesture Recognition Method - Poster
Abstract » We propose a novel hand-gesture recognition method based on mechanomyograms (MMGs). Skeletal muscles generate sounds specific to their activity. By recording and analyzing these sounds, MMGs provide means to evaluate the activity. Previous research revealed that specific motions produce specific sounds enabling human motion to be classified based on MMGs. In that research, microphones and accelerometers are often used to record muscle sounds. However, environmental conditions such as noise and human motion itself easily overwhelm such sensors. In this paper, we propose to use piezoelectric-based sensing of MMGs to improve robustness from environmental conditions. The preliminary evaluation shows this method is capable of classifying several hand gestures correctly with high accuracy under certain situations.
Highly Deformable Interactive 3D Surface Display - Poster
Abstract » In this research, we focused on the flexibility limitation of a display material as one of the main causes for height con-straints in deformable surfaces. We propose a method that does not only utilize the material flexibility but also allows for increased variations of shapes and their corresponding interaction possibilities. Using this method, our proposed display design can then support additional expansion via protrusion of an air-pressure-controlled moldable display surface using a residual cloth-excess method and a fixed airbag mount.
Development of a Non-contact Tongue-motion Acquisition System - Poster
You Can't Force Calm: Designing and Evaluating Respiratory Regulating Interfaces for Calming Technology - Poster
Abstract » Interactive systems are increasingly being used to explicitly support change in the user's psychophysiological state and behavior. One trend in this vein is systems that support calm breathing habits. We designed and evaluated techniques to support respiratory regulation to reduce stress and increase parasympathetic tone. Our study revealed that auditory guidance was more effective than visual at creating self-reported calm. We attribute this to the users' ability to effectively map sound to respiration, thereby reducing cognitive load and mental exertion. Interestingly, we found that visual guidance led to more respiratory change but less subjective calm.
Thus, motivating users to exert physical or mental efforts may counter the calming effects of slow breathing. Designers of calming technologies must acknowledge the discrepancy between mechanical slow breathing and experiential calm in designing future systems.
Thus, motivating users to exert physical or mental efforts may counter the calming effects of slow breathing. Designers of calming technologies must acknowledge the discrepancy between mechanical slow breathing and experiential calm in designing future systems.
E-Block: A Tangible Programming Tool for Children - Poster
Abstract » E-Block is a tangible programming tool for children aged 5 to 9 which gives children a preliminary understanding of programming. Children can write programs to play a maze game by placing the programming blocks in E-Block. The two stages in a general programming process: programming and running are all embodied in E-Block. We realized E-Block by wireless and infrared technology and gave it feedbacks on both screen and programming blocks. The result of a preliminary user study proved that E-Block is attractive to children and easy to learn and use.
MISO: A Context-Sensitive Multimodal Interface for Smart Objects Based on Hand Gestures and Finger Snaps - Poster
Abstract » We present an unobtrusive multimodal interface for smart objects (MISO) in
an everyday indoor environment.
MISO uses pointing for object selection and context-sensitive
arm gestures for object control. Finger snaps are used to confirm
object selections and to aid with gesture segmentation. Audio feedback is
provided during the interaction. The use of a Kinect depth camera
allows for a compact system and robustness in
varying environments and lighting conditions at low cost.
an everyday indoor environment.
MISO uses pointing for object selection and context-sensitive
arm gestures for object control. Finger snaps are used to confirm
object selections and to aid with gesture segmentation. Audio feedback is
provided during the interaction. The use of a Kinect depth camera
allows for a compact system and robustness in
varying environments and lighting conditions at low cost.
BallCam! Dynamic View Synthesis from Spinning Cameras - Poster
Abstract » We are interested in generating novel video sequences from a ball’s point of view for sports domains. Despite the challenge of extreme camera motion, we show that we can leverage the periodicity of spinning cameras to generate a stabilized ball point-of-view video. We present preliminary results of image stabilization and view synthesis from a single camera being hurled in the air at 600 RPM.
VideoInk: A Pen-based Approach for Video Editing - Poster
Abstract » Due the growth of video sharing, its manipulation is important, however still a hard task. In order to improve it, this work proposes a pen-based approach, called VideoInk. The concept exploits the painting metaphor, replacing digital ink with video frames. The method allows the user to paint video content in a canvas, which works as a two dimensional timeline. This approach includes transition effects and zoom features based on pen pressure. A Tablet PC prototype implementing the concept was also developed.
Transparent Display Interaction without Binocular Parallax - Poster
Abstract » Binocular parallax is a problem for any interaction system that has a transparent display and objects behind it. A proposed quantitative measure called Binocular Selectability Discriminant (BSD) allows UI designers to predict the ability of the user to perform selection task in their transparent display systems, in spite of binocular parallax. A proposed technique called Single-Distance Pseudo Transparency (SDPT) aims to eliminate binocular parallax for on-screen interactions that require precision. A mock-up study shows potentials and directions for future investigation.
Programming With Everybody: Tightening the Copy-Modify-Publish Feedback Loop - Poster
Abstract » People write more code than they ever share online. They also copy and tweak code more often than they contribute their modifications back to the public. These situations can lead to widespread duplication of effort. However, the copy-modify-publish feedback loop which could solve the problem is inhibited by the effort required to publish code online. In this paper we present our preliminary, ongoing effort to create Ditty, a programming environment that attacks the problem by sharing changes immediately, making all code public by default. Ditty tracks the changes users make to code they find and exposes the modified versions alongside the original so that commonly-used derivatives can eventually become canonical. Our work will examine mechanical and social methods to consolidate global effort on common code snippets, and the effects of designing a programming interface that inspires a feeling of the whole world programming together.
sleepyWhispers: Sharing Goodnights within Distant Relationships - Poster
Abstract » There is a growing body of work in HCI on the design of communication technologies to help support lovers in long distance relationships. We build upon this work by presenting an exploratory study of a prototype device intended to allow distant lovers to share goodnight messages. Our work distinguishes itself by basing distance communication metaphors on elements of familiar, simple co-located behaviours. We argue that voice remains an under-utilised media when designing interactive technologies for long-distant couples. Through exploring the results of a 2-month case study we present some of the unique challenges that using voice entails.
Crowd-Based Recognition of Web Interaction Patterns - Poster
Abstract » Web automation often involves users describing complex tasks to a system, with directives generally limited to low-level constituent actions like "click the search button." This level of description is unnatural and makes it difficult to generalize the task across websites. In this paper, we propose a system for automatically recognizing higher-level interaction patterns from user's completion of tasks, such as "searching for cat videos" or "replying to a post". We present PatFinder, a system that identifies these patterns using the input of crowd workers. We validate the system by generating data for 10 tasks, having 62 crowd workers label them, and automatically extracting 14 interaction patterns. Our results show that the number of patterns grows sublinearly with the number of tasks, suggesting that a small finite set of patterns may suffice to describe the vast majority of tasks on the web.
Restorable Backspace - Poster
Abstract » This paper presents Restorable Backspace, an input helper for mistyping correction. It stores characters deleted by backspace keystrokes, and restores them in the retyping phase. We developed Restoration algorithm that compares deleted characters and retyped characters, and makes a suggestion while retyping. In a pilot study we could observe the algorithm work as expected for most of the cases. All participants in the pilot study showed satisfaction about the concept of Restorable Backspace.
Development of a Non-contact Tongue-motion Acquisition System - Poster
Abstract » We present a new tongue detection system called SITA, which comprises only a Kinect device and conventional laptop computer. In contrast with other tongue-based devices, the SITA system does not require the subject to wear a device. This avoids the issue of oral hygiene and removes the risk of swallowing a device inserted in the mouth. In this paper, we introduce the SITA system and an application. To evaluate the system, a user test was conducted. The results indicate that the system could detect the tongue position in real time.
Moreover, there are possibilities of training the tongue with this system.
Moreover, there are possibilities of training the tongue with this system.
Follow-Me!: Conducting a Virtual Concert - Poster
Abstract » In this paper, we present a real-time continuous gesture recognition system for conducting a virtual concert. Our systems allow the user control over beat, by conducting four different beat-pattern gestures; tempo, by making faster or slower gestures; volume, by making larger or smaller gestures; and instrument emphasis, by directing the gestures towards specific areas of the orchestra on a large display. A recognition accuracy of up to 95% could be achieved for the conducting gestures (beat, tempo, and volume).
Synchrum: A Tangible Interface for Rhythmic Collaboration - Poster
Abstract » Synchrum is a tangible interface, inspired by the Tibetan prayer wheel, for audience participation and collaboration during digital performance. It engages audience members in effortful interaction, where they have to rotate the device in accord with a given rotation speed. We used synchrum in a video installation and report our observations.
Review Explorer: An Innovative Interface for Displaying and Collecting Categorized Review Information - Poster
Abstract » Review Explorer is an interface that utilizes categorized information to help users to explore a huge amount of online reviews more easily. It allows users to sort entities (e.g. restaurants, products) based on their ratings of different aspects (e.g. food for restaurants) and highlight sentences that are related to the selected aspect. Existing interfaces that summarize the aspect information in reviews suffer from the erroneous predictions made by the systems. To solve this problem, Review Explorer performs a real-time aspect sentiment analysis when a reviewer is composing a review and provides an interface for the reviewer to easily correct the errors. This novel design motivates reviewers to provide corrected aspect sentiment labels, which enables our system to provide more accurate information than existing interfaces.
mashpoint: Browsing the Web along Structured Lines - Poster
Abstract » Large numbers of Web sites support rich data-centric features to explore and interact with data. In this paper we present mashpoint, a framework that allows distributed data-powered Web applications to linked based on similarities of the entities in their data. By linking applications in this way we allow browsing with selections of data from one application to another application. This sort of browsing allows complex queries and exploration of data to be done by average Web users using multiple applications. We additionally use this concept to surface structured information to users in Web pages. In this paper we present this concept and our initial prototype.
Lost in the Dark: Emotion Adaption - Poster
Abstract » Having environments that are able to adjust accordingly with the user has been sought in the last years particularly in the area of Human Computer Interfaces. Environments able to recognize the user emotions and react in consequence have been of interest on the area of Affective Computing. This work presents a project – an adaptable 3D video game, Lost in the Dark: Emotion Adaption, which uses user’s emotions as input to alter and adjust the gaming environment. To achieve this, an interface that is capable of reading brain waves, facial expressions, and head motion was used, an Emotiv® EPOC headset. For our purposes we read emotions such as meditation, excitement, and engagement into the game, altering the lighting, music, gates, colors, and other elements that would appeal to the user emotional state. With this, we achieve closing the loop of using the emotions as inputs, adjusting a system accordingly as a result, and elicit emotions.
Collision Avoidance Interface for Safe Piloting of Unmanned Vehicles using a Mobile Device - Poster
Abstract » Autonomous robots and vehicles can perform tasks that are unsafe or undesirable for humans to do themselves, such as investigate safety in nuclear reactors or assess structural damage to a building or bridge after an earthquake. In addition, improvements in autonomous modes of such vehicles are making it easier for minimally-trained individuals to operate the vehicles. As the autonomous capabilities advance, the user’s role shifts from a direct teleoperator to a supervisory control role. Since the human operator is often better suited to make decisions in uncertain situations, it is important for the human operator to have awareness of the environment in which the vehicle is operating in order to prevent collisions and damage to the vehicle as well as the structures and people in the vicinity. In this paper, we present the Collision and Obstacle Detection and Alerting (CODA) display, a novel interface to enable safe piloting of a Micro Aerial Vehicle with a mobile device in real-world settings.
Directed Social Queries With Transparent User Models - Poster
Abstract » The friend list of many social network users can be very large. This creates challenges when users seek to direct their social interactions to friends that share a particular interest. We present a self-organizing online tool that by incorporating ideas from user modeling and data visualization allows a person to quickly identify which friends best match a social query, enabling precise and efficient directed social interactions. To cover the different modalities in which our tool might be used, we introduce two different interactive visualizations. One view enables a human-in-the-loop approach for result analysis and verification, and, in a second view, location, social affiliations and "personality" data is incorporated, allowing the user to quickly consider different social and spatial factors when directing social queries. We report on a qualitative analysis, which indicates that transparency leads to an increased effectiveness of the system. This work contributes a novel method for exploring online friends.
For novices playing music together, adding structural constraints leads to better music and may improve user experience - Poster
Abstract » We investigate the effects of adding structure to musical interactions for novices. A simple instrument allows control of three musical parameters: pitch, timbre, and note density. Two users can play at once, and their actions are visible on a public display. We asked pairs of users to perform duets under two interaction conditions: unstructured, where users are free to play what they like, and structured, where users are directed to different areas of the musical parameter space by time-varying constraints indicated on the display. A control group played two duets without structure, while an experimental group played one duet with structure and a second without. By crowd-sourcing the ranking of recorded duets we find that structure leads to musically better results. A post experiment survey showed that the experimental group had a better experience during the second unstructured duet than during the structured.
Tuesday, October 09, 11:35 - 12:40
Interactions I
Chair: Shahram Izadi, Microsoft Research, UK
Abstract » We explore creating “cliplets”, a form of visual media that juxtaposes still image and video segments, both spatially and temporally, to expressively abstract a moment. Much as in “cinemagraphs”, the tension between static and dynamic elements in a cliplet reinforces both aspects, strongly focusing the viewer’s attention. Creating this type of imagery is challenging without professional tools and training. We develop a set of idioms, essentially spatiotemporal mappings, that characterize cliplet elements, and use these idioms in an interactive system to quickly compose a cliplet from ordinary
handheld video. One difficulty is to avoid artifacts in the cliplet composition without resorting to extensive manual input. We address this with automatic alignment, looping optimization and feathering, simultaneous matting and compositing, and Laplacian blending. A key user-interface challenge is to
provide affordances to define the parameters of the mappings from input time to output time while maintaining a focus on the cliplet being created. We demonstrate the creation of a variety of cliplet types. We also report on informal feedback as well as a more structured survey of users.
handheld video. One difficulty is to avoid artifacts in the cliplet composition without resorting to extensive manual input. We address this with automatic alignment, looping optimization and feathering, simultaneous matting and compositing, and Laplacian blending. A key user-interface challenge is to
provide affordances to define the parameters of the mappings from input time to output time while maintaining a focus on the cliplet being created. We demonstrate the creation of a variety of cliplet types. We also report on informal feedback as well as a more structured survey of users.
Abstract » Focus+context lens-based techniques smoothly integrate two levels of detail using spatial distortion to connect the magnified region and the context. Distortion guarantees visual continuity, but causes problems of interpretation and focus targeting, partly due to the fact that most techniques are based on statically-defined, regular lens shapes, that result in far-from-optimal magnification and distortion. JellyLenses dynamically adapt to the shape of the objects of interest, providing detail-in-context visualizations of higher relevance by optimizing what regions fall into the focus, context and spatially-distorted transition regions. This both improves the visibility of content in the focus region and preserves a larger part of the context region. We describe the approach and its implementation, and report on a controlled experiment that evaluates the usability of JellyLenses compared to regular fisheye lenses, showing clear performance improvements with the new technique for a multi-scale visual search task.
Abstract » We present PiVOT, a tabletop system aimed at supporting mixed-focus collaborative tasks. Through two view-zones, PiVOT provides personalized views to individual users while presenting an unaffected and unobstructed shared view to all users. The system supports multiple personalized views which can be present at the same spatial location and yet be only visible to the users it belongs to. The system also allows the creation of personal views that can be either 2D or (auto-stereoscopic) 3D images. We first discuss the motivation and the different implementation principles required for realizing such a system, before exploring different designs able to address the seemingly opposing challenges of shared and personalized views. We then implement and evaluate a sample prototype to validate our design ideas and present a set of sample applications to demonstrate the utility of the system.
Abstract » We present Histomages, a new interaction model for image editing that considers color histograms as spatial rearrangements of image pixels. Users can select pixels on image histograms as they would select image regions and directly manipulate them to adjust their colors. Histomages are also affected by other image tools such as paintbrushes. We explore some possibilities offered by this interaction model, and discuss the four key principles behind it as well as their implications for the design of feature-rich software in general.
Tuesday, October 09, 12:40 - 14:20
Lunch (The Women of UIST Luncheon, sponsored by Microsoft Research, will be at EVOO Restaurant. RSVP at the UIST registration desk.)
Tuesday, October 09, 14:20 - 16:00
Pen
Chair: Daniel Vogel, University of Waterloo, Canada
Abstract » We study uncertainty in graphical-based interaction (with special attention to sketching). We argue that a comprehensive model for the problem must include the interaction participants (and their current beliefs), their possible actions and their past sketches. It’s yet unclear how to frame and solve the former problem, considering all the latter elements. We suggest framing the problem as a game and solving it with a game-theoretical solution, which leads to a framework for the design of new two-way, sketch-based user interfaces. In special, we use the framework to design a game that can progressively learn visual models of objects from user sketches, and use the models in real-world interactions. Instead of an abstract visual criterion, players in this game learn models to optimize interaction (the game’s duration). This two-way sketching game addresses problems essential in emerging interfaces (such as learning and how to deal with interpretation errors). We review possible applications in robotic sketch-to-command, hand gesture recognition, media authoring and visual search, and evaluate two. Evaluations demonstrate how players improve performance with repeated play, and the influence of interaction aspects on learning.
Abstract » Digital pen technology has allowed for the easy transfer of pen data from paper to the computer. However, linking handwritten content with the digital world remains a hard problem as it requires the translation of unstructured and highly personal vocabularies into structured ones that computers can easily understand and process. Automatic recognition can help to this direction, but as it is not always reliable, solutions require the active cooperation between users and recognition algorithms. This work examines the use of portable touch-screen devices in connection with pen and paper to help users direct and refine the interpretation of their strokes on paper. We explore four techniques of bi-manual interaction that combine touch and pen-writing, where user attention is divided between the original strokes on paper and their interpretation by the electronic device. We demonstrate the techniques through a mobile interface for writing music that complements the automatic recognition with interactive user-driven interpretation. An experiment evaluates the four techniques and provides insights about their strengths and limitations.
High-Performance Pen + Touch Modality Interactions: A Real-Time Strategy Game eSports Context - Paper - ACM
Abstract » We used the situated context of real-time strategy (RTS) games to address the design and evaluation of new pen + touch interaction techniques. RTS play is a popular genre of Electronic Sports (eSports), games played and spectated at an extremely high level. Interaction techniques are critical for eSports players, because they so directly impact performance.
Through this process, new techniques and implications for pen + touch and bi-manual interaction emerged. We enhance non-dominant hand (NDH) interaction with edge-constrained affordances, anchored to physical features of interactive sur- faces, effectively increasing target width. We develop bi- manual overloading, an approach to reduce the total number of occurrences of NDH retargeting. The novel isosceles lasso select technique facilitates selection of complex object subsets. Pen-in-hand interaction, dominant hand touch in- teraction performed with the pen stowed in the palm, also emerged as an efficient and expressive interaction paradigm.
Through this process, new techniques and implications for pen + touch and bi-manual interaction emerged. We enhance non-dominant hand (NDH) interaction with edge-constrained affordances, anchored to physical features of interactive sur- faces, effectively increasing target width. We develop bi- manual overloading, an approach to reduce the total number of occurrences of NDH retargeting. The novel isosceles lasso select technique facilitates selection of complex object subsets. Pen-in-hand interaction, dominant hand touch in- teraction performed with the pen stowed in the palm, also emerged as an efficient and expressive interaction paradigm.
Abstract » This work presents GaussSense, which is a back-of-device sensing technique for enabling input on an arbitrary surface using stylus by exploiting magnetism. A 2mm-thick Hall sensor grid is developed to sense magnets that are embedded in the stylus. Our system can sense the magnetic field that is emitted from the stylus when it is within 2cm of any non-ferromagnetic surface. Attaching the sensor behind an arbitrary thin surface enables the stylus input to be recognized by analyzing the distribution of the applied magnetic field. Attaching the sensor grid to the back of a touchscreen device and incorporating magnets into the corresponding stylus enable the system 1) to distinguish touch events that are caused by a finger from those caused by the stylus, 2) to sense the tilt angle of the stylus and the pressure with which it is applied, and 3) to detect where the stylus hovers over the screen. A pilot study reveals that people were satisfied with the novel sketching experiences based on this system.
Abstract » The availability of flexible capacitive sensors that can be fitted around mice, smartphones, and pens carries great potential in leveraging grasp as a new interaction modality. Unfortunately, most capacitive sensors only track interac-tion directly on the surface, making it harder to differentiate among grips and constraining user movements. We present a new optical range sensor design based on high power infrared LEDs and photo-transistors, which can be fabricat-ed on a flexible PCB and wrapped around a wide variety of graspable objects including pens, mice, smartphones, and slates. Our sensor offers a native resolution of 10 dpi with a sensing range of up to 30mm (1.2”) and sampling speed of 50Hz. Based on our prototype wrapped around the barrel of a pen, we present a summary of the characteristics of the sensor and describe the sensor output in several typical pen grips. Our design is versatile enough to apply not only to pens but to a wide variety of graspable objects including smartphones and slates.
PhantomPen: Virtualization of Pen Head for Digital Drawing Free from Pen Occlusion & Visual Parallax - Paper - ACM
Abstract » We present PhantomPen, a direct pen input device whose pen head is virtualized onto the tablet display surface and visually connected to a graspable pen barrel in order to achieve digital drawing free from pen occlusion and visual parallax. As the pen barrel approaches the display, the virtual pen head smoothly appears as if the rendered virtual pen head and the physical pen barrel are in unity. The virtual pen head provides visual feedback by changing its virtual form according to pen type, color, and thickness while the physical pen tip, hidden in the user's sight, provides tactile feedback. Three experiments were carefully designed based on an analysis of drawings by design professionals and observations of design drawing classes. With these experiments that simulate natural drawing we proved significant performance advantages of PhantomPen. PhantomPen was at least as usable as the normal stylus in basic line drawing, and was 17 % faster in focus region drawing (26 % faster in extreme focus region drawing). PhantomPen also reduced error rate by 40 % in a typical drawing setup where users have to manage a complex combination of pen and stroke properties.
Tuesday, October 09, 16:00 - 16:40
Break / Poster session
Tuesday, October 09, 16:40 - 18:05
Interactions II
Chair: Dan Olsen, Brigham Young University, USA
Abstract » Scrolling is controlled through many forms of input devices, such as mouse wheels, trackpad gestures, arrow keys, and joysticks. Performance with these devices can be adjusted by introducing variable transfer functions to alter the range of expressible speed, precision, and sensitivity. However, existing transfer functions are typically "black boxes" bundled into proprietary operating systems and drivers. This presents three problems for researchers: (1) a lack of knowledge about the current state of the field; (2) a difficulty in replicating research that uses scrolling devices; and (3) a potential experimental confound when evaluating scrolling devices and techniques. These three problems are caused by gaps in researchers' knowledge about what device and movement factors are important for scrolling transfer functions, and about how existing devices and drivers use these factors. We fill these knowledge gaps with a framework of transfer function factors for scrolling, and a method for analysing proprietary transfer functions---demonstrating how state of the art commercial devices accommodate some of the human control phenomena observed in prior studies.
Abstract » We argue that the current practice of using integer positions for pointing events artificially constrains human precision capabilities. The high sensitivity of current input devices can be harnessed to enable precise direct manipulation “in between” pixels, called subpixel interaction. We provide detailed analysis of subpixel theory and implementation, including the critical component of revised control-display gain transfer functions. A prototype implementation is described with several illustrative examples. Guidelines for subpixel domain applicability are provided and an overview of required changes to operating systems and graphical user interface frameworks are discussed.
Abstract » Audio producers often use musical underlays to emphasize key moments in spoken content and give listeners time to reflect on what was said. Yet, creating such underlays is time-consuming as producers must carefully (1) mark an emphasis point in the speech (2) select music with the appropriate style, (3) align the music with the emphasis point, and (4) adjust dynamics to produce a harmonious composition. We present UnderScore, a set of semi-automated tools designed to facilitate the creation of such underlays. The producer simply marks an emphasis point in the speech and selects a music track. UnderScore automatically refines, aligns and adjusts the speech and music to generate a high-quality underlay. UnderScore allows producers to focus on the high-level design of the underlay; they can quickly try out a variety of music and test different points of emphasis in the story. Amateur producers, who may lack the time or skills necessary to author underlays, can quickly add music to their stories. An informal evaluation of UnderScore suggests that it can produce high-quality underlays for a variety of examples while significantly reducing the time and effort required of radio producers.
Abstract » In our information-driven web-based society, we are all gradually falling “victims” to information overload [5].However, while sighted people are finding ways to sift through information faster, Internet users who are blind are experiencing an even greater information overload. These people access computers and Internet using screen-reader software, which reads the information on a computer screen sequentially using computer-generated speech. While sighted people can learn how to quickly glance over the headlines and news articles online to get the gist of information, people who are blind have to use keyboard shortcuts to listen through the content narrated by a serial audio interface. This interface does not give them an opportunity to know what content to skip and what to listen to. So, they either listen to all of the content or listen to the first part of each sentence or paragraph before they skip to the next one. In this paper, we propose an automated approach to facilitate non-visual skimming of web pages. We describe the underlying algorithm, outline a non-visual skimming interface, and report on the results of automated experiments, as well as on our user study with 23 screen-reader users. The results of the experiments suggest that we have been moderately successful in designing a viable algorithm for automatic summarization that could be used for non-visual skimming. In our user studies, we confirmed that people who are blind could read and search through online articles faster and were able to understand and remember most of what they have read with our skimming system. Finally, all 23 participants expressed genuine interest in using non-visual skimming in the future.
Tuesday, October 09, 18:05 - 18:35
Town Hall
Tuesday, October 09, 18:35 - 19:30
Break
Tuesday, October 09, 19:30 - 22:00
Banquet
Wednesday, October 10, 09:00 - 10:40
Augmented Reality
Chair: Daniel Avrahami, Intel, USA
PICOntrol: Using a Handheld Projector for Direct Control of Physical Devices through Visible Light - Paper - ACM
Abstract » Today’s environments are populated with a growing number of electric devices which come in diverse form factors and provide a plethora of functions. However, rich interaction with these devices can become challenging if they need be controlled from a distance, or are too small to accommodate user interfaces on their own. In this work, we explore PICOntrol, a new approach using an off-the-shelf handheld pico projector for direct control of physical devices through visible light. The projected image serves a dual purpose by simultaneously presenting a visible interface to the user, and transmitting embedded control information to inexpensive sensor units integrated with the devices. To use PICOntrol, the user points the handheld projector at a target device, overlays a projected user interface on its sensor unit, and performs various GUI-style or gestural interactions. PICOntrol enables direct, visible, and rich interactions with various physical devices without requiring central infrastructure. We present our prototype implementation as well as explorations of its interaction space through various application examples.
Abstract » We demonstrate a realtime system which infers and tracks the assembly process of a snap-together block model using a Kinect sensor. The inference enables us to build a virtual replica of the model at every step. Tracking enables us to provide context specific visual feedback on a screen by augmenting the rendered virtual model aligned with the physical model. The system allows users to author a new model and uses the inferred assembly process to guide its recreation by others. We propose a novel way of assembly guidance where the next block to be added is rendered in blinking mode with the tracked virtual model on screen. The system is also able to detect any mistakes made and helps correct them by providing appropriate feedback. We focus on assemblies of Duplo blocks.
We discuss the shortcomings of existing methods of guidance - static figures or recorded videos - and demonstrate how our method avoids those shortcomings. We also report on a user study to compare our system with standard figure-based guidance methods found in user manuals. The results of the user study suggest that our method is able to aid users' structural perception of the model better, leads to fewer assembly errors, and reduces model construction time.
We discuss the shortcomings of existing methods of guidance - static figures or recorded videos - and demonstrate how our method avoids those shortcomings. We also report on a user study to compare our system with standard figure-based guidance methods found in user manuals. The results of the user study suggest that our method is able to aid users' structural perception of the model better, leads to fewer assembly errors, and reduces model construction time.
Abstract » In this paper, we present a novel smartphone application designed to easily capture, visualize and reconstruct homes, offices and other indoor scenes. Our application leverages data from smartphone sensors such as the camera, accelerometer, gyroscope and magnetometer to help model the indoor scene. The output of the system is two-fold; first, an interactive visual tour of the scene is generated in real time that allows the user to explore each room and transition between connected rooms. Second, with some basic interactive photogrammetric modeling the system generates a 2D floor plan and accompanying 3D model of the scene, under a Manhattan-world assumption. The approach does not require any specialized equipment or training and is able to produce accurate floor plans.
Abstract » Steerable displays use a motorized platform to orient a projector to display graphics at any point in the room. Often a camera is included to recognize markers and other objects, as well as user gestures in the display volume. Such systems can be used to superimpose graphics onto the real world, and so are useful in a number of augmented reality and ubiquitous computing scenarios. We contribute the Beamatron, which advances steerable displays by drawing on recent progress in depth camera-based interactions. The Beamatron consists of a computer-controlled pan and tilt platform on which is mounted a projector and Microsoft Kinect sensor. While much previous work with steerable displays deals primarily with projecting corrected graphics onto a discrete set of static planes, we describe computa-tional techniques that enable reasoning in 3D using live depth data. We show two example applications that are enabled by the unique capabilities of the Beamatron: an augmented reality game in which a player can drive a virtual toy car around a room, and a ubiquitous computing demo that uses speech and gesture to move projected graphics throughout the room.
Abstract » We present a system for producing 3D animations using physical objects (i.e., puppets) as input. Puppeteers can load 3D models of familiar rigid objects, including toys, into our system and use them as puppets for an animation. During a performance, the puppeteer physically manipulates these puppets in front of a Kinect depth sensor. Our system uses a combination of image-feature matching and 3D shape matching to identify and track the physical puppets. It then renders the corresponding 3D models into a virtual set. Our system operates in real time so that the puppeteer can immediately see the resulting animation and make adjustments on the fly. It also provides 6D virtual camera \\rev{and lighting} controls, which the puppeteer can adjust before, during, or after a performance. Finally
our system supports layered animations to help puppeteers produce animations in which several characters move at the same time. We demonstrate the accessibility of our system with a variety of
animations created by puppeteers with no prior animation experience.
our system supports layered animations to help puppeteers produce animations in which several characters move at the same time. We demonstrate the accessibility of our system with a variety of
animations created by puppeteers with no prior animation experience.
Abstract » KinÊtre allows novice users to scan arbitrary physical objects and bring them to life in seconds. The fully interactive system allows diverse static meshes to be animated using the entire human body. Traditionally, the process of mesh animation is laborious and requires domain expertise, with rigging specified manually by an artist when designing the character. KinÊtre makes creating animations a more playful activity, conducted by novice users interactively *at runtime*. This paper describes the KinÊtre system in full, highlighting key technical contributions and demonstrating many examples of users animating meshes of varying shapes and sizes. These include non-humanoid meshes and incomplete surfaces produced by 3D scanning -- two challenging scenarios for existing mesh animation systems. Rather than targeting professional CG animators, KinÊtre is intended to bring mesh animation to a new audience of novice users. We demonstrate potential uses of our system for interactive storytelling and new forms of physical gaming.
Wednesday, October 10, 10:40 - 11:10
Break
Wednesday, October 10, 11:10 - 12:50
Multi-touch
Chair: Patrick Baudisch, Hasso Plattner Institute, University of Potsdam, Germany
Abstract » Visual search in large real-world scenes is both time consuming and frustrating, because the search becomes serial when items are visually similar. Tactile guidance techniques can facilitate search by allowing visual attention to focus on a subregion of the scene. We present a technique for dynamic tactile cueing that couples hand position with a scene position and uses tactile feedback to guide the hand actively toward the target. We demonstrate substantial improvements in task performance over a baseline of visual search only, when the scene's complexity increases. Analyzing task performance, we demonstrate that the effect of visual complexity can be practically eliminated through improved spatial precision of the guidance.
Abstract » Software designed for direct-touch interfaces often utilize a metaphor of direct physical manipulation of pseudo “real-world” objects. However, current touch systems typically take 50-200ms to update the display in response to a physi-cal touch action. Utilizing a high performance touch de-monstrator, subjects were able to experience touch latencies ranging from current levels down to about 1ms. Our tests show that users greatly prefer lower latencies, and noticea-ble improvement continued well below 10ms. This level of performance is difficult to achieve in commercial compu-ting systems using current technologies. As an alternative, we propose a hybrid system that provides low-fidelity visu-al feedback immediately, followed by high-fidelity visuals at standard levels of latency.
A User-Specific Machine Learning Approach for Improving Touch Accuracy on Mobile Devices - Paper - ACM
Abstract » We present a flexible Machine Learning approach for learning user-specific touch input models to increase touch accuracy on mobile devices. The model is based on flexible, non-parametric Gaussian Process regression and is learned using recorded touch inputs. We demonstrate that significant touch accuracy improvements can be obtained when either raw sensor data is used as an input or when the device's reported touch location is used as an input, with the latter marginally outperforming the former. We show that learned offset functions are highly nonlinear and user-specific and that user-specific models outperform models trained on data pooled from several users. Crucially, significant performance improvements can be obtained with a small (~200) number of training examples, easily obtained for a particular user through a calibration game or from keyboard entry data.
Abstract » Proton++ is a declarative multitouch framework that allows developers to describe multitouch gestures as regular expressions of touch event symbols. It builds on the Proton framework by allowing developers to incorporate custom touch attributes directly into the gesture description. These custom attributes increase the expressivity of the gestures, while preserving the benefits of Proton: automatic gesture matching, static analysis of conflict detection, and graphical gesture creation. We demonstrate Proton++’s flexibility with several examples: a direction attribute for describing trajectory, a pinch attribute for detecting when touches move towards one another, a touch area attribute for simulating pressure, an orientation attribute for selecting menu items, and a screen location attribute for simulating hand ID. We also use screen location to simulate user ID and enable simultaneous recognition of gestures by multiple users. In addition, we show how to incorporate timing into Proton++ gestures by reporting touch events at a regular time interval. Finally, we present a user study that suggests that users are roughly four times faster at interpreting gestures written using Proton++ than those written in procedural event-handling code commonly used today.
Extended Multitouch: Recovering Touch Posture and Differentiating Users using a Depth Camera - Paper - ACM
Abstract » Multitouch surfaces are becoming prevalent, but most existing technologies are only capable of detecting the user’s actual points of contact on the surface and not the identity, posture, and handedness of the user. In this paper, we define the concept of extended multitouch interaction as a richer input modality that includes all of this information. We further present a practical solution to achieve this on tabletop displays based on mounting a single commodity depth camera above a horizontal surface. This will enable us to not only detect when the surface is being touched, but also recover the user’s exact finger and hand posture, as well as distinguish between different users and their handedness. We validate our approach using two user studies, and deploy the technique in a scratchpad tool and in a pen + touch sketch tool.
Abstract » Multi-touch technology lends itself to collaborative crowd interaction (CI). However, common tap-operated widgets are impractical for CI, since they are susceptible to accidental touches and interference from other users. We present a novel multi-touch interface called FlowBlocks in which every UI action is invoked through a small sequence of user actions: dragging parametric UI-Blocks, and dropping them over operational UI-Docks. The FlowBlocks approach is advantageous for CI because it a) makes accidental touches inconsequential; and b) introduces design parameters for mutual awareness, concurrent input, and conflict management. FlowBlocks was successfully used on the floor of a busy natural history museum. We present the complete design space and describe a year-long iterative design and evaluation process which employed the Rapid Iterative Test and Evaluation (RITE) method in a museum setting.
Wednesday, October 10, 12:50 - 14:20
Lunch
Wednesday, October 10, 14:20 - 16:00
Tactile & Grip
Chair: Andy Wilson, Microsoft Research, USA
Abstract » Ferroelectric material supports both pyro- and piezoelectric effects that can be used for sensing pressures on large, bended surfaces. We present PyzoFlex, a pressure-sensing input device that is based on a ferroelectric material. It is constructed with a sandwich structure of four layers that can be printed easily on any material. We use this material in combination with a high-resolution Anoto-sensing foil to support both hand and pen input tracking. The foil is bendable, energy-efficient, and it can be produced in a printing process. Even a hovering mode is feasible due to its pyroelectric effect. In this paper, we introduce this novel input technology and discuss its benefits and limitations.
Jamming User Interfaces: Programmable Particle Stiffness and Sensing for Malleable and Shape-Changing Devices - Paper - ACM
Abstract » Malleable and organic user interfaces have the potential to enable radically new forms of interactions and expressiveness through flexible, free-form and computationally controlled shapes and displays. This work, specifically focuses on particle jamming as a simple, effective method for flexible, shape-changing user interfaces where programmatic control of material stiffness enables haptic feedback, deformation, tunable affordances and control gain. We introduce a compact, low-power pneumatic jamming system suitable for mobile devices, and a new hydraulic-based technique with fast, silent actuation and optical shape sensing. We enable jamming structures to sense input and function as interaction devices through two contributed methods for high-resolution shape sensing using: 1) index-matched particles and fluids, and 2) capacitive and electric field sensing. We explore the design space of malleable and organic user interfaces enabled by jamming through four motivational prototypes that highlight jamming's potential in HCI, including applications for tabletops, tablets and for portable shape-changing mobile devices.
Abstract » We have developed a simple skin-like user interface that can be easily attached to curved as well as flat surfaces and used to measure tangential force generated by pinching and dragging interactions. The interface consists of several photoreflectors that consist of an IR LED and a phototransistor and elastic fabric such as stocking and rubber membrane. The sensing method used is based on our observation that photoreflectors can be used to measure the ratio of expansion and contraction of a stocking using the changes in transmissivity of IR light passing through the stocking. Since a stocking is thin, stretchable, and nearly transparent, it can be easily attached to various types of objects such as mobile devices, robots, and different parts of the body as well as to various types of conventional pressure sensors without altering the original shape of the object. It can also present natural haptic feedback in accordance with the amount of force exerted. A system using several such sensors can determine the direction of a two-dimensional force. A variety of example applications illustrated the utility of this sensing system.
Capacitive Fingerprinting: Exploring User Differentiation by Sensing Electrical Properties of the Human Body - Paper - ACM
Abstract » At present, touchscreens can differentiate multiple points of contact, but not who is touching the device. In this work, we consider how the electrical properties of humans and their attire can be used to support user differentiation on touchscreens. We propose a novel sensing approach based on Swept Frequency Capacitive Sensing, which measures the impedance of a user to the environment (i.e., ground) across a range of AC frequencies. Different people have different bone densities and muscle mass, wear different footwear, and so on. This, in turn, yields different impedance profiles, which allows for touch events, including multitouch ges- tures, to be attributed to a particular user. This has many interesting implications for interactive design. We describe and evaluate our sensing approach, demonstrating that the technique has considerable promise. We also discuss limita- tions, how these might be overcome, and next steps.
GripSense: Using Built-In Sensors to Detect Hand Posture and Pressure on Commodity Mobile Phones - Paper - ACM
Abstract » We introduce GripSense, a system that leverages mobile device touchscreens and their built-in inertial sensors and vibration motor to infer hand postures including one- or two-handed interaction, use of thumb or index finger, or use on a table. GripSense also senses the amount of pres-sure a user exerts on the touchscreen despite a lack of direct pressure sensors by inferring from gyroscope readings when the vibration motor is “pulsed.” In a controlled study with 10 participants, GripSense accurately differentiated device usage on a table vs. in hand with 99.67% accuracy and when in hand, it inferred hand postures with 84.26% accuracy. In addition, GripSense distinguished three levels of pressure with 95.1% accuracy. A usability analysis of GripSense was conducted in three custom applications and showed that pressure input and hand-posture sensing can be useful in a number of scenarios.
Abstract » ForcePhone is a mobile synchronous haptic communication system. During phone calls, users can squeeze the side of the device and the pressure level is mapped to vibrations on the recipient’s device. The pressure/vibrotactile messages supported by ForcePhone are called pressages. Using a lab-based study and a small field study, this paper addresses the following questions: how can haptic interpersonal communication be integrated into a standard mobile device? What is the most appropriate feedback design for pressages? What types of non-verbal cues can be represented by pressages? Do users make use of pressages during their conversations? The results of this research indicate that such a system has value as a communication channel in real-world settings with users expressing greetings, presence and emotions through pressages.
Wednesday, October 10, 16:00 - 16:40
Break
Wednesday, October 10, 16:40 - 18:05
Fabrication & Hardware
Chair: Xiang Cao, Microsoft Research Asia
Abstract » We present acoustic barcodes, structured patterns of physical notches that, when swiped with e.g., a fingernail, produce a complex sound that can be resolved to a binary ID. A single, inexpensive contact microphone attached to a surface or object is used to capture the waveform. We present our method for decoding sounds into IDs, which handles variations in swipe velocity and other factors. Acoustic barcodes could be used for information retrieval or to triggering interactive functions. They are passive, durable and inexpensive to produce. Further, they can be applied to a wide range of materials and objects, including plastic, wood, glass and stone. We conclude with several example applications that highlight the utility of our approach, and a user study that explores its feasibility.
Abstract » This paper introduces the PICL, the portable in-circuit learner. The PICL explores the possibility of providing standalone, low-cost, programming-by-demonstration machine learning capabilities to circuit prototyping. To train the PICL, users attach a sensor to the PICL, demonstrate example input, then specify the desired output (expressed as a voltage) for the given input. The current version of the PICL provides two learning modes, binary classification and linear regression. To streamline training and also make it possible to train on highly transient signals (such as those produced by a camera flash or a hand clap), the PICL includes a number of input inferencing techniques. These techniques make it possible for the PICL to learn with as few as one example. The PICL's behavioural repertoire can be expanded by means of various output adapters, which serve to transform the output in useful ways when prototyping. Collectively, the PICL's capabilities allow users of systems such as the Arduino or littleBits electronics kit to quickly add basic sensor-based behaviour, with little or no programming required.
Abstract » An increasing number of consumer products include user interfaces that rely on touch input. While digital fabrication techniques such as 3D printing make it easier to prototype the shape of custom devices, adding interactivity to such prototypes remains a challenge for most designers.
We introduce Midas, a software and hardware toolkit to support the design, fabrication, and programming of flexible capacitive touch sensors for interactive objects. With Midas, designers first define the desired shape, layout, and type of touch sensitive areas, as well as routing obstacles, in a sensor editor interface. From this high-level specification, Midas automatically generates layout files with appropriate sensor pads and routed connections. These files are then used to fabricate sensors using digital fabrication processes, e.g. vinyl cutters and circuit board printers. Using step-by-step assembly instructions generated by Midas, designers connect these sensors to our microcontroller setup, which detects touch events. Once the prototype is thus assembled, designers can define interactivity for their sensors: Midas supports both record-and-replay actions for controlling existing local applications and WebSocket-based event output for controlling novel or remote applications. In a first-use study with three participants, users successfully prototyped media players. We also demonstrate how Midas can be used to create a number of touch-sensitive interfaces.
We introduce Midas, a software and hardware toolkit to support the design, fabrication, and programming of flexible capacitive touch sensors for interactive objects. With Midas, designers first define the desired shape, layout, and type of touch sensitive areas, as well as routing obstacles, in a sensor editor interface. From this high-level specification, Midas automatically generates layout files with appropriate sensor pads and routed connections. These files are then used to fabricate sensors using digital fabrication processes, e.g. vinyl cutters and circuit board printers. Using step-by-step assembly instructions generated by Midas, designers connect these sensors to our microcontroller setup, which detects touch events. Once the prototype is thus assembled, designers can define interactivity for their sensors: Midas supports both record-and-replay actions for controlling existing local applications and WebSocket-based event output for controlling novel or remote applications. In a first-use study with three participants, users successfully prototyped media players. We also demonstrate how Midas can be used to create a number of touch-sensitive interfaces.
Abstract » We present an approach to 3D printing custom optical elements for interactive devices labelled Printed Optics. Printed Optics enable sensing, display, and illumination elements to be directly embedded in the casing or mechanical structure of an interactive device. Using these elements, unique display surfaces, novel illumination techniques, custom optical sensors, and embedded optoelectronic components can be digitally fabricated for rapid, high fidelity, highly customized interactive devices. Printed Optics is part of our long term vision for interactive devices that are 3D printed in their entirety. In this paper we explore the possibilities for this vision afforded by fabrication of custom optical elements using today's 3D printing technology.
Abstract » Personal fabrication tools, such as laser cutters and 3D printers allow users to create precise objects quickly. However, working through a CAD system removes users from the workpiece. Recent interactive fabrication tools reintroduce this directness, but at the expense of precision.
In this paper, we introduce constructable, an interactive drafting table that produces precise physical output in every step. Users interact by drafting directly on the workpiece using a hand-held laser pointer. The system tracks the pointer, beautifies its path, and implements its effect by cutting the workpiece using a fast high-powered laser cutter.
Constructable achieves precision through tool-specific constraints, user-defined sketch lines, and by using the laser cutter itself for all visual feedback, rather than using a screen or projection. We demonstrate how constructable allows creating simple but functional devices, including a simple gearbox, that cannot be created with traditional interactive fabrication tools.
In this paper, we introduce constructable, an interactive drafting table that produces precise physical output in every step. Users interact by drafting directly on the workpiece using a hand-held laser pointer. The system tracks the pointer, beautifies its path, and implements its effect by cutting the workpiece using a fast high-powered laser cutter.
Constructable achieves precision through tool-specific constraints, user-defined sketch lines, and by using the laser cutter itself for all visual feedback, rather than using a screen or projection. We demonstrate how constructable allows creating simple but functional devices, including a simple gearbox, that cannot be created with traditional interactive fabrication tools.
Wednesday, October 10, 18:05
Closing