Friday, November 28, 2008

SHADY: A Shape Description Debugger for Use in Sketch Recognition

Authors: Tracy Hammond and Randall Davis

Comments:

Summary:
This paper gives a description on a debugger (SHADY) that identified over-constrained shapes by having the user draw some samples of the shapes. SHADY uses the drawn sample to identify the constraints that are wrong and displays them. The tool tries to give out the smallest collection of constraints that would match the user's necessity and displays the set of constraints and the set of wrong contraints.
A constraint solver is built for linear constraints of the shapes. Missing constraints are identified from the user drawn sample and left out in the description. Near miss shapes are shown to the user based on the missing constraints. The user then chooses the shapes which are close to his necessity.

Discussion:
Capturing user intentions is important here. It would be interesting to see if speed and pressure data corresponding to strokes would help identify the constraints. Users tend to slow down at places of constraints (Sloppy selection).

Discussion:

Multimodal Collaborative Handwriting Training for Visually-Impaired People

Authors: Beryl Plimmer, Andrew Crossan, Stephen A. Brewster, Rachel Blagojevic

Comments:

Summary:
MCSig, a system designed with multimodal feedback for aiding visually-impaired people in learning to write. Unlike ordinary people, visually-impaired people lack the main form of feedback while writing - the visual feedback. This makes their learning harder. This paper proposes a tool which combines haptics and feedback through sound to help write the characters. A specially designed tool is being used in this tool to guide the user to trace through the character and a sound based feedback to get a feedback on the quality of the trace.

Discussion:
This paper throws interesting aspect of feedback to people. Using ridges on the paper to provide feedback on shapes is an interesting method.

This paper also shows the way we learn to write.
- First we start learning by tracing over the character. This means moving our whole hand and the pen together to trace the whole character. As the pen traces the character, hand also traces the character. This is the also the way we start writing in the cursive.
- As our confidence grows, we fix our palm and trace the characters by moving the pen and the 2 fingers holding our pen.

Using a Geometric-Based Sketch Recognition Approach to Sketch Chinese Radicals

Author: Paul Taele, Tracy Hammond

Comments:

Summary:
This paper gives a description of chinese radical recognition built using LADDER and sezgin primitive recognizer. The image recognition and neural network methods have several drawbacks like need for large training data, inability to store the order of the strokes. LADDER achieves a lesser accuracy in recognition but consumes less time in building and using the system.

Discussion:
Interesting things found in this paper is the significance of the geometric features in identifying the chinese radicals.

Friday, November 21, 2008

Fluid Sketches: Continuous Recognition and Morphing of Simple HandDrawn Shapes

Author: James Arvo,Kevin Novins

Comment:

Summary:

This paper discusses about a method feedback while sketching. The paper uses instantaneous morphing and recognition feedback in order to give the user more flexibility while drawing.
This paper provides mathematical models for morphing and recognizing shapes.
As the user sketches, the ink is transformed to the clean shape instantaneously. The clean shape then replaces the original ink as the user continues sketching. The author claims this reduces the effort in sketching.

Discussion:

The idea of instantaneous feedback on user- drawn sketch is interesting. But i think this can also confuse users while drawing a complex shape.

Tuesday, November 18, 2008

Sketch Recognition User Interfaces: Guidelines for Design and Development

Author: Christine Alvarado

Comment:
1. Daniel's blog

Summary:

This paper discusses Sketch Recognition user interfaces (SkRUI) and an application developed using this concept, MS Power point .
Online Edit mode - user needs to hold the pen down fro sometime and the system changes the mode to Edit mode automatically. User can then select the sketches. A pen up then results in switching back to sketch mode while the selected items are highlighted.
A switching between unrecognized / recognized sketch was provided by a checkbox at the top of the window.

Evaluation of the system was done to understand the perception of the user about the tool and what they wanted from the tool.
3 scenarios - Creating new diagrams/ slides with the tool
- Labeling the diagrams with keyboard since the system did not support handwriting recognition
- editing and sketching using pen

Design guidelines - this paper provides some design guidelines to design a SkRUI systems.
- Display recognition results only when the user is done sketching
- Provide obvious indications to distinguish free sketching from recognition
- Restrict recognition to a single domain until automatic domain detection becomes feasible
- Incorporate pen-based editing.
- Sketching and editing should use distinct pen motions
- SkRUIs require large buttons
- The pen must always respond in real time

Discussion:
These design guidelines may not be ideal for all the cases. There can be lot variability in designing SkRUIs. For instance, sketch recognition can be done while drawing the stroke. Its cannot be a rule.

This throws up lot of open questions:
- When to recognise ?
- When to show the results?
- How to switch modes in an unambiguous way?
- How to show recognition results?

Saturday, November 8, 2008

Magic Paper: Sketch-Understanding Research

Author: Randall Davis

Summary:
Why sketch? more intuitive. Difficulty in sketch recognition - order of strokes, noise, segmentation, overtracing, more degree of freedom more difficulty.

Recognizing sketch - identifying primitives - identify the shape based on order of strokes, set of geometric constraints or based on template . LADDER provides us a method to define geometric constraints for shapes. Using a variation of HMM (dynamic bayes net) to capture the order of strokes. These help us to build a sketch interface (e.g UML) . PRoviding refinement to sketch recognition by generating near miss samples.

Discussion:
Its a good summary on LADDER

Interactive Learning of Structural Shape Descriptions from

Author: Tracy Hammond, Randall Davis

Comments:

Summary:
This paper discusses about defining structural shape descriptions in LADDER by generating near miss examples. This paper discusses about handling the conceptual errors (under- constrained and over- constrained errors) that occurs while describing a shape. The other type of errors is syntactic errors (out of scope of this paper) .

Under-Contrained - missing equalLength condition for the 4 sides in describing a square.
Over- Contrained - adding equalLength condition for (top left) sides for a rectangle. Redundant constraints .

After the description of a shape with GUI, the first step is to find a good match between the typed description and the drawn shape that is a set of bindings that associates variables in the
description with the geometric shapes. The system chooses the variable assignment with the fewest failed constraints. The system then displays the subcomponents of the failed constraints for the developer to remove the constraints, if necessary. The developer should remove enough contraints for the system to recognise the initial hand-drawn shape.

The system then reduces the constraint list by testing with negative samples and positive samples.

Over constrained testing - shapes for differrent scales and rotation are displayed to the developer and asked if all the samples are positive. Testing for other constraints are done by negating the constraints and applying them to generate shapes. Similar testing is done for checking under-constraints.
The paper also describes the process behind generating shapes for different constraints.

Discussion:
Its a nice method to generate near miss samples as the users cannot always generate all the samples.

Sunday, November 2, 2008

Grouping Text Lines in Freeform Handwritten Notes

Author: Ming Ye, Herry Sutanto, Sashi Raghupathy, Chengyang Li and Michael Shilman

Comments: Yuxiang's blog

Summary:
This paper describes an effective way of grouping ink strokes in to text/shape .

Likelihood of Line:
Three features are used measure the likelihood of the line.
Linear Regression error(eLR) - measure of the deviation of the stroke points from the fitting line. Reflects the linearity.
Maximum inter stroke distance of the strokes projection along X(dxmax) and Y (dymax) - reflects the compactness of the stroke set.

Configuration Consistency - the paper identifies 3 configuration and consistency corresponding to the configuration. The paper also provides a method to measure the consistency using the neighborhood graph where the vertices correspond to a line and edges correspond to the distance between the line. If the distance is below a certain threshold and there are no drawings between them, the 2 vertices connected by the edge can be considered neighbors. Configuration consistency is computed using the neighbor-length-weighted sum of te orientation angle difference.

A cost function is defined as weighted sum of the 3 features discussed for the likelihood of the line and the configuration consistency. Groups are chosen which would reduce the cost function.
Optimization is done to cost function by redefining it as sum of 'eLR' and 'dxmax'.

Undergrouping errors - typically 'i / t' and high configuration energy errors are caused by temporally adjacent strokes being recognised in different lines. This solved using the finding the neighbors which are approximately parralel.

Iterations: The algorithm starts with the results of the temporal grouping and then uses gradient descent to reduce the cost function and fixes the under/over grouping errors during the iteration. Incremental parsing turns out to be much more efficient than batch mode.

Produces a result of 0.93 and 0.87 respectively for perfect word/draw and crude W/D

Discussion:
The way they solve the starting point problem by assigning groups based on temporal information is interesting. This paper will be effective in areas were text strokes are continuous i.e input is a set of paragraphs and drawings / input containing continuous text strokes. I do not think this would work well in cases of finite state machines where text strokes are very sparse and broken.