Sketch Recognition: September 2008

Friday, September 26, 2008

Envisioning Sketch Recognition: A Local Feature Based Approach to Recognizing Informal Sketches

Author: Michael Oltmans(thesis)

Comments:
1.Andrew's Blog

Summary:
This is a thesis work on freehand shape recognition. In this thesis, the author has discussed a method to recognize shapes based on the recognition results of the parts constituting the shapes.
Bulls-eye features are used in this method. This is calculated using a circle around interest point and cut into wedges. A vector containing the points which fall in particular wedges is taken as the feature. The circles around the interest point are separated equally in log space.The radius of the circles are chosen in such a way that it covers most of the shape. The inner bins capture more data while the outer bins capture coarse data.
The trajectory of the stroke is used to achieve rotational invariance. Shapes that drawn in the reverse order cannot be recognized. To solve this, bullseye feature for original and the reverse order of the shape is found. There may be errors in finding the bullseye histogram when the strokes are on the boundary. Some points may fall on either side of the boundary. To solve this, the bullseye axis oriented with interest point and rotating extra half a bin turn.
Direction at a point is found by tangent to window of 19 points. The preprocessing steps include resampling stroke for constant spatial frequency(> 1pixel). The author proposes way to calculate distance between the bullseye bins. The bins are also normalized based on the size of the bin. Outer bins hold lesser weights than inner bins. Finally, the result is normalized to 1 to remove the avoid the effect of the differences in total number of points in a region.
This algorithm first performs a clustering based on the bullseye for the training data. Then cluster centers added to a codebook. A match vector is then formed from the codebook which would contain data just enough for classifying shapes.
The algorithm uses support vector machines to differentiate the classes. One Vs One strategy is used to combine binary classfiers. So for n shapes n*(n-1)/2 binary classifiers are used.
For finding shapes in a complete sketch, the isolated shape recognition is run over candidate region. This assigns class labels to the candidate region. Isolated shape reconizers are trained for Wire recognition and also for variation in shapes in the form of change in alignment, inteference from the wire region,... . To give out a final set of predictions, the algorithm needs to first list the set of initial candidates and split the cluster of candidates if they are large.
Results show a 92.3% accuracy in recognizing circuit shapes and 94% recognition on HHreco dataset. The algorithm fails in recognizing symbols that are similar ( capacitor, battery, ground), (Bjt, diode), ... .

Discussion:
I have read 2 completely different approaches in sketch recognition today. One is the constellation method and the other is the bullseye window proposed by this author. These approaches are very different from the traditional approaches.
The results are quite good. The algorithm fails quite rarely in classifying similar shapes. It had worked well in classifying bjt and jfet. I am interested to know if this algorithm could differentiate resistor(zigzag line) and inductor(helix).
I think this algorithm will be very slow, given the number of steps in each section of the algorithm.

Constellation Models for Sketch Recognition

Authors: D. Sharon and M. van de Panne

Comments:

1. Yuxiang's blog
2. Andrew's blog

Summary:

This paper uses constellation model to recognize shapes. An object is defined with 2 features, individual and pair wise. Individual features give the shape and global location of the object. The pairwise features give the relative position of parts. Individual feature is a vector of X,Y( center of the line aligned to the bounding box), d(normalized length of the diagonal of the bounding box) and Cosine of the angle between the diagonal the X axis. Pairwise feature vector include difference in X, difference in Y, the minimum distance between the endpoints of stroke a and any point on stroke b, and the minimum distance between the endpoints of stroke b and any point on stroke a.

The sketch recognition process has two phases, the first which searches the space of possible mandatory label assignments, and the second which searches for optional labels for the remaining unlabelled strokes. In this way the mandatory labels provide contextual location information necessary for assigning appropriate labels to the potentially large number of optional parts.Mean and variance for the Individual and pair wise features are calculated separately and likelihood for labeling(L) is calculated by mutiplying the likelihood of individual features being recognized as L and likelihood of pair-wise features being labelled L.
A maximum likelihood search procedure is used to label the strokes. There are exponential number of possibilities of such selection. There are some strokes which might not be labelled. In order to reduce the computation cost because of the above mentioned factors, the search is done over 2 phases - labeling strokes which are mandatory and then linear search through the optional labels for the recognition of the remaining unlabeled strokes.The search is carried out using branch and bound tree. Multi pass thresholding is used to increase the speed of searching for large number of strokes.
Recognition can go wrong in several ways - inability to find suitable mandatory strokes because of the hard constraints, mislabeling of a mandatory stroke and mislabeling of optional strokes.
Errors are rare and imply a lack of training data, can occur if unusual strokes occur that affect
the overall bounding box and fewer mandatory strokes

Discussion:
The highlight of the paper is the constellation model. Using pairwise features to understand the spacial orientation is interesting. The number of training samples that this algorithm needs is quite large(20-60). The algorithm depends more on the bounding box and on the madatory strokes for the recognition. This is contrary to the flexibility it claims to give the user.

Thursday, September 25, 2008

A Domain-Independent System for Sketch Recognition

Authors: Bo Yu, Shijie Cai

Comments:

Summary:

This paper has an interesting way to find the corners. Its actually based on primitive shape recognition. This algorithm has two steps- Stroke Approximation and Post processing.In Stroke Approximation, The algorithm tries to fit a stroke into some primitive. If it fails, it splits the stroke into two substrokes at point of highest curvature then feeds back to the algorithm. This paper proposes different recognition criteria for primitives (lines,arcs, curves, circles, ellipse, helix and spiral) which is similar to that discussed in PaleoSketch. The second step, post processing , involves merging corners and cleaning up false positives. This is very similar to mergeCF . Removing short lines, merging overlapping curves/arcs .

Discussion:

This paper forsees creation of a system which allows to configure it for any shape. LADDER framework achieves this. Its an interesting approach to find corners based on the primitives. Accuracy of 98% for polylines and 94% for arcs are impressive.

GLADDER: Combining Gesture and Geometric Sketch Recognition

Authors: Paul Corey and Tracy Hammond

Summary:
This paper proposes a method to integrate a feature based classifier Rubine and a geometric recognition system based on LADDER. Rubine classifier is used to calculate minimum mahalanobis distance of the sample from the classes. Then the sample is rejected for the rubine classification based on a threshold. This threshold value is set to 35 which is claimed to be optimum threshold.
The sample which falls above the threshold are classified using geometric recognition system.

Discussion:
Its kind of clever way to improve accuracy by using geometric recognition system to classify samples that may be rejected by Rubine. I expected a significant increase in the accuracy of classification. As the paper states, the shapes which fall in the common area of both these systems may be the reason for errors(reduced accuracy).

Monday, September 22, 2008

A curvature estimation for pen input segmentation in sketch-based modeling

Authors: Dae Hyun Kim , Myoung-Jun Kim

Comments:

1. Daniel's blog

Summary:

This paper proposes an algorithm to find the corners based on the curvature and direction measures. Preprocessing steps include resampling strokes to make the inter point distance constant. This implies the curvature can be estimated only based on the direction values of consecutive points by removing the constant distance term.

Local Convexity: Curvature at a point is defined as the sum of curvature values over a window of k points before and after the current point (support window). This sum is greater if the point is a corner. This fails when there are corners which are close together.

Local Monotocity: When the Local convexity calculation fails, monotonocity is used to find the corners. This is used where corners are close together. The algorithm finds if the points around a corner is monotonously increasing or decreasing in curvature and finds the minima in the support window .This minima is then stated as the corner.

Discussion:

This method works well for polylines and smoother curves. Like any other corner finding algorithm, It fails for small curves and curves with irregularly changind curvature.

Eliminating False Positives During Corner Finding by Merging Similar Segments

Author: Aaron Wolin, Brandon Paulson and Tracy Hammond

Summary:
Chord length between 2 points - Sum of the euclidean distance between all the consecutive points between them.
Speed - chord length/ time difference.
In the first run , corners found based on the curvature and speed thresholds. If the corners are close together, one with lower curvature is removed.
The algorithm tries to remove false positives by merging 2 segments. It tries to check if the merged segment best fit into some primitive shape ( line/arc).
Discussion:
Simple and a powerful algorithm. I would like to see if it works well with helix and resistor type strokes.

PaleoSketch: Accurate Primitive Sketch Recognition and Beautification

Author: Brandon Paulson and Tracy Hammond
Comments:
1. Yuxiang's blog
Summary:
This paper is about primitive shape recognition. The shapes recognised by the system include line,polyline,arcs,curve, circle,ellipse,helix and spiral. First step in the recognition is to calculate some pre-recognition values for a stroke and pass it on to low level shape recognizers.Each of the recognizers return a boolean flag for recognition and a beautified shape object. Then the recognition results are fed into an hierarchy function which sorts them in order of best fit.
Pre recognition -
- Removing duplicate points (Points with same X,Y coordinates / time).
- Direction graph,speed graph, curvature graph and corners are computed.
- Normalized distance between direction extremes (NDDE)- this is calculated by dividing the stroke length between points with highest and lowest direction values by total stroke length.Curves will have high NDDE and polylines have Lowers NDDE.
- Direction change Ratio(DCR) - to find spikes in direction graphs. its is the maximum change in direction by average change in direction. First and last 20% of the stroke is analysed for higher curvature and the stroke till that point is considered tail. this is not performed on the strokes with short length.
- Test for overtracing - count the number of revolutions (2*pi) that the stroke has made. If its more than one revolution and above certain threshold, mark it as overtraced.
- Closed figure test - Ratio of distance between first and last point by the stroke length give measure of this feature.

Line test - Find the orthogonal distance between the best fit line and the real stroke. if the error(orthogonal distance/stroke length) is within a threshold, then mark the stroke as line.Verify overtracing and number of corners in the stroke.

PolyLine Test - Segment stroke into substrokes based on the corners. Berify line test on each substroke and each of them must have high DCR;

Ellipse test -Farthest points make the major axis and the the perpendicular bisector of the major axis is the minor axis. It should pass the closed shape test, NDDE must be high and feature area / area of ideal ellipse must be within a threshold.

Circle test:
Ellipse test+ ratio of major /minor axis should be close to 1.

Arc Test - Perpendicular bisectors of the chords on the arc can be used to determine the center of the arc. average distance of all the points from the center is take as ideal radius. Feature area error is taken into consideration. NDDE high and DCR low.

Curve test - fitting a bezier curve to the points. finding least square error with the bezier parametric values. Curve should have low DCR value.

Spiral test - Must pass Overtrace test, NDDE should be high,substroke should be a circle with ascending / descnding radius, Centers of the substrokes must be close to each other, distance between the farthest of the centers should not be greater than the diameter of spiral. distance between first and last point / the stroke length should be less than a threshold.

helix test -Spirals with moving centers. test on number of revolutions.

Complex test - more than one possible output.The results of the complex shapes are ranked.

Discussion -
Seems to be very powerful solution for identifying primitives. I would like to know the how computationally intensive this algorithm is. I liked the way this algorithm finds center for arcs. I did not understand the recognition part for curves since i do not know much about the bezier curves. Calculating rank for complex shape based on the primitive shapes they are composed of is interesting.

Thursday, September 18, 2008

Sketch Based Interfaces: Early Processing for Sketch

Authors: Tevfik Metin Sezgin, Thomas Stahovich, Randall Davis

Summary:

- Corner detection with direction change, curvature and speed of the stroke.
- Average Based filtering - threshold for curvature/speed is based on the averages.
- Speed data used to remove the errors generated by the curvature data. Speed data is low at corners.Problems with speed data -maximum speed for short lines may be below threshold and can be recognised as corners,.
- Generate hybrid fits based on the speed and curvature results based on vertex certainties, then selecting the best fit.Certainty measure for curvature is magnitude of curvature around the corner and certainty for speed is the measure of pen slowdown at the point. 1-vi/vmax.
- List of hybrid fits are sorted based on the certainty measure. Intersection of the curvature and speed candidate list is first chosen. Interatively, corners ,with best certainty measure on either of the lists, are added to the list to create 2 more lists H' and H''.Best of the 3 lists are chosen from the Mean square distance of the list from the stroke. The one with the lowest distance is chosen for next interation and this continues till the candidate corner lists are fully checked.

Discussion:
Does not work well for the strokes with many sharp curves like resistors in circuit diagrams.Detects corners on curves.

Monday, September 15, 2008

Algorithms for the reduction of the number of points required to represent a digitized line or its caricature

Authors: David H Douglas, Thomas K Peucker

Comments:

Summary:

This paper suggests resampling of sketch data.

* Resampling based on removing duplicate points - remove points which lie with in threshold distance. might remove points which are important.

* Resampling based on selecting point - E.g. Selecting every 6th point in a line. Might remove important points

*Resampling by calculating perpendicular distances -

- take 2 points - anchor and floating point. find the perpendicular distance between all the points lying between them and the line formed by them.

- find the maximum of the perpendicular distances. If it lies within a threshold, the anchor and floating point can be deemed to be a line.

- this anchor and floating points window is moved across the stroke.

Discussion:

Effective way of resampling strokes without losing sensitive points. Success of the algorithm greatly depends on the threshold chosen to limit the perpendicular distances.

This algorithm gives a method to perform a straight line test over a set of points. A similar one is used in the Wolin's algorithm too.

ShortStraw: A Simple and Effective Corner Finder for Polylines

Authors: Aaron Wolin, Tracy Hammond

Comment:
1. Daniel's blog
Summary:
Resampling - Interspacing distance = bounding box diagonal by a constant(40).
Corner Finding -
* for each point 'p' in the stroke, find the distance between (p-w,p+w). This distance is short when the point 'p' is the corner.
* find the median of all the distances. threshold for corner is fixed at t= median * 0.95.
* if straw(k) if below this threshold, then its a corner.
* Run a line test on 2 consecutive corners (0.95 threshold). if the corners does not pass the line test, then there are additional corners between them. So the threshold t is relaxed, to include the corners. Run this step until the segment between the corners passes the line test.
* Run a collinearity test on consecutive 3 corners.

Discussion:
a clever way to find corners in a stroke. I dont know how effective will relaxing the threshold for outlying corners will be. It might raise false positives.

Tuesday, September 9, 2008

User Sketches: A Quick, Inexpensive, and Effective way to Elicit More Reflective User Feedback

Author: Maryam Tohidi, William Buxton, Ronald Baecker, Abigail Sellen
Comments:
1. Nabeel's blog
Summary:
This paper compares various traditional methods of usability testing with one that involves sketching. Usability test was done with 48 participants on house climate control system.
3 prototypes were used. Participants were divided into groups of 12. First 3 groups were given one prototype each and the last one was given all the 3 prototypes.
Verbal feedback suggested first 2 prototypes were better than third one. Result was consistent with the 4th group.The hypothesis that the 4th group, that had all the prototypes, would give more feedback was proven wrong. Many of the participants had comments and not suggestions. They could not express their suggestion.
Feedback through Sketch - The participants came up with more ideas. The first 2 prototypes were mostly replicated exactly with minor modification. The 4th group came up with ideas that had features pulled out of all the 3 prototypes.

Discussion:
Excited to see a new application for Sketch.
These sketch + a domain specific sketch recognition system would be more valuable.

Prototype Pruning by Feature Extraction for Handwritten Mathematical Symbol Recognition

Author: Stephen M Watt, Xiaofang Xie

Summary:
Recognizer for large symbol sets - pruning prototypes to restrict search into small group of the sample. Preclassifying of the stroke based on certain features in done in this approach.
Preprocessing - chopping head and tail of the strokes(removing hooks), resampling , (average/gaussian) smoothing and size normalization.

Geometric features- number of loops, intersection and cusps.
* Loops - found using Minimum distance pair. Thresholds are fixed for time separation, distance between the pair, area of the loop.
* Intersections-modified Bently ottman sweepline algorithm - no detail
* Cusps - sharp turning point. Cusps are detected in this paper by finding 3 points which makes small angle. This algorithm also checks if the previous 2 points of the cusp lie on a straight line and also for next 2 points.

Ink Related Features: Number of Strokes, Point Density( dividing bounding box into 3 regions upper, middle and lower).

Directional features - Initial direction,End Direction and Initial End Direction.

Global Features - Initial and end position, Weight to Height Ratio

Discussion:
The author's thesis suggested some interesting stroke preprocessing - removing hooks(head/tail) from the start and end of the strokes based on sharp edges and Gaussian smoothing.
The calculation of point density is interesting. The bounding box is divided into 3 regions upper(40%),middle(20%) and bottom(40%).
I like the concept of finding out cusp and loops.In her thesis, she has discussed scenarios to where cusps can become loops while drawing and vice versa.
I am not able to understand how the author calculates the number of loops.

Sunday, September 7, 2008

GRAPHICAL INPUT THROUGH MACHINE RECOGNITION OF SKETCHES

Author: Christopher F. Herot

Summary:
HUNCH - multi level sketch interpreting system. Sketch interpreted in one level acts as the input for the next level and so on. Constant sample rate input device is used for sketching. Based on STRAIT which assumed that speed of stroke can be linked to intent with which it was drawn. Fast strokes were less intended than the slow ones. Speed of the stroke was less around the corners.Model worked better for some people. STRAIN , modification of the old system uses speed and bentness to find the corners.

Latching - STRAIT used latching to join endpoints which was lying within a radius. This simple technique produced lot of errors.Features that can be considered to determine latching - change in size/ angle, closure, number of lines, speed and line length, user profiles ( training to specific user).

Overtracing - Reducing the amount of data given by the user. line.The amount and style of overtracing may serve as a measure of a particular user's attitude toward the design, with heavy overtracing indicating emphasis or reinforcement of selected areas.

Other levels of inferences - Program uses simple rules of projective geometry to map two-dimensional networks of lines into three-dimensional data structures. this has also helped in finding the interdependencies of latching and finding the third dimension.
Program maps the sketches to 2 dimensional grids (similar to bitmap). lines passing through point is '1' and others have value '0'. Used in room finding algorithm - works for simple sketches.

Recognising sketch is done based on context. HUNCH system generates a context free data structure which follows a top down approach with general nodes on the top and specific details in the bottom. Matching is done in a top down fashion. First, match is found for the top most node then moved down.

Discussion:
This system doesnot use the domain based information instead allows users to manipulate the results and trains based on them. Some interesting mappings are given in this paper - speed of stroke to intent of user , speed/bentness - corner, overtracing - attitude of the user . It has also thrown light on the problems of latching and overtracing which i never knew before reading this paper.

Wednesday, September 3, 2008

Gestures without libraries, toolkit or training

Authors: Jacob.o.Wobbrock, Andrew D Wilson, Yang Li

Summary:
This paper proposes a simple four step algorithm to recognize gestures which is computationally less expensive compared to Rubine, DTW - Flexible and requires less training.
1. Resample the point path - Slow gestures give more sample points and fast gestures give less sample points. So resampling of gestures to N=64 (adequate , 32-N-256) is done.
2. Rotate Gesture based on indicative angle (angle between centroid of gesture and first point) - Find the indicative angle and rotate the gesture such that this angle is Zero.This paper proposes using Golden Section search using Golden ratio angle and bound within range of +45 - -45 degrees. It is better than brute force and the hill climbing algorithm.
3.Scale and translate : Gesture is scaled to a reference square and the translated to a reference point such that centroid lies at (0,0).
4. Find the euclidean distance between the templates and and candidate. The one that gives minimum distance is the match.

Limitations:
Cannot recognize gestures which differs on orientation, aspect ratios/ location.For the recognizer to be flexible to gesture variations, authors suggest to add the unrecognized gesture to the same class as the user intends it to be recognized.

Results show this algorithm performs identical to DTW in terms of recognizing accuracy and better than rubine. Recognition improved with number of templates (significantly in rubine).

Discussion:
this paper doesnot discuss about the rational applied behind the size of the reference square. Is it the bounding box of template/ is it some arbitrary square (larger than every sample)?. Reference square depending on the size of the template can also increase the error rate. (in case template is small).
Algorithm doesnot take care of direction of stroke.
Lots of limitations - limited support variations, inability to recognize gestures based on orientation differencs ,etc.
Algorithm will be useful for applications using fairly simple and less gestures.

Sketch Recognition

Friday, September 26, 2008

Envisioning Sketch Recognition: A Local Feature Based Approach to Recognizing Informal Sketches

Constellation Models for Sketch Recognition

Thursday, September 25, 2008

A Domain-Independent System for Sketch Recognition

GLADDER: Combining Gesture and Geometric Sketch Recognition

Monday, September 22, 2008

A curvature estimation for pen input segmentation in sketch-based modeling

Eliminating False Positives During Corner Finding by Merging Similar Segments

PaleoSketch: Accurate Primitive Sketch Recognition and Beautification

Thursday, September 18, 2008

Sketch Based Interfaces: Early Processing for Sketch

Monday, September 15, 2008

Algorithms for the reduction of the number of points required to represent a digitized line or its caricature

ShortStraw: A Simple and Effective Corner Finder for Polylines

Tuesday, September 9, 2008

User Sketches: A Quick, Inexpensive, and Effective way to Elicit More Reflective User Feedback

Prototype Pruning by Feature Extraction for Handwritten Mathematical Symbol Recognition

Sunday, September 7, 2008

GRAPHICAL INPUT THROUGH MACHINE RECOGNITION OF SKETCHES

Wednesday, September 3, 2008

Gestures without libraries, toolkit or training

Followers

Blog Archive

About Me

My blogs