Segmenting the Motion for Virtual Valipilla

From this series of posts I'm going to mention the project background. The techniques and tools that can be used for air-finger-writing process. This post includes the motion segmentation techniques that can be used for the implementation.

Motion Segmentation

We are using a vision input device that constantly streams the location of the fingers within its field of view. So the data stream contains both an intended control motion and other extraneous motions that do not correspond to any control motion. Pen up /pen down kind of gestures cannot easily identify in here as in normal handwriting. Essentially there is no known beginning and end from the user input. There is no predefined gesture that indicates character writing starts and stops. That is why there should be a motion segmentation mechanism in place to accurately identify starts and ends of the letter strokes as well as finger pauses.

There are two paradigms for motion segmentation;


1.      Explicit Delimitation
2.      Automatic Detection.



Explicit Delimitation

Explicit delimitation can be easily accomplished with a push-to-write scheme, where the user holds a button to start and releases it to stop. A common alternative is to replace the button with a specific posture or gesture to signal the end points of the intended control motion. Another approach is to define an input zone for gesture delimitation.[8] When the user reaches to write in the air, the explicit delimitation is done by thresholding the depth information. Other scenarios have been proposed, including one where an LED pen is tracked in the air. This allows for 3D data to be interpreted, and makes sure that the beginning and end of input are clearly defined. But our focus is to develop a low cost and easy to use tool, without using any other gadget except the sensor device.

Automatic Detection

This uses a spotting technique that requires no intentional delimitation. The system automatically detects the intended motion and segments the writing correspondingly. If functioning perfectly, automatic detection can make the air-writing experience more convenient, especially for controller-free systems. Detection of motion gestures can be very difficult when the defined gesture has a similar trajectory to other control motions. In such a case, the motion itself may not contain enough information for automatic detection, and push-to-write is more robust and accurate for motion segmentation. Anyway as the writing motion is much different from other general control motions, it makes robust detection of air-writing possible. Typically unsupervised learning and data-driven approaches are used in this kind of method.

Amma et al.[9] proposed a spotting algorithm for air-writing based on the acceleration and angular speed from inertial sensors attached to a glove. It's a two-stage approach for the spotting and recognition task.


Automatic Detection Employed Process

In the spotting stage, they use binary Support Vector Machines (SVM) classifier to discriminate motion that potentially contains handwriting from motion that does not. Like illustrated it shows two ways the letter ‘N’ could be written and the letter H. Red portions would be the segments that recognized by the spotting stage as ignored motion. Only grey segments should be input to the next stage.





Time window based approach used by Chen’s work[2] is similar to this method where potential non-writing motions incurred in the capturing stage and later use the filler model to handle them. The window-based detector is only responsible for determining whether a writing event occurs in the window, which is slided through the continuous motion data. A writing event usually involves sharp turns, frequent changes in directions, and complicated shapes rather than a drift or swipe motion. The sliding window has to be long enough to capture these writing characteristics to distinguish a writing event. It is straightforward to determine a window that has tiny motion as a non-writing event, and this “silent” window need to be skipped from further processing. 

Almost all of the automatic detection techniques involves time variable to distinguish between writing motion and non-writing motion. So important consideration should be given to decide whether the user base can depend on this time variable, if uses automatic detection techniques.


References

[2]
Mingyu Chen, "Universal Motion Based Control," School of Electrical and Computer Engineering, Georgia Institute of Technology, PhD. Dissertation 2013.
[8]
Per Ola Kristensson, Thomas F.W. Nicholson, and Aaron Quigley, "Continuous recognition of one-handed and two-handed gestures using 3D full-body motion tracking sensors," in Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces, Lisbon, Portugal, 2012, pp. 89-92.
[9]
Christoph Amma, Marcus Georgi, and Tanja Schultz, "Airwriting: Hands-free Mobile Text Input by Spotting and Continuous," in Proceedings of the 2012 16th Annual International Symposium on Wearable Computers (ISWC), Washington, DC, USA, 2012, pp. 52-59.



Comments