Process Designing of Virtual Valipilla

Based on the literature, necessary technologies and processes to build up the gesture based learn-to-write tool were recognized as the solutions to the problems came across. This post summarizes the gathered knowledge about the selected processes and the technologies, and how they should applied.

This system can be identified as four phases.

Writing Phase and UI

This is the phase where user directly interacting with the system. The human-computer interaction (HCI) totally goes with the finger gesture. Apart from the mainly focused writing interactions, there can be other interactions such as button selections and navigation. An effective motion-based control should support common functions of a 2D graphic user interface, such as pointing, selection and dragging. The natural user interface (NUI) generally used for motion-based remote pointing is “virtual mouse in the air” metaphor. The finger motion on a virtual plane parallel to the display should control the cursor correspondingly. The 2D cursor position on the screen is calculated by projecting, scaling, shifting, and clipping the 3D coordinates of the controller. The scaling factor is determined based on the screen resolution and the precision of the optical tracking. This can be easily handled with the Leap Motion Application Program Interface (API).

UI plays a crucial role in here. The main component of the UI is the drawing area. It is necessary to give user the real time visual feedback of his finger movement when writing the letter to avoid the user confusion. As mentioned in the previous post we choose explicit delimitation for motion segmentation due to the fact our user base is vary in writing habits. If automatic detection was used, it may give unpredictable results at the spotting stage. Since explicit delimitation for motion segmentation is used, the pen-up kind of motion should be avoided from showing in the UI.

Also concentration should be on giving the user dynamic feedback to their every gesture action. The more feedback they have, the more precisely they can interact with the system. For example, if a user wants to push a button in gesture, he will like to know whether he is “pushing” a button in real-time. It is more effective if he can see when he is hovering over a button, or how much they are pressing it. Another way of amplifying the button experience for the users is to provide a proximity-based highlighting scheme. This would highlight the closest item to the user’s cursor, and tap gesture could activate it without having to actually be over it. Anticipating what the user might want in these contexts can save time and eliminate frustration.

Also the UI components should be larger in size and well organized, so then it makes easy access with our finger without much ergonomic difficulties.

Capturing Phase Alternate Solutions

The biggest problem comes with the gesture writing capturing process. Anyhow due to the varying user base, our only option is using explicit delimitation in capturing motion segments as to the reasons mentioned previously. So in explicit delimitation, there are several alternate solution methods we can see as below.

Key Press on the Keyboard

This is the easiest method to implement. But pressing a button/key and writing with gesture does not go together. Our focus should be implementing a controller-free system which has no buttons and requires different and easy to use form of delimiters for push-to-write. So this technique is not the best fit for our requirement even though it is so easy to put into operation.

Breaking Gestures

This is like starting the stroke with a defined gesture (e.g. angled down finger) and end a stroke with a defined opposite gesture (e.g. angled up finger).

Using specific postures or gestures to break letter strokes may require a little more user training than the plane-penetration method which described next.

Virtual Input Zone

Concept here is to define a drawing plane that the finger or tool (drawing component) must cross to start the stroke and then pull back to stop the stroke. This can be as simple as the z-coordinate of the tip position is less than a certain value. So it would give a vertical writing surface. Also could define a plane tilted away from the user at a more comfortable angle with a bit more complex geometry. Ending of the letter stroke could be based on withdrawal from the drawing area. Also pause in motion on the plane can trigger the start and end of the letter strokes.

Virtual Touch Emulation Plane

When compared to the key pressing approach and break gesture approach, it is concluded that defining a virtual input zone is a suitable approach to move on.

Recognition Phase

Capturing phase and recognition phase are mostly seen as a combined process.

Gesture Character Recognition Process

‘$P Cloud Point’ recognizer from the latest of the ‘$-family’ was selected as the recognition technique due to the reasons mentioned in the Introduction post.

$P recognizer considers gesture as cloud points.

Point Cloud

Not like other recognition techniques, in this method, gesture execution timeline has been discarded, the number of strokes, stroke ordering, and stroke direction become irrelevant. The task of the recognizer remains to match the point cloud of the candidate gesture to the point cloud of each template in the training set and compute a matching distance.

With the concept of nearest neighbor approach, the template located at the smallest distance from candidate gesture delivers the classification result.

$P Point Cloud Process

Feedback Phase

This is the phase where users receive a feedback to their letter writing. The response message should tell the user whether he has performed well in writing or not. If system could verify according to recognition process, whether the said letter is written by the user, then that information should be given to the user.

The Moment

Search This Blog