Based on the literature, necessary
technologies and processes to build up the gesture based learn-to-write tool
were recognized as the solutions to the problems came across. This post summarizes the gathered knowledge about the selected processes and the
technologies, and how they should applied.
Writing Phase and UI
This is the phase where user directly interacting with the system. The human-computer interaction (HCI) totally goes with the finger gesture. Apart from the mainly focused writing interactions, there can be other interactions such as button selections and navigation. An effective motion-based control should support common functions of a 2D graphic user interface, such as pointing, selection and dragging. The natural user interface (NUI) generally used for motion-based remote pointing is “virtual mouse in the air” metaphor. The finger motion on a virtual plane parallel to the display should control the cursor correspondingly. The 2D cursor position on the screen is calculated by projecting, scaling, shifting, and clipping the 3D coordinates of the controller. The scaling factor is determined based on the screen resolution and the precision of the optical tracking. This can be easily handled with the Leap Motion Application Program Interface (API).
UI plays a crucial role in here. The main component of the UI is the drawing area. It is necessary to give user the real time visual feedback of his finger movement when writing the letter to avoid the user confusion. As mentioned in the previous post we choose explicit delimitation for motion segmentation due to the fact our user base is vary in writing habits. If automatic detection was used, it may give unpredictable results at the spotting stage. Since explicit delimitation
for motion segmentation is used, the pen-up kind of motion should be avoided
from showing in the UI.
Also
concentration should be on giving the user dynamic feedback to their every
gesture action. The more feedback they have, the more precisely they can
interact with the system. For example, if a user wants to push a button in
gesture, he will like to know whether he is “pushing” a button in real-time. It
is more effective if he can see when he is hovering over a button, or how much
they are pressing it. Another way of amplifying the button experience for the
users is to provide a proximity-based highlighting scheme. This would highlight
the closest item to the user’s cursor, and tap gesture could activate it
without having to actually be over it. Anticipating what the user might want in
these contexts can save time and eliminate frustration.
Also
the UI components should be larger in size and well organized, so then it makes
easy access with our finger without much ergonomic difficulties.
‘$P Cloud Point’ recognizer from the latest of the ‘$-family’ was selected as the recognition technique due to the reasons mentioned in the Introduction post.
$P recognizer considers gesture as cloud points.
Not like other recognition techniques, in this method, gesture execution timeline has been discarded, the number of strokes, stroke ordering, and stroke direction become irrelevant. The task of the recognizer remains to match the point cloud of the candidate gesture to the point cloud of each template in the training set and compute a matching distance.
Capturing Phase Alternate Solutions
The biggest problem comes with the
gesture writing capturing process. Anyhow due to the varying user base, our
only option is using explicit delimitation
in capturing motion segments as to the reasons mentioned previously. So in explicit delimitation, there are several alternate solution
methods we can see as below.
- Key Press on the Keyboard
This
is the easiest method to implement. But pressing a button/key and writing with
gesture does not go together. Our focus should be implementing a controller-free
system which has no buttons and requires different and easy to use form of
delimiters for push-to-write. So this technique is not the best fit for our
requirement even though it is so easy to put into operation.
- Breaking Gestures
This is like starting the stroke
with a defined gesture (e.g. angled down finger) and end a stroke with a
defined opposite gesture (e.g. angled up finger).
Using specific
postures or gestures to break letter strokes may require a little more
user training than the plane-penetration method which described next.
- Virtual Input Zone
Concept here is to define a drawing
plane that the finger or tool (drawing component) must cross to start the
stroke and then pull back to stop the stroke. This can be as simple as the
z-coordinate of the tip position is less than a certain value. So it would give a vertical writing surface. Also could define a
plane tilted away from the user at a more comfortable angle with a bit more
complex geometry. Ending of the letter stroke could be based on withdrawal from
the drawing area. Also pause in motion on the plane can trigger the start and
end of the letter strokes.
![]() |
Virtual Touch Emulation Plane |
When compared to the key pressing approach and break gesture
approach, it is concluded that defining a virtual input zone is a suitable
approach to move on.
Recognition Phase
Capturing phase and recognition phase are mostly seen as a combined process.![]() |
Gesture Character Recognition Process |
‘$P Cloud Point’ recognizer from the latest of the ‘$-family’ was selected as the recognition technique due to the reasons mentioned in the Introduction post.
$P recognizer considers gesture as cloud points.
![]() |
Point Cloud |
With the concept of nearest neighbor approach, the template located at the smallest distance from candidate gesture delivers the classification result.
![]() |
$P Point Cloud Process |
Feedback Phase
This is the phase where users receive
a feedback to their letter writing. The response message should tell the user
whether he has performed well in writing or not. If system could verify according
to recognition process, whether the said letter is written by the user, then
that information should be given to the user.
Comments
Post a Comment