Thursday, December 31, 2015

Thesis Draft Submitted


Submitted Thesis Draft of Virtual Valipilla. Getting ready for presentations. Ahmm...

Friday, November 27, 2015

Middle Path | මධ්‍යම ප්‍රතිපදාව | Majjhimāpaṭipadā

නොපෙනෙන මානය | දෙරණ | 2015-11-25 | ඉල් පුන් පොහෝ දින සවන් වැකුණු අදහසක් | පූජ්‍ය මැදගොඩ අභයතිස්ස ස්ථවීර

Does middle way really meant the middle way between the two extremes?

Logic 1 : "හොරකං කරනකොට එහි අන්ත දෙක වෙච්ච හොරකං නොකරම ඉන්නවයි, ඔක්කොම හොරකං කරන් යනවයි කියන කතාවේ මැදුම් පිළිවත වෙච්ච පොඩ්ඩක් හොරකං කරනවයි කියන එක හොඳ වැඩක්ද? "

Logic 2 : "මධ්‍යම ආණ්ඩුව කියන්නේ අන්ත දෙකක් අතර මැද තියන ආණ්ඩුවක්ද? …මධ්‍යම බස් නැවතුම්පළ කියන්නෙ තැන් දෙකක් අතර මැද ගානට මිම්මට හදල තියන බස් හෝල්ට් එකක්ද?"

Conclusion : "මධ්‍යම ප්‍රතිපදාව යනු වැඩි වශයෙන් මූලික යන වචනයෙහි අරුත දේ"


Saturday, August 29, 2015

September Dedications

Book Month Incoming.....

This September dedications goes to TB after watching "Thilaka ha Thilakaa" (තිලක හා තිලකා) for the second time. 2015 is a good year to watch it for the second time to get a better understanding of its politics. The movie is created from mix of the TB's three books "විලම්බීත", "තිලක", "තිලක හා තිලකා" ("Vilambeetha", "Thilaka", "Thilaka ha Thilakaa"). And story of "Thilaka ha Thilakaa" depict the real life story of TB(Tikiri Bandara) and his wife, Thamara. As "Apu" trilogy came from the Bengal culture, "Thilaka" trilogy is the brand for Sri Lanka.

There were lot of popular books from great authors in the past with the flavor of the cultural, social and political situation at that time. Such as in Gunadasa Amarasekara stories. Wonder if such a thing like now.

කරුණාරත්න හඟවත්ත සහ වජිරා නිර්මලී, තිලක සහ  තිලකා ලෙසින් (Karnarathne Hangawatte [aka Prof. Karu Hangawatte] and Vajira Nirmali as Thilaka and Thilakaa)

Hope to grab some second hand books of T. B Ilangaratne (ටී බී ඉලංගරත්න) as Yamu.lk said for this September rather than new books. Because it will give the real flavor in olden smell of bygone stories.

Sunday, August 9, 2015

Open the pod bay doors, Hal!


Watched the boring(according to the current standard) science fiction film 2001:A Space Odyssey which was co-written by Arthur C Clarke in 1968. And I just found this funny part in the wiki, how Arthur was involved for the writing in the film.


Searching for a collaborator in the science fiction community, Kubrick(director) was advised by a mutual acquaintance, Columbia Pictures staffer Roger Caras, to talk to writer Arthur C. Clarke. Although convinced that Clarke was "a recluse, a nut who lives in a tree", Kubrick allowed Caras to cable the film proposal to Clarke, who lived in Ceylon.



Seriously wonder, why he chooses Ceylon...!

HAL 9000, the machine intelligence, which was highlighted by the film, describes about its ability to lip-reading with visual speech recognition. And that's where this quote comes from, which was so much popular "Open the pod bay doors, Hal!"






Anyway, what we're interested in here is not the villain Hal or the genius Clarke. We're interested in here is about a song "Daisy Bell" which was sung by the Hal, our machine intelligence that lead me into watching the film even it make me sleepy.


Daisy, Daisy, give me your answer do
I'm half crazy all for the love of you
It won't be a stylish marriage
I can't afford a carriage
But you'll look sweet upon the seat
Of a bicycle built for two."

Originally it was not sung by Hal. It was actually sung by a real machine of IBM in 1961 to demonstrate speech synthesis and was a quite inspiration to Arthur as he witnessed its singing. So let's here its singing.




And more..., search how Samsung uses this Stanley Kubrick’s 1968 science fiction film as evidence against Apple’s intellectual property infringement lawsuit. Anyway Samsung has lost I believe.




Sunday, August 2, 2015

Beer & Diapers for Data Mining


Beer & Diapers story is a well known example to explain the concepts of Data Mining Frequent Itemset Association Rule in the international context.




This unexpected association rule of Beer & Diapers anecdote is everywhere to be heard when learning data mining, even at the places where there is no beer culture.

Does this means there is no proper analysis going on for the local context, that can be taken as anecdote?


Friday, July 31, 2015

JME3 and Collision Detection

JMonkey or JavaMonkey is a Java based 3D game development engine. In the developer community it is recognized as JME3. For more information you can visit their website http://jmonkeyengine.org/




Their IDE is just the same as the NetBeans environment.






I was just experienced the JME3 to learn collision detection. So for this experiment we're building a simple shooting game application. In the game play, player need to move his spaceship and shoot at the moving enemy objects, before, moving enemy objects collided with the spaceship. Main focus was to look at how the collision detection techniques are used in the game environment.

This post is not about building this app because you can follow this tutorial to create that state of the art game app.
It explains how to setup sounds, manage scene objects and its behaviors, and all the other things necessary  for a beginner, step by step, in a easily understandable way. 

Here we are in this post focusing on detect collisions. In a shooting game, typical collisions are enemy and the player, enemy and the bullets, or bullets and the player. As it decides the game-play, it is important to handle them.

So in our experiment, there are two types of collisions that are important to us. Enemies with bullets triggered by the player and enemies with the player spaceship.

In order to make enemies kill the player, we need to know whether enemies and the spaceship are colliding. Same as to kill enemies, we check whether enemies and bullets are colliding.

Radius Checking


To detect whether there is a collision, radius checking technique is a one solution. First we calculate the distance between the two objects in the center point. Next, we need to know how close the two objects need to be in order to be considered as having collided, so we get the radius of each spatial and add them. So, if the actual distance is shorter than or equal to this maximum distance, they collided.



So if the enemies and spaceship collided anyhow, game over and restart again.






Tuesday, July 28, 2015

Process Designing of Virtual Valipilla

Based on the literature, necessary technologies and processes to build up the gesture based learn-to-write tool were recognized as the solutions to the problems came across. This post summarizes the gathered knowledge about the selected processes and the technologies, and how they should applied.

This system can be identified as four phases.


Writing Phase and UI

This is the phase where user directly interacting with the system. The human-computer interaction (HCI) totally goes with the finger gesture. Apart from the mainly focused writing interactions, there can be other interactions such as button selections and navigation. An effective motion-based control should support common functions of a 2D graphic user interface, such as pointing, selection and dragging. The natural user interface (NUI) generally used for motion-based remote pointing is “virtual mouse in the air” metaphor. The finger motion on a virtual plane parallel to the display should control the cursor correspondingly. The 2D cursor position on the screen is calculated by projecting, scaling, shifting, and clipping the 3D coordinates of the controller. The scaling factor is determined based on the screen resolution and the precision of the optical tracking. This can be easily handled with the Leap Motion Application Program Interface (API).

UI plays a crucial role in here. The main component of the UI is the drawing area. It is necessary to give user the real time visual feedback of his finger movement when writing the letter to avoid the user confusion. As mentioned in the previous post we choose explicit delimitation for motion segmentation due to the fact our user base is vary in writing habits. If automatic detection was used, it may give unpredictable results at the spotting stage. Since explicit delimitation for motion segmentation is used, the pen-up kind of motion should be avoided from showing in the UI.
Also concentration should be on giving the user dynamic feedback to their every gesture action. The more feedback they have, the more precisely they can interact with the system. For example, if a user wants to push a button in gesture, he will like to know whether he is “pushing” a button in real-time. It is more effective if he can see when he is hovering over a button, or how much they are pressing it. Another way of amplifying the button experience for the users is to provide a proximity-based highlighting scheme. This would highlight the closest item to the user’s cursor, and tap gesture could activate it without having to actually be over it. Anticipating what the user might want in these contexts can save time and eliminate frustration.
Also the UI components should be larger in size and well organized, so then it makes easy access with our finger without much ergonomic difficulties. 


Capturing Phase Alternate Solutions

The biggest problem comes with the gesture writing capturing process. Anyhow due to the varying user base, our only option is using explicit delimitation in capturing motion segments as to the reasons mentioned previously. So in explicit delimitation, there are several alternate solution methods we can see as below.

  • Key Press on the Keyboard

This is the easiest method to implement. But pressing a button/key and writing with gesture does not go together. Our focus should be implementing a controller-free system which has no buttons and requires different and easy to use form of delimiters for push-to-write. So this technique is not the best fit for our requirement even though it is so easy to put into operation.

  • Breaking Gestures

This is like starting the stroke with a defined gesture (e.g. angled down finger) and end a stroke with a defined opposite gesture (e.g. angled up finger).
Using specific postures or gestures to break letter strokes may require a little more user training than the plane-penetration method which described next.

  • Virtual Input Zone

Concept here is to define a drawing plane that the finger or tool (drawing component) must cross to start the stroke and then pull back to stop the stroke. This can be as simple as the z-coordinate of the tip position is less than a certain value. So it would give a vertical writing surface. Also could define a plane tilted away from the user at a more comfortable angle with a bit more complex geometry. Ending of the letter stroke could be based on withdrawal from the drawing area. Also pause in motion on the plane can trigger the start and end of the letter strokes.

Virtual Touch Emulation Plane
When compared to the key pressing approach and break gesture approach, it is concluded that defining a virtual input zone is a suitable approach to move on.


 Recognition Phase

Capturing phase and recognition phase are mostly seen as a combined process.

Gesture Character Recognition Process


‘$P Cloud Point’ recognizer from the latest of the ‘$-family’ was selected as the recognition technique due to the reasons mentioned in the Introduction post.

$P recognizer considers gesture as cloud points.
Point Cloud

Not like other recognition techniques, in this method, gesture execution timeline has been discarded, the number of strokes, stroke ordering, and stroke direction become irrelevant. The task of the recognizer remains to match the point cloud of the candidate gesture to the point cloud of each template in the training set and compute a matching distance.

With the concept of nearest neighbor approach, the template located at the smallest distance from candidate gesture delivers the classification result.

$P Point Cloud Process

Feedback Phase

This is the phase where users receive a feedback to their letter writing. The response message should tell the user whether he has performed well in writing or not. If system could verify according to recognition process, whether the said letter is written by the user, then that information should be given to the user.

Segmenting the Motion for Virtual Valipilla

From this series of posts I'm going to mention the project background. The techniques and tools that can be used for air-finger-writing process. This post includes the motion segmentation techniques that can be used for the implementation.

Motion Segmentation

We are using a vision input device that constantly streams the location of the fingers within its field of view. So the data stream contains both an intended control motion and other extraneous motions that do not correspond to any control motion. Pen up /pen down kind of gestures cannot easily identify in here as in normal handwriting. Essentially there is no known beginning and end from the user input. There is no predefined gesture that indicates character writing starts and stops. That is why there should be a motion segmentation mechanism in place to accurately identify starts and ends of the letter strokes as well as finger pauses.

There are two paradigms for motion segmentation;


1.      Explicit Delimitation
2.      Automatic Detection.



Explicit Delimitation

Explicit delimitation can be easily accomplished with a push-to-write scheme, where the user holds a button to start and releases it to stop. A common alternative is to replace the button with a specific posture or gesture to signal the end points of the intended control motion. Another approach is to define an input zone for gesture delimitation.[8] When the user reaches to write in the air, the explicit delimitation is done by thresholding the depth information. Other scenarios have been proposed, including one where an LED pen is tracked in the air. This allows for 3D data to be interpreted, and makes sure that the beginning and end of input are clearly defined. But our focus is to develop a low cost and easy to use tool, without using any other gadget except the sensor device.

Automatic Detection

This uses a spotting technique that requires no intentional delimitation. The system automatically detects the intended motion and segments the writing correspondingly. If functioning perfectly, automatic detection can make the air-writing experience more convenient, especially for controller-free systems. Detection of motion gestures can be very difficult when the defined gesture has a similar trajectory to other control motions. In such a case, the motion itself may not contain enough information for automatic detection, and push-to-write is more robust and accurate for motion segmentation. Anyway as the writing motion is much different from other general control motions, it makes robust detection of air-writing possible. Typically unsupervised learning and data-driven approaches are used in this kind of method.

Amma et al.[9] proposed a spotting algorithm for air-writing based on the acceleration and angular speed from inertial sensors attached to a glove. It's a two-stage approach for the spotting and recognition task.


Automatic Detection Employed Process

In the spotting stage, they use binary Support Vector Machines (SVM) classifier to discriminate motion that potentially contains handwriting from motion that does not. Like illustrated it shows two ways the letter ‘N’ could be written and the letter H. Red portions would be the segments that recognized by the spotting stage as ignored motion. Only grey segments should be input to the next stage.





Time window based approach used by Chen’s work[2] is similar to this method where potential non-writing motions incurred in the capturing stage and later use the filler model to handle them. The window-based detector is only responsible for determining whether a writing event occurs in the window, which is slided through the continuous motion data. A writing event usually involves sharp turns, frequent changes in directions, and complicated shapes rather than a drift or swipe motion. The sliding window has to be long enough to capture these writing characteristics to distinguish a writing event. It is straightforward to determine a window that has tiny motion as a non-writing event, and this “silent” window need to be skipped from further processing. 

Almost all of the automatic detection techniques involves time variable to distinguish between writing motion and non-writing motion. So important consideration should be given to decide whether the user base can depend on this time variable, if uses automatic detection techniques.


References

[2]
Mingyu Chen, "Universal Motion Based Control," School of Electrical and Computer Engineering, Georgia Institute of Technology, PhD. Dissertation 2013.
[8]
Per Ola Kristensson, Thomas F.W. Nicholson, and Aaron Quigley, "Continuous recognition of one-handed and two-handed gestures using 3D full-body motion tracking sensors," in Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces, Lisbon, Portugal, 2012, pp. 89-92.
[9]
Christoph Amma, Marcus Georgi, and Tanja Schultz, "Airwriting: Hands-free Mobile Text Input by Spotting and Continuous," in Proceedings of the 2012 16th Annual International Symposium on Wearable Computers (ISWC), Washington, DC, USA, 2012, pp. 52-59.



Vision Sensor Devices for Virtual Valipilla

From this series of posts I'm going to mention the project background. The techniques and tools that can be used for air-finger-writing process. This post includes the vision sensor devices that can be used for the implementation.


Vision Sensor Devices

Human computer interaction (HCI) can go far beyond typing on the keyboard, moving the mouse, and touching the screen. Recent advances in computer vision technologies can recognize hand gestures and finger movement tracking. Optical tracking is the most competitive among all tracking technologies for use in motion-based user interfaces. For example vision-based optical tracking uses computer vision techniques to recognize the tracking target from the scene, which can be the user’s head, hand, or body. The most successful vision-based optical tracking system, Microsoft Kinect sensor is more popular among the development community. Combining a simple RGB camera and a depth camera, Kinect is able to track human body in 3D. But it is an expensive approach because its main focus is on full body motion[6]

As our requirement is just the tracking of finger movements, ‘Leap Motion Controller’ device is the ideal less expensive solution. It is a small USB peripheral device which is specifically designed to track hand, fingers or stick-like objects such as a pen or chopstick precisely in a desktop environment. The 3D positions of the fingers are precisely captured by this computer vision device at over 100 frames per second (fps) using USB 2.0 port or about 150 fps using a USB 3.0 port. Therefore, it is possible to track the detailed positions of each finger precisely and efficiently. Compared to the Kinect, it is too smaller in size (just 3" long), light weight, not take much space on desk and can carry wherever very easily. Therefore portability is ensured.

Leap Motion Controller

To work with Leap Motion sensor device, it needs to plug into a computer via USB and does not need any special adapters. It does not replace the keyboard, mouse or any other input device, so can work with both gesture inputs and other kind of inputs parallel. Leap Motion sensor can send information about hands, fingers and even real cylindrical pencils it sees in its field of view. It then determines the location of fingers within the field of view, the existence and position of any “tool”, such as cylindrical non-transparent pen or pencil [7].


The Leap Motion Controller tracks all 10 fingers up to 1/100th of a millimeter. It is dramatically more sensitive than other motion control technologies. That is a strong benefit for gesture based writing.

References

[6]
(2015, July) Microsoft Kinect for Windows. [Online]. https://www.microsoft.com/en-us/kinectforwindows/develop/default.aspx
[7]
(2015, July) Leap Motion Developer Portal. [Online]. https://developer.leapmotion.com/documentation/java/index.html

Background of Virtual Valipilla

From this series of posts I'm going to mention the project background. The techniques and tools that can be used for  air-finger-writing process. This post includes the gesture based character recognition techniques that can be used for the implementation.

BACKGROUND

Air-writing recognition enables a user to input text by writing in the air. Different from conventional pen-based handwriting, air-finger-writing renders the characters written with a finger tip or tip of a stylus, on a virtual plane without a haptic feedback. It involves no physical plane to write on, and has no direct pen-up/pen-down information. In other words, air-writing is uni-stroke and different from ordinary handwriting. Therefore, conventional offline handwriting recognition techniques cannot be applied directly. This type of research deals with interpreting the data while it is generated and much different to the process of a typical OCR system. So most of the researches carried out in this area is to finding better recognition techniques to recognize this uni-stroke writing in a virtual plane. In simple, it is like building a gesture based writing tool. That is the most general usage in this research domain. So our focus is not on building a writing tool, but building a learn-to-write tool.

Gesture Based Character Recognition

In building this learn-to-write tool, our consideration is just on a single character rather than the whole word. However this single character should be verified, that it is written perfectly and in a proper way with correct component proportions. Also we could not expect that the characters are only on uni-strokes. They can contain multi-strokes. 

When designing a recognizer, a trade-off is usually made between personalization and generality. So there are two extreme cases lying on as user-dependent and user-independent gesture recognition. Even with a predefined gesture vocabulary, robust user-independent gesture recognition can be very challenging due to the large variations among different users. 

Considering these aspects, several approaches were surveyed as follows, in order to find a suitable one for the online recognition of a character in a stream of 3D coordinate points from finger gestures.

  •  Dynamic Time Warping

It is an effective algorithm based on dynamic programming. By treating the input as a time series of 3D positions, Dynamic Time Warping (DTW) algorithms can be used to recognize characters. The task of identifying characters in a time series requires data to test and train on. Vikram et al.[1] proposed an approach where their algorithm searches similar character writing sequences from the training database using DTW. Their first goal is to have a similarity search algorithm, as in the pseudo code depict. The similarity search will sweep across the data time series, checking every sub -sequences against the candidate and returning the best match. Both candidates and all sub-sequences are z-normalized in the process.

DTW Process


The dynamic time warping algorithm is used as a similarity metric between vectors. It is a generalization of the Euclidean distance metric but chooses the closest point within a certain time window, rather than creating a one-to-one mapping of points. When the time window is 0, DTW reduces to Euclidean distance. 


Letter Stroke Distance Measuring Techniques


To obtain better results for this kind of a character recognition technique, it is best suitable when the handwriting's are belonging to a frequent writer. Actually DTW can be useful for personalized gesture recognition. But for our requirement we are focusing on language learners on different proficiency levels and age limits. So the time duration people taking to write a letter may have significant variances.

  • Hidden Markov Model

The statistical approaches to this problem use Hidden Markov Model (HMM) or use a combination of HMM and neural network approach to recognize characters. The HMM is efficient at modeling a time series with spatial and temporal variations, and has been successfully applied to gesture recognition.

Depending on the tracking technology in use, the features (observations) for the HMMs may vary, including the position, the moving direction, acceleration, etc. Chan[2] proposed an approach using HMM-based recognizer which uses the 2 dimensional (2D) position and velocity on the xy-plane as the feature vector for the HMMs. The raw sensor signals may need proper normalization to make the recognizer scale and speed invariant, or quantization to handle the variations of gestures, especially in the user-independent case.

  • Artificial Neural Network
Most of the researches used Artificial Neural Network (ANN) as a classifier in gesture recognition process. ANN can be trained with known examples of a problem before it is tested for its inference capability on unknown instances of the problem. It possesses the capability to generalize the given set of data; as a result they can predict new outcomes from past trends. The system proposed by Joshi et al.[3], inputs preprocessed data; such as smoothed, duplicate removed, spatial normalized data for ANN model. They had used back propagation Neural Network (NN) with one input layer, two hidden layers and one output layer for handwriting recognition as shown. 


Artificial Neural Network


Larger the amount of training cycles, better the yield mean square error, which in general indicates better test result. However, more training cycles spend more training time as well. A very important feature of these networks is their adaptive nature, where programming is replaced by learning in solving problems. They are robust, fault tolerant and can recall full patterns from partial, incomplete or noisy patterns.

  • Hilbert Warping Method

Ishida et al.[4] has proposed an alignment method for gesture based handwriting recognition called Hilbert Warping as a solution for the problems occurred in DTW. As they say, DTW has a drawback for the classification task because DTW looking always for the best alignment for the reference sequences of all categories. They suggest misclassification can occur due to the over-fitting to incorrect categories.

In their proposed method, the input sequence is aligned to the reference sequences by phase-synchronization of the analytic signals, and then classified by comparing the cumulative distances. A major benefit of this method is that over fitting to sequences of incorrect categories is restricted. The proposed method exhibited high recognition accuracy in finger-writing character recognition.

  • Data-driven Template Matching

The $1 recognizer and its variants $N and $P [5] are based on template matching. Unlike DTW, which relies on dynamic programming, these algorithms process the trajectory with re-sampling, rotation, and scaling and then match the point-paths with the reference templates. These recognizers are simple to implement, computationally inexpensive, and require only a few training samples to function properly for personalized recognition. However, for user-independent recognition, a significant amount of templates are needed to cover the range of variations. Still the research space is lacking of using $-family (dollar family) for character recognition as most of the researches carried out to build a writing tool which focus on words and sentences rather than a single character. $-family is best suitable for recognizing a single character or a symbol written with finger gesture.

State-of-the-art gesture recognition techniques, such as HMM, feature-based statistical classifiers or mixture of classifiers, typically require significant technical knowledge to understand and develop them for new platforms, or else knowledge from other fields needed (e.g. graph theory). But $-family is proposing low-cost, easy to understand and  easy to  implement, yet high performing, gesture recognition approaches. It involves only simple geometric computations and straightforward internal representations. Furthermore, the algorithms are highly accessible through the published pseudo codes which developers can use for their own platforms.

References

[1]
Vikram Sharad, Lei Li, and Stuart Russell, "Writing and sketching in the air, recognizing and controlling on the fly.," in ACM Conference on Human Factors in Computing Systems (CHI), 2013.
[2]
Mingyu Chen, "Universal Motion Based Control," School of Electrical and Computer Engineering, Georgia Institute of Technology, PhD. Dissertation 2013.
[3]
Aditya G. Joshi, Darshana V. Kolte, Ashish V. Bandgar, Aakash S. Kadalak, and N. S. Patil, "Touchless Writer a hand gesture recognizer for Englsih characters," in Proceedings of 22nd IRF International Conference, Pune, India, January 2015.
[4]
Hiroyuki Ishida, Tomokazu Takahashi, Ide Ichiro, and Murase Hiroshi, "A Hilbert warping method for handwriting gesture recognition," Pattern Recognition, vol. 43, no. 0031-3203, pp. 2799-2806, August 2010.
[5]
Radu-Daniel Vatavu, Lisa Anthony, and Jacob O. Wobbrock, "Gestures as Point Clouds: A $P Recognizer for User Interface Prototypes," in Proceedings of the 14th ACM International Conference on Multimodal Interaction, Santa Monica, California, USA, 2012, pp. 273-280.



Monday, July 27, 2015

Interim Report - "Done" and here goes the Introduction



Yay!!!!! Completed and Submitted the Interim Report. Finally arrives with Literature Review.


Virtual “Valipilla” - Finger Gesture Based Learn to Write English Alphabet Tool



INTRODUCTION

The recent technologies in vision based sensors are competent enough to capture the finger positions and movements. This allows people to write something in the air even with the finger, which could be perfectly recognized by a computerized platform. So this research project was carried out considering the above two aspects in the domain of learning a language; more specifically writing the alphabet.

Motivation for the Project

Most of the educationists all over the world suggest that, for a kindergartner, the first tool that should be used to teach writing, more specifically the letters in the alphabet, must not be a pencil, pen, marker or a crayon. What they recommend is that the child should first use their fingers to move along the path of the letter. And that is why in Sri Lankan culture people had this tool called “valipilla” (වැලිපිල්ල) to teach how to write. It is an ideal multi sensory approach to teach the alphabet because the muscle movement helps the child to process what they are writing and make it more likely to stick in to their minds.

Traditional Sri Lankan ‘Valipilla’


With this traditional approach there need to be always an instructor behind the child to provide guidance how to write. Instructor has to show them how to write correctly, verify them whether they are writing correctly and show them the correct way if they are wrong. So his presence is always necessary to get an effective outcome.

Problem Statement

So, can we automate this process of verification that above mentioned? If so, then the instructor does not need to being always attended to the learning activity and the child could carry on by his own.  To answer that automation question we have to find solutions for two kinds of problems; How to accurately capture the finger movement, and how to verify whether it is the correct movement that we expected.

If a computer system could track the finger movement trajectory precisely with vision sensors then the first problem would be solved. Then we can use this trajectory input, to recognize as an alphabetical character when someone writes with a finger, and the second problem would be solved.

Application of the Research

So with this approach as mentioned above, we could easily replace the physical instructor and provide a finger gesture based virtual platform for the child to improve his writing ability all by his own. In other means it will add a higher value to the child education system by having an automated application to check whether the child has written the letters in the alphabet in correct way and by that provide the guidance to write it correctly. Not only for child education, but also this kind of a tool can be useful for an adult who learns a new language. Apart from the learning alphabet applications, this research can be further extended as a finger gesture based writing tool.

Aim and Objective

The main objective of this project is to research on a way to implement an accurate virtual platform to practice in writing English alphabetical letters using finger gesture and provide a developed tool as the concept of proof. Also the aim is to provide a prototype tool that can act as a supportive platform in the education system to improve writing ability when learning a new language.

Project Scope

The scope of the project is bound to the following areas mainly targeting on developing a finger gesture based learn-to-write tool for English alphabet.
  • Capturing Finger Gesture
There should be a process to efficiently capture the index finger gesture movements and correctly identify gesture pauses in letter strokes. A suitable technique should be employed to mention the starting and ending point of the strokes.
  • Recognition of Alphabetical Letters
There should be a process to match the finger gesture input trajectory data with the expected trajectories. Literature survey is conducted to find a suitable recognition technique to perform this task.
  • Gesture Writing and User Interface
There should be an interactive user interface (UI) with a drawing area to write the letter. When designing the UI it is important to keep in mind that the interaction done with a finger, in the air and not with a mouse, on a surface. And the activities should be designed by applying auditory and visual aids.
  • Display Feedback
There should be a method to output the results of the recognition process in a presentable way to the user. This should include whether the user has written the character correctly or not. 

Novelty and Contribution of the Research

There are lots of approaches to recognize handwritten characters. The challenging part of this is combining those approaches to recognize something written in the air with finger gesture, and also capturing those finger gesture trajectories by accurately identifying the start and end of the letter stroke. Still the related work carried out in this research space are lacking of its accuracy. A huge weight of the performance depends on the used recognition technique and letter strokes capturing mechanism. There are few related research projects which have used different recognition techniques and capturing mechanisms. The focus of almost all of those projects was to build up an air-writing tool, but not to make it as a supportive tool for improve writing ability. So our expectation of this research is to contribute for learning area. 

Tuesday, July 21, 2015

Computer Vision and Chocolate Packaging

A video camera has been installed at a known height above a production line of chocolates.

There are only 3 types of chocolates;

  • Heart Shape
  • Round Shape
  • Square Shape


By determining the type of chocolates, robotic arm can pick it up and place the chocolate in the relevant position in the chocolate box.

So here we are going to implement a computer vision system to solve this packaging problem. But before moving on, let's see a real environment of chocolate packaging.



Okay!! To recognize the shape of the chocolate in the production line we are using contours.
Here is what our camera has captured. After taking all the chocolates in this current tray, belt is move forward to capture the next set.



So our first step should be prepare our image suitable to find the contours. To do that we do the following steps.


  1. Load the image.
  2. Convert to Gray-scale
  3. Detect Edges (use Canny Operator)

Grey-Scale Image






Edge Detected Image


Now our image is ready for finding the contours. By using these contour points, we can easily find the polygons which covered by these contours. By identifying the center of the polygon, we can easily move our robotic arm to attach the chocolate into its arm.



Next thing is to identify the type of the chocolate.
There are only three types and each type have its specific attributes.

Polygons of the Shapes


For example square has only four sides. So the polygon which covered by the square shape has only four sides. If we reduce the smoothness in the circular polygon, it will have a specific number of sides. In our case we reduce into 8 sides. As heart shape is not a convex object, polygon obtain for it would not be the same everytime. Anyway it would not be a much problem, because we can take all the shapes apart from the circle and square as heart shape.


Detected Shapes

At the end we can obtain our log report as this.


Monday, July 20, 2015

Image Processing and Button Hole Problem

A large scale manufacturing company produces buttons which are exactly identical in shape except the number of holes.



What you have to do is to develop a Computer Vision system.

It should acquire an image of a single button.

Then it should determine whether the button is of 6, 4, 2, 0 holes. Number of no holes are defects.

After identifying the number of holes, button should direct to a separate box according to the number of holes.

Before move into implementation, let's watch the process of a button manufacturing company.




As you see, in a normal button process, production of each type of button happened in separately. Actually there is no need to clarify buttons depending on its holes, because creating holes for 2 holes, 4 holes, 6 holes are separate processes. Only the vision system used to identify defects. Anyway let's assume they are all mixed in the tray. 

Okay!! Now let's begin our implementation.

Here is our button tray that captured from the camera.



This is the system we propose.



Robotic arm can move across our button tray. Also it can move down and stick the button to its arm. It releases the button when it moves into correct button basket where button belongs to. When the button tray is empty, tray moves forward to identify next set of buttons

Let's begin our button type recognition process. To do this we're using OpenCV 2.4 with Java wrapper.

First we thought to use Hough Circle Transformation for identify the holes in buttons. But the results are not always 100% accurate for that approach. We can say it will give 85% best results. But what we expect is 99% accuracy. Therefore we went for another approach.

Step 1

Our first step is to load the image and convert it to gray scale.



Step 2

Then blur the image to reduce the noise effect. Here we're using Gaussian blur technique.



Step 3


Then detect the edges. Here we're using Canny edge detector.


 Step 4


Now we are ready with our edge detected image to segment the button regions. For that we're using Contours.



Robotic arm moves along each of the button region. As the boundary taken both as a square and circular region, robotic arm could easily attach the button into it.

Step 5


Next step is recognition of holes. For that we're again using counters within each of the identified segment.


By identifying which basket the button belongs to, robotic arm could easily release the attached button at the basket. When a basket is filled we could easily take another basket. 

Here is the log of the current button tray. As this button tray is fully checked, tray can be moved forward to check for the next step.