Zinnia
Saturday, September 20th, 2008In my last post, I was writing about this impressive Chinese character recognition demo using AJAX on the client side and Support Vector Machines (SVM) on the server side, for the recognition process. Well, I don’t know if it’s just a coincidence (this demo was from 2 years ago) but Taku Kudo released last week the backend he’s using as free software. Needless to say that this was awesome news for me! I know the basic principle of SVM but time to learn more about it I guess…
His project, called Zinnia, has been rewritten from scratch to be more flexible and reusable. Models for Japanese and Chinese are included but models for other languages can be built easily provided that you have training data. I’m pretty sure that this package could also be useful for Gesture Recognition because it’s so close to Handwriting Recognition…
For the sake of comparison, I wanted to evaluate how Zinnia performs compared to both Tomoe and my own HMM experiment. I used the same evaluation corpus as I wrote about in earlier posts, that is two sets of 50 kanjis written by a Japanese friend of mine and me. The characters have the correct stroke order and were drawn carefully. Therefore, the results below indicate how the different recognizers perform in ideal conditions and don’t indicate how robust they would be in more difficult conditions.
Tomoe - Zinnia
| Tomoe | Zinnia | |
|---|---|---|
| 1st match accuracy | 61% | 77% |
| 5 matches accuracy | 74% | 92% |
| 10 matches accuracy | 74% | 93% |
| Recognition time | 21 / 100 = 0.21 s | 3 / 100 = 0.03 s |
| Total number of kanji | 3000 | 3000 |
1st match accuracy is the percentage of characters that were recognized as first match.
5 matches accuracy is the percentage of characters that were recognized in the first 5 matches.
You can download my evaluation script for Zinnia here. Tomoe’s evaluation script is sitting in Tomoe’ SVN, in the benchmark/ folder.
A few remarks:
- Zinnia is notably better than Tomoe in terms of accuracy
- Zinnia is about 7 times faster than Tomoe, making it a good candidate for an embedded platform
- In both cases, 5 matches and 10 matches accuracy are about the same, meaning that it would be enough for the user interface to display the first 5 matches only.
Project Tegaki - Zinnia
Due to lack of training data, my personal HMM experiment (project Tegaki) was only conducted over a set of 50 characters. However, Zinnia supports over 3000 characters. For fair comparison, I thus created new models for Zinnia using the same training data as I used for my experiment.
Zinnia was trained with only one sample per character, using the same data as Tomoe, which is template-based. While SVM seems to be able to cope with only one sample per character, it’s a little bit more complicated to do that with HMM because of the need to find the parameters of the Observation Probability Density Function (e.g. mean and variance for a Gaussian).
| Project Tegaki | Zinnia | |
|---|---|---|
| 1st match accuracy | 92% | 100% |
| 5 matches accuracy | 100% | 100% |
| 10 matches accuracy | 100% | 100% |
| Recognition time | 14 / 100 = 0.14 s | 1.50 / 100 = 0.015 s |
| Total number of kanji | 50 | 50 |
A few remarks:
- My experiment is slow, which is probably due to the fact that I’m using Character-level models. Stroke-level models are known to scale much better.
- My experiment has slightly worse accuracy, which is probably because I’m only using two features per observation.
Handwriting database
If you follow my adventures in the world of handwritten Chinese character recognition, you probably know that I’m planning to create a handwriting database website. This database will aim to 1) make it easy and attractive for people to contribute their handwriting samples and 2) make it easy for the database staff to manage and organize what is supposed to become a large collection of handwriting samples.
The database will use a client/server architecture. So far I’m thinking of four important clients:
- A client that people will be able to use directly in their web browser, using my web canvas
- A client for the Maemo platform
- A client for the Iphone
- A multi-platform client for the Destkop
A client of slightly lesser priority would be a Facebook application.
The handwriting samples collected will be distributed in free software license. For projects like Zinnia or Project Tegaki, this will mean more training data and more means to evaluate the performance. I consider this database one of my priorities among my free software projects but it’s going to be quite hard for me to find time for that before December…
Contribute
As always, more people are welcome to contribute.
To download the source code of my work,
$ git clone http://www.mblondel.org/code/hwr.git




