First HMM experiment
Today I’m publishing the initial results of my experiments on online handwriting recognition of Chinese characters, using Hidden Markov Models (HMM). You can see my post on Tomoe Evaluation for some background.
Download
$ git clone http://www.mblondel.org/code/hwr.git
The code can be browsed online using gitweb.
See my memo on git if you don’t know it yet. I published my work under GPL license.
Requirements
- Python (2.4)
- GHMM (SVN)
- Tomoe (SVN)
- Tomoe-GTK (SVN)
The Python bindings for the last three are also needed.
Folder structure
- data/ contains the raw training and evaluation data
- lib/ contains reusable components
- models/ contains model experiments
- tests/ contains test cases
- character-editor is the graphical interface to edit character data
- model-manager controls the training workflow, evaluation and the test pad
Each model must have an intelligible name. Each model must defined a file called model.py containing a class called Model. This class defines the behavior of the model. A model can inherit from other models in order to reuse common components. My first model is called “basic” so its file is models/basic/model.py.
First model
Here is some information regarding my first model.
- HMM unit: whole character
- Feature vectors: (deltax, deltay) with deltax = abs(xt - xt-1) and deltay = abs(yt - yt-1)
- Number of states: 3 * number of strokes
- Initial state transitions: 0.5 to stay in the same state, 0.5 to jump to the next state
- Initial state alignment: feature vectors are segmented uniformly and segments are associated with their corresponding state
- Training: Baum-Welch
If you don’t understand anything of the above, you should read more about HMMs ;) I may write an introduction on this journal if I have some time.
Training workflow
model-manger’s usage is as follows:
./model-manager model-name command
My first model is named “basic” so you may replace model-name by “basic”. Possible commands include:
- fextract, for the feature vectors extraction
- init, for the model initialization
- train, for the training
- eval, for the evaluation
- pad, to test the model with your own handwriting
“all” is a command equivalent to fextract, init, train and eval.
Testing with your own handwriting
First of all you should generate the HMMs with the following command:
./model-manager basic all
The process takes less than one minute on my computer. You may see a few warnings because of some issues in ghmm and tomoe. If all goes well, you should see the accuracy of the model.
From this point, normally, you could test the HMMs with your own handwriting with the following command:
./model-manager basic pad
However, for strange reasons, ghmm behaves incorrectly when the pygtk module is loaded. So the above command works but the character results will be incorrect. I need to contact the pygtk or ghmm mailing-list about this obscure issue. For now, you can use the following command:
./model-manager pad | ./model-manager basic eval -s
The results are displayed on the console. The system supports the following 50 kanji only.
一 二 三 泣 漢 温 使 便 旅 族 水 氷 撃 女 安 北 化 忘 妄 近 集 育 坊 訪 防 妨 駅 福 副 神 版 坂 板 金 全 錬 練 業 習 央 決 代 反 想 歯 象 始 初 発 感
Pick a few of them and try them with your own handwriting ;-)! By the way, all training and evaluation data were written by mouse.
Evaluation
match1: 80.0%
match5: 96.0%
match10: 98.0%
始 1 始, 福, 駅, 錬, 漢 旅 1 旅, 族, 駅, 練, 副 妨 1 妨, 練, 錬, 板, 発 防 1 防, 訪, 旅, 族, 板 泣 1 泣, 温, 福, 練, 駅 副 1 副, 訪, 福, 撃, 初 福 1 福, 練, 錬, 副, 駅 坂 3 板, 駅, 坂, 族, 錬 代 1 代, 板, 漢, 使, 駅 反 1 反, 福, 副, 忘, 妄 撃 3 駅, 錬, 撃, 漢, 副 業 1 業, 練, 錬, 集, 駅 氷 2 駅, 氷, 水, 妨, 版 温 1 温, 福, 駅, 錬, 想 育 1 育, 練, 副, 駅, 福 神 2 練, 神, 福, 錬, 撃 近 1 近, 駅, 練, 漢, 福 化 1 化, 練, 駅, 便, 習 一 X 央 1 央, 決, 業, 駅, 発 族 1 族, 練, 旅, 錬, 副 安 4 妄, 駅, 福, 安, 族 象 1 象, 駅, 錬, 練, 集 歯 1 歯, 練, 錬, 駅, 副 錬 1 錬, 練, 集, 駅, 福 習 1 習, 錬, 福, 駅, 漢 使 1 使, 便, 漢, 錬, 練 訪 1 訪, 駅, 錬, 副, 板 漢 1 漢, 錬, 駅, 練, 業 全 1 全, 金, 集, 錬, 福 集 1 集, 練, 業, 錬, 福 版 1 版, 板, 錬, 駅, 集 水 2 氷, 水, 旅, 駅, 便 板 1 板, 族, 坂, 福, 駅 妄 1 妄, 駅, 福, 忘, 練 初 1 初, 駅, 旅, 練, 坂 想 1 想, 駅, 副, 錬, 集 発 1 発, 練, 福, 駅, 漢 練 1 練, 錬, 福, 駅, 板 北 1 北, 坂, 副, 駅, 板 決 1 決, 漢, 便, 練, 坂 坊 X 駅 1 駅, 錬, 練, 族, 福 金 1 金, 発, 練, 錬, 駅 女 5 駅, 妨, 妄, 板, 女 忘 1 忘, 族, 副, 福, 駅 二 1 二, 三, 忘, 歯, 習 感 1 感, 福, 駅, 族, 練 便 3 練, 駅, 便, 錬, 福 三 1 三, 忘, 副, 版, 訪
The results are very promising and outperform Tomoe’s current recognizer. Incidentally, I used the same evaluation corpus for Tomoe and for my experiment. However, a few things must be emphasized:
- My experiment only supports 50 kanji while Tomoe supports thousands of them.
- The evaluation of my experiment is performed using kanji from the same people who wrote the kanji used for training. However, the kanji instances for training and evaluation are not the same.
- It’s pretty sure that using the whole character HMM symbol will not perform well in terms of computation time with thousands of kanji. Usually, stroke or sub-stroke models are preferred.
Interestingly, my recognizer doesn’t do a good job at recognizing the simplest characters: 一 二 三.
Both Tomoe and my recognizer are sensitive to stroke order. However, as it seems, my recognizer is not so sensitive to stroke number. For example, く in 女 is one stroke but it’s acceptable to write it in two strokes. However, if you write く after 一 and ノ, it doesn’t work.
Call to online handwriting database
If you’re a researcher in handwriting recognition and read this, I’m looking for a handwriting database of Chinese characters (kanji or hanzi). Please contact me if you can help me.
What’s next?
- Try more sophisticated feature vectors
- Try more sophisticated initial state alignment
- Try stroke and sub-stroke HMMs
- Collect more data
- Try techniques other than HMMs
June 24th, 2008 at 11:16 pm
Very interesting results! Free Software is really missing a good kanji recognition software. Is there any literature about handwriting recognition software?
July 4th, 2008 at 4:52 pm
If you search for “handwriting recognition”, there are a few results but I’ve never bought any of those books.