Some progress and identified goals

The Google Summer of Code has not officially started yet but since with school it will not be easy to work on the project in May and June, I thought I would make good use of my one-week holidays and give away a little bit of my time to make progress on the project :-).

First and foremost, as I had never developed anything for Maemo, I had to run through the process of setting up my working environment. This can be summed up as the following three steps :

- doing the necessary to gain root privileges on the device
- install SSH in order to be able to control the device from a normal computer and move files from/to the device
- install scratchbox in order to develop and test programs from a normal computer.

All those steps were straightforward and well documented. Incidentally, I am very impressed by scratchbox which is a great piece of work and integration !

Tomoe under Maemo

After a few hours of work, I got tomoe, the handwriting recognition engine, to work under scratchbox. I just had to remove the gucharmap dependency. Then I made Debian packages for tomoe and tomoe-gtk because it is the easiest way to test a program on the device. On the device indeed, tomoe runs well and it is very convenient to draw characters with the stylus. The only problem is that it is currently very slow : about four seconds are required on each stroke to display the updated list of character candidates… I see three ways of solving this problem:

- Since tomoe supports multiple backends (xml, mysql…), create a new backend with particular focus on performances.
- Identify the bottlenecks of the main backend and improve its performances.
- Disable the update of the character candidates on each stroke. This would be the ultimate solution in case the previous two solutions do not work. Update on each stroke is however very useful because you can understand if you have not drawn a stroke correctly as soon as you have drawn it.

As expected, the job for this summer will mainly have to do with performances and smooth integration.

Apart from that, the tomoe team seems to be quite active! Hu Zheng from Red Hat Beijing (?) and author of stardict, has contributed a stroke editor to help add support for more characters. He is in the process of adding 6000 Chinese characters. There are some plans on the list to add support for the missing Japanese characters as well. Good news!

Stroke editor

5 Responses to “Some progress and identified goals”

  1. Peter Maydell Says:

    Hi; do you mind if I ask your opinion on how far away Tomoe is from being a practical input method for Japanese on the N800? I ask because my Sharp Zaurus died recently and I’m wondering whether an N800 would make an acceptable replacement; since I use it mostly as a dictionary the Japanese input is pretty critical. Obviously as you say four seconds for lookup is very slow and would require improvement even if it only happened when you had completely entered a character. How is its accuracy with kanji and kana recognition? Does the UI work as a means for entering a string of kanji/kana as well as for looking up a single character?

    (I must confess that I haven’t yet downloaded Tomoe to try it on the desktop, which would probably help me answer these questions.)

    Thanks in advance…

  2. Mathieu Says:

    Hi Peter.

    I intend to use my N800 as a Japanese dictionary as well. My ultimate goal is to have Fantasdic (http://www.gnome.org/projects/fantasdic) working on it but it will probably take a few months before I can work on this. There are dictionary applications for the N800 already but I have not tested them yet. I like Fantasdic because I know that I will be able to use all my dictionaries with it, including some commercial ones that I purchased.

    With reference to Tomoe, I got it to work on the N800 after a few hours of work only. I have not gone very deeply into things. As you may have seen on my journal, I got sponsored by Google to work during this summer on porting Tomoe to Maemo, which is used by the N800. As I said, I will focus on improving the performances of Tomoe. Hopefully, at the end of the summer, Tomoe should run much better on the N800. Let’s keep our fingers crossed :-) As far as I know, Tomoe is not meant to take a full sentence as input but only one character at a time.

  3. Peter Maydell Says:

    Thanks for the reply.

    My remark about entering strings of text was really meant to try to distinguish between the pure character recognition engine parts and the “input method” bits (ie the UI which makes it actually usable as a means of entering Japanese text into a random application). For example, the Zaurus IM has a row of three boxes. You write a character into a box, and as soon as you stop writing or start writing a different character into another box the recognition engine puts its best guess into the string you’re building up. You can then correct it if it’s wrong, but in the common case that it’s right you can just press the button to dump the string into the application. In contrast it looks from the screenshot as if with Tomoe you would have to do more manual selection of characters from the list; I think this might be a less fluid UI.

    Anyway, I’ll try to find time to download Tomoe tomorrow and have a play with it.

  4. Paul "TBBle" Hampson Says:

    Just a quick query, are you planning on pushing the Tomoe Debian packages into Debian proper at all? I’ve just come across Tomoe and as a current uim user, would love to see it in Debian, but I don’t want to step on anyone’s toes by posting an ITP out of the blue.

    As far as the guessing thing Peter Maydell mentions on the Zaurus, a combination of Tomoe and Prime would prolly manage that quite well.

    (Prime’s http://taiyaki.org/prime/ if you don’t know it, both uim and scim support it)

  5. Mathieu Says:

    Yes I am planning to create packages but as I am not a Debian Developer, I will need someone to put the packages on the Debian servers. Are you a DD ?

Leave a Reply

CAPTCHA Image