Tweeter button

Archive for the ‘Wikipedia’ Category

Multiple dictionary sources in Fantasdic

Thursday, November 1st, 2007

Over the past few weeks, I have slowly but surely been adding multiple dictionary sources support to Fantasdic. Until recently, Fantasdic had been a DICT client only, that is, Fantasdic connected to DICT servers (as configured by the user in the settings) in order to retrieve definitions. I thought it would always be like that and I had even objected to change that in gnome-dictionary but I’ve finally changed my mind. As I said some time ago, a great deal of Fantasdic’s source code is only user interface source code. If making a dictionary application means spending so much time on user interface, it’s best to make it general-purpose…

Currently, Fantasdic includes two new kinds of source, in addition to DICT servers:

- Google Translate
- EDICT files

Basically, it works like a plugin system. Source plugins can either be distributed and installed with Fantasdic or installed manually in $HOME/.fantasdic/sources/ for third-party plugins. Writing a new source plugin is merely a matter of extending a base class and implementing a few required methods. Plugins are written in Ruby.

Hopefully, the user interface remained as simple as it was.

Fantasdic screenshot
Fantasdic searching in an EDICT file. EDICT is a famous dictionary format for anyone learning Japanese.

Some sources may require additional fields to be configured by the user. For example, the DICT server source requires a server host and port. The EDICT file source requires a file path to be specified. The user interface for those additional fields is defined directly in the source plugins.

Fantasdic screenshot

Fantasdic screenshot
For this source, a file must be selected…

Fantasdic screenshot
With the Google Translate source, you need to select your languages for the translations.

Fantasdic screenshot
Fantasdic, using Google Translate.

I hope more and more sources can be added :) Ideally all source plugins should be multi platform. Here are a few suggestions (of course, I’m counting on you to implement them ;-)):

- dictd file: search directly in files aimed for the dictd server. See “man dictd” for a description of the format and tools/ in Fantasdic’s source code for some starters.

- Stardict file. There’s a file describing the format in Stardict’s source code. Likewise, tools/ has a script to convert stardict files, it may be a good starter.

- Stardict server. Stardict authors have created their own protocol and they’re running a server with quite some dictionaries. Directly see Stardict’s source code or use a packet sniffer.

- Epwing dictionaries. You’ll need to use rubyeb, the Ruby bindings to the excellent libeb.

- Wikpedia/Wiktionary. This source plugin would simply perform an HTTP request to the appropriate site. Greg Hewgill kindly accepted to share his code to clean mediawiki syntax and make it more readable. I’m quoting an email he sent to me:

The current state of my code can be found at:
http://hewgill.com/viewvc/wiktiondict/trunk/

Feel free to use any of my code (or the algorithms therein) to format
mediwiki data. I imagine you already know this, but you can fetch the
raw output for individual pages using a url like this:
http://en.wiktionary.org/w/index.php?title=test&action=raw

In fact, you can also add &templates=expand to that url and mediawiki
does all the hard template work! I found the docs at:
http://www.mediawiki.org/wiki/Manual:Parameters_to_index.php

Waiting for your comments and your source plugins!

WikipediaFS 0.3 released

Sunday, May 27th, 2007

I probably should not have spent so much time on this with all the things I have to do lately but here we are: I have released WikipediaFS 0.3. With WikipediaFS, you can view and edit Wikipedia articles as if they were real files.

WFS screenshot

Highlights for this release :

- Compatible with new fuse-python API
- Rewritten from scratch
- Significant reliability and performances improvements
- HTTPS, HTTP authentication support (in addition to HTTP and proxy)
- Subpages support
- Automatically make directories for sites from the Wikimedia foundation
- Files now need the “.mw” extension!

Overall, I am very happy with this release!

I would like to thank Csaba Henk (fuse-python’s maintainer and author of fuse’s port to FreeBSD) for his prompt answers to my questions.

Sébastien Delafond uploaded this version to the Debian servers yesterday. By the way, yesterday I found out that WikipediaFS 0.2 had been downloaded almost 2000 times on sourceforge. Not bad ! :-)

Translating Wikipedia articles more easily

Wednesday, April 11th, 2007

It is not easy to explain the following with plain sentences so let’s take an example. Say I want to translate the following paragraph from English to French:

Tokyo is known for its many museums. Located in [[Ueno Park]] are the [[Tokyo National Museum]], the country’s largest museum and specializing in traditional [[Japanese art]]; the National Museum of Western Art; and the Tokyo Metropolitan Art Museum, which contains collections of Japanese [[modern art]] as well as over 10,000 Japanese and foreign films.

In order to complete the translation, I will need the French article name for [[Ueno Park]], [[Tokyo National Museum]], [[Japanese art]] etc. Seeking all those names is quite boring and time-consuming, isn’t it ? So I have written a little tool in Ruby that does that for us. In this very example, the program would have output:


----------
Ueno_Park: interwiki link to fr found (Parc de Ueno)
Tokyo_National_Museum: interwiki link to fr found (Musée national de Tōkyō)
Japanese_art: interwiki link to fr found (Art japonais)
modern_art: interwiki link to fr found (Art moderne)
----------
Tokyo is known for its many museums. Located in [[Parc de Ueno]] are the [[Musée national de Tōkyō]], the country’s largest museum and specializing in traditional [[Art japonais]]; the National Museum of Western Art; and the Tokyo Metropolitan Art Museum, which contains collections of Japanese [[Art moderne]] as well as over 10,000 Japanese and foreign films.

More explanations and download here.