Multiple dictionary sources in Fantasdic
Over the past few weeks, I have slowly but surely been adding multiple dictionary sources support to Fantasdic. Until recently, Fantasdic had been a DICT client only, that is, Fantasdic connected to DICT servers (as configured by the user in the settings) in order to retrieve definitions. I thought it would always be like that and I had even objected to change that in gnome-dictionary but I’ve finally changed my mind. As I said some time ago, a great deal of Fantasdic’s source code is only user interface source code. If making a dictionary application means spending so much time on user interface, it’s best to make it general-purpose…
Currently, Fantasdic includes two new kinds of source, in addition to DICT servers:
- Google Translate
- EDICT files
Basically, it works like a plugin system. Source plugins can either be distributed and installed with Fantasdic or installed manually in $HOME/.fantasdic/sources/ for third-party plugins. Writing a new source plugin is merely a matter of extending a base class and implementing a few required methods. Plugins are written in Ruby.
Hopefully, the user interface remained as simple as it was.

Fantasdic searching in an EDICT file. EDICT is a famous dictionary format for anyone learning Japanese.
Some sources may require additional fields to be configured by the user. For example, the DICT server source requires a server host and port. The EDICT file source requires a file path to be specified. The user interface for those additional fields is defined directly in the source plugins.


For this source, a file must be selected…

With the Google Translate source, you need to select your languages for the translations.

Fantasdic, using Google Translate.
I hope more and more sources can be added :) Ideally all source plugins should be multi platform. Here are a few suggestions (of course, I’m counting on you to implement them ;-)):
- dictd file: search directly in files aimed for the dictd server. See “man dictd” for a description of the format and tools/ in Fantasdic’s source code for some starters.
- Stardict file. There’s a file describing the format in Stardict’s source code. Likewise, tools/ has a script to convert stardict files, it may be a good starter.
- Stardict server. Stardict authors have created their own protocol and they’re running a server with quite some dictionaries. Directly see Stardict’s source code or use a packet sniffer.
- Epwing dictionaries. You’ll need to use rubyeb, the Ruby bindings to the excellent libeb.
- Wikpedia/Wiktionary. This source plugin would simply perform an HTTP request to the appropriate site. Greg Hewgill kindly accepted to share his code to clean mediawiki syntax and make it more readable. I’m quoting an email he sent to me:
The current state of my code can be found at:
http://hewgill.com/viewvc/wiktiondict/trunk/Feel free to use any of my code (or the algorithms therein) to format
mediwiki data. I imagine you already know this, but you can fetch the
raw output for individual pages using a url like this:
http://en.wiktionary.org/w/index.php?title=test&action=rawIn fact, you can also add &templates=expand to that url and mediawiki
does all the hard template work! I found the docs at:
http://www.mediawiki.org/wiki/Manual:Parameters_to_index.php
Waiting for your comments and your source plugins!
July 12th, 2009 at 6:22 am
Hello Mathieu,
I discovered Fantasdic just a few days ago, and I really-really like the simple user interface.
At home and at work I use Mac OS X for my work and Apple’s Dictionary (I translate quite large quantities of text). What I like about Apple’s Dictionary is that it has a very simple interface (similar to Fantasdic), it is very fast and since version 2.0 Apple gave everyone the possibility to create and add your own dictionaries.
I have created and compiled a few dictionaries that would also greatly help the work of my colleagues - but none of them Mac. I’ve been searching ever since to find some dictionary back-end that could work on Windows XP and read some nice dictionary format. I’ve found only a few programs that have a simple yet powerful interface - Fantasdic is one of them (I like Gnome Dictionary, too).
But something is always missing. The dictionary formats used in all of these applications are too simple. I think, Apple resolved the question in a very elegant way. The source format of the dictionaries is a Unicode-encoded XML file. The definition of this XML is basically XHTML extended with a few dictionary-related tags, typically (with attributes id and d:title, this latter represents the entry), (with two attributes d:value and d:title one of which is used to build the index, the other is what shows up in the result list. You can use multiple d:index tags for one entry, so for example you can find the entry “do” by searching “done”). (This is the open-source part of the dictionary format, as later one uses Apple’s tools to build indexes and to convert the source into a format that Dictionary understands.)
But the point is that they used XHTML as a basis, permitting all the formatting (including images) you can do with XHTML, and the whole dictionary can use a single CSS file. I’ve found no other base dictionary formats that would permit formatting. My question is if you have seen any format like that (I mean with formatting), what applications can read it and if you plan to include a similar format in a future release of Fantasdic?
Thank you for your work!
July 12th, 2009 at 12:50 pm
Fantasdic currently supports pango markup (a small subset of HTML). http://library.gnome.org/devel/pango/stable/PangoMarkupFormat.html
It wouldn’t be too hard to add support for XHTML and CSS but it would require Fantasdic to depend on a rendering engine like Gecko or Webkit. Unfortunately I have no time to devote to Fantasdic lately so I can’t promise anything.