About FOKS

FOKS stands for Forgiving Online Kanji Search.

FOKS is an intelligent dictionary interface that allows users more freedom in searching for dictionary entries than is the case with conventional dictionary interfaces. Besides the usual methods of searching by entering the reading or pasting the kanji string you can specify further options that make it easier to find the word you are looking for.

Most notable difference from the conventional dictionaries is the intelligent search which allows for lookup of dictionary entries even with the (predictably) incorrect reading. See explanation below for more details

All the dictionaries used in this interface are freely available from the home of the EDICT dictionary projectat the Monash University.



SEARCH OPTIONS

Reading

Required option.

The search will return an error if nothing is entered. Either a kana string (the reading of the string you are looking for) or the kanji string (pasted from a different file or entered using an IME) can be entered かな and 仮名 are both acceptable as entries. If the first is entered than the specified dictionaries are searched for the string that contains this string in the reading specified in the dictionary. If the second is entered then the specified dictionaries are searched for strings that contain the string in the entry part of the dictionary specification.

All the dictionary entries are written in the following form:
Line NoEntryReadingTranslationChar No Dictionary
21632仮字かな(n) (uk) Japanese syllabary (alphabets)/kana2edict

We support two types of wild characters (characters that will match any character during search):* and ?

* matches any number of characters. For example かな*わ will match any words that start with かな and end with

Line NoEntryReadingTranslationChar No Dictionary
31751金轡かなぐつわ(n) a (metal) bit2edict
31832金沢かなざわ(city in Ishikawa Prefecture)2edict
31902金輪かなわ(n) metal rings or hoops or bands2edict

? matches exactly one two-byte character. For example か な?わ will match all strings which have one character between か な and

Line NoEntryReadingTranslationChar No Dictionary
31832金沢かなざわ(city in Ishikawa Prefecture)2edict

Char Count

optional

The search will return all the matching strings that have the character count , orto the number entered. The Sign for character count specifies which relational operator will be used. If "ANY" is the value specified then both of the fields are ignored. This is the default behavior

For example if you enter:
Reading: かな; Char count: 2; ; the search will match all the entries that have more than 2 characters in their entry.
So, results like
Line NoEntryReadingTranslationChar No Dictionary
22350家内工業かないこうぎょうhousehold or cottage industry4edict
70458鼎の軽重を問うかなえのけいちょうを(exp) to call one`s ability into question/to weigh one`s ability7edict
25510叶えるかなえる(v1) to grant (request, wish)/(P)3edict
21721仮名漢字変換かなかんじへんかんkana-kanji conversion6edict
would be returned but no entries with character count of <=2 would be returned

Dictionaries to search

optional, if not entered defaults to edict.

The FOKS is interface to several dictionaries and by specifying this option you can concentrate the search on one of the dictionaries or all. All dictionaries are part of the EDICT dictionary project and are results from a long-running project to produce a freely available Japanese/English Dictionary/ies in machine-readable form. See the full documentation for details. Currently included dictionaries are:

Search Type

optional, if not entered defaults to simple.

Simple search basically decides whether to search the reading or entry part of the dictionary entry. This is done based on whether there are any kanji characters in the Reading field. If there are, then the entry part of the dictionary is searched. Otherwise reading part of the dictionary entry is searched.
The remaining parameters specified(Char Count, Sign for Char Count, Dictionaries to search and Search string location) are used to create a database query that retrieves the entries that match these parameters.

Intelligent search facility is what makes this dictionary interface special. It allows you to lookup words as you expect them to be read. When you input a search string we perform a number of calculations to figure what search string are you were likely looking for. We are presuming that you have entered correct or almost correct readings that kanji characters in the string that you are looking for take in certain context.
For example you are trying to lookup 母音 but you do not know the correct reading ぼいん. Most other dictionaries would be of little help in this situation, but with our intelligent search this is not case.
You can search on strings ははおと, ぼおん, ははおん or similar since takes on はは、ぼ readings in certain contexts and takesおと、おん、いん readings in certain context.
In other words we try to figure out what parts of the reading are representing what characters and try to figure the most likely candidates to your query.
For example intelligent search on string ぼおん will result like this:

EntryReadingGradeDictionary Link to correct Translation
母音ぼいん5.76793edictTranslation
保温ほおん0.299767edictTranslation
母乳ぼにゅう0.078679edictTranslation
哺乳ほにゅう0.000635359edictTranslation

The intelligentsearch is a two step process in which you first get a list of candidates that can be plausibly considered to take on the reading which you have searched on. If the list contains the string for which you were looking, you have to click on the its link to get the correct translation.
The calculations we perform are only applicable to readings of the string so if you try using intelligent search on kanji characters you will be defaulted into simple search. Also wild card character search or specific location search are also not applicable so you will be defaulted to simple search in cases you try to use them.


Keitai (Cellular) Phone Interface

Recently we have implemented interface to FOKS accessible from keitai phones. As of 08 Nov. 2002 these interfaces can also handle wild character searches(see above). Only phones able to display and input Japanese characters can be used.
Currently we should be able to handle phones using one of the: EZWEB (WAP), I-mode or J-Sky access methods. However, the interfaces have been tested in very limited fashion so there might be some problems depending on the make of your phone. If you find an error please contact sbilac@cl.cs.titech.ac.jp with a description of your problem.

Following URLs can be used to access the respective interfaces:




RELEVANT PUBLICATIONS



Slaven Bilac. 2002. Intelligent Dictionary Interface for Learners of Japanese. Masters's thesis, Tokyo Institute of Technology. ps version (758 K gzipped) pdf version (1,2 M)

Slaven Bilac, Timothy Baldwin, and Hozumi Tanaka. 2002. "Bringing the Dictionary to the User: The FOKS System". In Proc. of the 19th International Conference on Computational Linguistics (COLING2002), pages 89--95. ps version (543 K gzipped) pdf version (220 K)

Timothy Baldwin, Slaven Bilac, Ryo Okumura, Takenobu Tokunaga, and Hozumi Tanaka. 2002. "Enhanced Japanese electronic dictionary look-up".In Proc. of LREC2002. ps version (188 K)

Ryo Okumura. 2001. Basic research on an intelligent dictionary interface for learners of the Japanese language. Bachelor's thesis, Tokyo Institute of Technology.(In Japanese) ps version (830 K gzipped)

Timothy Baldwin and Hozumi Tanaka.2000. "A comparative study of unsupervised grapheme-phoneme alignment methods". In Proc. of the 22nd Annual Meeting of the Cognitive Science Society (CogSci 2000), pages 597--602, Philadelphia. ps version (112 K gzipped) pdf version (295 K)

Timothy Baldwin and Hozumi Tanaka. 1999. "The applications of unsupervised learning to Japanese grapheme-phoneme alignment".In Proc.of ACL Workshop on Unsupervised Learning in Natural Language Processing, pages 9--16, University of Maryland. ps version (320K) pdf version (284 K)