in reply to Looking for suggestions for writing a module to look up translations in a 9 MB XML dictionary

Hmmm, I would get rid of all the parsing overhead XML introduces. A simple-and-stupid way to implement this could be a Makefile (or similar logic) that generates a fast (non-SQL) database file from the XML master whenever it changes (its timestamp). Ideally, reading from the database file would not change its modification time (as an unwanted side-effect).

My very first idea was to use SQLite, but I think djb's CDB should be way faster. Of course, there is a CDB_File on CPAN, and for extra bonus points, it is capable of generating new CDB files.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
  • Comment on Re: Looking for suggestions for writing a module to look up translations in a 9 MB XML dictionary

Replies are listed 'Best First'.
Re^2: Looking for suggestions for writing a module to look up translations in a 9 MB XML dictionary
by rastoboy (Monk) on Nov 05, 2009 at 09:42 UTC
    I'd have to agree with SuicideJunkie on the KISS principle. 9MB is nothing these days, and really wall you want it simple word lookups. I can identify because, though I'm a native English speaker, when I tried to read Patrick O'Brien's Master and Commander books (which are truly awesome, btw), I ended up bringing up my giant unabridged Webster's dictionary every page or so. Sometimes it wasn't words I didn't know, but words I knew that I suspected he was making archaic use of. That ended up being a Killer App for me to buy a Palm Pilot, as that was the only platform any unabridged English dictionary was available for (Webster's, again). That way, instead of physically manhandling this paper monstrousity, I could just keep my Palm Pilot nearby and just look up words. So I say screw it, keep it simple in memory, and utilize Term::Readkey or some such to monitor keystrokes and start showing suggestions, a la Youtube's search box javascript.
Re^2: Looking for suggestions for writing a module to look up translations in a 9 MB XML dictionary
by skangas (Novice) on Nov 05, 2009 at 21:06 UTC

    Your suggestions will be very valuable; thanks. I wasn't even aware of cdb, which seems like a pretty good fit in this case. The fact that it has been written by djb only adds to it's attractiveness.