Apache solr vs Apache Lucy

on Sep 16, 2015

I have a web application written in Perl. My search requirement is to index a file system / on fly document which can contain file types i.e.; HTML, MS Office, PDF documents etc and then perform a full-text search. I have already investigated Apache Solr works fine with sample data. Now I got to know about Apache Lucy and wondering if this is the right candidate for my Perl-based application. My concern with Apache Lucy this is that there is no update on CPAN after Dec 2014. Not sure if this is actively maintained especially what is the progress with integration with Lucene 5.3. I need suggestion on below points:
- Is Apache Lucy is similar to Apache Lucene, API's everything?
- Is Apache Lucy production ready?
- Any tentative planning of new release of Apache Lucy.

    I've used Apache Lucy and its predecessor, KinoSearch, for over 10 years and at three different @jobs with great success. I recommend it with all my heart!

    • Fast
    • Stable
    • Flexible
    • Authors reply to emails for help

    Use Apache Lucy and be happy!

      Thanks dmitri for your valuable response. I just got to know that Lucy only provides sub-set of features Lucene provides. Would like to know if you know any critical features Lucy is lacking. I think my requirement is not complex. I need to index file system periodically.. need full-text search including HTML, DOC, XLS, PDF etc types. Thanks again.
        I've never used Lucene, so I cannot compare the two. I use Lucy to index PDF, HTML, DOC, and a several other document types. Converting them into text indexable by Lucy has to be done separately.

        I've graduated from reindexing once every few hours using cron job to using Linux::Inotify2 to provide practically instant updates to the index. Surely impressed my $boss...

    My concern with Apache Lucy this is that there is no update on CPAN after Dec 2014.

    Not necessarily a bad thing - indicates stability and/or caution on new releases. It also probably puts it in the top 20% or so of most recently released dists (guessing).

    Anyway, the repository shows updates within the last week, so it is clearly being worked on.

      Thanks. Indeed, GitHub source repo has recent activities.
    What dmitri said. I've used KinoSearch/Lucy for almost as long and can testify that the devs are the best and I know that if you find a real bug, it will be addressed quickly because it happened to me. I've never used Search::Elasticsearch but it looks like a good thing to try as well and possibly easier to work with than Lucy (its API seems a bit higher level) but I can't imagine it's as fast.

      Thanks for your feedback. have done some trials on Search::Elasticserach too. Module is working fine and looks good but was facing difficulties in making json for attachment (HTML,PDF, DOC file stored in a file system) mapper through the module (not found enough documentation). Probably I am missing something? I am using elastic 1.7 but upcoming version 2 is getting a lot of changes mainly how they index filesystem (river is deprecaited in 2.0) so probably Perl module will also get some updates also documentation.
    Since this got revived I'll chime in. I'm still using KinoSearch, Lucy's dad, on 5.8 no less, because I'm stuck in upgrade Hell and don't have a cc/gcc new enough to compile Lucy yet. It's still working great, in production at hundreds of customer sites in front of tens of thousands of users, as it has for 6 or 7 years.

    I've used (and still am using) Lucy for many years since the KinoSearch days and can testify that it's one of the gems on CPAN. At least since it's rename to Lucy it's stable and perfectly production ready. Getting it to work is a bit tedious, as you have to fine tune / design every bit of your application, but once it's working, it's fast and consistently performant. I never did benchmarks, but I had a multi-million docs project once where many components had difficulties, except Lucy, which plowed through like a champ.
    Regarding release plans, its best to ask the devs, in their prefered mail list/irc/...
      Thanks. Will do so.

