Jaap has asked for the wisdom of the Perl Monks concerning the following question:

Wise Monks,

I am currently using Text::WikiFormat which is too slow for my needs.
I want to write my own wiki formatter using some Perl module for the tokenizing / lexical analyzing.
Now searching through CPAN i am a bit lost there with all the modules that come up.
I have text like this:
* Unordered List item 1 * Item 1.1 * 1.1.1 etc... 1. Ordered lists too, nested as well |table|with| |2 rows|and 2 columns| *strong text* /emphasized text/ horizontal ruler: ----
The main focus is currently on speed/performance. Does anybody have any tips what parser / tokenizer to use?

Replies are listed 'Best First'.
Re: Tokenizer / Lexical Analyzer for parsing Wiki
by BrowserUk (Patriarch) on Sep 02, 2004 at 13:41 UTC

    How much too slow is it?

    Have you considered profiling the code and seeing whether you can improve the performance of the existing code sufficiently for your needs?


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: Tokenizer / Lexical Analyzer for parsing Wiki
by Crian (Curate) on Sep 02, 2004 at 12:24 UTC

    Perhaps you could take a look at KWiki. It's a Perl module too, but I don't know if it is pure Perl. If it is, you could perhaps take a look on their wiki formatter.

      The Kwiki parser is rather poor too, unfortunately. I went from Kwiki to Text::WikiFormat.
Re: Tokenizer / Lexical Analyzer for parsing Wiki
by PodMaster (Abbot) on Sep 02, 2004 at 11:04 UTC
    I want to write my own wiki formatter using some Perl module for the tokenizing / lexical analyzing.
    If speed/perfornace is your main focus, and Text::WikiFormat is too slow for you, I'd say you don't want to use perl at all.

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

      That is a rather good point. But i know that Text::WikiFormat is not very speed-oriented so i can do a lot speedier in Perl. My entire Wiki is in Perl too.

      Currently looking at Parse::Lex.
        That is a rather good point. But i know that Text::WikiFormat is not very speed-oriented so i can do a lot speedier in Perl

        Is Text::WikiFormat a lost cause? Or do you think it could be speed up? If so, why not see if the author is interested in you attempting to do so?

        However, If you really feel Text::WikiFormat is a lost cause, then I would recommend writing your own lexer and tokenizer, that way you will be sure to squeeze all the speed out of it you need.

        -stvn