murugu has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks

Im now doing XML conversion job from a plain text file which has been pre-edited for conversion convenience.

Im having the text file in which there are four types of lists are there.

they are numberedlist which is pre-edited with <nl>, unnumberedlist with <un>, titled numberedlist <tl> and unnumbered title list <utl>.

A part of my input file is like this:

<nl> slkjslkjslkjslks slkjslkjslkjslkjs slkslkjslkjsl <utl>lksjlkjslkjs lskjslkjslkjs slkjslkjslkjslkj </utl> <un>lkslskjlkjs <nl>lksjlksj lkjlkjl </nl> </un> </nl>

This is just a single instance of the nesting in the file. there are so many occurences are there in such a way that any list tag may consist of any number of list tags inside in any order in it, even itself inside. The above input has to be converted as :

<list type="numbered"> <listitem>slkjslkjslkjslks</listitem> <listitem>slkjslkjslkjslkjs</listitem> <listitem>slkslkjslkjsl</listitem> <listitem><list type="unnumeredtitle"><listitem>lksjlkjslkjs</listitem +> <listitem>lskjslkjslkjs</listitem> <listitem>slkjslkjslkjslkj</listitem> </list></listitem><listitem><list type="unnumbered"><listitem>lkslskjl +kjs</listitem> <listitem><list type="numbered"><listitem>lksjlksj</listitem> <listitem>lkjlkjl</listitem> </list></listitem> </list></listitem> </list>

How should i start convert this text file into valid XML file

My english is not that good.

Please Give ur suggestions.

Thanks in advance.

--Murugesan--

Replies are listed 'Best First'.
•Re: Recursive lists
by merlyn (Sage) on Mar 23, 2004 at 14:56 UTC
    Looks like a perfect job for Parse::RecDescent or its heir apparent, Perl6::Rules. Using today's technology, the PRD grammar would be something like:
    stuff: nl_list | un_list | tl_list | utl_list | items nl_list: m{<nl>} stuff m{</nl>} { "<list type='numbered'>$item{stuff}< +/list>" } ... ditto for the other three lists items: item(s) { join "", @{$item[1]} } item: /\w+/ { "<listitem>$item[1]</listitem>" }

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      I haven't seen hide nor heir of Perl6::Rules.

          -- Chip Salzenberg, Free-Floating Agent of Chaos

      Thank you very much for ur kind reply

      i am a beginer in perl. Frankly, I dont understand what u have written in the code here. Can u please explain me what u have written. If i do the coding without understanding what the coding is meant for, then i wont improve my perl knowledge.

      So please kindly help me to understand what the code actually meants

      Thanks in advance

      --Murugesan--

        Parse::RecDescent is an example of a grammar-based parser. You will need to understand something about grammars before you can understand it. I would suggest going to CPAN (www.cpan.org) and reading the extensive documentation provided with Parser::RecDescent before going much further.

        ------
        We are the carpenters and bricklayers of the Information Age.

        Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

        merlin has given you an example grammar, and specifically stated that he was doing so. To understand the code, you are going to need to study the code and Perl6::Rules, as he stated.