in reply to Regex for simple parsing job

Would have been nice to see you code so we could show you where you are going wrong. But this seems to do what you want.

#!/usr/bin/perl use strict; use warnings; my $data = do { local $/; <DATA> }; my @data = $data =~ /STARTP(.*?)ENDP/sg; foreach (@data) { my @titles = /TITLE(.*?)ENDTITLE/sg; $_ = \@titles; } foreach my $i (0 .. $#data) { print "Block $i\n"; foreach my $j (0 .. $#{$data[$i]}) { print "Title $j:\n$data[$i][$j]\n"; } print "\n"; } __DATA__ STARTP TITLE some gibberish some more gibberish ENDTITLE TITLE some gibberish some more gibberish ENDTITLE TITLE some gibberish some more gibberish ENDTITLE ENDP STARTP TITLE some gibberish some more gibberish ENDTITLE TITLE some gibberish some more gibberish ENDTITLE TITLE some gibberish some more gibberish ENDTITLE ENDP
--
<http://www.dave.org.uk>

"The first rule of Perl club is you do not talk about Perl club."
-- Chip Salzenberg

Replies are listed 'Best First'.
Re^2: Regex for simple parsing job
by toadi (Chaplain) on Jul 27, 2004 at 09:39 UTC
    actually I was still looking for the /sg switch in the regex. That's why my code didn't match anything.

    But thanx for helping.



    --
    My opinions may have changed,
    but not the fact that I am right

Re^2: Regex for simple parsing job
by toadi (Chaplain) on Jul 27, 2004 at 09:53 UTC
    In the file I don't have ENDTITLE ass ending but just TITLE until next TITLE until next TITLE. How do I match that? Update Seems split does the job :)


    --
    My opinions may have changed,
    but not the fact that I am right

      Won't "split" give you an extra empty title?

      I fixed the second regex like this:

      my @titles = /TITLE(.*?)(?=TITLE|$)/sg;

      Update: regex re-fixed

      --
      <http://www.dave.org.uk>

      "The first rule of Perl club is you do not talk about Perl club."
      -- Chip Salzenberg

        Yes it did. But I just did next when the title was empty. Not very clean, but it does the job :)

        Like I said I'm no wizzard with regexes



        --
        My opinions may have changed,
        but not the fact that I am right