ultranerds has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm trying to do a regex like this:

if ($summary =~ /\Q[[$article_name]]\E(.*)([^[]+)/s) {

Basically, I have a string:

my $summary_test = q|[sommaire France] [[France]] [[Cities of France]] == Paris blabla == Bordeau blabla === Hotels in Bordeaux blabla [[Things to see in France]] [[Things to do in France]] [/sommaire]|;


As a test (example), what I want to do - is match:

[Cities of France]
...all content, up to the next [[

The above regex kinda works, appart from it grabs ALL the content, to the end of the string (and not the [[ bit)

I'm a little new with the [^[] format - so I expect its something stupid Im doing wrong.

Any pointers would be much appreciated :)

TIA

Andy

Replies are listed 'Best First'.
Re: Regex to catch UP TO a particular string?
by Ratazong (Monsignor) on Mar 24, 2010 at 14:23 UTC
    Try not being greedy (and using .*? instead of .*) :
    if ($summary_test =~ /\Q[[$article_name]]\E(.*?)\[\[/s) { print $1; }
      Thanks, that doesn't seem to get any results though :/
      my @tags; my $i = 0; my @split = split /\n/, $summary; foreach (@split) { my $contents; if ($_ !~ /^\s*\t*\[\[(.*)\]\]/) { next; } print qq|Tag is: $1 \n|; my $article_name = $1; if (/\Q[[$article_name]]\E(.*?)\[\[/s) { print "got content: $1 \n"; $contents = $1; } push @tags, { name => $article_name, contents => $contents }; + } use Data::Dumper; print Dumper(@tags);

      ..gives:
      $VAR1 = { 'contents' => undef, 'name' => 'France' }; $VAR2 = { 'contents' => undef, 'name' => 'Cities of France' }; $VAR3 = { 'contents' => undef, 'name' => 'Things to see in France' }; $VAR4 = { 'contents' => undef, 'name' => 'Things to do in France' };
      ..any ideas?

      TIA!

      Andy

        In your second post, you split your summary into lines

        my @split = split /\n/, $summary;

        However your expected result contains multiple lines. The regex matches the beginning of the next line starting with [[ - and as the [[ occurs in the first column, that match is empty.

        Try to work on your original data, not the line-by-line-version, e.g. by replacing

        if (/\Q[[$article_name]]\E(.*?)\[\[/s) {

        with

        if ($summary =~ /\Q[[$article_name]]\E(.*?)\[\[/s  {
        Never mind - was me being stupid ;) (loooong day)

        if ($summary =~ /\Q[[$article_name]]\E(.*?)\[\[/s) {

        Thanks :)

        Cheers