in reply to Re: Regex keep matching the last possible match (but should get all)
in thread Regex keep matching the last possible match (but should get all)

Dear Perl-Monks

I will have a look at the links provided at the page in time, but in the meanwhile I created a counter-example to verify that my codings indeed will yield results in a way I expect them to do.

consider the following file

blabla:(123):falleriefallera dingdong moep blubb 4711 dingdong blob))hop((gob))sob((0815))ding knickknack boing 44 nothing here blabla:(123):falleriefallera dingdong moep blubb 471 dingdong blob))hop((gob))sob((0815))ding knickknack boing 45 nothing here too blabla:(1344):falleriefallera dingdong moep blubb 4711 dingdong blob))hop((gob))sob((0815))ding knickknack boing 46 nothing again blabla:(123):falleriefallera dingdong moep blubb 4711 dingdong blob))hop((gob))sob((0825))ding knickknack boing 47

access it using the following perl-script:

use strict; use warnings; # 1. get file and stuff it into an array # that what it will be in target code open FILE, 'target.txt' or die "nope dude: $!"; my @stuff; while(<FILE>){ chomp $_; push @stuff, $_; } print "reading done "; # 2. make a long line out of it # because I still have problems using an array for this :( my $longline; foreach my $x (@stuff){ $longline .= $x; } # 3. get all matches and place them in an array array x) my @super; while ($longline =~ /\D+(\d+)\D+(\d+)\D+(\d+)\D+(\d+)/g){ my @sub = ($1, $2, $3, $4); push @super, \@sub; } # 4. we should have four entries in that @super print scalar @super, "\n";

will yield this (at least the debugger think so):

0 ARRAY(0x1f08820) 0 123 1 4711 2 0815 3 44 1 ARRAY(0x2199678) 0 123 1 471 2 0815 3 45 2 ARRAY(0x21994e0) 0 1344 1 4711 2 0815 3 46 3 ARRAY(0x219f128) 0 123 1 4711 2 0825 3 47

so it will work in the way I hoped for. IF I ever can create a valid regex for this. But now I'm busy looking into these walktroughs.

By the way; using .+? didn't made the RegEx work, but I don't understand how [^>] should be utilized to help me in my case :( Because... I do find the correct piece of plain text in my file, so how should I include "no >" and "no <" inside?

Greetings, a random visitor

Replies are listed 'Best First'.
Re^3: Regex keep matching the last possible match (but should get all)
by Corion (Patriarch) on May 18, 2015 at 11:05 UTC

    Your example is far more restricted because a character in \D (a non-digit) can never be matched by a character in \d (a digit) and vice-versa.

    This is why I suggested that you could use [^<]+ for characters within tags or [^>]+ for characters outside of tags. Both will only match normal characters and not closing (or opening) a tag.