Re: Regex keep matching the last possible match (but should get all)

Replies are listed 'Best First'.
Re^2: Regex keep matching the last possible match (but should get all) by Anonymous Monk on May 18, 2015 at 09:48 UTC
Speaking of which :) htmltreexpather.pl says `//td[@class='statusOdd']` [download] The links here are tips :)Re: Retrieve select information from HTML, they're examples(for tree-xpath and others)/walkthroughs/tutorials ...	[reply] [d/l]
Re^2: Regex keep matching the last possible match (but should get all) by Anonymous Monk on May 18, 2015 at 10:57 UTC
Dear Perl-Monks I will have a look at the links provided at the page in time, but in the meanwhile I created a counter-example to verify that my codings indeed will yield results in a way I expect them to do. consider the following file blabla:(123):falleriefallera dingdong moep blubb 4711 dingdong blob))hop((gob))sob((0815))ding knickknack boing 44 nothing here blabla:(123):falleriefallera dingdong moep blubb 471 dingdong blob))hop((gob))sob((0815))ding knickknack boing 45 nothing here too blabla:(1344):falleriefallera dingdong moep blubb 4711 dingdong blob))hop((gob))sob((0815))ding knickknack boing 46 nothing again blabla:(123):falleriefallera dingdong moep blubb 4711 dingdong blob))hop((gob))sob((0825))ding knickknack boing 47 [download] access it using the following perl-script: use strict; use warnings; # 1. get file and stuff it into an array # that what it will be in target code open FILE, 'target.txt' or die "nope dude: $!"; my @stuff; while(<FILE>){ chomp $_; push @stuff, $_; } print "reading done "; # 2. make a long line out of it # because I still have problems using an array for this :( my $longline; foreach my $x (@stuff){ $longline .= $x; } # 3. get all matches and place them in an array array x) my @super; while ($longline =~ /\D+(\d+)\D+(\d+)\D+(\d+)\D+(\d+)/g){ my @sub = ($1, $2, $3, $4); push @super, \@sub; } # 4. we should have four entries in that @super print scalar @super, "\n"; [download] will yield this (at least the debugger think so): `0 ARRAY(0x1f08820) 0 123 1 4711 2 0815 3 44 1 ARRAY(0x2199678) 0 123 1 471 2 0815 3 45 2 ARRAY(0x21994e0) 0 1344 1 4711 2 0815 3 46 3 ARRAY(0x219f128) 0 123 1 4711 2 0825 3 47` [download] so it will work in the way I hoped for. IF I ever can create a valid regex for this. But now I'm busy looking into these walktroughs. By the way; using .+? didn't made the RegEx work, but I don't understand how `[^>]` should be utilized to help me in my case :( Because... I do find the correct piece of plain text in my file, so how should I include "no >" and "no <" inside? Greetings, a random visitor	[reply] [d/l] [select]
Re^3: Regex keep matching the last possible match (but should get all) by Corion (Patriarch) on May 18, 2015 at 11:05 UTC
Your example is far more restricted because a character in `\D` (a non-digit) can never be matched by a character in `\d` (a digit) and vice-versa. This is why I suggested that you could use `[^<]+` for characters within tags or `[^>]+` for characters outside of tags. Both will only match normal characters and not closing (or opening) a tag.	[reply] [d/l] [select]