Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Stumped Regular expressions

by Anonymous Monk
on Jun 10, 2008 at 09:57 UTC ( [id://691191]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks, I'm trying to match all instances of "http://hosting/image1.jpg http://hosting/image1.jpg http://hosting/image4.jpg" in a single string but I can't seem to get this to work? Any advice? Thanks!
#!/usr/bin/perl use strict; use warnings; open(F, "test.htm"); while(<F>) { my @a = ($_ =~ /(http:\/\/hosting.+\.rar)/g ); foreach(@a) { chomp; print "$_\n"; } } close(F);

Replies are listed 'Best First'.
Re: Stumped Regular expressions
by moritz (Cardinal) on Jun 10, 2008 at 10:11 UTC
    If the URLs have the structure you described above, they will not match \.rar. If you have slashes in the regex, chose a different delimiter to avoid the backslashes: m{(http://hosting/image\d+\.jpg)}

    If you are looking for URLs in general, consider Regexp::Common::URI.

Re: Stumped Regular expressions
by psini (Deacon) on Jun 10, 2008 at 10:31 UTC

    Moreover, your .+ is greedy, so your regex will match only (at most) one substring in the string, starting with the first http://hosting and ending with the last .rar. If you want to individually catch all the occurrences, you should use a non greedy modifier (ie .+?)

    Careful with that hash Eugene.

Re: Stumped Regular expressions
by Grey Fox (Chaplain) on Jun 10, 2008 at 20:28 UTC
Re: Stumped Regular expressions
by kabeldag (Hermit) on Jun 11, 2008 at 13:48 UTC
    I stuck a 'global' modifier on the end:
    use strict; use warnings; my $line_count = 0; open(F, "test.htm"); while(<F>) { my @a = $_ =~ m{(http://hosting/image\d+\.jpg)}g; for my $url (@a) { print "URL [ $url ] found on line ($line_count)\n"; } $line_count++; } close(F);
      This could be further simplified by using the special variable $. (INPUT_LINE_NUMBER) instead of $line_count:
      open(F, "test.htm"); while(<F>) { my @a = $_ =~ m{(http://hosting/image\d+\.jpg)}g; for my $url (@a) { print "URL [ $url ] found on line ($.)\n"; } } close(F);

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://691191]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2024-04-25 12:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found