awohld has asked for the wisdom of the Perl Monks concerning the following question:

I have a string in a large text file where I need to get out some data:
oCMenu.makeMenu('m41','m40','Periodic Reports','index.cfm?openaction=f +ile_archive.view&CONTENT_ID=BF764E84-ED11-EC57-40672E520082E823','',1 +25,20)
And I need to get the:
index.cfm?openaction=file_archive.view&CONTENT_ID=BF764E84-ED11-EC57-4 +0672E520082E823
string out of there.

I identifiy the start of the match by Peridoic Reports'.' and grab whatever is inbetween that and the next single quote, ie
"index.cfm?openaction=file_archive.view&CONTENT_ID=BF764E84-ED11-EC57- +40672E520082E823".
I built the look behind to start the match with Periodic Reports'.' and end with a '

How do I get this look behind to work?

#!/usr/bin/perl -w use strict; my $string = qq{oCMenu.makeMenu('m40','','Files','','',100,20) oCMenu. +makeMenu('m41','m40','Periodic Reports','index.cfm?openaction=file_ar +chive.view&CONTENT_ID=BF764E84-ED11-EC57-40672E520082E823','',125,20) + oCMenu.makeMenu('m53','m40','Reference Documents','index.cfm?openact +ion=file_archive.view&CONTENT_ID=BF76E870-E7CC-515F-D4D5C3D4A210BB9A' +,'',125,20)}; $string =~ /Periodic Reports','(?<!.*'$)/; # Want to see index.cfm?openaction=file_archive.view&CONTENT_ID=BF764E +84-ED11-EC57-40672E520082E823 here print $&;

Replies are listed 'Best First'.
Re: RegEx Help - Look Behind
by McDarren (Abbot) on Dec 03, 2005 at 06:03 UTC
    I understand this isn't directly answering your question, however if every line of your data is formatted in the same way, then an alternative (and probably simpler) approach may be to use split, for example:
    #!/usr/bin/perl -w use strict; while (<DATA>) { my $wanted_string = (split /','/)[3]; print "$wanted_string\n"; } __DATA__ oCMenu.makeMenu('m41','m40','Periodic Reports','index.cfm?openaction=f +ile_archive.view&CONTENT_ID=BF764E84-ED11-EC57-40672E520082E823','',1 +25,20)

    Just offering another way to do it....

    Cheers,
    Darren :)

Re: RegEx Help - Look Behind
by prasadbabu (Prior) on Dec 03, 2005 at 06:00 UTC

    First of all, in negative look behind, variable length is not allowed, which you have used (.*).

    If i understood your question correctly, this will work.

    my $string = qq{oCMenu.makeMenu('m40','','Files','','',100,20) oCMenu. +makeMenu('m41','m40','Periodic Reports','index.cfm?openaction=file_ar +chive.view&CONTENT_ID=BF764E84-ED11-EC57-40672E520082E823','',125,20) + oCMenu.makeMenu('m53','m40','Reference Documents','index.cfm?openact +ion=file_archive.view&CONTENT_ID=BF76E870-E7CC-515F-D4D5C3D4A210BB9A' +,'',125,20)}; my ($match) = $string =~ /Periodic Reports','([^']*)/; print "$match\n"; #$string =~ /Periodic Reports','(?<!.*'$)/; #not matches

    Also take a look at perlre

    Prasad

Re: RegEx Help - Look Behind
by serf (Chaplain) on Dec 03, 2005 at 08:21 UTC
    The gem from the perlre page (which prasadbabu pointed you to) is to stay away from $& - it reads this:

    As I understand it, this means that from the point you invoke one of those evil 3, your program will cache (in your precious RAM) every text match it makes while the program is running - if you're parsing a large amount of data that's *nasty*!

    There is no need in this case (and I don't recall ever finding a case that did need) to use $& - what you want is a specific part of the match (for that you could use round brackets) like this:

    #!/usr/bin/perl -w use strict; my $string = qq{oCMenu.makeMenu('m40','','Files','','',100,20) oCMenu. +makeMenu('m41','m40','Periodic Reports','index.cfm?openaction=file_ar +chive.view&CONTENT_ID=BF764E84-ED11-EC57-40672E520082E823','',125,20) + oCMenu.makeMenu('m53','m40','Reference Documents','index.cfm?openact +ion=file_archive.view&CONTENT_ID=BF76E870-E7CC-515F-D4D5C3D4A210BB9A' +,'',125,20)}; $string =~ /Periodic Reports','([^']+)'/; print "Want to see:\n", "index.cfm?openaction=file_archive.view&CONTENT_ID=BF764E84-ED11-EC57- +40672E520082E823", "\n$1\n";
    Here the $1 returns the portion between the first () in the match ($2 will return the second and so on...)
      I've had a different interpretation than serf on this. I believe that there is little of the "precious RAM" that is used, (unless that matches were large), since there are only three $&, $' and $` variables. I've always thought that it was more the slightly extra cpu time (hence the "can slow your program down"). I've actually thought that the authors were just trying to be truthful, and tell you what was going on, and rather to pay attention to the sentence that was cut off So avoid $&, $’, and $‘ if you can, but if you can’t (and some algorithms really appreciate them), once you’ve used them once, use them at will, because you’ve already paid the price.  As of 5.005, $& is not so costly as the other two.
Re: RegEx Help - Look Behind
by bart (Canon) on Dec 03, 2005 at 12:57 UTC
    You seem to misunderstand what lookbehind is for. What's more, you don't even need it. You wrote:
    $string =~ /Periodic Reports','(?<!.*'$)/;
    Just change this to
    $string =~ /Periodic Reports','(.*?)'/;
    and now you'll match what's in the single quotes after "Periodic Reports" (etc.).

    If you insist on using lookbehind, use it for what it's for: "Does what I match now follow something looking like this?"

    $string =~ /(?<=Periodic Reports',)'(.*?)'/;
    Read it as: "match what's between single quotes, but it has to come after the string "Periodic Reports',".
Re: RegEx Help - Look Behind
by TedPride (Priest) on Dec 03, 2005 at 12:52 UTC
    Or even:
    use strict; use warnings; my $string = qq{oCMenu.makeMenu('m40','','Files','','',100,20) oCMenu. +makeMenu('m41','m40','Periodic Reports','index.cfm?openaction=file_ar +chive.view&CONTENT_ID=BF764E84-ED11-EC57-40672E520082E823','',125,20) + oCMenu.makeMenu('m53','m40','Reference Documents','index.cfm?openact +ion=file_archive.view&CONTENT_ID=BF76E870-E7CC-515F-D4D5C3D4A210BB9A' +,'',125,20)}; $string =~ /Periodic Reports','(.*?)'/; print $1;