john.tm has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I have a script that pulls from a report regex matches. Then i wish to refernce these matches outside of the loop, but i do not know if it is possible to assign the matches to a variable.

As you can see in my example report. The book ref number and title are matched. so on which ever report the script is executed there will only ever be one result for each pattern, book ref no. and title , as each report has the same format. I was hoping to assign each match to $book and $title, for later reference.

#!/usr/bin/perl use strict; use warnings; my $book= "^\\s\*book ref \#"; my $book_res = $book =~ "/^\\s\*owner \#/"; my $title = "^\\s\*title "; my $title_res = $title =~ "/^\\s\*title /"; foreach (<DATA>) { next unless /$book|$title/ip; print ; } # here i would like to access the regex matches as scalers __DATA__ Book ref #4346 Lent: Sun Jul 12 03:26:43 BST 2015 status Lent Description: classic title blah blah blah last used: 2 color red Pages 238 Publisher Bca Type Hardback Location: N/a Author R jones
resulted matches Book ref #4346 title blah blah blah

Replies are listed 'Best First'.
Re: Perl store as variables regex matches for use outside of loop.
by 1nickt (Canon) on Jul 12, 2015 at 15:12 UTC

    You just need to capture the output of the regexp match, which Perl makes available in $1.

    See Extracting matches.

    (Update: added link to docs)

    #!/usr/bin/perl use strict; use warnings; my $book= "^\\s\*book ref \#"; my $book_res = $book =~ "/^\\s\*owner \#/"; my $title = "^\\s\*title "; my $title_res = $title =~ "/^\\s\*title /"; my %keepers; foreach (<DATA>) { #next unless /$book|$title/ip; chomp; if ( /($book|$title)/ip ) { $keepers{ $_ } = $1; } #print ; } # here i would like to access the regex matches as scalers for (keys %keepers ) { print "Here's your scalar: <<$_>> has value <<$keepers{ $_ }>>\n"; } __DATA__ Book ref #4346 Lent: Sun Jul 12 03:26:43 BST 2015 status Lent Description: classic title blah blah blah last used: 2 color red Pages 238 Publisher Bca Type Hardback Location: N/a Author R jones
    Remember: Ne dederis in spiritu molere illegitimi!
Re: Perl store as variables regex matches for use outside of loop.
by stevieb (Canon) on Jul 12, 2015 at 15:15 UTC

    Commonly, we use hashes for doing this sort of work, where you want to have a key (book ref) that holds information about the thing (the book's title). In this specific case, I've used a hash of hashes (HoH), and used regex capture groups (()) to grab the relevant parts of the regex to store. Here's a basic example. This assumes that the book's ref will always be in the same position (above the line containing the title, and any other items you want to store)

    #!/usr/bin/perl use strict; use warnings; my %books; my $book_ref; while (<DATA>) { chomp; if (/^\s*book ref #(\d+)/i and $1){ $book_ref = $1; $books{$book_ref} = {}; } if (/^title\s+(.*)$/i and $1){ $books{$book_ref}{Title} = $1; } } # print the whole shebang for my $ref (keys %books){ for my $book_element (keys $books{$ref}){ print "Book ref: $ref, $book_element: $books{$ref}{$book_eleme +nt}\n"; } } # print one of the book's titles print "$books{9969}{Title}\n"; __DATA__ Book ref #4346 Lent: Sun Jul 12 03:26:43 BST 2015 status Lent Description: classic title blah blah blah last used: 2 color red Pages 238 Publisher Bca Type Hardback Location: N/a Author R jones Book ref #9969 Lent: Sun Jul 12 03:26:43 BST 2015 status Lent Description: classic title My Little Pony last used: 2 color red Pages 238 Publisher Bca Type Hardback Location: N/a Author R jones __END__ Book ref: 4346, Title: blah blah blah Book ref: 9969, Title: My Little Pony My Little Pony

    -stevieb

Re: Perl store as variables regex matches for use outside of loop.
by Laurent_R (Canon) on Jul 12, 2015 at 15:55 UTC
    Hi,

    you've been given good answers (use of $1 for retrieving the capture, etc.), I would just add a couple of comments on your code.

    If I understand what you are trying to do, your regex matching attempts should be within the loop over the DATA section, not before.

    Then, for a regex to capture what it matches or part thereof, you need to use parentheses around the part of the match that you want to capture.

    Finally, and less importantly, you might want to consider using the qr// operator (see http://perldoc.perl.org/perlop.html#Quote-and-Quote-like-Operators) rather than simple quote marks for defining your regex patterns.

    Putting it together, you might end up with something like this:

    my $book_regex = qr/^Book ref #(\d+)/; # (\d+) will capture the ref in +to $1 my ($ref, $title, ...); while (<DATA>) { # you may need to chomp the lines $ref = $1 if /$book_regex/; $title = $1 if /^title\s+(.*)/; # other regexes for title, etc. } # now you can use $ref and $title
    This will work if you are looking at only one book at a time. As mentioned previously by stevieb, you'll probably want to use a hash of hashes if you need to look at several books and store the results for further use.

      Regarding matching numbers ... I have recently gone back to using the old-school

      /[0-9]+/

      because of (in Using character classes)

      Since the introduction of Unicode, unless the //a modifier is in effec +t, these character classes match more than just a few characters in t +he ASCII range. \d matches a digit, not just [0-9] but also digits from non-roman scri +pts

      There are a lot of characters flying around out there. But some stuff still needs to be limited to ASCII -- for example book ref numbers, even if your book titles may have non-Roman characters; or a primary key from a database table in a multilingual CMS ...

      So if you want to make sure you are matching only Roman script digits, and you need to work on perls older than 5.14, you should probably use the old [0-9]. If you have a new perl, you can do:

      /\d+/a

      which will limit the match to ASCII characters ...

      Remember: Ne dederis in spiritu molere illegitimi!
        Yes, 1nickt, you're absolutely right, we have to be quite careful about these things nowadays. I am not dealing too often with Unicode or UTF8 data, and mostly with pure ASCII or sometimes with an extended ASCII Latin character set, so that I don't always think about the possibility of character classes matching more than I usually expect. Thank you for reminding.

        Having said that, I was not presenting production code, not even a complete working program, but just trying to identify some of the errors in the OP code and show some ways to correct them.

Re: Perl store as variables regex matches for use outside of loop.
by davido (Cardinal) on Jul 12, 2015 at 16:52 UTC

    foreach (<DATA>) { my $re = qr/($book|$title)/; next unless /$re/; push @matches, [$1, $_, $re]; print; }

    This saves the portion of the string that was captured, the string that matched, and the regexp that succeeded.


    Dave