Re: Perl store as variables regex matches for use outside of loop.

Hi,

you've been given good answers (use of $1 for retrieving the capture, etc.), I would just add a couple of comments on your code.

If I understand what you are trying to do, your regex matching attempts should be within the loop over the DATA section, not before.

Then, for a regex to capture what it matches or part thereof, you need to use parentheses around the part of the match that you want to capture.

Finally, and less importantly, you might want to consider using the qr// operator (see http://perldoc.perl.org/perlop.html#Quote-and-Quote-like-Operators) rather than simple quote marks for defining your regex patterns.

Putting it together, you might end up with something like this:

my $book_regex = qr/^Book ref #(\d+)/; # (\d+) will capture the ref in
+to $1
my ($ref, $title, ...);
while (<DATA>) {
    # you may need to chomp the lines
    $ref = $1 if /$book_regex/;
    $title = $1 if /^title\s+(.*)/;
    # other regexes for title, etc.
}
# now you can use $ref and $title
[download]

This will work if you are looking at only one book at a time. As mentioned previously by stevieb, you'll probably want to use a hash of hashes if you need to look at several books and store the results for further use.

Comment on Re: Perl store as variables regex matches for use outside of loop. Select or Download Code

Replies are listed 'Best First'.
Re^2: Perl store as variables regex matches for use outside of loop. by 1nickt (Canon) on Jul 12, 2015 at 16:26 UTC
Regarding matching numbers ... I have recently gone back to using the old-school `/[0-9]+/` [download] because of (in Using character classes) `Since the introduction of Unicode, unless the //a modifier is in effec +t, these character classes match more than just a few characters in t +he ASCII range. \d matches a digit, not just [0-9] but also digits from non-roman scri +pts` [download] There are a lot of characters flying around out there. But some stuff still needs to be limited to ASCII -- for example book ref numbers, even if your book titles may have non-Roman characters; or a primary key from a database table in a multilingual CMS ... So if you want to make sure you are matching only Roman script digits, and you need to work on perls older than 5.14, you should probably use the old `[0-9]`. If you have a new perl, you can do: `/\d+/a` [download] which will limit the match to ASCII characters ... Remember: Ne dederis in spiritu molere illegitimi!	[reply] [d/l] [select]
Re^3: Perl store as variables regex matches for use outside of loop. by Laurent_R (Canon) on Jul 12, 2015 at 17:50 UTC
Yes, 1nickt, you're absolutely right, we have to be quite careful about these things nowadays. I am not dealing too often with Unicode or UTF8 data, and mostly with pure ASCII or sometimes with an extended ASCII Latin character set, so that I don't always think about the possibility of character classes matching more than I usually expect. Thank you for reminding. Having said that, I was not presenting production code, not even a complete working program, but just trying to identify some of the errors in the OP code and show some ways to correct them.	[reply]