That would involve a lot of post-processing to match up the numbers with the categories and filter the categories to just the valid ones ('crew', 'wounded' and 'crit'). And it can't be inserted into a larger regex match.

Of course I'm just seeing one part of the overall picture, but with just a very minor modification to the code, I generate a hash table with the noun as the key and # as the value. Aside from the print stuff, this is just a few lines of code. I would expect that this is a sub that you call and re-use many times. To check if enough stuff is there, num of keys would give that. To see if one of these nouns is invalid, is just 2 lines of code (see below).

Basically I would advocate some kind of data table driven approach with some rules being applied by some subs to that tabular data description. I mean if you have a validate sub that uses a table of valid nouns, then you can call that sub with other tables of valid nouns as the situation requires.

validating user input is often harder than it first appears and I wouldn't be over concerned about 25 lines versus a whole page of code IF that page is clear. Clarity should be a higher priority than number of lines because this will lead to less buggy code that is easier to maintain.

#!/usr/bin/perl -w use strict; my %valid = qw (crew 1 critical 1 wounded 1 killed 1); while (<DATA>) { print "testing: $_"; chomp; my %hash = reverse(m/(\d+)\s+(\w+)/g); foreach my $key(keys %hash) { print "$key $hash{$key}\n"; } my @invalid = grep {!$valid{$_}}keys %hash; print "invalid nouns: @invalid\n" if @invalid; print "\n"; } # testing: beam 15 crew 5 wounded 2 critical to S.S.Kevorkian # crew 15 # critical 2 # wounded 5 # # testing: what a day:5 wounded 2 critical 20 crew # critical 2 # crew 20 # wounded 5 # testing: 20 crew and 6 killed and 14 MIA # crew 20 # killed 6 # MIA 14 # invalid nouns: MIA __DATA__ beam 15 crew 5 wounded 2 critical to S.S.Kevorkian what a day:5 wounded 2 critical 20 crew 20 crew and 6 killed and 14 MIA

In reply to Re^5: Leaking Regex Captures by Marshall
in thread Leaking Regex Captures by SuicideJunkie

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.