in reply to Re^2: search pattern with digits
in thread search pattern with digits

Regular expressions can have a /x qualifier. It allows embedded comments and for the regular expression to be formatted for easy reading. To get a literal space you have to either

for example to match "one two" you have I use the latter as the space is easier to see with a mark both sides.

I used /x to make it easier for you to see the alternations. You can remove it along with the comment (# to end of line), white space not in a character class and then change "[ ]" to " ".

I am now a little confused about what you are matching. You say you are given the string to match as an argument but you have two different strings "total rows rejected: number" and "number rows rejected". If you do put the argument into a regular expression then you are correct to use \Q \E.

Replies are listed 'Best First'.
Re^4: search pattern with digits
by mercuryshipz (Acolyte) on Feb 14, 2008 at 21:07 UTC
    thanks for the responses...

    lemme make it clear ... the argument(s) which are given is jus the phrase. for example
    total rejected rows

    or
    rejected number of rows

    in the log file which im gonna parse it with the script contains:

    log.txt
    total rejected rows: 1000 total rejected rows: 1254 total rejected rows: 1000 total rejected rows: 1254 total rejected rows: 1000 3000 rejected number of rows 8700 rejected number of rows 65000 rejected number of rows 1200 rejected number of rows 4300 rejected number of rows total rejected rows: 1254 total rejected rows: 1000 total rejected rows: 1254 total rejected rows: 1000 total rejected rows: 1254 54000 rejected number of rows 4000 rejected number of rows


    the programs works only for "total rejected rows:", ie., if the number (desired result) is at the end of the search phrase, but if it is at the beginning for example, "rejected number of rows" the number present at the beginning is not returned... but always the search phrase has the digits either at the beginning or at the end... if u have more questions plz lemme kno...

    thanks.

      1. How is the search string given?
      2. What search string is given?
      3. How the input data vary?
      4. How will what you search for depend on the input data?
      5. Will you always be capturing the same data?

      For the example you have given where you always want to capture a number after or before "rows rejected"

      qr{(?| \Q$string\E [^\d]* (\d+) | (\d+) [^\d]* \Q$string\E ) }x
      will capture the last number before a given search string or if there isn't one the first number after the string. Note given
      records processed 23456, total rows rejected 567
      it will match 23456.

      What is the input when one row is rejected? Is it

      1 rows rejected
      or
      1 row rejected
      Know your data;)