nanban has asked for the wisdom of the Perl Monks concerning the following question:

I am looking for extracting a number between in the below line

Code line :

Apples Orances:5433 55552246:777449 Country 457852/total

Apples Red:9987 green:777449 Public 74585/total

So I need the numbers 457852 and 74585 as output.

In simple need to extract the value before "/total".

Please help me.

  • Comment on extract the value before a particualr string

Replies are listed 'Best First'.
Re: extract the value before a particualr string
by space_monk (Chaplain) on Sep 05, 2013 at 10:12 UTC
    If you don't like split, try regexes instead. Something like ...
    if (/(\d+)\/total$/) { print "$1\n"; }
    If you spot any bugs in my solutions, it's because I've deliberately left them in as an exercise for the reader! :-)
      Thanks for the reply... Can we implement it using split? i am pretty new to perl...
      will it be possible to use split???
        nanban,
        will it be possible to use split???

        The first argument to split is a regular expression so it is really the difference between using capturing parens in a regular expression or using a regular expression to break the string into pieces but they are both regular expression solutions.

        Cheers - L~R

Re: extract the value before a particualr string
by hdb (Monsignor) on Sep 05, 2013 at 10:07 UTC

    I suggest to split on the regex /\s|\// and then to pick the second from last element of the resulting array.

Re: extract the value before a particualr string
by 2teez (Vicar) on Sep 05, 2013 at 10:26 UTC

    Hi nanban,
    I am looking for extracting a number between in the below line
    How would you that using Perl?
    The best way to get help is to try something first, show such and help will then bring understanding.
    I would advise that you check How do I post a question effectively?.
    Seeing that you are new here, I could give you a head up that solves the this dataset you posted like so:

    use warnings; use strict; while(<DATA>){ chomp; print $1,$/ if/\s+?(\d+?)\/total$/; } __DATA__ Apples Orances:5433 55552246:777449 Country 457852/total Apples Red:9987 green:777449 Public 74585/total
    produces ..
    457852 74585
    But of what help is that, if your dataset changes or you have a different input files?
    Hope this helps.

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
      thanks.

      I did try somethin like using split with multiple parameters but it extacts the whole line.

      can u help mein understand this line

      print $1,$/ if/\s+?(\d+?)\/total$/;

      I understood what we are trying to do here if/\s+?(\d+?)\/total but the thing before and after i dont get it

        $1 holds the result of the pattern match captured inside (parenthesis). If you had multiple parens then there would also be a $2, $3, etc. The $/ variable is the input record separator, this defaults to being a newline, so print $_,$/ is equivalent to print "$_\n". The $ after total tells the regex engine that total will be the last thing on the line.

        Cheers,
        R.

        Pereant, qui ante nos nostra dixerunt!

        Random_Walk has given a good explanation.
        However, maybe the following line could help clear the that line a bit.
        print $1,$/ if m{\s+?(\d+?)\/total$};
        The above line shows clearly that after "total", what you have is "$" that indicate the end of the string and not "$/". The / that follows "$" is the second "slash" of the match operator i.e

        m// ^ This one
        In addition to all that, using the module "YAPE::Regex::Explain" the regular expression matches could be explain as thus:
        The regular expression: (?-imsx:\s+?(\d+?)/total$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \s+? whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the least amount possible)) ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- \d+? digits (0-9) (1 or more times (matching the least amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- /total '/total' ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

        If you tell me, I'll forget.
        If you show me, I'll remember.
        if you involve me, I'll understand.
        --- Author unknown to me
        I did try somethin like using split with multiple parameters but it extacts the whole line.

        I'm intrigued. Could you post the code that extacted the whole line?

Re: extract the value before a particualr string
by BillKSmith (Monsignor) on Sep 05, 2013 at 11:57 UTC
    The regex in the following script does exactly what you specify. It extracts the 5-digit value before each "\total" in a single line.
    use strict; use warnings; my $in_string = "Apples Orances:5433 55552246:777449 Country 457852/total" . " " . "Apples Red:9987 green:777449 Public 74585/total" . "\n" ; my @totals = $in_string =~ m{ (\d{5}) (?= /total ) }gx; print "@totals\n";
    Bill

      Hi,
      with ...(\d{5})... in your regexp, you will only have 57852 instead of 457852 for the first value, since the digits are more that 5.
      Maybe ...(\d{5,})... could help in this case. Or other variation in case the targeted digits are less than 5.

      If you tell me, I'll forget.
      If you show me, I'll remember.
      if you involve me, I'll understand.
      --- Author unknown to me