premal has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am newbee in perl development.I have recently started doing perl. I need to extract number from text file using regular expression. This nymber is lying somewhere in the big text file. Please help me how to write Regex for this or suggest me other convenient way to do this. I have paste one sentence from the content of my text file below. The number that i need to search is 12 in this sentence but this number will keep changing as my script will regenerate this file everytime. Content of my file====> Users of ClearQuest: (Total of 18 licenses issued; Total of 12 licenses in use). This sentense is lying somewhere in text file and I want to search number 12 and store it in a variable. Please suggest me how do I do this.

Replies are listed 'Best First'.
Re: Regex to extract number from text file
by moritz (Cardinal) on Feb 18, 2009 at 10:11 UTC
    If it's always an integer, it is matched by the regex \d+. To give it some context, you can say m/Total of (\d+) licenses in use/.

    The study of perlretut should tell you how to then extract the value into a variable.

      It appears you are trying to parse the output of the FlexLM 'lmstat' command. I've tackled that problem before.

      A few details worth noting that I can think of off hand:

    • The 's' after the word licenses should be marked optional with a ? in the regexp.
    • Licenses can be "Reserved" and "Queued" as well, if my memory served, those are the strings one sees.
    • The lmstat output itself can vary in format on a per-feature basis. The lmstat tool is merely passing on data from the "vendor daemons" and the format varies quite a bit.

      The solution I found first way back in 1998-1999 that worked the best was not to solve the problem in a single Regexp, but rather to use multiple Regexp looking for pieces of the puzzle.

      To be more specific:

    • Parse the license/feature name out of the string, and strip out the part that follows with the 'usage' data into a new string.
    • Parse the 'usage' string and look for the pieces you need. Handle the variations if they occur in your data. Code example:

      #!perl # sample string. The code below could be looped as well. # i.e. within while($string = <LMSTAT>){ } my $string = 'Users of ClearQuest: (Total of 18 licenses issued; Total + of 12 licenses in use)'; if($string =~ m/Users? of (\w+):\s+(\(.+\))/){ # must test for successful pattern match # or value may mistakenly be set to previous successful match! my $feature =$1; my $usage = $2; my $num_used=0; my $num_total=0; my $num_queued=0; if($usage =~ m/(\d+)\slicenses? (in use|Used)/ # handles variation +s in Used Licenses ){ $num_used=$1; } if($usage =~ m/(\d+)\slicenses? (issued|available)/ # handles vari +ations in Total Licenses ){ $num_total=$1; } if($usage =~ m/(\d+)\slicenses? (queued)/ # handles Queued license +s ){ $num_queued=$1; } # more IF statements to test for queued, reserved, etc. print "lmstat output: $string\n"; print "feature name: $feature\n"; print "usage string: $usage\n\n"; print "Total $feature Licenses: $num_total\n"; print "Total $feature Used Licenses: $num_used\n"; print "Total $feature Queued Licenses: $num_queued\n"; my $num_free=$num_total-$num_used-$num_queued; print "Approximate Free $feature Licenses: $num_free\n"; } else { # not a "Usage" line. Perhaps parse out # the USERNAMES of each used license here... }

      This should directly solve your problem, and prepare you for any hangups that may await you. Like Perl, with FlexLM there is 'more than one way to do it' so keep this in mind.

      If you need further help, you can message me privately. I've used Perl to populate a data warehouse with these sort of data. Getting control over license usage is a great way to reduce software costs.

      spectre#9 -- "Strictly speaking, there are no enlightened people, there is only enlightened activity." -- Shunryu Suzuki
        Hi Thanks a lot for your help. I will try your solution and you have pertactly understood my problem.
      Thanks for your reply. I tried writing following code but it is not printing anything on screen.

      my @temp;
      open(FILE,"C:\\test.txt");
      @temp = <FILE>;
      close(FILE);
      foreach (@temp)
      {
      my $search =~ m/ClearQuest\: \(Total of 18 licenses issued\; Total of (\d+)/;
      my $wd = $1;
      print "\n$wd\n";
      exit;
        That's a bit of surprise to me, because for me it prints
        Missing right curly or square bracket at /home/moritz/foo.pl line 10, +at end of line syntax error at /home/moritz/foo.pl line 10, at EOF Execution of /home/moritz/foo.pl aborted due to compilation errors.

        Please use <code>...</code> tags around code examples that you put here, and also make sure to copy & paste them from your actual script to not introduce some bugs by incorrectly writing it down.

        Also please start your scripts with

        use strict; use warnings;

        And you'll get at least a warning about what you're doing wrong.

        You need to change the code:
        my $search =~ m/ClearQuest\: \(Total of 18 licenses issued\; Total of +(\d+)/;
        to:
        (my $search)=$_=~ m/ClearQuest\: \(Total of 18 licenses issued\; Total + of (\d+)/;
Re: Regex to extract number from text file
by velusamy (Monk) on Feb 18, 2009 at 10:39 UTC
    Hi,

    '\d' is used to match the digits in regular expressions. You can write the regular expression like

    while (<FH>) { print "Matched Number:",$1 if(/(\d+)/g); }


    you will get the matched number in $1.

      except that you don't want the /g (global) switch on the regex! Consider:

      use strict; use warnings; my $str = 'wibble 10'; print "Matched $1\n" if $str =~ /(\d+)/g; print "Matched $1\n" if $str =~ /(\w+)/g;

      Prints:

      Matched 10

      Because of the /g switch the second match fails because is starts searching from where the first match left off.


      True laziness is hard work
Re: Regex to extract number from text file
by hda (Chaplain) on Feb 18, 2009 at 12:09 UTC
    Depending on the eventual design of your program and the format of your input file (for example, if the number you are looking for changes a lot in format as integer, floating point, etc), you might find useful the module: Scalar::Util.

    Have a look at the following node: Checking whether a value is numeric or not

Re: Regex to extract number from text file
by irah (Pilgrim) on Feb 18, 2009 at 10:49 UTC

    In every line in your file have only two digits, (here 12 is two digits), you can use one more way as,

    [0-9]{2}

    If the number of digits is not fixed, use,

    [0-9]+

    You can get to know from man pages. You can store the values using $1 variable.