jaggu_bg has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks... i am getting only one line output from the i/p file. the data below that is not printing from the input file , i/p file is given below & program
D Fri Nov 21 14:09:41 2008 TME_BILLING_COE cm:10570 cm_child.c(107) +:4385 1:blrdxp-santbs:CustomerCenter XXX CMAP: op_custom() past op_decode, opcode: PCM_OP_CUST D Fri Nov 21 14:09:41 2008 TME_BILLING_COE cm:10570 cm_child.c :92 +1:blrdxp-santbs:CustomerCenter:0:AWT-EventQueue-0:83:1227256900:0 op_cust_validate_customer input flist # number of field entries allocated 20, used 3 0 PIN_FLD_POID POID [0] 0.0.0.1 /plan -1 0 0 PIN_FLD_ACCOUNT_OBJ POID [0] 0.0.0.1 /account -1 0 0 PIN_FLD_PAYINFO ARRAY [1] allocated 20, used 5 1 PIN_FLD_NAME STR [0] "Invoice1" D Fri Nov 21 14:09:41 2008 TME_BILLING_COE cm:10570 fm_cust_pol_pre +p_payinfo.c:113 1:blrdxp-santbs:CustomerCenter:0:AWT-EventQueue-0:83: +1227256900:0 op_cust_pol_prep_payinfo input flist # number of field entries allocated 20, used 5 0 PIN_FLD_POID POID [0] 0.0.0.1 /payinfo/invoice -1 0 0 PIN_FLD_ACCOUNT_OBJ POID [0] 0.0.0.1 /account -1 0 0 PIN_FLD_FLAGS INT [0] 1 use strict; use warnings; my $input_file = $ARGV[0]; my $input_pid = $ARGV[1]; my $input_string = $ARGV[2]; print "Usage: test.pl <input file name> <input process id> <input file +string>" if (@ARGV != 3); open (my $hfile, '<', "$input_file") || die("Unable to open the file $ +input_file\n"); open (my $ofile, '>', "output.txt") || die("Unable to open the file ou +tput.txt\n"); while (<$hfile>){ my $line = $_; if ($line =~ /[DWEM]\s+.*?\s+(?:c|cm|M)\:($input_pid)\s+([^\.]+\.c +)/s) { my $c_file = $2; my $pid = $1; if($input_string eq $c_file){ #check with input string print $ofile "$line\n"; #write the output in a output file } } }
my input is perl test 10570 'cm_child.c' output i am getting is,
D Fri Nov 21 14:09:43 2008 TME_BILLING_COE cm:10570 cm_child.c(107) +:4385 1:blrdxp-santbs:CustomerCenter:0:AWT-EventQueue-0:63:1227256901 D Fri Nov 21 14:09:46 2008 TME_BILLING_COE cm:10570 cm_child.c(107) +:4385 1:blrdxp-santbs:CustomerCenter:0:AWT-EventQueue-0:3:1227256904:
Rest of the data is not coming .. the expected ouput is,
D Fri Nov 21 14:09:41 2008 TME_BILLING_COE cm:10570 cm_child.c(107) +:4385 1:blrdxp-santbs:CustomerCenter XXX CMAP: op_custom() past op_decode, opcode: PCM_OP_CUST D Fri Nov 21 14:09:41 2008 TME_BILLING_COE cm:10570 cm_child.c :92 +1:blrdxp-santbs:CustomerCenter:0:AWT-EventQueue-0:83:1227256900:0 op_cust_validate_customer input flist # number of field entries allocated 20, used 3 0 PIN_FLD_POID POID [0] 0.0.0.1 /plan -1 0 0 PIN_FLD_ACCOUNT_OBJ POID [0] 0.0.0.1 /account -1 0 0 PIN_FLD_PAYINFO ARRAY [1] allocated 20, used 5 1 PIN_FLD_NAME STR [0] "Invoice1"

Replies are listed 'Best First'.
Re: output issue ????
by brsaravan (Scribe) on Nov 28, 2008 at 11:25 UTC
    ....the data below that is not printing. the log file is
    Please explain expected output.
    Your regex validates only three lines of your log file, where as only two lines has your input string (cm_child.c).
    You are further checking ($input_string eq $c_file), so the output.txt contains only two lines....
    . If you remove this condition, then you will get all the data in output.txt file.
Re: output issue....
by Andrew Coolman (Hermit) on Nov 28, 2008 at 14:43 UTC
    Reason is that you are reading input files by lines (the s in reqex is useless) and do the print in if statement for reqex. So even if it matches what you need it does not print the rest of lines because reqex won't match for them.

    I would suggest something like that (not tested):
    my $c_file = "NO MATCH"; while (<$hfile>) { my $line = $_; if ($line =~ /[DWEM]\s+.*?\s+(?:c|cm|M)\:($input_pid)\s+([^\.]+\.c ++)/s) { $c_file = $2; my $pid = $1; } if($input_string eq $c_file) { #check with input string print $ofile "$line\n"; #write the output in a output file } }

    Regards,
    s++ą  ł˝ ął. Ş ş şą Żľ ľą˛ş ą ŻĽąş.}++y~-~?-{~/s**$_*ee

      "Not tested" is, perhaps, a very kind way of putting it. You may have intended to match both the filename and the PID, but you failed to do the latter anywhere in the code. In addition, you're comparing '$c_file' against a non-existent variable called '$input_string'. Last of all, even if you had checked both the filename and the PID against reasonable values, you'd continue to print lines until you had another successful match - which may not be the right thing to do.

      Update Modified phrasing.


      --
      "Language shapes the way we think, and determines what we can think about."
      -- B. L. Whorf
Re: output issue....
by oko1 (Deacon) on Nov 28, 2008 at 19:49 UTC

    You're reading your input file one line at a time, and only printing a line when your regex matches - while what you say you want to print is a paragraph ("\n\n"-delimited areas of your input.) There are ways to make the above work - i.e., you could set a "keep printing" flag as soon as your regex matches and reset it when you hit whatever your end condition is - but since your input is already split into paragraphs, it's much easier to use Perl's EOL variable, $/, to tell your script to read it that way.

    #!/usr/bin/perl use strict; use warnings; my $pid = "10570"; my $str = 'cm_child\.c'; { local $/ = "\n\n"; while (<DATA>){ print if /^[DWEM]\s+.*?\s+(?:cm?|M):$pid\s+$str/sm; } } __END__ D Fri Nov 21 14:09:41 2008 TME_BILLING_COE cm:10570 cm_child.c(107) +:4385 1:blrdxp-santbs:CustomerCenter XXX CMAP: op_custom() past op_decode, opcode: PCM_OP_CUST D Fri Nov 21 14:09:41 2008 TME_BILLING_COE cm:10570 cm_child.c :92 +1:blrdxp-santbs:CustomerCenter:0:AWT-EventQueue-0:83:1227256900:0 op_cust_validate_customer input flist # number of field entries allocated 20, used 3 0 PIN_FLD_POID POID [0] 0.0.0.1 /plan -1 0 0 PIN_FLD_ACCOUNT_OBJ POID [0] 0.0.0.1 /account -1 0 0 PIN_FLD_PAYINFO ARRAY [1] allocated 20, used 5 1 PIN_FLD_NAME STR [0] "Invoice1" D Fri Nov 21 14:09:41 2008 TME_BILLING_COE cm:10570 fm_cust_pol_pre +p_payinfo.c:113 1:blrdxp-santbs:CustomerCenter:0:AWT-EventQueue-0:83: +1227256900:0 op_cust_pol_prep_payinfo input flist # number of field entries allocated 20, used 5 0 PIN_FLD_POID POID [0] 0.0.0.1 /payinfo/invoice -1 0 0 PIN_FLD_ACCOUNT_OBJ POID [0] 0.0.0.1 /account -1 0 0 PIN_FLD_FLAGS INT [0] 1

    You also need to consider how you're going to deal with the metacharacters (such as '.') in the filename that you're using, since "cm_child.c" will happily match "cm_childXc" or "cm_child-c" when used as a part of the regex - but that's a different problem (one that can't be addressed since we don't know what the range of possible filenames is in your case.)


    --
    "Language shapes the way we think, and determines what we can think about."
    -- B. L. Whorf