rajkrishna89 has asked for the wisdom of the Perl Monks concerning the following question:

i got a huge doc file in which i have to extract something..the scenario is

Module: 1 ID:001 Customer : yes Module: 2 ID:002 Customer : no Module: 3 ID:003 Customer : yes Module: 4 ID:004 Customer : no

I have to extract the ID no and Module no whose Customer tag is "Yes"

I have used an array to extract ID if its yes but its not taking the module name

my $i; for ($i = 0; $i <@array; $i++) { if($array[$i] =~ /^Customer\s : \sYes/) { for ($count = $i; $count >= 1; $count--) { if ($array[$count] =~ /ID\s*:\s*(.+)/ ) { $ID = $1; #PLACE WHERE THE ID GET EXTRACTED print "$ID \n"; for ($count = $i; $count >= 1; $count--) { if ($array[$count] =~ /Module:\s*(.+)/ ) { $MODULE_NAME = $1; print "$MODULE_NAME \n"; my $Mycell1 = $Sheet->Range($Sheet->Cells($ro +w, $col),$Sheet->Cells($row, $col+2)); $Mycell1->{Value}=["$ID","$MODULE_NAME","$Fil +e"]; $row++; goto breakingfunction; } } } } breakingfunction: } }

Im able to extract the ID but not the Module no..help out monks

Replies are listed 'Best First'.
Re: Need Help in Array
by Utilitarian (Vicar) on Jan 03, 2012 at 14:36 UTC
    Take a look at the following:
    #!/usr/bin/perl use strict; use warnings; my %modules; my $current_module; while(<DATA>){ if(/Module:\s*(\d+)/){ $current_module=$1; } else{ $modules{$current_module}{$1}=$2 if (/^(\w+)\s*:\s*(\w+)$/); } } use Data::Dumper; print Dumper(\%modules); __DATA__ Module: 1 ID:001 Customer : yes Module: 2 ID:002 Customer : no Module: 3 ID:003 Customer : yes Module: 4 ID:004 Customer : no

    print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."
Re: Need Help in Array
by davido (Cardinal) on Jan 03, 2012 at 14:37 UTC

    This snippet makes a few assumptions and glosses over a few implementation details:

    • Assume the input is well formed; "yes" or "no" always comes at the end of a complete module and ID definition sequence.
    • Ignoring details about the file's "doc" encoding (which I assume you have already figured out).
    • Ignoring details about your $Sheet object (again assuming you know what to do once data has been properly extracted).

    Given those assumptions the following snippet will parse the file and place each "module/id" pair into an anonymous array, with all of them pushed onto an array poorly named @results.

    use strict; use warnings; use autodie; my $filename = 'something.doc'; my @results; open my $in_fh, '<', $filename; my( $module, $id ); while( my $line = <$in_fh> ) { chomp $line; if( $line =~ /^Module:\s+(\d+)/ ) { $module = $1; next; } if( $line =~ /^ID:(\d+)/ ) { $id = $1; next; } if( $line =~ /yes/ ) { push @results, [ $id, $module ]; } }

    It may feel unclean allowing $module and $id to retain values from one iteration to the next, even after a 'no' in the 'Customer:' field. But as long as the input is well formed, $module and $id will always contain valid information at the moment a 'yes' is encountered.


    Dave

Re: Need Help in Array
by ansh batra (Friar) on Jan 03, 2012 at 19:03 UTC
    #! /usr/bin/perl open(FILE,"<file.txt"); my @lines=<FILE>; close(FILE); my $module; my $id; foreach my $line(@lines) { chomp($line); if($line=~ /Module.*/) { $line=~ /:\s*/;$module=$'; } if($line=~ /ID.*/) { $line=~ /:\s*/;$id=$'; } if($line=~ /Customer : yes/i) { print "$module and $id\n"; } }
    outputs (with your data)
    1 and 001 3 and 003

      Your sample would be better written:

      #!/usr/bin/perl use warnings; use strict; my $dataStr = <<DATA; Module: 1 ID:001 Customer : yes Module: 2 ID:002 Customer : no Module: 3 ID:003 Customer : yes Module: 4 ID:004 Customer : no DATA my $filename = 'file.txt'; #open my $inFile, '<', $filename or die "Unable to open $filename: $!\ +n"; open my $inFile, '<', \$dataStr; my $module; my $id; while (defined (my $line = <$inFile>)) { chomp ($line); if ($line =~ /Module:\s*(\S+)/) { $module = $1; next; } if ($line =~ /ID:\s*(\S+)/) { $line =~ /:\s*/; $id = $'; next; } if ($line =~ /Customer : (\w+)/i) { print "$module and $id\n" if $1 eq 'yes'; $module = '-- missing module --'; $id = '-- missing ID --'; } }

      which uses three parameter open, lexical file handles, sample data included in the code, checked open result (in the commented "real" open), explicit use of strictures, while loop rather than file slurp and for loop, and improved handling of badly formed data.

      True laziness is hard work