august3 has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,
I am trying to extract a single line of text from multiple files in a folder. I am trying to do this with out opening the files for reading. Don't know if this is a good idea. The code that I wrote using grep is not working..I would appreciate if you could correct me. The code snippet is provided below. Each file has multiple lines but ONLY ONE line that starts with CQA_STATUS and I am trying to extract ONLY this line.This line is like CQA_STATUS,TEST_NAME,PASS,FAIL,PASS,FAIL,FAIL,FAIL,PASS
Thanks
august
my $usage = "usage: get_cqa_status.pl <cqa_file_path>"; my $cqapath = ""; my $status_line = ""; ############################################ if ($#ARGV <0) { die "$usage"; } $cqapath = $ARGV[0]; my @cqafiles = glob("$cqapath/cqa_*"); open (FILE, "> $cqapath/CQA_STATUS")||die "can't open file $cqapat +h/CQA_STATUS for writing.\n"; foreach my $item (@cqafiles) { my @stats = grep ('CQA_STATUS', $item); $status_line = $stats[0]; #Remove everything before the first comma #$status_line = substr($status_line, (index $status_line, ',')+1); print FILE "$status_line\n"; } close(FILE) || die "can't close$cqapath/CQA_STATUS after writing\ +n";

Replies are listed 'Best First'.
Re: extract a single line from multiple files in a folder.
by codeacrobat (Chaplain) on Mar 22, 2006 at 19:42 UTC
    What is the matter with egrep?

    The following works fine with me.
    egrep "^CQA_STATUS" $CQAPATH/cqa_*

    If the line needs further processing, you can always pipe it into perl, sed or other.
Re: extract a single line from multiple files in a folder.
by davido (Cardinal) on Mar 22, 2006 at 19:37 UTC

    Why do you not want to open the files for reading? How do you expect to read that one line without opening the file? Imagine the concept of finding one particular sentence in a book, without opening the book. There is no way to see what is not available to be seen. An un-opened file's contents are not available to be seen.

    I would do it like this:

    my @cqafiles = glob( "$cqapath/cqa_*" ); open( STATUS, '>', "cqapath/CQA_STATUS" ) or die "Can't open file $cqapath/CQA_STATUS for writing." . "\n$!"; foreach my $file ( @cqafiles ) { open my $handle, '<', $file or die "Can't open $file.\n$!"; FINDLINE: while( my $line = <$handle> ) { next unless $line =~ m/^CQA_STATUS/; $line =~ s/^[^,]+//; print STATUS $line; last FINDLINE; } } close STATUS or die "Can't close status file\n$!";

    This is an untested snippet. It will still need your first few lines to set things up. ...give it a try. ;)


    Dave

Re: extract a single line from multiple files in a folder.
by CountOrlok (Friar) on Mar 22, 2006 at 19:19 UTC
    The grep function in Perl is not the same as the Unix utility grep. Your best bet is to open each file in @cgafiles for read and do a pattern match on each line:
    foreach my $item (@cqafiles) { open CQAFILE, "<$item"; while (<CQAFILE>) { next unless /^CQA_STATUS,/; print FILE $'; } }
    -imran
Re: extract a single line from multiple files in a folder.
by GrandFather (Saint) on Mar 22, 2006 at 19:32 UTC

    Where to start? Maybe you want index rather than grep:

    my $found = -1 != index $item, 'CQA_STATUS';

    or you might want grep rather than foreach:

    my @stats = grep (/CQA_STATUS/) @cqafiles;

    except that (although, being a Windows user I don't use glob much) I don't think that is the right thing at all. What are you expecting to match? The best this code could do is is find a file whose name includes CQA_STATUS - very likely not what you want.

    If you need to search files for some text, you've got to search the files! There just ain't no way around it. The following may serve as a starting point:

    use strict; use warnings; my $usage = "usage: get_cqa_status.pl <cqa_file_path>"; my $cqapath = ""; my $status_line = ""; ############################################ if ($#ARGV <0) { die "$usage"; } $cqapath = $ARGV[0]; open (FILE, "> $cqapath/CQA_STATUS")||die "can't open file $cqapath/CQ +A_STATUS for writing.\n"; while (<>) { next if ! /^CQA_STATUS/; print FILE; } close(FILE) || die "can't close$cqapath/CQA_STATUS after writing\n";

    DWIM is Perl's answer to Gödel
Re: extract a single line from multiple files in a folder.
by bfdi533 (Friar) on Mar 22, 2006 at 21:21 UTC
    You could always change the PERL grep for the system grep by using the backticks operator as follows (beginning of your code omitted):
    ... foreach my $item (@cqafiles) { my $status_line = `grep '^CQA_STATUS' $item`; print FILE "$status_line\n"; } close(FILE) || die "can't close$cqapath/CQA_STATUS after writing\n +";
Re: extract a single line from multiple files in a folder.
by graff (Chancellor) on Mar 23, 2006 at 05:13 UTC
    Are you writing a perl script to do this because you don't have unix tools like "grep"? (There's no good reason why you should lack those tools.)
    # shell command, assuming there are not tons of files in file_path: grep -h ^CQA_STATUS file_path/cqa_* > file_path/CQA_STATUS # but if there are tons of files, do it like this: find file_path -name 'cqa_*' | xargs grep -h ^CQA_STATUS > file_path/C +QA_STATUS # (update: added carets where needed)
    Now, if these files had two or more lines starting with CQA_STATUS, and you only wanted to extract one of those lines, then you'd probably want a perl script -- and yes, you would want to go ahead and open each file in turn, and use perl's grep function (and/or whatever else is necessary to pick the particular line you want) in order to extract the target line from each file. Something like:
    open( O, ">", "file_path/CQA_STATUS" ) or die "CQA_STATUS: $!"; for my $file ( <file_path/cqa_*> ) { open( I, "<", $file ) or do { warn "$file: $!"; next }; while (<I>) { # suppose we only want the first occurrence from eac +h file if ( /^CQA_STATUS/ ) { print O; last; } } close I; } close O;