reciter has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks

I have lakhs of files in a folder, which I have to input in perl and then locate variables(x2, z2 and some_t2). then compare their values (whether x2 >= 0.6, z2>=0.5 and some_t2>=0.4 is true or not) If a files is true for all x2,z2 and some_t2 then either write the filename as output or write that file(whole content) into new file. (my input files has other variable as well, which we don't want to consider for filtration purpose)

example of file
$somecontent 1somecontent somecontent $ 0 1 1 1 0 00 somecontent somecontent x2 = 3.6235 z2 = 0.1036 F_eroie = 0.6156 someothervali = 0.9976 somecontent Some_t2 = 0.8456 se_x2 = -0.6545 x2_se = 0.9647 z2_se = 0.6954 --------------------------------- somecontent somecontent $ $ $somecontent

this my pathetic attempt of code. I started with the aim to print filename of single file, on the basis that all condition are true. even then its not working.

#usr/bin/perl use strict; use warnings; for my $i= 6000000 { open (FH, "try_$i.txt") or die"Can't Open file"; my @readline=<FH>; my $filename = <FH>; close(FH); my $pat='/^x2=\s'; my $pat1='/^z2=\s'; my $pat2='/^some_t2=\s'; foreach my $line (@readline) { $pat=~'/^x2=\s'; my $x2=$line; $pat1=~'/^z2=\s'; my $z2=$line; $pat2=~'/^some_t2=\s'; my $some_t2=$line; $x2 >= 0.6 and $z2 >= 0.5 and $some_t2 >= 0.4; } print $filename; close(FH); }

And I got completly puzzled when I tried opening multiple files. All my files are named like out_1, out_2 ,out_3 and so on..so tried using loop opening file as wel.

But it doesn't work as well.

PLease Help

(I have textual content as well as numerical values in the file)

Replies are listed 'Best First'.
Re: filter the files in a folder on the basis of some variable present in them
by Discipulus (Canon) on Jun 10, 2015 at 08:34 UTC
    mmh not sure if i have understood all your question.. anyway i have some hints for you.

    First your method to get the file list ($i = 6000000) is, mmh how to say, bizarre? Perl has a glob function, use it.
    Second, never use bareword filehandles (FH) use the lexical form.

    Third if you really have so much files then copy a little bounch of them (with some positive case included) in a development directory and write your script against them, so you'll have a fast feedback.

    A basic approch can be similar to this pseudo-code
    #pseudo-code $|++; #flush output to stdout as soon as possible my @files = glob '*.txt'; foreach my $file (@files){ my ($var_one, $var_two ...); #your var names to be checked open my $fh, '<', $file or die "..."; while (<$fh>) { #use regex to put something inside $var_one, $var_two } # if (all vars needed are defined and pass your check){ print "$file + IS VALID\n"; system 'mv $file /new/path' } }


    HtH
    L*
    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

      Hi discipulus, I am a novice to perl thats why I do so much blunder. I tried another way to filter files. Instead using x2 z2 and all as variable I tried to search them as pattern. but it also having problem can you please check and tell what can I do so it may start working. I have updated once

      #usr/bin/perl use strict; use warnings; $|++; #flush output to stdout as soon as possible my @files = glob '*.txt'; my $pat='/^x2=0.[6-9][0-9][0-9][0-9][0-9]\n'; my $pat1='/^z2=0.[6-9][0-9][0-9][0-9][0-9]\n'; my $pat3='/^some_t2=0.[6-9][0-9][0-9][0-9][0-9]\n'; foreach my $file (@files) { foreach my $file1(@files) { foreach my $files2(@files) { if (($file=~/^$pat/) && ($file1=~/^$pat1/) && ($$file2=~/^$pat3/)) { print "$file IS VALID\n"; system 'mv $file E:\test\some' } } else print "not relevant\n" }

        Instead using x2 z2 and all as variable I tried to search them as pattern ??
        Also the triple foreach make no sense to me: first a regex can be precompiled using the qr operator, second you are appliyng the regex to filename! not the content. You need a lot of practice with Perl subjects: open file, regexes (basics), loop ..

        Anyway, following the basic structure mentioned by me above, and given the following folder content:
        ls -l -rw-rw-rw- 1 user group 26 Jun 10 13:27 invalid.txt -rw-rw-rw- 1 user group 1049 Jun 10 13:44 reciter.pl -rw-rw-rw- 1 user group 36 Jun 10 13:28 valid.txt cat invalid.txt dfd wdfq qwef z2=0.7 cat valid.txt adf df x2=0.7 z2=0.7 some_t2=0.7
        you must have something like (tested working code):
        #!/usr/bin/perl use strict; use warnings; $|++; #flush output to stdout as soon as possible my @files = glob '*.txt'; my $pat = qr/^x2=(0.[6-9])$/; my $pat1= qr/^z2=(0.[6-9])$/; my $pat3= qr/^some_t2=(0.[6-9])$/; foreach my $file (@files){ my ($var_one, $var_two, $var_three); #your var names to be checked print "checking '$file'\n"; open my $fh, '<', $file or die "..."; while (<$fh>) { #use regex to put something inside $var_one, $var_two.. chomp $_; if ($_ =~ $pat) {$var_one = $1; print "\tfound:'$_'\n"} if ($_ =~ $pat1) {$var_two = $1; print "\tfound:'$_'\n"} if ($_ =~ $pat3) {$var_three = $1; print "\tfound:'$_'\n"} } # #if (all vars needed are defined and pass your check){ print "$fil +e IS VALID\n"; system 'mv $file /new/path' } if (defined $var_one && defined $var_two && defined $var_three ) { print "FILE $file has a valid content ( x2=$var_one, z2=$var_two +, some_t2=$var_three)\n"; # system "mv $file x:/valid_files" } }
        and the output will be:
        perl reciter.pl checking 'invalid.txt' found:'z2=0.7' checking 'valid.txt' found:'x2=0.7' found:'z2=0.7' found:'some_t2=0.7' FILE valid.txt has a valid content ( x2=0.7, z2=0.7, some_t2=0.7)


        HtH
        L*
        There are no rules, there are no thumbs..
        Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

        A few problems with your code:

        • regular expressions (regexes or patterns) inadequate
        • you're not searching each file for the patterns (you're searching the file names)

        I also recommend precompiling the regexes before use via the qr// operator. Try to use those clues (and the help from Discipulus) to modify your code.

Re: filter the files in a folder on the basis of some variable present in them
by Ratazong (Monsignor) on Jun 10, 2015 at 08:16 UTC

    Hi reciter!

    The code below shows how I handle checks to be applied on more than one file:

    • I use File::Find to loop through all files
    • With the sub checkFile I do some basic checking on each of the files (e.g. if it is empty)
    • With sub analyze I open a relevant file and process it; if you use the same approach, you'll move most of your code into that function
    HTH, Rata

    use File::Find; use File::stat; find(\&checkFile, $path); # loop through all files in my directory #--------------------------------- check to be applied on each file -- +----------- sub checkFile { my $fullfilename = $File::Find::name; my $filesize = stat($File::Find::name)->size; my $filename = $_; # --- check for relevance based on expected filenames if ((! ($filename =~ /_obj_pp_comp.txt/ )) && (! ($filename =~ /ob +j.doc/ ))) { return; } if ($filesize == 0) { print ("<!-- \t\tERROR: file $fullfilename is empty. -->\n"); + return; } if (! -f $filename) { print ("<!-- \t\tERROR: file $fullfilename is a directory. -->\ +n"); return; } analyze($fullfilename); } #--------------------------------- work to be done on each relevant fi +le ------------- sub analyze { ... }

      Hi Ratazong, I am a novice to perl thats why I do so much blunder. I tried another way to filter files. Instead using x2 z2 and all as variable I tried to search them as pattern. but it also having problem can you please check and tell what can I do so it may start working.

      #usr/bin/perl use strict; use warnings; use File::Find; use File::stat; my $path='e:\testfolder'; find(\&checkFile, $path); # loop through all files in my directory #--------------------------------- check to be applied on each file -- +----------- sub checkFile { my $fullfilename = $File::Find::name; my $filesize = stat($File::Find::name)->size; my $filename = $_; # --- check for relevance based on expected filenames if ((! ($filename =~ /_obj_pp_comp.txt/ )) && (! ($filename =~ /ob +j.doc/ ))) { return; } if ($filesize == 0) { print ("<!-- \t\tERROR: file $fullfilename is empty. -->\n"); + return; } if (! -f $filename) { print ("<!-- \t\tERROR: file $fullfilename is a directory. -->\ +n"); return; } analyze($fullfilename); } #--------------------------------- work to be done on each relevant fi +le ------------- sub analyze { my @files = $fullfilename; my $pat='/^x2=0.[6-9][0-9][0-9][0-9][0-9]\n'; my $pat1='/^z2=0.[6-9][0-9][0-9][0-9][0-9]\n'; my $pat3='/^some_t2=0.[6-9][0-9][0-9][0-9][0-9]\n'; foreach my $file (@files) { foreach my $file1(@files) { foreach my $files2(@files) { if (($file=~/^$pat/) && ($file1=~/^$pat1/) && ($file2= +~/^$pat3/)) { print "$file IS VALID\n"; system 'mv $file E:\test\some' } else print "not relevant\n" } } } }

Re: filter the files in a folder on the basis of some variable present in them
by Laurent_R (Canon) on Jun 10, 2015 at 08:24 UTC
    Hi,

    first you should indent your code, this is really a must.

    Next, what do you intend with this:

    for my $i= 6000000
    It does not make any sense to me.

    No time now to go through the rest of your code, but, basically you need to apply your regexes to each line of code and set the right variable when the regex is successful.

      I was actually trying open to all the files present in folder. they only difference is last character i.e. each contains 1,2,3 (increasing order)at end of file name

Re: filter the files in a folder on the basis of some variable present in them
by ww (Archbishop) on Jun 10, 2015 at 13:53 UTC
    "it doesn't work

    How doesn't it work? Show us the error messages (verbatim, please) and the warnings, if any. Provide a narrative description of how output is different than your desire.

    Help us to help you.

    Questions containing the words "doesn't work" (or their moral equivalent) will usually get a downvote from me unless accompanied by:
    1. code
    2. verbatim error and/or warning messages
    3. a coherent explanation of what "doesn't work actually means.

    How do I post a question effectively?