in reply to Re: Making commond for large number of files
in thread Making commond for large number of files

I am a chemistry student and just started working on Computational Chemistry. I am a new user of perl and just strated learning it. I want to complete the following code. This is what I have done till now.

#!/usr/bin/perl # Write a single output file with # Bond_length Delocalization_range EDR # from each calculation here use strict; my $ELowSoFar = 0.0; # The lowest energy found so far my $FileLowSoFar = ''; # The file containing the lowest energy so far foreach my $files (<*log>){ # Loop over all of the files my $E=`grep "SCF Done" $files|awk "{print \\\$5}"`; chomp($E); +# Find the energy in this file # Check if the energy in this file is LOWER than the lowest en +ergy so far # If it is, then it is the NEW lowest energy so far # and the file containing is the new FileLowSoFar print "File $files has energy $E and the lowest energy so far +is $ELowSoFar\n"; } print "The lowest total energy was $ELowSoFar\n"; print "This was in file $FileLowSoFar\n";

Replies are listed 'Best First'.
Re^3: Making commond for large number of files
by FreeBeerReekingMonk (Deacon) on Apr 18, 2015 at 12:41 UTC

    Ok, great, that looks like a great start. I was checking http://nbo6.chem.wisc.edu/tut_del.htm and I assume now that EDu is the calculated energy of the Delocalization (some seem to call it deletion) in atomic units. And what you need to calculate is the Energy Delocalization Range. How about this:

    #!/usr/bin/perl # Write a single output file with # Bond_length Delocalization_range EDR # from each calculation here use strict; my $ELowSoFar = undef; # The lowest energy found so far my $FileLowSoFar = ''; # The file containing the lowest energy so far my %FILE2SCF; # $FILE2SCF{"FILENAME"} = Energy foreach my $file (<*log>){ # Loop over all of the files my $E=`grep "SCF Done" $file|awk "{print \\\$5}"`; chomp($E);# + Find the energy in this file # Check if the energy in this file is LOWER than the lowest en +ergy so far if (!defined $ELowSoFar || $ELowSoFar>$E){ # If it is, then it is the NEW lowest energy so far $ELowSoFar = $E; # and the file containing is the new FileLowSoFar $FileLowSoFar = $file; }; # store the energy of the file for later use $FILE2SCF{$file} = $E; print "File $file has energy $E and the lowest energy so far i +s $ELowSoFar\n"; } print "The lowest total energy was $ELowSoFar\n"; print "This was in file $FileLowSoFar\n"; # Now calculate Delocalization_range for each files my $minimalvalue = $FILE2SCF{$FileLowSoFar}; for my $file (sort keys %FILE2SCF){ my $currentvalue = $FILE2SCF{$file}; print "$file: Energy Delocalization Range=". ($currentvalue-$minimal +value)."\n"; }

      Thank you soo much for your help and patience. So nice of you. I have learnt something and this is what I want.

      This code is now working for me and it has solved my big problem. I shall learn slowly.

      But I am still in a little trouble. Each file contains the Electron Delocalization Range. It is at the end of each file. Here is a sample:

      Index Exponent U <EDRA> <EDRB> 1 0.50000000E+02 0.14142136E+00 0.18121396E-01 0.18121396E +-01 2 0.35714286E+02 0.16733201E+00 0.23255352E-01 0.23255352E +-01 3 0.25510204E+02 0.19798990E+00 0.29810613E-01 0.29810613E +-01 4 0.18221574E+02 0.23426481E+00 0.38155765E-01 0.38155765E +-01 5 0.13015410E+02 0.27718586E+00 0.48736939E-01 0.48736939E +-01 6 0.92967216E+01 0.32797073E+00 0.62081416E-01 0.62081416E +-01 7 0.66405154E+01 0.38806020E+00 0.78791719E-01 0.78791719E +-01 8 0.47432253E+01 0.45915902E+00 0.99523282E-01 0.99523282E +-01 9 0.33880181E+01 0.54328428E+00 0.12493692E+00 0.12493692E ++00 10 0.24200129E+01 0.64282263E+00 0.15561685E+00 0.15561685E ++00 11 0.17285807E+01 0.76059799E+00 0.19194780E+00 0.19194780E ++00 12 0.12347005E+01 0.89995168E+00 0.23395297E+00 0.23395297E ++00 13 0.88192890E+00 0.10648372E+01 0.28110962E+00 0.28110962E ++00 14 0.62994922E+00 0.12599324E+01 0.33218196E+00 0.33218196E ++00 15 0.44996373E+00 0.14907721E+01 0.38513805E+00 0.38513805E ++00 16 0.32140266E+00 0.17639053E+01 0.43723655E+00 0.43723655E ++00 17 0.22957333E+00 0.20870809E+01 0.48535182E+00 0.48535182E ++00 18 0.16398095E+00 0.24694674E+01 0.52651679E+00 0.52651679E ++00 19 0.11712925E+00 0.29219133E+01 0.55846889E+00 0.55846889E ++00 20 0.83663750E-01 0.34572544E+01 0.57979031E+00 0.57979031E ++00 21 0.59759821E-01 0.40906786E+01 0.58943084E+00 0.58943084E ++00 22 0.42685587E-01 0.48401561E+01 0.58613183E+00 0.58613183E ++00 23 0.30489705E-01 0.57269500E+01 0.56862373E+00 0.56862373E ++00 24 0.21778361E-01 0.67762186E+01 0.53666404E+00 0.53666404E ++00 25 0.15555972E-01 0.80177300E+01 0.49197001E+00 0.49197001E ++00 26 0.11111408E-01 0.94867060E+01 0.43819183E+00 0.43819183E ++00 27 0.79367203E-02 0.11224822E+02 0.37998198E+00 0.37998198E ++00 28 0.56690859E-02 0.13281388E+02 0.32182467E+00 0.32182467E ++00 29 0.40493471E-02 0.15714751E+02 0.26720453E+00 0.26720453E ++00 30 0.28923908E-02 0.18593944E+02 0.21830006E+00 0.21830006E ++00 31 0.20659934E-02 0.22000651E+02 0.17608839E+00 0.17608839E ++00 32 0.14757096E-02 0.26031521E+02 0.14065210E+00 0.14065210E ++00 33 0.10540783E-02 0.30800911E+02 0.11151704E+00 0.11151704E ++00 34 0.75291305E-03 0.36444130E+02 0.87930132E-01 0.87930132E +-01 35 0.53779504E-03 0.43121276E+02 0.69050633E-01 0.69050633E +-01

      I have been using U and sum of <EDRA> and <EDRB>. I am getting the results in .txt file. Here is the code I was using.

      use strict; # Find the lowest-energy geometry # Prepare array EDRvars0 containg the EDR at each u from that geometrr +y open(F,">results.txt"); print F "# Bond_length Delocalization_length EDR \n"; # Loop over all log files foreach my $f (<*log>){ my $c=`grep -c "Normal term" $f`; chomp($c); # Avoid files that do +dn't converge if($c>0){ # Find the bond length. We assume this is built into the file +name my $R = $f; $R=~s/.log//; $R=~s/.*_//; # Find the U valnes my $Ustr = `grep -A37 "EDR alpha" $f | tail -n35|awk "{print \ +\\$3}"`; my @Uvars = split(/\n/,$Ustr); # Convert them into an array my $NU = scalar(@Uvars); # That array has $NU elements # Find the <EDR(u)> and sum alpha and beta my $EDRstr = `grep -A37 "EDR alpha" $f | tail -n35|awk "{print + \\\$4+\\\$5}"`; my @EDRvars = split(/\n/,$EDRstr); # Print the outputs foreach my $i(0..$NU-1){ print F sprintf("%8.3E %12.6E %12.6E\n",$R,$Uvars[$i],$EDR +vars[$i]); } } } close(F);

      Now I want to get the difference in EDRs i.e. Delta<EDR> of each file from minimum energy file i.e. {(<EDRA>+<EDRB>)each file}-{(<EDRA>+<EDRB>)file with minimum energy}. The file with minmum energy is the file for which you helped in writing the code. I want the result of difference in the same text file.

        I note that the data you are interested are in the last part of the file. The reverse <$fh> is a pretty clunky way to read the file backwards but it works. If you can install File::ReadBackwards, it will be nicer, I think.

        A slightly nicer version of your program:

        foreach my $f (<*log>){ open my $fh,'<',$f; my @lines = reverse <$fh>; close $fh; next if ((shift @lines) !~ /Normal termination/); # Find the bond length. We assume this is built into the file name + my $R = $f; $R=~s/.log//; $R=~s/.*_//; my ($u,$edra,$edrb,@EDRvars,@Uvars); foreach my $line (@lines) { next unless ($line =~ m/^ \d/); last if ($line =~ m/^ Index/); (undef,undef,$u,$edra,$edrb) = unpack('A7A17A17A17A17',$line); push @Uvars, $u; push @EDRvars, $edra + $edrb; } # Print the outputs foreach my $i(0..$#Uvars){ print sprintf("%8.3E %12.6E %12.6E\n",$R,$Uvars[$i],$EDRvars[$ +i]); } }
        Dum Spiro Spero

        One way I see how it could be done is passing the Minimum Energy filename to the script, here in untested code, assuming you do not store the output results from the minimum enery:

        ./leg.pl `mineng.pl |grep "This was in file"|awk '{print $5}'`

        And then in leg.pl:

        my $minfilename = shift; # = $ARGV[0]; : : #somewhere in the loop: if($f eq $minfilename){ #store values to substract them later }

        However, if you DO save that file and have a known filename, lets call it min.txt, then this also works:

        my $minfilename = `grep "This was in file" min.txt|awk '{print $5}'`;

        And go from there, testing against $f to get its values.

        There are 35 EDRA and EDRB values in each file, it appears. For each line of the 199 other files, do you need to subtract the EDRA+EDRB value from the corresponding line of the file with mininum energy?

        File with minimum energy 1 10 20 30 40 2 20 30 45 50 Another file 1 10 11 50 50 2 20 21 60 60 Do you want: R 11 30 R 21 5 30 = (50 + 50) - (30 + 40) 5 = (60 + 60) - (45 + 50) I just put 'R' since I don't know what that value should be.
        Dum Spiro Spero