http://qs1969.pair.com?node_id=665258


in reply to Re: Parsing Guassian '03 Log Files
in thread Parsing Guassian '03 Log Files

Thank you very much, but unfortunately, I need it to work in such a way that I call the script/program, feed it an input filename and an output filename, and have it work its magic. Each file is of an undetermined length, with sections that need to be parsed f undetermined length, as each file represents a different chain-length of the molecule.

Sorry that I didn't specify that earlier.
C(qw/74 97 104 112/);sub C{while(@_){$c**=$C;print (map{chr($C!=$c?shift:pop)}$_),$C+=@_%2!=1?1:0}}

Replies are listed 'Best First'.
Re^3: Parsing Guassian '03 Log Files
by GrandFather (Saint) on Jan 31, 2008 at 02:34 UTC

    Well, I couldn't do it all for you - what would you do with all the time you saved?

    You may find help in dealing with the file issues in replies to the thread File read and strip ;).


    Perl is environmentally friendly - it saves trees
      Thanks, GF.
      You've been a great help time and again. :)
      C(qw/74 97 104 112/);sub C{while(@_){$c**=$C;print (map{chr($C!=$c?shift:pop)}$_),$C+=@_%2!=1?1:0}}
      I ended up hacking this together when I got into the lab this afternoon. It's ugly, and probably inefficient, but it is entirely within my skillset and it works beautifully for what I need it to do (save for one file, which for some reason gets printed in triplicate, but that is just one case).
      #!/usr/bin/perl use strict; use warnings; # Parse Gaussian '03 output files # for the lengths of the bonds # in Diaminopolymethine Dyes # strictly between Carbon and Nitrogen # or Carbon and Carbon my $infile, my $outfile; my @inlog, my @inlog_n, my @logged; chomp($infile = <>); chomp($outfile = <>); open FILE, "<$infile"; while(<FILE>) { push @inlog, $_; } close FILE; for(@inlog) { push @inlog_n, grep( /^\s?!{1}?\s*(c|n)/, $_); } @inlog = grep( !/h|c{3}?|nc{2}?|(estimate)|c{2}?n{1}?/, @inlog_n ); pop @inlog_n for @inlog_n; @inlog_n = map{ split( /\s/, $_) } @inlog; pop @inlog for @inlog; for(@inlog_n) { push @logged, $_ if $_=~/\d/ && $_!~/[a-zA-Z]/; } open LOG, ">$outfile"; for(@logged) { print LOG "$_\n" if $_>=1; } close LOG;

      Thanks again for the help!
      C(qw/74 97 104 112/);sub C{while(@_){$c**=$C;print (map{chr($C!=$c?shift:pop)}$_),$C+=@_%2!=1?1:0}}

        You can clean that up a little:

        #!/usr/bin/perl use strict; use warnings; # Parse Gaussian '03 output files for the lengths of the bonds in # Diaminopolymethine Dyes strictly between Carbon and Nitrogen or Carb +on and # Carbon chomp (my $infile = <>); chomp (my $outfile = <>); open FILE, '<', $infile or die "Unable to open $infile: $!"; my @inlog = grep {/\d/ && ! /[a-zA-Z]/ && $_ >= 1} map {split (/\s/, $_)} grep {!/h|c{3}?|nc{2}?|(estimate)|c{2}?n{1}?/} grep {/^\s?!{1}?\s*(c|n)/} <FILE>; close FILE; open LOG, '>', $outfile or die "Unable to create $outfile: $!"; print LOG join "\n", @inlog; close LOG;

        Your version had some rather odd constructs. The pop loops to clear out arrays were perhaps the strangest. Much better to:

        @array = ();

        The while loop to slurp the file is better as:

        @array = <FILE>;

        The construct my $infile, my $outfile; is odd. Either my ($var1, $var1); or use two separate statements:

        my $var1; my $var2;

        However you should declare your variables as close to their first use as possible so it is not often that you need to declare a bunch of variables in one place like that anyway. See too the file name variable declarations in my version of your code above.

        You should always use the three parameter version of open and you should always check the result.


        Perl is environmentally friendly - it saves trees