Like most other people, I was going to skip over your post because of the length of the code and the appalling layout. But then my phone line was disconnect for 4 hours which stopped me doing what I wanted to do, and your code sat on the screen in front on me so I played whilst waiting for the phone to come back.

A few steps to improving your chances of fixing your code, and to getting some help if you can't do it yourself.

  1. 700+ lines reduced to 450

    The first thing I noticed about your code was the appalling layout.

    Admittedly, about half of that was contributed by the PM codewrap 'feature'. I've had to break lines in this reply in stupid places to stop PM from complete screwing the formatting.

    I fixed this by using perl to parse and reformat the whole thing.

    P:\test>perl -MO=Deparse 376142-dl.pl >376142.pl 376142-dl.pl syntax OK

    Try it. You will be pleasently surprised. Consistant indentation makes reading code so much easier.

  2. 450 reduced to 250

    Next was that this block of code

    if ($line =~ /$atomid.*\s*\w+\s+$resid\s+$atomidmatch.* \w+\s+$residmatch\s+\d\ +./ or $line =~ /$atomidmatch.*\s*\w+\s+$residmatch\s+$atomid.* \w+\s+$resid\s+\d +\./) { print OUTFILE "array2[$el]\n"; last; }

    which is repeated, exactly, 36 times. So, I moved that into a subroutine.

    sub outputIf { my( $line, $atomid, $atomidmatch, $residmatch, $resid, $array2_el, $outfile) = @_; if ($line =~ /$atomid.*\s*\w+\s+$resid\s+$atomidmatch.* \w+\s+$residmatch\ +s+\d\./ or $line =~ /$atomidmatch.*\s*\w+\s+$residmatch\s+$atomid.* \w+\s+$resid\ +s+\d\./ ) { print $outfile "array2_el\n"; return 1; } return 0; }

    and replaced each occurance in the main code with

    last if outputIf( $line, $atomid, $atomidmatch, $1, $2, $array2[$el], +\*OUTFILE );
  3. 250 reduces to 125

    That done, it becomes immediately apparent that the main bulk of the program now consists of six, nearly identical repetitions of

    if ($array2[$el + 10] =~ /resid (\d+) and name (H\w+\d*).?\)/ and $array2[$el ] =~ /resid (\d+) and name (H\w+\d*).?\)/) { $resid = $1; $atomid = $2; if ($array2[$el + 10] =~ /resid (\d+) and name (H\w+\d*).\n?\)/) { $residmatch = $1; $atomidmatch = $2; } die "Can't open $distfil\ne" unless open DISTFILE, "$distfile"; while (defined($_ = <DISTFILE>)) { $line = $_; last if outputIf( $line, $atomid, $atomidmatch, $residmatch, $resid, $array2[$el] ); if ($array2[$el + 11] =~ /resid (\d+) and name(H\w+\d*).?\)/) +{ last if outputIf( $line, $atomid, $atomidmatch, $1, $2, $array2[$el] ); } elsif ($array2[$el + 12] =~ /resid (\d+) and name(H\w+\d*).?\) +/) { last if outputIf( $line, $atomid, $atomidmatch, $1, $2, $array2[$el] ); } elsif ($array2[$el + 13] =~ /resid (\d+) and name(H\w+\d*).?\) +/) { last if outputIf( $line, $atomid, $atomidmatch, $1, $2, $array2[$el] ); } elsif ($array2[$el + 14] =~ /resid (\d+) and name(H\w+\d*).?\) +/) { last if outputIf( $line, $atomid, $atomidmatch, $1, $2, $array2[$el] ); } elsif ($array2[$el + 15] =~ /resid (\d+) and name(H\w+\d*).?\) +/) { last if outputIf( $line, $atomid, $atomidmatch, $1, $2, $array2[$el] ); } } }

    So, by sticking that into a subroutine your main for loop reduces to

    @array2 = <KATFILE>; chomp @array2; foreach $el ( 0 .. @array2 ) { next if whileif( \@array2, $el, 10, $distfile, $outfile ); next if whileIf( \@array2, $el, 11, $distfile, $outfile ); next if whileIf( \@array2, $el, 12, $distfile, $outfile ); next if whileIf( \@array2, $el, 13, $distfile, $outfile ); next if whileIf( \@array2, $el, 14, $distfile, $outfile ); next if whileIf( \@array2, $el, 15, $distfile, $outfile ); print OUTFILE "$array2[$el]\n"; }
  4. Besides the stupid name, that subroutine is repetitious. A little juggling renders

    sub whileIf { my( $arrayRef, $el, $offset, $distfile, $outfile ) = @_; if ($array2[$el + $offset] =~ /resid (\d+) and name (H\w+\d*).?\)/ + and $array2[$el ] =~ /resid (\d+) and name (H\w+\d*).?\)/) { $resid = $1; $atomid = $2; die "Can't open $distfil\ne" unless open DISTFILE, "$distfile" +; WHILELOOP: while( defined( $line = <DISTFILE> ) ) { for my $offset2 ( 0 .. 5 ) { last WHILELOOP if $array2[$el + $offset] =~ /resid (\d+) and name (H\w+\d*).\n?\)/ and outputIf( $line, $atomid, $atomidmatch, $1, $2, $array2[$el], $outfile ) + ; } } return 1; } return 0; }
  5. Down to under 100 lines.

    The modified program

    $katfile = '/remote/belgarath/tpukala/pegasus/xplor/signif/25jun2.kat2 +'; $outfile = 'temp.kat'; die "Can't open $outfile" unless open OUTFILE, ">$outfile"; die "Can't open $katfile" unless open KATFILE, "$katfile"; @array = <KATFILE>; chomp @array; $numelements = @array; @arraylist = 0 .. $numelements; foreach $el (@arraylist) { if ($array[$el] =~ /^\)$/ and $array[$el + 1] =~ /^\($/) { print OUTFILE ")\n*\n*\n*\n*\n*\n*\n*\n"; } elsif ($array[$el] =~ /assign/ and $array[$el + 1] =~ /^\($/) { print OUTFILE "assign\n*\n*\n*\n*\n*\n*\n*\n*\n*\n*\n"; } else { print OUTFILE "$array[$el]\n"; } } close KATFILE; close OUTFILE; $distfile = "/remote/belgarath/tpukala/pegasus/xplor/signif/25jun/rnk_ +2\n5jun_signif_.dst"; $katfile = '/remote/belgarath/tpukala/pegasus/xplor/signif/temp.kat'; $outfile = 'temp2.kat'; die "Can't open $outfile" unless open OUTFILE, ">$outfile"; die "Can't open $katfile" unless open KATFILE, "$katfile"; @array2 = <KATFILE>; chomp @array2; foreach $el ( 0 .. @array2 ) { whileif( \@array2, $el, $_, $distfile, \*OUTFILE ) and next for 10 + .. 15; print OUTFILE "$array2[$el]\n"; } close KATFILE; close OUTFILE; $katfile = '/remote/belgarath/tpukala/pegasus/xplor/signif/temp2.kat'; $outfile = 'half.kat'; die "Can't open $outfile" unless open OUTFILE, ">$outfile"; die "Can't open $katfile" unless open KATFILE, "$katfile"; @array3 = <KATFILE>; chomp @array3; $numelements3 = @array3; @arraylist3 = 0 .. $numelements3; foreach $el (@arraylist3) { if ($array3[$el] =~ /^\*$/) { (); } elsif ($array3[$el] eq $array3[$el + 1]) { (); } else { print OUTFILE "$array3[$el]\n"; } } ## Exit main program exit; sub outputIf { my( $line, $atomid, $atomidmatch, $residmatch, $resid, $array2_el, $outfile ) = @_; if ($line =~ /$atomid.*\s*\w+\s+$resid\s+$atomidmatch.* \w+\s+$residmatch\s ++\d\./ or $line =~ /$atomidmatch.*\s*\w+\s+$residmatch\s+$atomid.* \w+\s+$resid\ +s+\d\./ ) { print $outfile "array2_el\n"; return 1; } return 0; } sub whileIf { my( $arrayRef, $el, $offset, $distfile, $outfile ) = @_; if ($array2[$el + $offset] =~ /resid (\d+) and name (H\w+\d*).?\)/ + and $array2[$el ] =~ /resid (\d+) and name (H\w+\d*).?\)/) { $resid = $1; $atomid = $2; die "Can't open $distfil\ne" unless open DISTFILE, "$distfile" +; WHILELOOP: while( defined( $line = <DISTFILE> ) ) { for my $offset2 ( 0 .. 5 ) { last WHILELOOP if $array2[$el + $offset] =~ /resid (\d+) and name (H\w+\d*).\n?\)/ and outputIf( $line, $atomid, $atomidmatch, $1, $2, $array2[$el], $outfile ) + ; } } return 1; } return 0; }

    Still far from perfect. It almost certainly doesn't fix your problem--and given the absence of my ability to test it without the appropraite data, it may well not even do what your original does--but had you posted this code, you would probably have got a lot more interest from people in trying to help you with your problem. Whenever you find yourself copy&pasting code, do so once--into a subroutine.

    Note: I've stuck with teh stupid subroutine names, because I don;t know enough about what your program is doing to choose good ones.

Give me a break! By the way, until about a week ago i had never heard of perl and had no idea whatsoever about computer programming so please be nice!!

Programming is hard. If I walked in to your lab and told you that I had lifted random bits of technique from these papers: alopecia, Mutations cause extra limbs in amphibians, in my attempt to create a 4-legged bald chicken for the food industry, but it didn't work and asked for your help.

What would your reaction be?

Whomever expects you to pick up programming in one week and do something useful with it, whether that is your boss, your lecturer, lab mentor or yourself, is doing you no favours at all. Most programmer spend years learning their craft, trying do it in a week is just insulting.

I'm not trying to discourage you, but get some training.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon

In reply to Re: nmr comparison script by BrowserUk
in thread nmr comparison script by NMRsucks

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.