atmosphere has asked for the wisdom of the Perl Monks concerning the following question:

Hey guys, I am extremely new to perl, about a month into it, and have written an application that works without issue, however I would love to clean it up with some for or foreach loops, and would like to hear some suggestions... Here is a blurb from my code that I would like cleaned up:
if (scalar@row1 == "6") { @entry1 = (@required , @row1); $digest1 = sha1_hex(@entry1); $sth = $dbh->prepare($query); $sth->execute($digest1); $data1 = $sth->fetchrow_array(); if ($data1 ne $digest1) { push(@entry1, $digest1); $sth = $dbh->prepare($insert); $sth->execute(@entry1); &create_pdf(@entry1); $msg->attach( Type => 'image/gif', Path => "../pdfs/ir_$diges +t1.pdf", Filename => "ir_$digest1.pdf", Disposition => 'attachment') } else {$dberr +or .= "The following row you submitted: <br> @entry1 <br>already exsi +ts in the database and will note be re-submitted. <br>"} };
My dilemma is that I have repeated this code 5 times changing only the numbers at the end of my arrays and variables which is where I would like the for loop to come in so I would only need to have the one iteration. Thanks in advance for any and all suggestions/help!

Replies are listed 'Best First'.
Re: Cleaning up Code
by ikegami (Patriarch) on Dec 10, 2008 at 00:25 UTC
    Original code more sanely formatted:
    if (scalar @row1 == "6") { @entry1 = (@required , @row1); $digest1 = sha1_hex(@entry1); $sth = $dbh->prepare($query); $sth->execute($digest1); $data1 = $sth->fetchrow_array(); if ($data1 ne $digest1) { push(@entry1, $digest1); $sth = $dbh->prepare($insert); $sth->execute(@entry1); &create_pdf(@entry1); $msg->attach( Type => 'image/gif', Path => "../pdfs/ir_$digest1.pdf", Filename => "ir_$digest1.pdf", Disposition => 'attachment' ) } else { $dberror .= "The following row you submitted: " . "<br> @entry1 " . "<br>already exsits in the database " . "and will note be re-submitted. <br>" } };

    Let's start by identifying bugs.

    • What fetchrow_array returns in scalar context is undocumented.
      $data1 = $sth->fetchrow_array();
      is wrong. If you want to the first field, use
      ($data1) = $sth->fetchrow_array();

    • Is the contents of $dberror text of HTML? You put both into it. At worst, this could be an opening for a cross-site scripting attack.

    • All over, you are instructing Perl to ignore prototypes. Why do you use & on your subroutine calls?

    • I note a distinct lack of my. Use use strict;!

    • Something tells me you don't use warnings either. ("if ($data1 ne $digest1)" would warn if fetchrow_array returned zero rows.) Use use warnings;!

    • Why is six in quotes in "if (scalar @row1 == "6")"? You want to compare with the number 6, not the string "6" (which would be converted to a number by ==). The scalar is useless too. == creates a scalar context, so scalar is redundant.

    As for your actual question, replace

    if (@row1 == 6) { my @entry1 = (@required , @row1); ... }

    with

    for my $row ( 1 .. $#rows ) { if (@{$rows[$row]} == 6) { my @entry = (@required , @{$rows[$row]}); ... } }

    and remove all other mentions of "1".

      First I would like to thank you, Grandfather, and Mike Lieman for such quick replies.
      What fetchrow_array returns in scalar context is undocumented. $data1 = $sth->fetchrow_array(); is wrong. If you want to the first field, use ($data1) = $sth->fetchrow_array();
      Thank you I will take your advice here.
      Is the contents of $dberror text of HTML? You put both into it. At wor +st, this could be an opening for a cross-site scripting attack.
      I do ultimately call $dberror in my html code yes. I do not understand what you mean by stating I put both into it? Again please forgive me this is my first attempt at coding anything.
      All over, you are instructing Perl to ignore prototypes. Why do you us +e & on your subroutine calls?
      I was under the impression that this was proper usage... All the examples I discovered on the web used the & before the subroutine.
      I note a distinct lack of my. Use use strict;!
      Again please forgive my inexperience here... I did a fast google search to find the benefit in using "my()".. am I correct in stating the primary reason in using "my()" is so that the variables are completely hidden from the outside?
      Something tells me you don't use warnings either. ("if ($data1 ne $dig +est1)" would warn if fetchrow_array returned zero rows.) Use use warn +ings;!
      I actually want it to return with zero rows, I do plan on adding DBI error codes for my prepare and execute commands.
      Why is six in quotes in "if (scalar @row1 == "6")"? You want to compar +e with the number 6, not the string "6" (which would be converted to +a number by ==). The scalar is useless too. == creates a scalar conte +xt, so scalar is redundant.
      My goal with this statement is to see if my @row1 has 6 elements, and if it does proceed on. So if I read what you stated, I could accomplish the same thing by doing this
      if (@row1 == 6) { other code here ... };
      ??
      for my $row ( 1 .. $#rows ) { if (@{ $rows[$row] } == "6") { my @entry = (@required , @row); ... } }
      Thank you for this! In trying to understand how this statement works, could I trouble you once more to explain how this exactly works? Again Thank you for you help thus far!!

      Mike, Thank you for your input, and I have tested this to the best of my ability so I believe my script is at a good point for "cleaning up". Hopefully I am not being ignorant, are their other ways to test my script other than using my program, and getting the results I hope for? I do try to "break" it by putting in invalid data, etc.

      Grandfather, @row1 is an array that is put together earlier after parsing data from a html form. So the scalar line is to look to see if it has all the required elements before it proceeds on with the script. Ideally there should only be 6 or 0 elements in the array, but to be thorough, and sure that information is being entered correctly and completely I wanted to be sure there were 6 elements in the array. Hopefully that's clearer than mud...

        I do not understand what you mean by stating I put both into it?

        @entry1 contains text and not HTML, right? At the very least, it needs to visit HTML::Entities.

        I was under the impression that this was proper usage

        It disables prototypes and allows subroutines to be called without parens before they are declared. It was possibly the way to do things in Perl4, but this isn't Perl4.

        the primary reason in using "my()" is so that the variables are completely hidden from the outside?

        Yes. It limits the scope of the variables. This prevents accidental clobbering of the variables by distant code and it fosters modularity of code. In conjunction with use strict;, typos in variable names are usually caught at compile-time. use strict; helps identify a couple of other type of dangerous code.

        my @entry = (@required , @{$rows[$row]}); my $digest = sha1_hex(@entry); my $sth = $dbh->prepare($query); $sth->execute($digest); my ($data) = $sth->fetchrow_array(); ...

        I actually want it to return with zero rows

        Yes, but you don't want it to warn "Use of uninitialized value in string eq" when it does. defined would be one way of avoiding the warning.

        So if I read what you stated, I could accomplish the same thing by doing this

        Yes. @row1 == 6 checks if @row has 6 elements. An array in scalar context (as imposed by scalar but also by ==) evaluates to the number of elements in the array.

        In trying to understand how this statement works

        All variables other than @row1 can lose the "1" without problem. The rows are a little bit trickier since we want to switch to using a loop. Unfortunately, you haven't shown us everywhere @rowX is used — in particular, you haven't shown us where they're created — so our help there is limited.

        Basically, I suggested that you create an array @rows whose elements are references to arrays (formerly @row1, @row2, etc). This is commonly called an "array of arrays". Basically, Perl's version of a two-dimensional array. There's perllol on the subject, for starters.

        Note: I've updated my post to fix a bug in that code.

Re: Cleaning up Code
by GrandFather (Saint) on Dec 10, 2008 at 00:30 UTC

    Before you do anything else add strictures (use strict; use warnings;) to your code. On the face of it all variables you are worried about except $row1 are local to the if block so their names don't matter. It may be there is a whole lot more context that is pertinent, but you haven't shown it.

    Aside from that if (scalar@row1 == "6") is not right and doesn't make sense. It isn't right because == compares numbers, but "6" is a string. It doesn't make sense because there is not enough context to figure out where @row1 comes from and why it should have 6 elements. It needs to make sense before it can be refactored.


    Perl's payment curve coincides with its learning curve.
Re: Cleaning up Code
by mikelieman (Friar) on Dec 10, 2008 at 02:12 UTC
    Before even that -- document your test cases, if you don't have a pretty good set of .t files already. It's imperative for you to be able to definitively state, "The code functions exactly as before" once you're done cleaning it up. ( Or, more likely, "The code now functions exactly as it was supposed to all along" )
Re: Cleaning up Code
by jimX11 (Friar) on Dec 16, 2008 at 15:16 UTC

    A test suite really helps when refactoring code.

    You may be able to add tests to just this area of the code, depending on how much this code depends on the code around it.

    If this code depends on code in lots of places, then adding tests would be harder. The tests help because you add them, change the code (refactor) and then run the tests to see if any test fails.

    Knowing what to test and how much to test takes experience.