basheer has asked for the wisdom of the Perl Monks concerning the following question:

Hi Experts

Could you pls look into my code and provide feedbacks to improve my coding level, since i am new to this platform

use 5.012; use String::Diff; use strict; use warnings; my $aff_print; my $aff_print1; my $outpath; my $data; my $source; my $mismatch_aff; my $nomismatch_aff; my $outpath1; my $prooftxt; my $proftxt; my $jnl; my $artno; my $aidno; my $xmlpath; my $xml_miss; my $xmltxt; my $xmtxt; my $filename; my $diff; ###### OUT PATH DETAILS ######$jnl=$ARGV[0]; $artno=$ARGV[1]; $outpath=$ARGV[2]; $proftxt=$ARGV[3]; $outpath=~s#T:/PGN/#G:/FROM_3B2/#gi; $outpath=~s#(.+\/.+\)\/)(.+)#$1#gi; $outpath1= $outpath; mkdir "$outpath1"; $outpath= $outpath1; $outpath .= 'affprint.html'; if ($proftxt=~m#<!--aff_print: #gi) { $proftxt=~s#(<!--aff_print: +)(.+)(-->)#$2#g; $aff_print=$2; }else {if (!-e $outpath){$data="<html><head><title>ISSUE REPORT<\/titl +e><\/head><body bgcolor\=\"\#CCFFCC\"><h1 align\=\"center\">MISMATCH +REPORT</h1><table width\=\"1200px\" border\=\"10\" align\=\"center\"f +rame\=\"border\" rules\=\"all\"><thead><tr bgcolor\=\"\#FFCCFF\"><th> +ARTICLE ID<\/th><th>FILE MISSING IN PATH<\/th><th>S100 OUTPUT<\/th><t +h>F300 OUTPUT<\/th><th>DIFFERENCES<\/th><\/tr><\/thead><tbody><\/tbod +y><\/table><\/body><\/html>"; $xml_miss="<tr><td>$jnl$artno</td><td>NO AFF_PRINT IN 3D</td><td\/><td +\/><td\/></tr>"; $data=~s#(<\/tbody><\/table>)#$xml_miss$1#g; &print_outpath(); }else{&read_outpath(); $xml_miss="<tr><td>$jnl$artno</td><td>NO AFF_PRINT IN 3D</td><td\/><td +\/><td\/></tr>"; $data=~s#(<\/tbody><\/table>)#$xml_miss$1#g; &print_outpath(); }}if ($aff_print){$filename=$jnl . $artno; $aidno = sprintf("%08d", $artno); $xmlpath="\\\\tnqfs01\\COPYEDITORS\\TO_3B2\\ELSEVIER\\$jnl\\$aidno\\$f +ilename.xml"; if (!-e $xmlpath){if (!-e $outpath){$data="<html><head><title>ISSUE RE +PORT<\/title><\/head><body bgcolor\=\"\#CCFFCC\"><h1 align\=\"center\ +">MISMATCH REPORT</h1><table width\=\"1200px\" border\=\"10\" align\= +\"center\"frame\=\"border\" rules\=\"all\"><thead><tr bgcolor\=\"\#FF +CCFF\"><th>ARTICLE ID<\/th><th>FILE MISSING IN PATH<\/th><th>S100 OUT +PUT<\/th><th>F300 OUTPUT<\/th><th>DIFFERENCES<\/th><\/tr><\/thead><tb +ody><\/tbody><\/table><\/body><\/html>"; $xml_miss="<tr><td>$jnl$artno</td><td>XML FILE MISSING</td><td\/><td\/ +><td\/></tr>"; $data=~s#(<\/tbody><\/table>)#$xml_miss$1#g; &print_outpath(); }else{&read_outpath(); $xml_miss="<tr><td>$jnl$artno</td><td>XML FILES MISSING</td><td\/><td\ +/><td\/></tr>"; $data=~s#(<\/tbody><\/table>)#$xml_miss$1#g; &print_outpath(); }}else{open (IN,"$xmlpath"); $xmlpath=join("\n", <IN>); close IN; $xmltxt=$xmlpath; $xmtxt=quotemeta($xmltxt); if ($xmtxt=~m#<\\!\\-\\-aff_print\\:\\ #gi) { $xmltxt=~s#<!--aff_p +rint: (.*?)-->#$1#s; $aff_print1=$1; }else {}}if ($aff_print1){if ($aff_print eq $aff_print1){if(!-e "$outp +ath") { $data="<html><head><title>ISSUE REPORT<\/title><\ +/head><body bgcolor\=\"\#CCFFCC\"><h1 align\=\"center\">MISMATCH REPO +RT</h1><table width\=\"1200px\" border\=\"10\" align\=\"center\"frame +\=\"border\" rules\=\"all\"><thead><tr bgcolor\=\"\#FFCCFF\"><th>ARTI +CLE ID<\/th><th>FILE MISSING IN PATH<\/th><th>S100 OUTPUT<\/th><th>F3 +00 OUTPUT<\/th><th>DIFFERENCES<\/th><\/tr><\/thead><tbody><\/tbody><\ +/table><\/body><\/html>"; $nomismatch_aff="<tr><td>$jnl$artno</td><td\/><td\/><td\/><td>NO MISMA +TCH FOUND</td></tr>"; $data=~s#(<\/tbody><\/table>)#$nomismatch_aff$1#g; &print_outpath(); } else { $nomismatch_aff="<tr><td>$jnl$artno</td><td\/><td\ +/><td\/><td>NO MISMATCH FOUND</td></tr>"; &read_outpath(); $data=~s#(<\/tbody><\/table>)#$nomismatch_aff$1#g; &print_outpath(); }}else {if(!-e "$outpath") { $data="<html><head><title>ISS +UE REPORT<\/title><\/head><body bgcolor\=\"\#CCFFCC\"><h1 align\=\"ce +nter\">MISMATCH REPORT</h1><table width\=\"1200px\" border\=\"10\" al +ign\=\"center\"frame\=\"border\" rules\=\"all\"><thead><tr bgcolor\=\ +"\#FFCCFF\"><th>ARTICLE ID<\/th><th>FILE MISSING IN PATH<\/th><th>S10 +0 OUTPUT<\/th><th>F300 OUTPUT<\/th><th>DIFFERENCES<\/th><\/tr><\/thea +d><tbody><\/tbody><\/table><\/body><\/html>"; $mismatch_aff="<tr><td>$jnl$artno</td><td\/><td>$aff_print1<\/td><td>$ +aff_print<\/td><td><\/td><\/tr>"; my $diff = String::Diff::diff_merge("$aff_print1", "$aff_print", + remove_open => '<FONT style="BACKGROUND-COLOR: RED">', remov +e_close => '</FONT>', append_open => '<FONT style="BACKGROUND- +COLOR: FFFF33">', append_close => '</FONT>', ); $mismatch_aff=~s#(<td><\/td>)#<td>$diff<\/td>#g; $data=~s#(<\/tbody><\/table>)#$mismatch_aff$1#g; &print_outpath(); } else { $mismatch_aff="<tr><td>$jnl$artno</td><td\/><td>$a +ff_print1<\/td><td>$aff_print<\/td><td><\/td><\/tr>"; my $diff = String::Diff::diff_merge("$aff_print1", "$aff_print", + remove_open => '<FONT style="BACKGROUND-COLOR: RED">', remov +e_close => '</FONT>', append_open => '<FONT style="BACKGROUND- +COLOR: FFFF33">', append_close => '</FONT>', ); $mismatch_aff=~s#(<td><\/td>)#<td>$diff<\/td>#g; &read_outpath(); $data=~s#(<\/tbody><\/table>)#$mismatch_aff$1#g; &print_outpath(); }}}else {}}else {}sub read_outpath{open FILE, "<$outpath"; read FILE, $data, 100000000; close FILE; }sub print_outpath{open FILE, ">$outpath"; print FILE $data; close FILE; }exit;

Thanks

Basheer

Replies are listed 'Best First'.
Re: code for review
by ww (Archbishop) on Sep 22, 2011 at 12:05 UTC
    Readability (which often is eq 'maintainability.'
    • Be generous with whitespace; Perl doesn't charge by the " " ( nor by the \t )
    • Indentation -- find a style YOU like and use it... for example:
      if ( !-e $xmlpath ) { if ( !-e $outpath ) { $data= } }
    • Global declaration of all vars (see Re: code for review ) above... and, style-wise,
      my ($aff_print, $aff_print1, $outpath,  $data ....);
    • else {}}else {} Huhn?

    Update: And, if -- as it appears from the comments above -- you have edited your post without notice, DON'T.

    Use strikeouts rather than removing or editing text or code; mark additions and insertions for what they are. Editing without acknowledgement destroys original context and can make posts by others appear senseless or inexplicable.

Re: code for review
by aaron_baugher (Curate) on Sep 22, 2011 at 10:23 UTC

    Here's a small bit. What's the point of this?

    $outpath1= $outpath; mkdir "$outpath1"; $outpath= $outpath1; $outpath .= 'affprint.html';

    You assign the value of $outpath to $outpath1, just to make the directory, then assign the value back to $outpath, which it already has since the value just came from there, then never use $outpath1 again. Why not just mkdir $outpath and forget $outpath1?

    Also, this isn't C; you don't need to pre-declare your variables at the start of the program before you use them. Just create them with my when you assign values to them. One other tiny thing: quotes around a single scalar like you have following mkdir are unnecessary, and may cause confusion.

Re: code for review
by pvaldes (Chaplain) on Sep 22, 2011 at 09:27 UTC

    An obvious mistake

    ###### OUT PATH DETAILS ######$jnl=$ARGV[0];

    should be

    ###### OUT PATH DETAILS ###### $jnl=$ARGV[0];

    But don't do this, better

    my ($jnl,$artno,$outpath,$proftxt)=@_;

    instead

    my $jnl; my $artno; my $outpath; my proftxt; $jnl=$ARGV[0]; $artno=$ARGV[1]; $outpath=$ARGV[2]; $proftxt=$ARGV[3];

    What's the purpose of your code?

Re: code for review
by perl_lover (Chaplain) on Sep 22, 2011 at 05:59 UTC
Re: code for review
by toolic (Bishop) on Sep 22, 2011 at 13:11 UTC
Re: code for review
by graff (Chancellor) on Sep 23, 2011 at 04:15 UTC
    There's a lot of repetitive content in your code (many identical copies of statements), and in your case, this is an especially bad thing, because (1) the code is formatted in a manner that is virtually illegible to humans, making it hard to tell whether repetitions are exact, and (2) the repeated lines are often long and complicated, so that if you need to make a change -- and the change will have to be repeated on every copy of the affected line -- it will drive you crazy and you will find it really hard to avoid mistakes.

    For example, if you modify your "read_outpath()" subroutine to handle the "file not found" condition, you'll need only one copy of your "default value" for $data, instead of four copies (and the code will be a lot easier for human readers to understand):

    sub read_outpath { if ( open FILE, '<', $outpath ) { read FILE, $data, 100000000; close FILE; return 1; } else { $data = <<EOT; <html> <head><title>ISSUE REPORT<\/title><\/head> <body bgcolor="#CCFFCC"><h1 align="center">MISMATCH REPORT</h1> <table width="1200px" border="10" align="center" frame="border" rules +="all"> <thead><tr bgcolor="#FFCCFF"> <th>ARTICLE ID</th><th>FILE MISSING IN PATH</th> <th>S100 OUTPUT</th><th>F300 OUTPUT</th> <th>DIFFERENCES</th></tr></thead> <tbody></tbody> </table> </body> </html> EOT } return 0; }
    That's one of many cases where a some more thought about the task and the conditions can lead the elimination of repeated code, which in turn leads to easier maintenance and legibility.

    One other very important problem here: lack of documentation, especially about what the command-line args are supposed to be. You're logic implies some very strange expectations about what goes into @ARGV, and you really should provide at least a SYNOPSIS of the usage. It would be even better if you provide a DESCRIPTION of what the program is supposed to do, maybe even explaining the various conditions that it handles.

    If you can describe the program's logic (i.e. document it) in a clear, concise way, it's likely you'll find it easier to write clear, concise code to implement that logic, getting rid of a lot of useless and redundant chunks of ugly stuff.

Re: code for review
by riverron (Sexton) on Sep 22, 2011 at 14:35 UTC
    The guidelines from Damian Conway's Perl Best Practices will definitely help you write better Perl code.
Re: code for review
by pvaldes (Chaplain) on Sep 22, 2011 at 10:04 UTC

    ... and you don't need to specify (again) use strict; use warnings; here