Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Could I get some feedback on my code please?

by PockMonk (Beadle)
on Jan 13, 2007 at 09:17 UTC ( [id://594509]=perlquestion: print w/replies, xml ) Need Help??

PockMonk has asked for the wisdom of the Perl Monks concerning the following question:

I realise that I am throwing myself to the lions here, but so be it. I don't have a specific problem with my code, I just want some feedback on it. I've got to the point that whatever I want to do in Perl I can get it to work in the meaning of "it seems to work to me", but I am aware that this doesn't necessarily mean "It is robust and will always work" and almost certainly doesn't mean "It works in the most efficient manner".

I'm a Perl hobbyist, in that this is not what I do for a job and I code in Perl purely for fun. I lack any official training in Perl and my only sources of knowledge is the O'Reilly "Programming Perl" book and any info I find online. I hope this explains if not excuses some of the coding obscenities that will doubtless be pointed out. Being self-taught and not in a coding environment I have no method of judging myself or finding directions for improvement. With those caveats out of the way... please point out any problems with or ways to improve my code (example below). Any and al comments welcome but if you point out something wrong could you please explain why it is wrong.

EDIT: thank you, some great suggestions, code updated

#!/usr/bin/perl -T #_____________________________________________________________________ +_________# #DeleteAsstBot.pl PRE-RELEASE VERSION + # # Author Dan Adams , (User:PocklingtonDan) + # #_____________________________________________________________________ +_________# #_____________________________________________________________________ +_________# # CHANGES STILL TO MAKE + # # + # # - Don't allow use by blocked IPs? + # # http://en.wikipedia.org/w/index.php?title=Special%3AIpblocklist&ac +tion=search&ip=PLACE_IP_HERE # - Change "ARTICLENAME (nomination N)" to "Articlename (nth nominatio +n)". # # - add question mark/help link next to each input field + # # + # # RECENT CHANGES + # # + # # 09.01.06 - Version 0.01 - Initial code written + # # 10.01.06 - Version 0.02 - Fixed bug where it wrote bot sig twice. + # # 11.01.06 - Version 0.03 - Added edit token system + # # + # #_____________________________________________________________________ +_________# #_____________________________________________________________________ +_________# # WHAT THE SCRIPT DOES + # # + # # This script is a wikipedia bot. It acts as a wizard interface for th +e process# # of nominating an article for deletion, a currently manual task. It t +akes # # user input in a wizard interface and makes the necessary wikipedia e +dits in # # the background. + # # + # # INTENDED USE + # # + # # It is intended that ultimately this script is placed on the toolserv +er and # # becomes the default method of nominating articles for deletion from +wikipedia# # + # #_____________________________________________________________________ +_________# #_____________________________________________________________________ +_________# # PACKAGES TO IMPORT (must be installed on server) + # #_____________________________________________________________________ +_________# # "Use Always" Packages use strict; use warnings; use CGI::Carp "fatalsToBrowser"; # during testing and debugging only # Required Packages for features. use CGI qw(:standard Vars); use URI::Escape; use LWP::Simple; use LWP::UserAgent; use HTML::Template; #_____________________________________________________________________ +_________# # SETTINGS + # #_____________________________________________________________________ +_________# #Variables my $editorName = "**********"; my $editorPassword = "**************"; my $language = "en"; # language code for wikipedia namespace my $enabled = "true"; # set to false to take offline/disable #Local paths my $editTokenFile = "/*********/editToken.txt"; my $inputFormFile = "/**********/inputForm.tmpl"; my $errorFile = "/*********/error.tmpl"; my $respondToUserFile = "/*************/respondToUser.tmpl"; # Web paths my $pathToScript = "http://*********.com/cgi-bin"; my $pathToImages = "http://www.**********.com"; #_____________________________________________________________________ +_________# # MAIN ROUTINE + # #_____________________________________________________________________ +_________# my $action = param('action') or 'getInput'; if ($enabled eq "true") { if ($action eq 'getInput') {getInput;} elsif ($action eq 'processInput') {processInput;} else {error("Unrecognised action request");} } else {error("Bot is currently offline for maintenance.");} exit; #_____________________________________________________________________ +_________# # SUBROUTINES + # #_____________________________________________________________________ +_________# sub getInput { my $remoteHost = $ENV{'REMOTE_ADDR'}; my $editToken = generateEditToken($remoteHost); printInputForm($editToken); return(); } #_____________________________________________________________________ +_________# sub processInput { my $editToken = param('editToken'); my $articleForDeletion = param('articleForDeletion'); my $categoryForDeletion = param('categoryForDeletion'); my $reasonForDeletion = param('reasonForDeletion'); validateFilePaths($editTokenFile); validateBasicFormats($editToken, $articleForDeletion, $categoryFor +Deletion, $reasonForDeletion); validatePageExists($articleForDeletion); validateCategoryCode($categoryForDeletion); my $remoteHost = validateEditToken($editToken); notifyArticleCreator($articleForDeletion, $remoteHost); my $afdUrl = setupDeletionDebatePage($articleForDeletion, $categor +yForDeletion, $reasonForDeletion, $remoteHost); tagArticleWithDeletionNotice($articleForDeletion, $remoteHost, $af +dUrl); logDeletionNomination($articleForDeletion, $remoteHost, $afdUrl); fileMaintenance($editToken); respondToUser($articleForDeletion, $afdUrl); return(); } #_____________________________________________________________________ +_________# sub checkForUnexpectedArgs { my @unexpectedArgs = @_; if (scalar(@unexpectedArgs) > 0) { error("Mismatch between expected and received number of argume +nts."); } return(); } #_____________________________________________________________________ +_________# sub checkFileCanBeAccessed { my($filePath,$requestType, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); if ($requestType eq "READ") { unless (-r $filePath) { error("The file $filePath cannot be read as requested."); } } elsif ($requestType eq "WRITE") { unless (-w $filePath) { error("The file $filePath cannot be written to as requeste +d."); } } elsif ($requestType eq "EXISTS") { unless (-e $filePath) { error("The file $filePath does not exist."); } } else{ error("Invalid file access method was chosen."); } return(); } #_____________________________________________________________________ +_________# sub fileMaintenance { my($editToken, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); open(TOKENFILE, '+>', "$editTokenFile") or error("Cannot open edit + token file."); # open for both read and write flock(TOKENFILE, 2) or error("Cannot lock edit token file."); + my $fileContents = <TOKENFILE>; chomp($fileContents); my @allLines = split(/\n/, $fileContents); my $currentLineNumber = 0; my $matchLineNumber = 0; foreach my $line (@allLines) { my ($possibleMatchToken, $possibleMatchRemoteHost) = split +(/\|/, $line); if ($possibleMatchToken eq $editToken) { $matchLineNumber = $currentLineNumber; } $currentLineNumber++; } splice(@allLines, $matchLineNumber, 1); foreach my $newLine (@allLines) { print TOKENFILE "$newLine\n"; } flock(TOKENFILE, 8); close (TOKENFILE) or error("Cannot close edit token file."); return (); } #_____________________________________________________________________ +_________# sub respondToUser { my($articleName, $afdUrl, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); my $articleURL = uri_escape($articleName, "\0-\377"); my $template = HTML::Template->new(filename => 'respondToUser.tmpl +'); $template->param(LANGUAGE => $language); $template->param(ARTICLEURL => $articleURL); $template->param(PATHTOIMAGES => $pathToImages); $template->param(ARTICLENAME => $articleName); $template->param(AFDURL => $afdUrl); print "Content-Type: text/html\n\n", $template->output; return(); } #_____________________________________________________________________ +_________# sub writeResultsToPage { my($replacement_page, $replacement_text, $replacement_summary, $ad +ditionalTextStyle, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); use LWP::UserAgent; my $agent=LWP::UserAgent->new; $agent->agent('Perlwikipedia/0.90'); $agent->cookie_jar({file=> '.perlwikipedia-cookies'}); my $login = HTTP::Request->new(POST => "http://$language.wikipedia +.org/w/index.php?title=Special:Userlogin&action=submitlogin&type=logi +n"); $login->content_type('application/x-www-form-urlencoded'); $login->content("wpName=$editorName&wpPassword=$editorPassword&wpR +emember=1&wpLoginattempt=Log+in"); my $logger_inner = $agent->request($login); my $do_redirect=HTTP::Request->new(GET =>'http://$language.wikiped +ia.org/w/index.php?title=Special:Userlogin&wpCookieCheck=login'); my $redirecter= $agent->request($do_redirect); my $is_success=$redirecter->content; use HTML::Form; my $ua = LWP::UserAgent->new; $ua->agent("Perlwikipedia/0.90"); $ua->cookie_jar($agent->cookie_jar()); my $response = $ua->get("$replacement_page"); my $form = HTML::Form->parse($response); my $existingText = $form->find_input('wpTextbox1')->value; my $summary = $form->find_input('wpSummary')->value; my $save = $form->find_input('wpSave')->value; my $edittoken = $form->find_input('wpEditToken')->value; my $starttime = $form->find_input('wpStarttime')->value; my $edittime = $form->find_input('wpEdittime')->value; my $textToWrite = ""; if ($additionalTextStyle eq "APPENDFIRST") {$textToWrite = $replacement_text . $existingText;} elsif($additionalTextStyle eq "APPENDLAST") {$textToWrite = $existingText . $replacement_text;} elsif ($additionalTextStyle eq "REPLACE") {$textToWrite = $replacement_text} else {error("Unrecognised wikipedia write request.");} $form->value('wpTextbox1', $textToWrite); $form->value('wpSummary', $replacement_summary ); $response = $ua->request($form->click); return(); } #_____________________________________________________________________ +_________# sub logDeletionNomination { my($articleForDeletion, $remoteHost, $afdUrl, @unexpectedArgs) = @ +_; checkForUnexpectedArgs(@unexpectedArgs); my $articleURL = uri_escape($articleForDeletion, "\0-\377"); my ($day, $month, $year) = getCurrentDate(); my $pageURL = "http://$language.wikipedia.org/w/index.php?title=Wi +kipedia:Articles_for_deletion/Log/" . $year . "_" . $month . "_" . $d +ay . "&action=edit"; my $contentToAdd = "{{subst:afd3 | pg=$articleForDeletion}}"; my $editSummary = "Adding [[Wikipedia:$afdUrl]]"; writeResultsToPage($pageURL, $contentToAdd, $editSummary, "APPENDL +AST"); return(); } #_____________________________________________________________________ +_________# sub getCurrentDate { my @months = qw(January February March April May June July August +September October November December); my @weekDays = qw(Sun Mon Tue Wed Thu Fri Sat Sun); my ($second, $minute, $hour, $dayOfMonth, $month, $yearOffset, $da +yOfWeek, $dayOfYear, $daylightSavings) = localtime(); my $currentYear = 1900 + $yearOffset; my $currentMonth = $months[$month]; return ($dayOfMonth, $currentMonth, $currentYear); } #_____________________________________________________________________ +_________# sub setupDeletionDebatePage { my($articleForDeletion, $categoryForDeletion, $reasonForDeletion, +$remoteHost, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); my $afdPage = "Articles_for_deletion/$articleForDeletion"; my $numberOfPreviousNoms = getNoOfPreviousNoms($afdPage); my $currentNominationNumber = $numberOfPreviousNoms +1; my $afdPageToCreate = $afdPage . " (nomination $currentNominationN +umber)"; my $afdUrlToCreate = uri_escape($afdPageToCreate, "\0-\377"); my $pageURL = "http://$language.wikipedia.org/w/index.php?title=Wi +kipedia:$afdUrlToCreate&action=edit"; my $contentToAdd = "{{subst:afd2 | pg=$articleForDeletion | cat=$c +ategoryForDeletion | text=$reasonForDeletion}} - [[User:$editorName|$ +editorName]] (on behalf of IP: $remoteHost) ~~~~~"; my $editSummary = "Creating deletion discussion page for [[$articl +eForDeletion]] because $reasonForDeletion"; writeResultsToPage($pageURL, $contentToAdd, $editSummary, "REPLACE +"); return($afdUrlToCreate); } #_____________________________________________________________________ +_________# sub tagArticleWithDeletionNotice { my($articleForDeletion, $remoteHost, $afdUrl, @unexpectedArgs) = @ +_; checkForUnexpectedArgs(@unexpectedArgs); my $articleURL = uri_escape($articleForDeletion, "\0-\377"); my $pageURL = "http://$language.wikipedia.org/w/index.php?title=$a +rticleURL&action=edit"; my $contentToAdd = "{{subst:afd}}"; my $editSummary = "nominated for deletion: see [[Wikipedia:$afdUrl +]]"; writeResultsToPage($pageURL, $contentToAdd, $editSummary, "APPENDF +IRST"); return(); } #_____________________________________________________________________ +_________# sub notifyArticleCreator { my($articleForDeletion, $remoteHost, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); my $articleCreator = fetchArticleCreator($articleForDeletion); my $pageURL = "http://$language.wikipedia.org/w/index.php?title=Us +er_talk:$articleCreator&action=edit&section=new"; my $contentToAdd = "{{subst:AFDWarning|$articleForDeletion}} - [[ +User:$editorName|$editorName]] (on behalf of IP: $remoteHost) ~~~~~"; my $editSummary = "Warning notification - $articleForDeletion prop +osed for deletion"; writeResultsToPage($pageURL, $contentToAdd, $editSummary, "REPLACE +"); return(); } #_____________________________________________________________________ +_________# sub fetchArticleCreator { my($articleForDeletion, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); my $articleURL = uri_escape($articleForDeletion, "\0-\377"); my $articleHistoryUrl = "http://en.wikipedia.org/w/index.php?title +=$articleURL&dir=prev&action=history"; my $browser = LWP::UserAgent->new(); $browser->timeout(60); my $request = HTTP::Request->new(GET => $articleHistoryUrl); my $response = $browser->request($request); my $articleHistoryContents = $response->content(); for ($articleHistoryContents) { s/[\s\S]*?<ul id="pagehistory">//; s/<\/ul>[\s\S]*/<br>/; s/<\/li>[\s\S]*?<li>/<br>/g; s/<li>//g; s/<\/li>//g; s/<input[\s\S]*?>//g; s/<a[\s\S]*?>//g; s/<\/a>//g; s/\(cur\)//g; s/\(last\)//g; s/\(Talk \| contribs\)//g; s/\(Talk\)//g; s/[\s\S]*?<span/<span/; s/<span class="minor">m<\/span>//g; s/<br>[\s\S]*?<span/<br><span/g; s/[\s]*?<\/span>/<\/span>/g; s/<\/span>[\s\S]*?<br>/<\/span><br>/g; s/<br>/\|/g; s/\|$//; s/^\|//; s/<span[\s\S]*?>//g; s/<\/span>//g; s/\n//g; } my @contributors = split(/\|/,$articleHistoryContents); reverse(@contributors); my $articleCreator = $contributors[0]; return ($articleCreator); } #_____________________________________________________________________ +_________# sub validateFilePaths { my($filePath, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); checkFileCanBeAccessed($editTokenFile, "EXISTS"); checkFileCanBeAccessed($inputFormFile, "EXISTS"); checkFileCanBeAccessed($errorFile, "EXISTS"); checkFileCanBeAccessed($respondToUserFile, "EXISTS"); return(); } #_____________________________________________________________________ +_________# sub validateCategoryCode { my($categoryForDeletion, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); unless (($categoryForDeletion eq "M") || ($categoryForDeletion eq +"O") || ($categoryForDeletion eq "B") || ($categoryForDeletion eq "S" +) || ($categoryForDeletion eq "W") || ($categoryForDeletion eq "G") | +| ($categoryForDeletion eq "T") || ($categoryForDeletion eq "F") || ( +$categoryForDeletion eq "P") || ($categoryForDeletion eq "I") || ($ca +tegoryForDeletion eq "?")) { error("The category '$categoryForDeletion' that you entered is + an invalid category code for the deletion debate."); } return() } #_____________________________________________________________________ +_________# sub validatePageExists { my($articleForDeletion, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); my $articleContents = fetchPageContents($articleForDeletion); if ($articleContents =~ m/Wikipedia does not have an article with +this exact name/) { error("The article '$articleForDeletion' that you nominated fo +r deletion does not exist."); } return(); } #_____________________________________________________________________ +_________# sub getNoOfPreviousNoms { my($afdPage, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); my $noOfNominations = -1; my $afdPageExists = "true"; until($afdPageExists eq "false") { $afdPageExists = "false"; my $afdPagePossibility = $afdPage . " (nomination $noOfNominat +ions)"; $afdPageExists = validateAfdPageExists($afdPagePossibility); $noOfNominations++; } return ($noOfNominations); } #_____________________________________________________________________ +_________# sub validateAfdPageExists { my($afdPage, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); my $articleContents = fetchPageContents($afdPage); my $afdPageExists = "false"; if ($articleContents =~ m/Category:AfD debates/) { $afdPageExists = "true"; } return($afdPageExists); } #_____________________________________________________________________ +_________# sub fetchPageContents { my($article, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); $article = uri_escape($article, "\0-\377"); my $article_url = "http://$language.wikipedia.org/wiki/$article"; my $browser = LWP::UserAgent->new(); $browser->timeout(60); my $request = HTTP::Request->new(GET => $article_url); my $response = $browser->request($request); if ($response->is_error()) {printf "%s\n", $response->status_line; +} my $contents = $response->content(); sleep(1); # don't hammer the server! One read request every 1 seco +nd. $article = uri_unescape $article; return($contents); } #_____________________________________________________________________ +_________# sub validateBasicFormats { my($editToken, $articleForDeletion, $categoryForDeletion, $reasonF +orDeletion, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); unless (($editToken =~ m/\S+/) && ($articleForDeletion =~ m/\S+/) + && ($categoryForDeletion =~ m/^\S{1}$/) && ($reasonForDeletion =~ m/ +\S+/)) { error("You returned one or more empty or invalid fields, this +is not permitted, all fields are mandatory."); } return(); } #_____________________________________________________________________ +_________# sub validateEditToken { my($offeredToken, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); my $fileContents = readEditTokens(); chomp($fileContents); my @allLines = split(/\n/, $fileContents); my $tokenFound = 0; my $actualHost = ""; foreach my $line (@allLines) { my ($editToken, $remoteHost) = split(/\|/, $line); if ($offeredToken eq $editToken) { $tokenFound++; $actualHost = $remoteHost; } } unless ($tokenFound > 0) { error("The edit token '$offeredToken' that you provded is an i +nvalid edit token."); } return($actualHost); } #_____________________________________________________________________ +_________# sub readEditTokens { open(TOKENFILE, '<', "$editTokenFile") or error("Cannot open edit +token file."); flock(TOKENFILE, 2) or error("Cannot lock edit token file."); + my $editTokens = <TOKENFILE>; flock(TOKENFILE, 8); close (TOKENFILE) or error("Cannot close edit token file."); return ($editTokens); } #_____________________________________________________________________ +_________# sub generateEditToken { my($remoteHost, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); my $timeNow = time(); my $stringForEncryption = $remoteHost . $timeNow; my $editToken = crypt($stringForEncryption,"a3"); writeTokenToFile($editToken, $remoteHost); return ($editToken); } #_____________________________________________________________________ +_________# sub writeTokenToFile { my($editToken, $remoteHost, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); my $stringToWrite = $editToken . "|" . $remoteHost; open(TOKENFILE, '>>', "$editTokenFile") or error("Cannot open edit + token file."); flock(TOKENFILE, 2) or error("Cannot lock edit token file."); + print TOKENFILE "$stringToWrite\n"; flock(TOKENFILE, 8); close (TOKENFILE) or error("Cannot close edit token file."); return(); } #_____________________________________________________________________ +_________# sub printInputForm { my($editToken, @unexpectedArgs) = @_; checkForUnexpectedArgs(@unexpectedArgs); my $template = HTML::Template->new(filename => 'inputForm.tmpl'); $template->param(PATHTOIMAGES => $pathToImages); $template->param(PATHTOSCRIPT => $pathToScript); $template->param(EDITTOKEN => $editToken); print "Content-Type: text/html\n\n", $template->output; return(); } #_____________________________________________________________________ +_________# sub error { my $template = HTML::Template->new(filename => 'error.tmpl'); $template->param(ERROR => $_[0]); print "Content-Type: text/html\n\n", $template->output; exit; }

Replies are listed 'Best First'.
Re: Could I get some feedback on my code please?
by f00li5h (Chaplain) on Jan 13, 2007 at 09:55 UTC
      Great tip, thanks! Implemented now (code updated above). Anything else?
Re: Could I get some feedback on my code please?
by nobull (Friar) on Jan 13, 2007 at 11:40 UTC
    OK, I'm not looking at the macro scale, just the micro scale. Here are a few random observations.
    use LWP::Simple; use LWP::UserAgent; use HTTP::Request; use HTTP::Request::Common qw(GET); use HTTP::Response;

    You never use HTTP::Request::Common::GET so why do you import it? LWP::UserAgent will load HTTP::Request and HTTP::Response for you. As far as I can see you never actually use LWP::Simple.


    $|=1; #Disable buffering to allow instant output of text.

    Is there a paricular reason you do this? If not, don't.


    our $editorPassword = "**********"; our $language = "en"; # language code for wikipedia namespace our $enabled = "true"; # set to false to take offline/disable

    Is there a paricular reason you use our? If not use my.


    if ($enabled eq "true") {

    It is my number one rule in programming: "use the most natural representation of things unless there is a reason to do otherwise".

    If you want $enabled to be a boolean then use Perl's natural concept of "true" or "false" and simply say:

    if ($enabled) {

    &getInput;

    Do not use the special &-syntax to call subroutines unless you understand and require its special semantics.

    getInput();

    my @gettheip = split(/\./,$ENV{'REMOTE_ADDR'}); my $remoteHost = "$gettheip[0].$gettheip[1].$gettheip[2].$gettheip[3]" +;

    Why do you split and rejoin the IP address?


    sub readEditTokens { &checkFileCanBeAccessed($editTokenFile, "READ"); open(TOKENFILE,"$editTokenFile") || &error("Cannot open edit token + file."); flock(TOKENFILE, 2) || &error("Cannot lock edit token file.") +; my $editTokens = <TOKENFILE>; flock(TOKENFILE, 8); close (TOKENFILE); return ($editTokens); }

    The checkFileCanBeAccessed is ill-concieved. It adds complexity, introduces a race condition and delivers no beniefit.

    You have useless quotes in "$editTokenFile".

    You use the special legacy 2-arg open() syntax in which perl parses the second argument into two parts: mode and filename. It is simpler and clearer to use the 3-arg format.

    You are using the global file handle TOKENFILE. This habit could cause you troble in future. You could local(*TOKENFILE) to make your changes to the global handle temporary but better to simply not use a global handle.

    You don't bother to check for errors on close(). This is not too bad but if you localised the filehandle you wouldn't need to close it explicitly (unless to check for errors).

    Your indentation is confusing.

    It is more conventinal to use or not || for flow control.

    Error messages should contain the error.

    You are taking an exclusive lock when you are just reading.

    There's no reason to unlock a handle you are about to close anyhow.

    sub readEditTokens { open(my $TOKENFILE, '<', $editTokenFile) or error("Cannot open edit token file: $!"); flock($TOKENFILE, 1) or error("Cannot lock edit token file: $!"); my $editTokens = <$TOKENFILE>; return ($editTokens); }

    However, consider instead File::Slurp.


    print STDOUT "Content-type: text/html\n\n";

    Why are you making the filehandle explict?

      Wow, lots of stuff to look at there, thanks. I have a few questions though. you say:

      "Do not use the special &-syntax to call subroutines unless you understand and require its special semantics. "

      My copy of the Programming Perl book makes no mention of this, should I never use the "&" in calling subroutines??? This is the first I've heard of this! Can you eplain please?

      "You are using the global file handle TOKENFILE. This habit could cause you troble in future. You could local(*TOKENFILE) to make your changes to the global handle temporary but better to simply not use a global handle. "

      Can you point me to a page on this or explain this further please?

        Calling a sub with &sub will do many things, but most interstingly, it will not set @_ inside the sub, so you will inherit the @_ from the outer context, the & also means that prototypes are ignored for that call.

        If you're not sure what those are, it's safer to not tamper with them, and just call subs with sub().

        Somewhere it says in Learning Perl, "you don't tell someone to 'do wash the dishes' when you can just say 'wash the dishes'" this is a terrible paraphrase, but i think i left the book at work ...

        @_=qw; ask f00li5h to appear and remain for a moment of pretend better than a lifetime;;s;;@_[map hex,split'',B204316D8C2A4516DE];;y/05/os/&print;
        Look at perlsub: &subname(); will disable prototype-checking for that call (now prototypes aren't used much in perl, for good reasons, but if they're there, you'll probably don't want to skip the checks).

        Also, if you forget the parenthesis and call the sub as &subname; you will forward your current @_ array to that call. In other words, you will pass on your current arguments - not call it without arguments.

        Instead of doing that, you should probably always write subname( args ) (i.e. with parenthesis and without the ampersand). It's the only way to call a subroutine that doesn't have any hidden snags. Generally, you'll only use the & sygil if you want to take references to named subroutines or for goto &subroutine which is only useful in very specific cases.

        using "&" is not needed in most parts of your code, but is still used for some things.
        From what I have read, tells to save the use of "&" for calling subroutines to equale other stuff like
        $SIG{__DIE__} = \&fatal_error;
        and other uses.
        Perl 5 still supportes "&" but the preferd format is like "a_sub();" the "&" is considered as clutter.

        A Link to Chapter 9 of Perl Best Practices page 176 explains this and I suggest to reading the hole Chapter and/or buy the book.

        Good Luck

        About the file handle thing: in Perl 5 a file handle can be stored as a reference in a scalar variable. As of Perl 5.6 (I think), the builtin open function can initialize this variable for you:
        open my $tokenfile, '<', $filename or die "Couldn't open file: $!";
        Now $tokenfile is your filehandle and you don't have to worry about it being global or whatever. Have a read through perlopentut - it talks more about this.
Re: Could I get some feedback on my code please?
by rodion (Chaplain) on Jan 13, 2007 at 11:20 UTC
    NOTE: A quick suggestion for other's looking at the code. Click the download button first thing, so you can see the code formatted nicely in a separate window.

    For someone who doesn't write code professionally, writing Perl for their own satisfaction, this is not bad code. You're clearly working at structuring the code for clarity. That's something that takes time to develop, but you've got a good start. Pick up a copy of Perl Best Practices. It's a very readable style guide that you'll probably enjoy reading.

    I only have two specific suggestions for the code, at this point in your learning

    • Use of 'false' as a false value, as in "until ($variable eq 'false')" is likely to get you in trouble in the long run, since 'false', 'False' and 'FALSE' all evaluate as true.
    • Preceeding function names with "&" is increasingly less common since Perl4, and its meaning is changed in Perl6 to mean a function reference. If the purpose is to distinguish your own fuctions from built-ins, consider starting function names with with an initial upper case letter.
    Good luck, and may you have many happy hours of Perl writing.
Re: Could I get some feedback on my code please?
by wfsp (Abbot) on Jan 13, 2007 at 11:36 UTC
    Hi PockMonk,

    I'm in a very similar boat to yourself. I'm not able to run your code but just from reading it I think it looks fine. And if it works that surely must be a bonus. :-)

    I would like to make a few points and ask a couple of questions.

    The list of 23 regexes in fetchArticleCreator() to extract data from HTML could be replaced by using an HTML parser.

    To add to f00li5h's very good point about taking the HTML 'guff' out of your script. You don't appear to be using CGI.pm for any output. I don't either so I moved over to using CGI::Simple. The docs discuss why you might consider doing the same.

    Minor nits.

    I find using all uppercase for globals helps make my scripts clearer. Why do you "our" instead of "my"?

    imo I think CGI scripts in particular benefit from the 3 argument open. I also always use a lexical file handle to avoid the bareword.

    For my own curiosity:

    Why do your subs checkForUnexpectedArgs()? All the subs are called from within the script so I'm not sure how useful the user will find that message.

    When would checkFileCanBeAccessed() return and a subsequent open die? Wouldn't or error("$msg $!") on the open be enough? I have some major rewriting to do if it isn't. :-)

    Hope that helps.

      Thanks for the response, lots to get my teeth into there:

      The list of 23 regexes in fetchArticleCreator() to extract data from HTML could be replaced by using an HTML parser.

      Could you explain this in more detail please?

      To add to f00li5h's very good point about taking the HTML 'guff' out of your script. You don't appear to be using CGI.pm for any output. I don't either so I moved over to using CGI::Simple. The docs discuss why you might consider doing the same

      I've just revised the script to take account of HTML::Template as suggested, I should look into CGI::Simple shortly as advised.

      I find using all uppercase for globals helps make my scripts clearer. Why do you "our" instead of "my"?

      I'm guessing probably because I don't truly understand globalsand aren't using them properly! :-P

      imo I think CGI scripts in particular benefit from the 3 argument open. I also always use a lexical file handle to avoid the bareword

      Thanks, I'll look into that on the link provided.

      Why do your subs checkForUnexpectedArgs()? All the subs are called from within the script so I'm not sure how useful the user will find that message

      The script isn't complete yet and I find it helpful while debugging etc to catch anything I've passed accidentally when I'm chopping and changing what gets passed to what.

      When would checkFileCanBeAccessed() return and a subsequent open die? Wouldn't or error("$msg $!") on the open be enough? I have some major rewriting to do if it isn't. :-)

      I'm sure that you're right, actually. I guess that bit is a bit redundant...

      Thanks!
Re: Could I get some feedback on my code please?
by shmem (Chancellor) on Jan 13, 2007 at 12:07 UTC
    If you have to do a bunch of substitutions as in fetchArticleCreator, a common trick is to use a for loop with just one element, in which the variable to be operated on is aliased to a localized $_:
    my $articleHistoryContents = $response->content(); for ($articleHistoryContents) { s/[\s\S]*?<ul id="pagehistory">//; s/<\/ul>[\s\S]*/<br>/; s/<\/li>[\s\S]*?<li>/<br>/g; s/<li>//g; ... }

    That way you need not repeat the variable on each line.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Could I get some feedback on my code please?
by stonecolddevin (Parson) on Jan 14, 2007 at 03:18 UTC

    Templates ++, and using modules to do stuff that's already been done ++.

    A lot of what I notice is you're re-inventing the wheel with how you've coded this (not your idea, just how it's written). If you did a couple pretty basic searches on CPAN, i bet you could get rid of a lot of your hand coded regexps and have something pretty solid, clean, fast, and virtually unbreakable.

    As far as the code, it looks incredibly organized and well documented, which is something i can't say for even most of my code. Takes a good deal of patience to do such a thing. ++ to you when i get my votes from the vote fairy.

    And one last note, throwing yourself to the wolves ain't always a bad thing, sometimes you need to just get a swift kick in the butt to get precious programming and other such values engrained into you.

    HTH.

    meh.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://594509]
Approved by wfsp
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-04-26 04:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found