nanohurtz has asked for the wisdom of the Perl Monks concerning the following question:

Kind monks, I've written a script that will read an 8 part string that starts with the word -alias-. It extracts the 2nd, 6th and 8th variable and prints it out to htm with count in a table format.

example: http://www.graffece.com/dev/prevplanout.htm

I have 3 problems

One. I would like to further isolate the actual alias word by eliminating the characters starting with the (equal) character in the $varalias string.

Two. I'd also like to be able to parse out the " (quotation) characters in the $vartarget string. The end result would look something like the following example

example: http://www.graffece.com/dev/prevplanmod.htm

This way I can use the $varsource and $vartarget variable in other file and directory sub routines

Three. The trickiest, there are times when the input file may contain carriage returns between my wanted strings. Is there a way I can filter them out within the same perl routine?

example: alias tt_clms_copy="/usr/cmvc/cmtools/bin/cm_move.sh -e /cmtest/appsmore/27/hpux/text/cmexports -t /cmtest/appsmore/27/hpux -s /prod1/uat_stage/clmstux/ap_clms"

Attached is the code.

open(IN, "< prevplanin.txt"); open(OUT, "> prevplanout.htm"); my ($varalias, $varsource, $vartarget); print OUT "<html><head><title>PLAN VERIFICATION</title></head><body bg +color=\"\#FFFFFF\" text=\"\#000000\">"; $i = 1; while(<IN>) { chomp; if (m/(alias) (.*) (.*) (.*) (.*) (.*) (.*) (.*)$/) { $varalias=$2, $varsource=$6, $vartarget=$8; print OUT "<table width=\"37%\" border=\"0\" cellspacing=\"1\" cellpad +ding=\"0\"><tr bgcolor=\"\#0099CC\"><td colspan=\"2\"><font face=\"Ge +neva, Arial, Helvetica, san-serif\" size=\"2\"><b>"; print OUT "<font color=\"\#FFFFFF\">alias retrieved</font></b></font>< +/td></tr><tr><td width=\"11%\" bgcolor=\"\#CCCCCC\"><font face=\"Gene +va, Arial, Helvetica, san-serif\" size=\"2\"><b>"; print OUT "alias</b></font></td><td width=\"89%\" bgcolor=\"\#FFCC99\" +><font face=\"Geneva, Arial, Helvetica, san-serif\" size=\"2\">$varal +ias</font></td></tr><tr><td width=\"11%\" bgcolor=\"\#CCCCCC\">"; print OUT "<font face=\"Geneva, Arial, Helvetica, san-serif\" size=\"2 +\"><b>source</b>"; print OUT "</font></td><td width=\"89%\" bgcolor=\"\#FFCC99\"><font fa +ce=\"Geneva, Arial, Helvetica, san-serif\" size=\"2\">$varsource</fon +t></td></tr><tr><td width=\"11%\" bgcolor=\"\#CCCCCC\">"; print OUT "<font face=\"Geneva, Arial, Helvetica, san-serif\" size=\"2 +\"><b>target</b></font></td><td width=\"89%\" bgcolor=\"\#FFCC99\"><f +ont face=\"Geneva, Arial, Helvetica, san-serif\" size=\"2\">$vartarge +t</font></td></tr><tr><td colspan=\"2\" bgcolor=\"\#0099CC\">"; print OUT "<font face=\"Geneva, Arial, Helvetica, san-serif\" size=\"2 +\"><b><font color=\"\#FFFFFF\">$i</font></b></font></td></tr></table> +<p></p>"; $i = $i + 1; } print OUT "</body></html>" } close(IN); close(OUT);

Any help from this established and world renouned monastary is greatly appreciated.

Replies are listed 'Best First'.
Re: parsing out alias string better in perl to .htm
by tadman (Prior) on May 31, 2002 at 05:37 UTC
    Simple solutions are probably just a regular expression:
    $varalias =~ s/=.*//; # Delete everything(.*) from the equals on $vartarget =~ s/"$//; # Delete the quote at the end($)
    As a note, your regular expression scares the willies out of me, a strong case for Death to Dot Star!. You probably mean to do something like this:
    # If able to delete the stuff(.*) between the beginning # of the line(^) and the word 'alias' and one-or-more spaces(\s+) # found after that ... if (s/^.*?alias\s+//) { # Extract the 1st, 5th and 7th "words" my ($varalias, $varsource, $vartarget) = (split(' '))[1,5,7]; # Then whatever... }
    Here's a few general tips which can help you simplify your program.

    Your print to OUT just screams out for a "here document" style approach, where you can put a whole bunch of stuff right in your program and save yourself having to quote it properly. As a bonus, you don't have to "escape" your quote marks, like you have done:
    print OUT <<END_HTML; <FOO> <FOO>$varsource <FOO> <FOO>$vartarget <FOO> <FOO> END_HTML
    Also, you can increment a variable with the ++ operator, like so: $i++ and that is the same as $i = $i + 1 but is much shorter.

    Don't forget to indent, either. Not indenting is a major Faux Pas, kind of like arrivng at work without a shirt on. You can do it, but people look at you funny. Whenever you open braces, kick it in a tab stop. Some editors can do this for you automatically, if you're feeling Lazy. Example:
    if ($something) { some_code(); if ($something_else) { some_other_code(); } }
Re: parsing out alias string better in perl to .htm
by particle (Vicar) on May 31, 2002 at 12:26 UTC
    the code below is untested. mostly, i'm using it for illustration of the type of code you should be writing. use strict, warnings, and CGI. localize your typeglobs. test the return status of open and close. use split when it makes sense. take advantage of select to make printing easier.

    #!/usr/bin/perl use strict; ## keeps you honest, makes you a better programmer use warnings; ## this too... use diagnostics; ## for explanations of error an warning messages require 5.006; ## use the html generation methods in the infamous CGI module use CGI qw/:standard/; my( $infile, $outfile ) = ( 'prevplanin.txt', 'prevplanout.htm' ); my ($varalias, $varsource, $vartarget); ## indent your code properly { ## localize your typeglobs so you don't pollute the namespace. ## this is done by creating them with [local] inside a block local( *IN, *OUT ); ## always test for failures on file open open( IN, "<", $infile ) or die "ERROR: can't open $infile: $!" open( OUT, ">", $outfile ) or die "ERROR: can't open $outfile: $!"; ## select the filehandle, so you don't have to specify it when you p +rint select OUT; ## specify autoflush mode, so print output is line buffered $|++; # print your html header print start_html( -title => 'PLAN VERIFICATION', -bgcolor => "#FFFFFF", -text => "#000000", ); ## set your input record seperator. here i chose newline followed by + "alias " ## this way you can support multi-line input strings $/ = "\nalias "; ## for each record while( <IN> ) { ## split the space-seperated fields of your record ## take a slice of the array [split] returns my( $alias, $source, $target ) = (split)[0,4,6]; { ## use CGI's html generation to create your table print table( { -width="37%", -border="0", -cellspacing="1", -cellpadding => + "0" }, [ Tr( { -bgcolor => "#0099CC", }, td( { -colspan => "2" }, 'alias retrieved' ) ), Tr( td( { -width => "11%", -bgcolor => "#CCCCCC" }, b( 'alias' ), ), td( { -width => "89%", -bgcolor => "#FFCC99" }, $alias, ) ), ## repeat for $source and $target ## ... ## you don't need $i, you can use $. instead (current record + number) ] ); } print OUT "</body></html>" } ## finish your page print end_html(); ## remember to check return status of close, too close( IN ) or warn "Warning: can't close file: $!"; close( OUT ) or warn "Warning: can't close file: $!"; ## we're done with *OUT, select a valid filehandle select STDOUT; }

    ~Particle *accelerates*

Many Thanks
by nanohurtz (Initiate) on May 31, 2002 at 14:26 UTC
    I knew I came to the right place for wisdom and inspiration. A thousand blessings upon the homes of those who helped.