in reply to Re^5: Intercharacter spacing
in thread Intercharacter spacing

See your point regarding the test conditions. Have a slim-line code as a result. The DzSoft Perl Editor has an 'In Browser' facility where I can view the fruits of my code. It displays the HTML exactly as it would if I'd been able to write the altered code back to the sourse file on my hard drive, i.e. images and anchors have been removed. Which is where I'm having (more) problems. This hacking business is certainly hard work, though fun (when I can get code to run).! I'm trying to write the changed code back to the file on the hard-drive, by writing on a filehandle, so I can re-open the html document. I use a print operator? I have to come clean and say that the file writing's confusing the Hell out of me. Are file tests the answer, assign to a new list variable? Time to try. Here's the code I've written so far - the file won't open for writing (yet). Confusion!!! Rich
#!/usr/bin/perl # write mods to HTML file.plx # Program will read in an html file, remove the img tag and rewrite HT +ML on E-drive. # 1. No need for file variable yet: open (INFILE, "<".$htmlFile) or di +e("Can't read source file!\n"); # 2. Alternative: m/<A\s+HREF=[^>]+>(.*?)<\/A>/ - Will not remove clo +sing tag though - why? # 3. Why is interpreter flipping-out over an 'undefined variable', whe +n # original regexp, m/<A\s+HREF=[^>]+>(.*?)<\/A>/, is known to work. + What am I missing? use warnings; use diagnostics; use strict; # Declare and initialise variables. my $pattern1 = '<IMG\s+(.*)>'; my $pattern2 = '<A\s+HREF\s*=[^>]+>'; my $pattern3 = '</A>'; my @htmlLines; my @htmlFile; # Open HTML test file and read into array. open INFILE, "E:/Documents and Settings/Richard Lamb/My Documents/HTML +/test1InDocCSS.html" or die "Sod! Can't open this file.\n"; @htmlLines = <INFILE>; close (INFILE); scrapImageTag(); scrapAnchorTag(); # Removes image tag elements in array sub scrapImageTag { foreach my $line (@htmlLines) { # replace <IMG ...> with nothing. $line =~ s/$pattern1//ig; # case insensitivity and global search +for pattern } } # Removes anchor tag elements in array sub scrapAnchorTag { foreach my $line (@htmlLines) { # replace <A HREF ...> with nothing. $line =~ s/$pattern2//ig; # case insensitivity and global search +for pattern $line =~ s/$pattern3//ig; # case insensitivity and global search +for pattern } } # Am I deleting the contents of the list with this? Not sure... open (OUTFILE, ">@htmlLines") or die("Can't rewrite the HTML file.\n") +; print OUTFILE "@htmlLines\n"; close (OUTFILE);

Replies are listed 'Best First'.
Re: Strangeness with Web browsers...
by graff (Chancellor) on Aug 16, 2003 at 19:34 UTC
    Certainly, you don't really want to do this:
    open (OUTFILE, ">@htmlLines") or die("Can't rewrite the HTML file.\n") +;
    You're using all the contents of the array -- which would appear to be all the text contents of a file -- as the file name. (Try including the same "string" in the error message that reports failure to open the file, so you'll know when the failure is due to a bad file name.)

    You seem to have a usable, sensible file name for opening "INFILE", and you should do the similar thing when opening "OUTFILE" -- maybe change the name a bit, so you don't obliterate the original data file, or else rename the original file to something else, first (before you use that file name again for output), to preserve the input data -- this is important when debugging this sort of script.