in reply to Re: Intercharacter spacing
in thread Intercharacter spacing

When I run my file opening perl script to open a Notepad text file, the interpreter's throwing the exception below, and I'm stumped as to how I can remedy this. Here's my source first:
#!/usr/bin/perl # nlexample.plx # The script opens a text file, reads it, prints a number and # then each line within the file. use warnings; use strict; use diagnostics; open FILE, 'C:\Perl\Perl practice\test.txt' or die $!; my $lineno = 1; while(<FILE>) { print $lineno++; print ": $_"; }
Here's the text file:
One day you're going to have to face
   A deep dark truthfull terror,
And it's gonna tell you things that I still
   Love you too much to say.
####### Elvis Costello, Spike, 1988 #######
And the expected response:
1:One day you're going to have to face
2:   A deep dark truthfull terror,
3:And it's gonna tell you things that I still
4:   Love you too much to say.
5:####### Elvis Costello, Spike, 1988 #######

Problem:

1. Why is this exception occurring? Perl seems to be looking for the text file at the same location every time: how do I change this? Why is Perl looking for the test.txt file at this location? Must be a default setting.

I using DzSoft Perl Editor to write my dode, and running from by the DOS promted provided by the IDE.

Uncaught exception from user code: No such file or directory at C:\DOCUMENTS AND SETTINGS\RICHARD LAMB \L +OCAL SETTINGS\Temp\dir11.tmp\nlexample.plx line 8
Very odd place for Windows to look for the files, as there”¦s no such file as ”„dir11.tmp”¦ when I look through the directory structure in Windows Explorer. Grrr. ƒ¼

Windows seems to be looking for a temp file containing the perl source: what the hell is going on?!

Solution: Locating the test.txt file in local settings?

edited: Thu Jul 24 00:37:03 2003 by jeffa - code tags formatting

Replies are listed 'Best First'.
Re: Re: Re: Intercharacter spacing
by graff (Chancellor) on Jul 24, 2003 at 04:41 UTC
    I'm just guessing (never heard of "DzSoft Perl Editor"), but if this happens to be "line 8" of your test script:
    open FILE, 'C:\Perl\Perl practice\test.txt' or die $!;
    then the error report would have something to do with the "open" statement and the file name string that you're giving it.

    If the perl interpreter (perl.exe) is in a directory that's covered by your PATH environment variable in a DOS shell, try stepping away from the DzSoft IDE for a bit, and use the shell. Go to the directory where your test perl script is kept, and do:

    perl name_of_test_script
    If it gives a similar error report, try using forward slashes "/" instead of backslashes "\" in the file name that you pass to "open()". (I did say I was guessing...) Then make sure that the "test.txt" file really does exist in that exact path.

    Just out of curiosity, what do you get when you run this command in a DOS shell:

    perl -V
    For that matter, if you went to a directory that contains some longer file names (and names with spaces in them, etc), what would you get if you try this command:
    perl -e 'opendir(D,"."); print join($/,readdir(D)),$/'
    Do all the complete file names show up (long, with spaces,etc)? How about when you run that one-liner from within the IDE?

    One other point: don't even think about trying to do regex substitutions on HTML text data for the sake of "expanding" visible white-space. It'll give you a headache. Doing it without HTML::TokeParser would be utterly wrong. Doing it with HTML::TokeParser (and, say, adding &nbsp; in strategic spots) would just be misguided and unsatisfying (you'd see some results, but you'd rarely see results that look good).

      Graff, Cheers for the pointers. I've opened my HTML test file, written regexps to remove image and anchor tags, and printed them out. Need to write these mods to the original file, then refresh the HTML page - Happy Days! My supervisor mentioned the that regexps may have limitations, so i'm beginning to look into the HTML parse-tree approach (is that the same approach you recommende?). Here's the source code i've put together so far! Rich
      #!/usr/bin/perl # remove img & anchor tags.plx # Program will read in an html file, remove the img tag and print out +entire doc. # 1. No need for file variable yet: open (INFILE, "<".$htmlFile) or di +e("Can't read source file!\n"); # 2. Alternative: m/<A\s+HREF=[^>]+>(.*?)<\/A>/ - Will not remove clo +sing tag though - why? # 3. Why is interpreter flipping-out over an 'undefined variable', whe +n # original regexp, m/<A\s+HREF=[^>]+>(.*?)<\/A>/, is known to work. + What am I missing? use warnings; use diagnostics; use strict; use HTML::Parser; # Include this module for future reference - may +need to abandon # regexps in favour of parse-trees. # Declare and initialise variables. my $pattern1 = '<IMG\s+(.*)>'; my $pattern2 = '<A\s+HREF\s*=[^>]+>'; my $pattern3 = '</A>'; my @htmlLines; # Open HTML test file and read into array. open INFILE, "E:\\Documents and Settings\\Richard Lamb\\My Documents\\ +HTMLworkspace\\HTML practice\\My First Page!\\firsttest.html" or die +"Sod! Can't open this file.\n"; @htmlLines = <INFILE>; close (INFILE); # Test for presence of patterns in HTML file if($pattern1) { scrapImageTag(); # calls to remove image tags } else { print "No tags matching this pattern within the HTML document.\n"; } if($pattern2 && $pattern3) { scrapAnchorTag(); } else { print "No tags matching this pattern within the HTML document.\n"; } # Removes image tag elements in array sub scrapImageTag { foreach my $line (@htmlLines) { # replace <IMG ...> with nothing. $line =~ s/$pattern1//ig; # case insensitivity and global search +for pattern } } # Removes anchor tag elements in array sub scrapAnchorTag { foreach my $line (@htmlLines) { # replace <A HREF ...> with nothing. $line =~ s/$pattern2//ig; # case insensitivity and global search +for pattern $line =~ s/$pattern3//ig; # case insensitivity and global search +for pattern } } printHTML(); # prints the reformatted HTML doc sub printHTML { for my $i (0..@htmlLines-1) { print $htmlLines[$i]; } } print "\n\n"; sleep 2; print "Success?!\n";
        Okay -- that is very likely what you intend most of the time, in terms of getting rid of unwanted tags. But you should note that some of the conditionals are not doing what the comments and messages say they are doing:
        # Test for presence of patterns in HTML file if($pattern1) { scrapImageTag(); # calls to remove image tags } else { print "No tags matching this pattern within the HTML document.\n"; }
        Well, the condition "if($pattern1)" does NOT test for the presence of image tags in the html data. It merely tests that some (non-empty, non-zero) value has been assigned to the scalar $pattern1, and since you have done so a few lines above this, the test will always be true -- it would be true if no data were read in from the html file.

        To test for the presence of image tags in the html data, the condition would have to be:

        if ( grep /$pattern1/i, @htmlLines )
        but there's really no reason to do the test -- just go ahead and call the "scrap" functions. If those regex substitutions apply, fine. If not, no harm done (and not that much cpu work either).
      Thanks very much for the advice! I'm no longer an 'opening files for reading' virgin. I rechecked the file path, and gotten it wrong. Doh. It's solved now, thankfully. Next problem is to extract an image html tag from the file that i'm reading and print it out (sent that question today). After that, I'll need to write to the html file. Starting to find the hacking a lot of fun... Cheers, Richard