Larry has asked for the wisdom of the Perl Monks concerning the following question:

Well, I have re-done my script : see below, but the computer just hangs and gives me a 0 file with nothing in it. The text file called nosect.txt contains a list of words on seperate lines. The file Statutes.fff is a 400mb file which contains text in paragraph form. Where am I going wrong!! !!Help
\#!/usr/local/bin/perl -w %Acts = (); #use Getopt::Long; use strict; my $filetoopen1 = 1; my $filetoopen = 1; my @WordList = 1; my %Groups = 1; my $TheLine = 1; ##################### &Actsin1; &Group; ##################### sub Actsin1 { $filetoopen1 = "c:/Update/nosect.txt"; open (ACTLIST, $filetoopen1) || die "Could not open file $filetoop +en1 \n"; open (OUTPUT2,">c:/Update/Errors.txt") || die "Could not open outp +ut \n"; dbmopen (%Groups, "c:/Update/Groups",0666) || die "Could not open +Acts database\n"; @WordList = (<ACTLIST>); } ##################### sub Group { $filetoopen = "c:/update/makeme/statutes\.fff"; open (INPUT, $filetoopen) || die "Could not open file $filetoopen \n"; open (OUTPUT,">c:/update/makeme/newstat.fff") || die "Could not open o +utput \n"; $TheLine = $_; while (my $TheLine = <INPUT>) { if ($TheLine =~ /<RD>[^\n]*Status Compendium<<.JL>/i) { for(my $i=0; $i > @WordList; $i++) { print OUTPUT $TheLine if($TheLine ne $WordList[$i]) } } } } ##################

Replies are listed 'Best First'.
Re: Remove multiple lines of text
by foogod (Friar) on Sep 21, 2001 at 08:38 UTC

    Larry,

    There are many different possible problems with your script.

    • 1) on windows you should point to your files like:  $filetoopen = "c:\\update\\makeme\\statutes.fff";
    • 2) why are you escaping the period in the $filetoopen string? This should be as above ... no \

    Fix these errors and you will be further along ... but I am not sure if I understand exactly what you are wanting to do.

    Follow up with some more details, and I will punch out a general outline for your script.

    HTH

    - f o o g o d

      Actually you don't have to use the \\ escaped backslashes. See this tutorial node by tachyon and commentary by various other people.

      As a quick test you can paste the code below into a file on your local system and run it w/o error

      use strict ; use warnings; use diagnostics ; # explain the warnings open(FH,">C:/tmp.txt") || die "blah blah $!"; print FH "some text\n"; close(FH); open(FH, "<C:/tmp.txt"); my $test=<FH>; print $test;

      --mandog
Re: (ebm)Remove multiple lines of text
by earthboundmisfit (Chaplain) on Sep 21, 2001 at 08:55 UTC
    I'm a bit confused by what you're trying to accomplish here:
    $TheLine = $_; while (my $TheLine = <INPUT>) { # ....etc. }
    You're overwriting $TheLine with the default variable, "it" ($_) outside your loop and then creating another $TheLine that is out of scope (within the loop).

    Here's one way of doing what I think you want. It's probably not the most advanced way, but I hope it makes sense and helps in your understanding of what's gone wrong.

    @lines = <INPUT>; for (@lines) { if (/<RD>[^\n]*Status Compendium<<.JL>/i ) { # notice that $_ is assumed here foreach my $word (@WordList) { print OUTPUT if($_ ne $word) # here we need $_ for camparison (?) } } }
Re: Remove multiple lines of text
by Anonymous Monk on Sep 21, 2001 at 08:46 UTC
    Here are just *a couple* tips and problems with your code.

    You really need to look at the errors and warnings -w, strict, and your code give you. -w and strict are your friends!!

    • my $filetoopen1 = 1; my $filetoopen = 1; my @WordList = 1;
      Why are you doing this? Are you doing it because of strict? Unless you have a practical reason for declaring your variables this way, you can just declare them like my $var;
    • Please excuse me if I'm wrong (as I don't use windows much), but in your path names, aren't you supposed to be using \ instead of the unix style / ?? I think you'll have to escape your backslashes (like \\) too.
    • The scope of most of the variables in your program are pretty screwed up, and, as mentioned before in your previous post, the subroutines are not needed at all and would solve your scoping problems. Otherwise, you'll need to declare these variables in your subs (my would be fine then). If you have no idea what I'm talking about, look around - I'm sure there's plently of info on scope and my!!
    • You are missing semicolons on a couple lines.
    • I have no idea of the ultimate intent of this script is, but do be careful with your open calls- make sure that you're opening the file in the correct mode.
    Please note that this list has no intensions of being complete, and you really should try understanding the error messages as much as you can - (how'd you get it to run????). If you are copying the code to here by hand, I'm sure your paste functionality of your browser is fully functional :)
      Oops -- that was me ^^^^^^ . :)