james28909 has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks, i have come before you today to ask a question in which i am unsure about. a couple of them actually. If i need to make a seperate thread for the questions i will, but i think the first question can be summed up pretty quick. It is about use warnings;

For example, the code i have created will run fine and dandy with strict and warnings on, and i will even get the expected results perfectly. But when warnings are used, i get an error about a $FILE being closed. and like i said, even tho i get this warning, it still completes the script and i get expected results. i will provide example files and the script later in this post that you can compare with :) . thats it for the first question

The next question is about populating an array. The main problem at hand is, i will have different amounts of files i will be doing this batch script on. It seems the way that i have coded this, i will have a hard time storing my needed precious data into a variable. sSo i think it will be best to use an array with delimiters, and foreach file, write in a delimiter and then split that array into multiple other arrays (or multidimensional) i am not sure which would be the best way honestly. Let me post the code that covers these two questions, with hopes you understand it ( i tried to comment in everywhere to help you understand better the code):

use strict; # use warnings; my $file; my $dirname = $ARGV[0]; #warn "$dirname"; my @array; foreach my $file (<$dirname/*>) { next if -d $file; open( our $FILE, '<', $file ); binmode($FILE); #get filesize of file in process my $filesize = -s $file; #seek to text entry reference in file seek $FILE, 6, 0; read $FILE, my $buf, 2; #convert data my $abc = unpack('H*', $buf); my $offset = hex($abc); #use text entry reference to seek to actual text and print fil +e/process info seek $FILE, $offset, 0; print "\n\n$file - size of file: $filesize - Text is at offset +: $abc\n\n"; #loop thru file byte by byte doing processes depending on +regex matches while (read($FILE, my $by, 1)){ #convert contents of byte to hex string my $byte = unpack('H*', $by); #if regex match is 00 return position in file of m +atch, and #subtract offset to get total bytes read from midd +le of the file if ($byte =~ /00/){ my $pos = tell($FILE); #this subtraction will later be converted to a hex + string "0xXX" my $decimal_value_pointer = $pos - $offset; my $pointer = sprintf("%X", $decimal_value_pointer +); push (@array, qq($pointer\n)); #just prints correct pointer value in hex #print "pointer is: 0x$pointer\n\n"; next; # close $FILE; } #if $byte is ff then close file,which will jump ba +ck to the next file in the foreach loop if ($byte =~ /ff/){ close $FILE; } } } unlink ('temp'); open my $temp, '>>', "temp"; foreach my $lines(@array){ ++$_; print $temp "Pointer$_ - $lines"; }
in this code above, i am reading a directory. and for each file in this directory i am opening it, getting filesize, then seeking and reading particular data i need out of the file. This is VERY HARD to explain sorry. but after i open the file and get filesize and an address reference from 0x07 - 0x08, i seek to the value stored in $offset. once i seek to $offset, i read the file byte by byte with the while loop.

I am seeking for "00" and "ff" in hex string, and when i find them i either return position in the file that i am at, then subtract from that $offset which gives me a decimal number that i convert to hex. i need this nuber to stay in hex as it is very very important (it is a pointer to each individual sentence in the text entry offset formerly known ass $offset. once i convert $decial value pointer to its hex equvalent, i push it to an array with a new line at the end. then later on in this file, after i have parsed thru all the 00's, it will find "ff" which is in every file (so i dont need to read to eof) and it will close $FILE and move to the next one perfectly. will do this over and over till there are no more files :)

Now at the end, what i did was my failed attempt to label each one of these, and that is where i beg of your help. there will always be an unknown amount of files, therefore, there will be an unknown amount of $pointers. These $pointers will have to be written back to the file at static addresses. these addresses are known. But i would like a way i could either dynamically create arrays for each file, or what would be even better is to figure out how i can do some thing like this.... pseudo code of course lol:
foreach my $line(@array){ foreach my file(0 .. 100 or however would be best){ open $file seek to static offset write $line from array to $file } }
this would be optimum if i could keep the @array as a whole, and just write each value where i need it, even while opening and closing files, and hopefully you understand what i mean. here are the Files i promised to test with: Test Files

and just extract the script and folder anywhere (it is in another folder already) and look at the code, and please, if i cn be any more clear on anything or if you see something wrong with my code please dont hesitate to tell me :)

Replies are listed 'Best First'.
Re: Question about warnings and arrays
by SuicideJunkie (Vicar) on Aug 12, 2014 at 19:43 UTC

    Just as a note, using both $file and $FILE for variable names is going to cause pain. Why not name them $filename and $iFH/$oFH or something else unique?

    Also, a nicer way to close a filehandle is to allow the variable to go out of scope. Just my $filehandle it inside the outer loop, and use next FILE; with an appropriate "FILE" label on the outer loop instead of closing it manually. The filehandle will get closed for you when you leave, and $filehandle won't exist after the file is closed so you can't accidentally try to do operations on the closed filehandle.

    That's a bit nicer than closing the file in order to make your read throw an error in the while loop, thus breaking out to the outer loop.

      so when i come out of a loop like that, it will close the $FILE for me? but i think in this particular case, dont i need to keep it open until $bytes =~ /ff/ ? that way it closed the file and goes to the next one early. but i thnk i see what your saying :) i will def keep that in mind. i just need to figure out how to make a list of variables that i can store $pointer into. or read each line from the file thats created into $_ and write that at my offsets i need to then close file and open next without loosing spot in pointers file. edit, i kind of see what you mean, but a good example would be even better :P
      you know you want to give an example anyway :P
        Hi james28909 , yes, I'm that guy, you know what I'm going to say :) don't nest loops, write subroutines, small subroutines, easy to debug ... naturally this code is untested but looks easier to read doesn't it :) instead of bunches of comments, subroutine names
        #!/usr/bin/perl -- ## ## ## perltidy -olq -csc -csci=3 -cscl="sub : BEGIN END " -otr -opr -ce +-nibc -i=4 -pt=0 "-nsak=*" ## perltidy -olq -csc -csci=10 -cscl="sub : BEGIN END if " -otr -opr +-ce -nibc -i=4 -pt=0 "-nsak=*" ## perltidy -olq -csc -csci=10 -cscl="sub : BEGIN END if while " -otr + -opr -ce -nibc -i=4 -pt=0 "-nsak=*" #!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd /; use autodie qw/ open close /; use Path::Tiny qw/ path /; Main( @ARGV ); exit( 0 ); sub Main { my( $dirname ) = @_; my @files = getFiles( $dirname ); my @pointers; for my $file ( @files ) { SolveThisProblem( $file, \@pointers ); } SpewPointers( 'temp', \@pointers ); } ## end sub Main sub SpewPointers { my( $outfile, $pointers ) = @_; my $tempfh = path( 'temp' )->openw; ## just like autodie dies o +n error my $ix = 0; for my $lines ( @$pointers ) { ++$ix; print $tempfh "Pointer$_ - $lines"; } close $tempfh; } ## end sub SpewPointers sub SeekToAbcOffset { my( $infh ) = @_; my $infhsize = -s $infh; #seek to text entry reference in file seek $infh, 6, 0; read $infh, my $buf, 2; #convert data my $abc = unpack( 'H*', $buf ); my $offset = hex( $abc ); #use text entry reference to seek to actual text and print file/proces +s info seek $infh, $offset, 0; print "\n\n$infh - size of file: $infhsize - Text is at offset: $a +bc\n\n"; return $offset; } ## end sub SeekToAbcOffset sub SolveThisProblem { my( $filename, $pointers ) = @_; use autodie qw/ open close /; open my( $infh ), '<', $filename; binmode $infh; my $offset = SeekToAbcOffset( $infh ); READER: while( read( $FILE, my $by, 1 ) ) { if( $by eq "\x00" ) { my $pos = tell( $FILE ); my $decimal_value_pointer = $pos - $offset; push @$pointers, sprintf( "%X", $decimal_value_pointer ); next READER; } elsif( $byte =~ /ff/ ) { last READER; } } ## end READER: while( read( $FILE, my $by...)) } ## end sub SolveThisProblem __END__

        See also perlquote and perlrebackslash because "\x00" is equal to chr(0), ie  $by eq chr(0)

        so when i come out of a loop like that, it will close the $FILE for me?

        Yes, but only if you use my $FILE instead of our $FILE. The automatic close is explained in open (look for the term "scope").

        so when i come out of a loop like that, it will close the $FILE for me

        Only insofar your filehandle is lexical (using my, not our, as already suggested), and is lexically scoped to that loop's block. So you'd better understand Perl scopes if you want to use this opportunity.

Re: Question about warnings and arrays
by 2teez (Vicar) on Aug 12, 2014 at 20:52 UTC

    Hi james28909

    If I could say, why not use a subroutine for repeated works. It's defined a clean and clear path of work.
    Since, you are reading through several files, use a subroutine, that read files, and does all you wanted then you can use the result returned. In fact, you don't have to manually close the file, because the file is closed for you once the subroutine is done for each file opened.
    Then, you can deal with each part of your work separately.
    E.g Let say I want to read line 10, of each file in a particular folder:

    use warnings; use strict; my $dirname = $ARGV[0] // '.'; for ( glob("$dirname/*") ) { if ( !-d ) { read_file($_); } } sub read_file { my $filename = shift; open my $fh, '<', $filename or die "can't open file:$! "; while (<$fh>) { print $_ if $. == 10; } }

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: Question about warnings and arrays
by Anonymous Monk on Aug 12, 2014 at 23:37 UTC

    To begin with, your code:

    1. open( our $FILE, '<', $file ); is better with error handling and a lexically scoped filehandle: open my $fh, '<', $file or die $!;
    2. read is also better with error handling: read($FILE, my $buf, 2)==2 or die "failed to read 2 bytes: $!";
    3. You should probably look at the pack documentation: your first unpack could be just my $offset = unpack('n',$buf);, and your second unpack could be my $byte = unpack('C',$by);, and then instead of regexes you could just write if ($byte==0) ... elsif ($byte==0xff)
    4. Several of your variables could have better names, it'd make thinking and talking about the code easier: @array, $abc, $temp, $lines
    5. Why unlink ('temp'); open my $temp, '>>', "temp"; when you can just open my $temp, '>', "temp" or die $!;?

    Now your questions:

    You should be using warnings, don't comment it out! The warning you're getting is correct - you're trying to read a closed filehandle. You're essentially forcing the loop to end via an error by simply closing the file. Instead you should be using last to exit that loop (that's an alternative to SuicideJunkie's suggested approach above).

    You mention "an array with delimiters" - that doesn't sound like a good idea, it sounds like what you're looking for is an array of arrays. I suggest you study that page well, along with perlreftut; references are further documented in perlref. Basically, for each file you could declare and populate an array my @pointers (declared inside the foreach), similar to what you're doing with @array now, and then at the end of the foreach you would be adding a reference to that array to @array, with push @array, \@pointers;. If you use that approach, Data::Dumper is very useful visualizing the data structure you've created for understanding what's going on and for debugging: use Data::Dumper; print Dumper(\@array);.

    Unfortunately your description of "These $pointers will have to be written back to the file at static addresses" is too unclear for me to make good recommendations. You want to take every "pointer"/"line" from your source files and write these to hundreds of other files? Or back to the source files? Do these files already contain data that you are overwriting, or are you generating these files yourself? I think what would really help is some examples of your desired output.

      thanks. you havent steered me wrong yet :)
      i guess it is time to study more lol
Re: Question about warnings and arrays
by tbone654 (Beadle) on Aug 12, 2014 at 20:18 UTC
    for opening all files in a directory consider perldoc -f readdir
    my $zz = param('team'); my $z = param('squad'); if (param) { $sd = "../data/data_mlb"; opendir( DIR, $sd) || die; while( ($filename = readdir(DIR))){ next if "$sd\/$filename" =~ /\/\./; ##skip . files push @dots, "$sd\/$filename"; } ## end of while @dots = sort @dots; closedir(DIR); for(my $a=0;$a<@dots;$a++){ open (FILE, $dots[$a]); push @foo, <FILE>; if ($a+1 eq @dots) { close FILE; open (FILE, $dots[$a]); push @foo2, <FILE>; } ## end of if close FILE; } ## end of for

      Some suggestions and thoughts:

      1. Although I can't be certain, it looks like you're not using strict. Always use warnings; use strict;!!
      2. opendir( DIR, $sd) || die; would better with a lexical handle and the error message: opendir(my $dh, $sd) || die "couldn't opendir $sd: $!";
      3. Your opens would be better and safer with the three-argument form, lexical filehandles, and error handling: open (my $fh, '<', $dots[$a]) or die "couldn't open $dots[$a]: $!";
      4. The condition "$sd\/$filename" =~ /\/\./ will not just match (i.e. skip) .dotfiles, it'll skip any filenames that contain /. anywhere. For example, if $sd happens to contain that string someday, all files will be rejected. The condition $filename=~/^\./ would skip only filenames that begin with a dot and so that's probably better.
      5. I find the name @dots a little strange since it contains only files that don't begin with a dot?
      6. The last file of @dots is opened and read twice. An alternative would be to read <FILE> into a temporary array and store that into @foo, and into @foo2 when appropriate. Or you could move the special treatment of $dots[-1] outside of the loop (that would also have the stylistic advantage that the loop variable $a could be eliminated, e.g. for my $dot (@dots) { open (my $fh, '<', $dot) ...).
      7. I think the condition $a+1 eq @dots is more clearly written as $a==$#dots.

      While readdir might be an alternative to the OP's glob, what does that have to do with any of the OP's questions? Also that code is desperately crying for some code comments as to what it's doing. @foo and @foo2 seem to have nothing to do with OP's question, why add that confusing code?

        It's probably not as confusing as you think... Or I would not be getting any votes I'm guessing...
        Even if it doesn't help with this specific question, it could help someone else along the way... That's how I myself learn, so if it doesn't offend anyone who's "non-anonymous", I'll keep trying to help...
        What I'm trying to show is:
        read an entire directory of filenames into an array...
        skip files you know you don't want...
        Then read them back one at a time and do your thing (whatever that is)...
        That was valuable to me at one time...
Re: Question about warnings and arrays
by Anonymous Monk on Aug 13, 2014 at 10:56 UTC

    A small nit: Your use of $_ in the final loop is a potential source of bugs, since you don't initialize it to a known value, plus $_ would be the loop variable if you didn't use a lexical variable. Better to just add an extra counter variable.

Re: Question about warnings and arrays
by james28909 (Deacon) on Aug 13, 2014 at 14:31 UTC
    i think my best bet would be to read the pointers into memory like $pointers (or what is already there) then i can seek and read each line and then do a regex to remove newline characters. then if the line is 2 characters, i can write it to file, if it is three characters, i will have to rewind one byte address, and then write. im sorry, i know in my head what i need, and its very hard for me to help anyone else understand. i need to calculate the pointer values, and my original code explain how in the while loop. but i need to calculate the pointer values, then write them back into the file at the appropriate offsets (starting at a static 0x44 per file, then seek 20 moire bytes and write pointer and seek 20 bytes and write pointer, over and over per file)