bsummerer has asked for the wisdom of the Perl Monks concerning the following question:

I have this problem where I have a text file of unspecified length. This file consists of multiple smaller files built in the same structure. In any event, the file parts are delimited with a special character. I can find this character's position in the input stream. What I need to do is grab that character (some ASCII character >128 -- something non readable in notepad for instance) into a variable by using the original input line and the characters position. Then, I need to locate the first occurance characters "IEA" in that same input stream and count the number of characters from that "IEA" string to the very next delimiter as above. Again, I'm bumping into the "I know how to do this is language X, but not in perl." Help is MOST appreciated.

Replies are listed 'Best First'.
Re: Jumping in over my head again
by frankus (Priest) on Apr 27, 2001 at 20:34 UTC

    A file with files inside it, all of the same structure. Have you heard of XML, mmmkay?

    First I'd change Perls internal input record seperator ($/) variable to that mystery character and suck the file contents into an array... how big are the subfiles likely to grow? This is an issue if you machine is not a server.(or it could be)

    If I understand you correctly:

    There is a really nasty file with records and sub-records.
    The records are delimited with a character, you can't find: (try using ord to find it, or a hex editor.)
    The sub record has a delimiter IEA..after this delimeter you want to know how many characters there are:  $count = m/IEA(.)*$/g does the trick, I think. (This is a regular expression that will get you the number of characters from IEA to the end of the file.)

    I suspect what you want is a lot simpler for Perl to do, but you've come from a less friendly language (but I'm far too polite to say that ;-)

    --
    
    Brother Frankus.
Re (tilly) 1: Jumping in over my head again
by tilly (Archbishop) on Apr 27, 2001 at 20:42 UTC
    I am unclear on how you plan to find this character's position. But if you can, then you can read lines and use substr to pull out the character, you can use seek to reset your position in the file. In perlvar you can find $/, which can be handy for reading to specific positions. (Alternately you can read through it and then find the stuff you want manually.) And if you want you can have the filehandle tell you where it is in the file.

    Before playing with this stuff, if you are on Windows it is highly recommended that you first binmode.

    And by the time you are done, well you probably will understand file manipulation in more detail than you do now...

    Of course if the file is small, you can just slurp it into a string and make your life much, much easier...

Re: Jumping in over my head again
by traveler (Parson) on Apr 27, 2001 at 20:42 UTC
    You could try something based on the following. Put the special character in $delim and start $buf from the next character. You'll have to modify this depending on whather the 'IEA' and delimiter are to be part of the count. Of course, you'll want to put this in a loop with the read and $delim finding code.
    $start = index $buf, 'IEA'; $end = index $buf, $delim, $start; $length = $end - $start;
    HTH, traveler
Re: Jumping in over my head again
by THRAK (Monk) on Apr 27, 2001 at 20:41 UTC
    What do you have written so far? Most everyone here will help you if you at least make an attempt on your own and post your code no matter how mimimal it might be. Also, it might be useful if you could provide an example of the text file you are trying to read. Guessing posting a chunk won't work (display problems), but maybe you have someplace where someone could FTP it from? In any case, it sound like this might require you to read the file in binary mode and then parse as necessary.

    Also, if you are using Notepad for an editor you may want to look into trying UltraEdit -my editor of choice on WIN or as others prefer EditPlus.

    -THRAK
    www.polarlava.com