batcater98 has asked for the wisdom of the Perl Monks concerning the following question:

I have done this before, but having troubles with the different seperators in this one. I basically have an autogenerated flat file that is an export of a directory structure. I want to parse through the flat file and pull values from specific lines that contain a file name with the extention of .dat. From that I want to collect the Date-Time, Size, and portion of the filename. And skip the rest.
.Example Flat File. Volume in drive E is NEW Data Volume Serial Number is 901D-960F Directory of E:\My_DATA\LooK_Here\Year_2008\02_21_08 02/20/2008 12:52 PM <DIR> . 02/20/2008 12:52 PM <DIR> .. 02/02/2008 11:35 AM 851,744 01ID0801.dat 02/05/2008 11:35 AM 61 01ID0801.sta 02/09/2008 11:36 AM 299,216 01ID0823.dat 02/09/2008 11:36 AM 61 01ID0823.sta 02/10/2008 11:36 AM 373,018 01ID0827.dat 02/10/2008 11:36 AM 61 01ID0827.sta 02/11/2008 11:37 AM 49,258 01ID0855.dat 02/11/2008 11:37 AM 61 01ID0855.sta 02/15/2008 11:37 AM 427,803 01ID0861.dat 02/15/2008 11:37 AM 61 01ID0861.sta 02/18/2008 11:37 AM 282,035 01ID0865.dat 02/18/2008 11:38 AM 61 01ID0865.sta 1604 File(s) 386,639,292 bytes 2 Dir(s) 78,292,127,744 bytes free .End Flat File Example. .Output Desired. Record 1 Date-Time = 02/02/2008 11:35 AM Size = 851,744 Name = 0801 (This is always 4 characters to left of the .) Record 2 Date-Time = 02/09/2008 11:36 AM Size = 299,216 Name = 0823 (This is always 4 characters to left of the .) ect....
Thanks for any help on this.

Replies are listed 'Best First'.
Re: Parsing Values from a Flat File?
by graff (Chancellor) on Feb 22, 2008 at 02:58 UTC
    I'm guessing there's some process (apparently running on a Windows box) that "autogenerates" this flat file. If your perl script is going to run on that same box, you might save yourself some trouble by leaving out the initial separate process, and just using the perl script to scan the path(s) of interest, and collect the file name, date and size information for each.

    Depending on the "big picture" (whatever it is you are really trying to accomplish, how many directories there are, etc), you could either use opendir and readir, or some variant of File::Find.

    One of those things, combined with stat (or the -X functions) and maybe POSIX should make for a fairly stable and portable solution.

Re: Parsing Values from a Flat File?
by Joost (Canon) on Feb 21, 2008 at 19:34 UTC
Re: Parsing Values from a Flat File?
by NetWallah (Canon) on Feb 21, 2008 at 20:05 UTC
    This Regex should help you:
    m~([\d/]+\s+[\d\:]+\s+\w+)\s+([\d,]+).+?(\w{4})\.~;
    It returns a 3-element array containing the Timestamp, FileSize, and 4-bytes-before-dot.

         "As you get older three things happen. The first is your memory goes, and I can't remember the other two... " - Sir Norman Wisdom

Re: Parsing Values from a Flat File?
by GrandFather (Saint) on Feb 21, 2008 at 19:42 UTC

    It's hard to see where your problem lies, my crystal ball is on the blink currently, but for this sort of job generally a combination of split and regular expressions does the trick.

    Note that for posting code snippets we like to see self contained examples so a framework for your snippet would look like:

    use strict; use warnings; my $fileData = <<FILEDATA; Volume in drive E is NEW Data Volume Serial Number is 901D-960F Directory of E:\My_DATA\LooK_Here\Year_2008\02_21_08 02/20/2008 12:52 PM <DIR> . 02/20/2008 12:52 PM <DIR> .. 02/02/2008 11:35 AM 851,744 01ID0801.dat 02/05/2008 11:35 AM 61 01ID0801.sta 02/09/2008 11:36 AM 299,216 01ID0823.dat 02/09/2008 11:36 AM 61 01ID0823.sta 02/10/2008 11:36 AM 373,018 01ID0827.dat 02/10/2008 11:36 AM 61 01ID0827.sta 02/11/2008 11:37 AM 49,258 01ID0855.dat 02/11/2008 11:37 AM 61 01ID0855.sta 02/15/2008 11:37 AM 427,803 01ID0861.dat 02/15/2008 11:37 AM 61 01ID0861.sta 02/18/2008 11:37 AM 282,035 01ID0865.dat 02/18/2008 11:38 AM 61 01ID0865.sta 1604 File(s) 386,639,292 bytes 2 Dir(s) 78,292,127,744 bytes free FILEDATA open my $inFile, '<', \$fileData; while (<$inFile>) { ... } close $inFile;

    Perl is environmentally friendly - it saves trees
Re: Parsing Values from a Flat File?
by roboticus (Chancellor) on Feb 21, 2008 at 19:35 UTC
    batcater98:

    I have done this before, but having troubles with the different seperators in this one.
    Since your lines are fixed-format, you could use substr to break the line apart.

    ...roboticus

      unpack too

      One of my favourites and often more readable.

      For example:

      my ( $flag, $value ) = unpack 'A A2', $flag_entry;

      This will split $flag_entry into a 1-char field followed by a 2-char field. If $flag_entry is C13, then $flag will get "C" and $value will get "13"