zli034 has asked for the wisdom of the Perl Monks concerning the following question:

Hi guys:

This is my work assignment. I can really use some good advices right now.

I have a ASCII file can be opened with Notepad, and this file have 4500 columns of plain text data. I also have another file called layout file in CSV format gives me the layout information of the ASCII file.

Here is a piece of the layout file:

Respondent.Serial,Serial number,,,,

Type: Long,,,,,

,,,,Card: 0,Cols: 1-8

As you can see this information show from column 1-8 of the ASCII file, is the data called Serial number.

I just need to use comma to separate the columns of the ASCII file by using the correct column index from the layout file, and also at the top of the new data CSV file also have to the proper name of that piece of data, in this case is the Serial number.

I don't want do spend a week to align this ASCII file, which only have 450 something rows manually. And of cause if I don't write some codes this time, next time this kind data given to me again, I will again suffer. There are about 1000 commas I have add for each row of the ASCII file.

Appreciate all your knowledge and thoughts

Zli034

  • Comment on Please help me with this ASCII file parsing assignment

Replies are listed 'Best First'.
Re: Please help me with this ASCII file parsing assignment
by toolic (Bishop) on Aug 06, 2008 at 01:30 UTC
    To parse your layout CSV file, you could use one of the CSV CPAN modules, such as Text::CSV_XS. This would place your starting column positions into an array. A simple example is:
    use strict; use warnings; use Data::Dumper; use Text::CSV_XS; my $csv = Text::CSV_XS->new(); # create a new object while (<DATA>) { my $status = $csv->parse($_); # parse a CSV string into fields my @columns = $csv->fields(); # get the parsed fields print Dumper(\@columns); } __DATA__ aa,bb,cc 0,2,3

    which prints:

    $VAR1 = [ 'aa', 'bb', 'cc' ]; $VAR1 = [ '0', '2', '3' ];

    Then, you could use the column positions to format your ASCII data file, by reading in the data line by line and adding commas as follows:

    use strict; use warnings; my @cols = qw (5 11); while (<DATA>) { chomp; my $row = ''; my $start = 0; for my $cs (@cols) { my $offset = $start; my $length = $cs-$start; $row .= substr($_, $offset, $length) . ','; $start = $cs; } $row .= substr($_, $start); print "$row\n"; } __DATA__ 12345678901234567890 abcdefghijklmnopqrst

    this prints:

    12345,678901,234567890 abcde,fghijk,lmnopqrst

    Perhaps you could adapt these simple examples for your application.

Re: Please help me with this ASCII file parsing assignment
by hangon (Deacon) on Aug 06, 2008 at 00:47 UTC

    One way is to make a list of the start column for each field, in reverse order, then iterate over your data using substr to insert the commas.

Re: Please help me with this ASCII file parsing assignment
by Lawliet (Curate) on Aug 06, 2008 at 00:30 UTC

    Could we see what you have coded so far? I assume you have at least tried to solve this problem yourself seeing how it explicitly states in the rules that we will not do your homework for you.

    <(^.^-<) <(-^.^<) <(-^.^-)> (>^.^-)> (>-^.^)>