sroux has asked for the wisdom of the Perl Monks concerning the following question:

Dear guruz all is in the title, I wish to split columns following header pattern (strange pattern indeed), the + sign is the delimiter here that turns then into fixed size delimiter for the next row... am I clear? not sure...

Also as a second step, is there also a way of indexing the content using upper column title (sounds like a hash isn't it?)

My example below is badly formatted but that's the whole idea, delimiter is where the + are Many thanks for your wisdom

app db comment time blabla +--------+-----+----------+---------+------ one a rge 00:00:00 DFDF two b fghfjjf 00:00:00 fgfg three cc fjfjfj 00:00:00 fh four ddd fjgkk 00:00:00 tjku

Replies are listed 'Best First'.
Re: Split columns according header delimiter
by choroba (Cardinal) on Jul 23, 2014 at 15:55 UTC
    First, I store the lengths of columns to the @lengths array. I then create a template for unpack to process the table.
    #!/usr/bin/perl use warnings; use strict; use Data::Dumper; my $header = <DATA>; my $pattern = <DATA>; my @lengths; push @lengths, length $1 while $pattern =~ /(\+-*)/g; my $template = join '', map "A$_", @lengths; my @names = unpack $template, $header; my @table; while (my $line = <DATA>) { substr $line, 0, 1, q(); # Remove the leading space. my %row; @row{@names} = unpack $template, $line; push @table, \%row; } print Dumper \@table; __DATA__ app db comment time blabla +--------+-----+----------+---------+------ one a rge 00:00:00 DFDF two b fghfjjf 00:00:00 fgfg three cc fjfjfj 00:00:00 fh four ddd fjgkk 00:00:00 tjku
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Split columns according header delimiter
by kennethk (Abbot) on Jul 23, 2014 at 15:20 UTC
    First, please read How do I post a question effectively?. In particular, you should wrap sample input in <code> tags to keep things properly formatted and to preserve white space. As an added bonus in this case, the results from that are rendered in monospace, which would make your sample clearer. Also suggested in that document is that you demonstrate effort. What have you tried? What worked, or didn't? As well, providing you actual expected output makes things more obvious that trying to explain in words.

    In terms of the question you've asked, the most obvious approach from my perspective would be to use index to find the locations of the + symbol, and then use substr to grab the chunks out of the target row.

    Alternatively, you could split on + (split /\+/, $string since + is a regular expression Metacharacters), and then use the

    substr EXPR,OFFSET,LENGTH,REPLACEMENT
    invocation of substr to destructively take chunks of your target row according to the length of each chunk of -s from the header.

    Update: Edit of original node makes much of the content above moot.


    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: Split columns according header delimiter
by jellisii2 (Hermit) on Jul 24, 2014 at 11:58 UTC
    Text::CSV can use any delimiter you assign it, not just commas. It's defined in the constructor. Need more coffee before commenting. E_READING_COMPREHENSION_FAILURE