Split columns according header delimiter

sroux has asked for the wisdom of the Perl Monks concerning the following question:

Dear guruz all is in the title, I wish to split columns following header pattern (strange pattern indeed), the + sign is the delimiter here that turns then into fixed size delimiter for the next row... am I clear? not sure...

Also as a second step, is there also a way of indexing the content using upper column title (sounds like a hash isn't it?)

My example below is badly formatted but that's the whole idea, delimiter is where the + are Many thanks for your wisdom

 app      db    comment    time      blabla
+--------+-----+----------+---------+------
 one      a     rge        00:00:00  DFDF  
 two      b     fghfjjf    00:00:00  fgfg  
 three    cc    fjfjfj     00:00:00  fh    
 four     ddd   fjgkk      00:00:00  tjku
[download]

Comment on Split columns according header delimiter Download Code

Replies are listed 'Best First'.
Re: Split columns according header delimiter by choroba (Cardinal) on Jul 23, 2014 at 15:55 UTC
First, I store the lengths of columns to the @lengths array. I then create a template for unpack to process the table. #!/usr/bin/perl use warnings; use strict; use Data::Dumper; my $header = <DATA>; my $pattern = <DATA>; my @lengths; push @lengths, length $1 while $pattern =~ /(\+-*)/g; my $template = join '', map "A$_", @lengths; my @names = unpack $template, $header; my @table; while (my $line = <DATA>) { substr $line, 0, 1, q(); # Remove the leading space. my %row; @row{@names} = unpack $template, $line; push @table, \%row; } print Dumper \@table; __DATA__ app db comment time blabla +--------+-----+----------+---------+------ one a rge 00:00:00 DFDF two b fghfjjf 00:00:00 fgfg three cc fjfjfj 00:00:00 fh four ddd fjgkk 00:00:00 tjku [download] لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l]
Re: Split columns according header delimiter by kennethk (Abbot) on Jul 23, 2014 at 15:20 UTC
First, please read How do I post a question effectively?. In particular, you should wrap sample input in `<code>` tags to keep things properly formatted and to preserve white space. As an added bonus in this case, the results from that are rendered in monospace, which would make your sample clearer. Also suggested in that document is that you demonstrate effort. What have you tried? What worked, or didn't? As well, providing you actual expected output makes things more obvious that trying to explain in words. In terms of the question you've asked, the most obvious approach from my perspective would be to use index to find the locations of the `+` symbol, and then use substr to grab the chunks out of the target row. Alternatively, you could split on `+` (`split /\+/, $string` since `+` is a regular expression Metacharacters), and then use the `substr EXPR,OFFSET,LENGTH,REPLACEMENT` [download] invocation of substr to destructively take chunks of your target row according to the length of each chunk of `-`s from the header. Update: Edit of original node makes much of the content above moot. #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.	[reply] [d/l] [select]
Re: Split columns according header delimiter by jellisii2 (Hermit) on Jul 24, 2014 at 11:58 UTC
~~Text::CSV can use any delimiter you assign it, not just commas. It's defined in the constructor.~~ Need more coffee before commenting. `E_READING_COMPREHENSION_FAILURE`	[reply] [d/l]