finding colunm header which crossing multiple rows

cscu2007 has asked for the wisdom of the Perl Monks concerning the following question:

-------------------------------------------------------------------------------------------

Name Company Work Home Work

Name Address Phone Hours

Number

-------------------------------------------------------------------------------------------

John ABC MAIN_STREET 987.111.4567 8

Bob CNN 2nd_Street 875.444.3425 2

I can build a hash table for the contents, but trouble is with the header. I need to find out how to get the column number based on above header, to extract the corresponding values per column. The reason to do this, because the header is dynamic, including orders, but for the same column, the header is always formated as aligned (first letter across multiple rows, as above).

For example, I need to find out which column of "Work Address" from header, then I can grep the corresponding column from the table contents.

Thanks in advance for the help!

Comment on finding colunm header which crossing multiple rows

Replies are listed 'Best First'.
Re: finding colunm header which crossing multiple rows by tybalt89 (Monsignor) on Nov 16, 2018 at 20:07 UTC
`#!/usr/bin/perl # https://perlmonks.org/?node_id=1225917 use strict; use warnings; $_ = <DATA>; my @titles = /\w+/g; my @positions; push @positions, $-[0] while /\w+/g; while( <DATA> ) { /\S/ or last; for my $i (0 .. $#positions) { /^.{$positions[$i]}(\w+)/ and $titles[$i] .= " $1"; } } use Data::Dump 'dd'; dd \@titles; __DATA__ Name Company Work Home + Work Name Address Phone + Hours Number` [download] Outputs: `[ "Name", "Company Name", "Work Address", "Home Phone Number", "Work Hours", ]` [download]	[reply] [d/l] [select]
Re^2: finding colunm header which crossing multiple rows by ikegami (Patriarch) on Nov 16, 2018 at 21:12 UTC
Or `$_ = <DATA>; my @titles = split; my @lens = map length, /\S+\s/g; $lens[-1] = ''; my $pat = join "", map "A".$_, @lens; while (defined( $_ = <DATA> ) && /\S/) { my @more = unpack($pat, $_); for my $i (0..$#titles) { $titles[$i] .= ' '.$more[$i] if length($more[$i]); } }` [download] This can be reduced to the following: `$_ = <DATA>; my @titles = split; my @lens = map length, /\S+\s/g; $lens[-1] = ''; my $pat = join "", map "A".$_, @lens; while (defined( $_ = <DATA> ) && /\S/) { my @more = unpack(" ".$pat, $_); $titles[$_] .= $more[$_] for 0..$#titles; }` [download]	[reply] [d/l] [select]
Re^3: finding colunm header which crossing multiple rows by cscu2007 (Novice) on Nov 16, 2018 at 21:21 UTC
Thank you ikegami! I got the idea to match up string position and concatenate to the final column names	[reply]
Re^2: finding colunm header which crossing multiple rows by cscu2007 (Novice) on Nov 16, 2018 at 21:18 UTC
This is Awesome. Thanks tybalt89!	[reply]
Re: finding colunm header which crossing multiple rows by LanX (Saint) on Nov 16, 2018 at 19:54 UTC
In order to help you, you should post your input text inside code tags `<c>...</c>`. A general answer is to match a regex against the first headline and to count the length of the matches, like that you'll get the width of the columns. update `DB<17> $_='Name Company Work + Home Work' DB<18> push @col,$1 while /(\w+\s*)/g DB<19> x @col 0 'Name ' 1 'Company ' 2 'Work ' 3 'Home ' 4 'Work' DB<20> print map {":".length $_} @col :11:23:27:27:4` [download] Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice}	[reply] [d/l] [select]
Re^2: finding colunm header which crossing multiple rows by cscu2007 (Novice) on Nov 16, 2018 at 21:24 UTC
Thanks LanX.	[reply]
Re: finding colunm header which crossing multiple rows by BillKSmith (Monsignor) on Nov 17, 2018 at 16:49 UTC
Now that you have solved the header problem, you still have to write your 'flat file database' program. I recommend that you at least look at that section in the book "Higher Order Perl" (ISBN 1558607013). I believe that a free download is still available. Bill	[reply]
Re^2: finding colunm header which crossing multiple rows by kcott (Archbishop) on Nov 17, 2018 at 21:53 UTC
"I believe that a free download is still available." There's a variety of downloading options linked from: Higher-Order Perl. — Ken	[reply]

update