in reply to making a single column out of a two-column text file
Output:sub multicolumn_to_single_column { my $min_gutter_width = shift; my @lines = @_; # the first task is to make a regex pattern from the input data, # so that it knows where all the space and non-space columns are. my $mask = ''; $mask |= $_ for @lines; # this is a hack: it only works because the ascii space character # has only one bit set. Note that this solution probably won't # handle input well that uses tabs for spacing. my @pattern = do { my $p; map { $p++ % 2 ? ".{$_}" : "(.{$_})" } map { length } split /( {$min_gutter_width,})/, $mask }; # you could dump @pattern here to see what's really going on. my $ncols = 1+@pattern >> 1; my $pattern = join '', @pattern; # Now that we have the pattern, use it to parse all the input line +s # into rows of columns of text. The separating spaces are ignored +. my @rows = map { $_ .= ' ' x (length($mask) - length($_)); [ /$pattern/ ] # oooooo! } @lines; # Finally, we're left with the simple matter of inverting the matr +ix # for output. map { my $c = $_; map { $rows[$_][$c] } 0 .. $#rows } 0 .. $#{$rows[0]} } # example. # Note that given sample data requires a gutter width of 5. # Any less, and the page number column on the right is seen # as a separate column; any more, and the two columns won't # be seen as distinct. my @lines = <DATA>; chomp @lines; for ( multicolumn_to_single_column( 5, @lines ) ) { print "$_\n"; } __DATA__ Indice 1. KETER 4. HESED + 131 1. Quando la luce dell'infinito 2. Abbiamo diversi e curiosi orologi 23. L'analogia dei contrar +i 133 24. Sauvez la faible Aisch +a 136 2. HOKMAH 25. Questi misteriosi iniz +iati 139 26. Tutte le tradizioni de +lla terra 141 3. In hanc utilitatem clementes angeli
Indice 1. KETER 1. Quando la luce dell'infinito 2. Abbiamo diversi e curiosi orologi 2. HOKMAH 3. In hanc utilitatem clementes angeli 4. HESED 131 23. L'analogia dei contrari 133 24. Sauvez la faible Aischa 136 25. Questi misteriosi iniziati 139 26. Tutte le tradizioni della terra 141
jdporter
The 6th Rule of Perl Club is -- There is no Rule #6.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: making a single column out of a two-column text file
by allolex (Curate) on Feb 26, 2003 at 13:32 UTC |