G'day GrandFather,
[Sorry, a bit late to the party; I haven't logged in for a week and a half.]
I don't know what sort of variations may exist for your input data. Purely on what's shown, here's how I might get the data into a canonical form suitable for subsequent processing (via split, Text::CSV, or other).
parse_000.pl:
#!/usr/bin/env perl use strict; use warnings; use autodie; my $data_file = 'data_000.txt'; my $re = qr{^([^0-9-]+?)\s+([0-9-][ 0-9.-]+?)\s*$}; my @data; open my $fh, '<', $data_file; while (<$fh>) { next unless /$re/; my ($site, $info) = ($1, $2); $info =~ y/ /\t/s; push @data, join "\t", $site, $info; } # For demo only use Data::Dump; dd \@data;
Input:
$ cat data_000.txt Annular-Total Eclipse of 2023 Apr 20 - multisite predictions 1st Contact Site Longitude Latitude Elvn U.T. PA Alt o ' o ' m h m s o o Auckland 174 45. -36 55. 0 4 33 59 313 13 Blenheim 173 55. -41 35. 30 4 40 34 326 11 Cape Palliser 175 25. -41 35. 0 4 42 28 327 9 Cape Reinga 172 45. -34 25. 50 4 30 11 307 17 Carterton 175 35. -41 5. 0 4 40 35 324 10 Dannevirke 176 5. -40 15. 200 4 39 9 321 10 East Cape 178 35. -37 45. 0 4 37 58 315 10 Featherston 175 25. -41 5. 40 4 40 36 325 10 Gisborne 178 5. -38 45. 0 4 38 29 317 10 Great Barrier Is 175 25. -36 15. 0 4 34 15 312 13
Output:
$ ./parse_000.pl [ "Auckland\t174\t45.\t-36\t55.\t0\t4\t33\t59\t313\t13", "Blenheim\t173\t55.\t-41\t35.\t30\t4\t40\t34\t326\t11", "Cape Palliser\t175\t25.\t-41\t35.\t0\t4\t42\t28\t327\t9", "Cape Reinga\t172\t45.\t-34\t25.\t50\t4\t30\t11\t307\t17", "Carterton\t175\t35.\t-41\t5.\t0\t4\t40\t35\t324\t10", "Dannevirke\t176\t5.\t-40\t15.\t200\t4\t39\t9\t321\t10", "East Cape\t178\t35.\t-37\t45.\t0\t4\t37\t58\t315\t10", "Featherston\t175\t25.\t-41\t5.\t40\t4\t40\t36\t325\t10", "Gisborne\t178\t5.\t-38\t45.\t0\t4\t38\t29\t317\t10", "Great Barrier Is\t175\t25.\t-36\t15.\t0\t4\t34\t15\t312\t13", ]
— Ken
In reply to Re: Module for parsing tables from plain text document
by kcott
in thread Module for parsing tables from plain text document
by GrandFather
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |