My preference would be to first decompose the notes field into its parts (all of them) and to store it into some structure (like a hash) and only then extract the information you are looking for.
Here, a split would do exactly that using captures in the regex to keep the field names.
use strict; use warnings; use Data::Dumper; while(<DATA>){ my( $notesField )= /"(.*)"/; my @parts = split /\s?(First Name|Last Name|Address|City|State|ZIP + Code|E-mail): /, $notesField; shift @parts; my %parts = @parts; print Dumper \%parts; } __DATA__ ,,,,,,,,,,,,,,,,,,,,,,,,,"First Name: Dobbin Last Name: David L. Addre +ss: david@adamsonanddobbin.com City: PO Box 1326407 Pido Road State: +Peterborough ZIP Code: ON Country: K9J 7H5 First Name: Dobbin Last Na +me: David L. E-mail: david@adamsonanddobbin.com Address: PO Box 13264 +07 Pido Road City: Peterborough State: ON ZIP Code: K9J 7H5",,,,,,Hom +e,743 7790,Other,742 4524,Work,745 5751,,,,,,,,,,,Adamson And Dobbin +Ltd. Mechanical Contractors,,General Manager,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,"First Name: Chapleau Last Name: Kathy, Ken A +ddress: 666 FrankFirst Name: Chapleau Last Name: Kathy, Ken City: 666 + Frank",,,,,,Home,876-9863,,,,,,,,,,,,,,,Admiralty Hall,,Accountant,, +,,,,,,,,,
The result would be a hash like this:
$VAR1 = { 'First Name' => 'Dobbin', 'ZIP Code' => 'K9J 7H5', 'Address' => 'PO Box 1326407 Pido Road', 'Last Name' => 'David L.', 'City' => 'Peterborough', 'E-mail' => 'david@adamsonanddobbin.com', 'State' => 'ON' }; $VAR1 = { 'First Name' => 'Chapleau', 'Address' => '666 Frank', 'Last Name' => 'Kathy, Ken', 'City' => '666 Frank' };
Please note that my extraction of the notes field is only done using a regex for convenience. Your approach using Text::CSV is clearly the right way to do it.
When using a hash you would lose any duplicates. So if there are two first names, only one of them would survive.
In reply to Re: Parsing a complex csv, cleaning it up, and exporting it
by hdb
in thread Parsing a complex csv, cleaning it up, and exporting it
by scotttromley
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |