Re: Removing multiple lines

From your example what distinguishes a block appears to be the first field (ID?), or name of the person. Is that true?

However, your example doesn't match your description:

Before Snip ---
 "SimpsonH","Homer","Simpson","NULL","648","0","218 555","Nuclear Cont
+rol","Nuclear Operator","SimpsonM"
 "SimpsonH","Homer","Simpson","NULL","647","0","218 555","Nuclear Cont
+rol","Nuclear Operator","SimpsonM"
 "SimpsonH","Homer","Simpson","NULL","648","0","218 555","Nuclear Cont
+rol","Nuclear Operator","BurnsM"
After Snip ---
 "SimpsonH","Homer","Simpson","NULL","648","0","218 555","Nuclear Cont
+rol","Nuclear Operator","SimpsonM"
[download]

The line containing "SimpsonM" and "648" is the first in the block.
??

In any case, consider:

#!/usr/bin/perl
use strict;

my %output_hash;
while(<DATA>) {
    my ($id_field, @undef) = split /,/, $_;
    $output_hash{$id_field} = $_;
}

# Quickie Print
print values %output_hash;

# Or loop around it if there is more to be done:
# foreach my $id_key (sort keys %output_hash) {
#   the more to be done stuff
#    print $output_hash{$id_key};
#}

__DATA__
 "SimpsonH","Homer","Simpson","NULL","648","0","218 555","Nuclear Cont
+rol","Nuclear Operator","SimpsonM"
 "SimpsonH","Homer","Simpson","NULL","647","0","218 555","Nuclear Cont
+rol","Nuclear Operator","SimpsonM"
 "SimpsonH","Homer","Simpson","NULL","648","0","218 555","Nuclear Cont
+rol","Nuclear Operator","BurnsM"
 "SimpsonB","Bart","Simpson","NULL","748","0","218 555","Springfield E
+lementary","Student","SimpsonM"
 "SimpsonB","Bart","Simpson","NULL","748","0","218 555","Springfield E
+lementary","Student","SimpsonH"
 "SimpsonB","Bart","Simpson","NULL","748","1","218 555","Springfield E
+lementary","Student","SkinnerP"
[download]

Getting the output sorted to suit is left as an exercise for you.

Also consider using Text::CSV to manipulate CSV data like the type you've presented as an example.

Be Appropriate && Follow Your Curiosity

Comment on Re: Removing multiple lines Select or Download Code

Replies are listed 'Best First'.
Re^2: Removing multiple lines by rycher (Acolyte) on May 04, 2009 at 00:58 UTC
I solved it by adding more data to the beginning...so basically, I cheated by not using PERL. :-\ There is an audit_date stamp in the MySQL database where this information is being extracted from. I simply added the audit_date field and removed everything that wasn't 'audited' in 2009. Perhaps not the most ideal solution since that particular database gets audited twice a year, but it will do for now.	[reply]

Replies are listed 'Best First'.

Re^2: Removing multiple lines
by rycher (Acolyte) on May 04, 2009 at 00:58 UTC

There is an audit_date stamp in the MySQL database where this information is being extracted from.

I simply added the audit_date field and removed everything that wasn't 'audited' in 2009.

Perhaps not the most ideal solution since that particular database gets audited twice a year, but it will do for now.

[reply]