http://qs1969.pair.com?node_id=11148326

MoodyDreams999 has asked for the wisdom of the Perl Monks concerning the following question:

(a paragraph)

Hey, I'm basically trying to output a excel spreadsheet and filter 2 columns of data, in this code I'm trying to just start off with filtering one column of data which is phone numbers if the phone numbers aren't 10 I export it to a new sheet. I need some help, it works, doesn't filter properly filters some of the data and it seems like it splits my list in too many ways. It halves the list and then outputs some of the data but not all of the incorrect data that. (a paragraph)

use strict; use warnings; use Excel::Writer::XLSX; use DBI; use Time::Piece; use Math::Round; ##$dbhA = DBI->connect("DBI:mysql:$VARDB_database:$VARDB_server:$VARD +B_port", "$VARDB_user", "$VARDB_pass") #or die "Couldn't connect to database: " . DBI->errstr; ##$stmt="SELECT t1.postal_code, t1.state FROM asterisk.vicidial_list +t1 inner join asterisk.vicidial_state vu on vu.state=t1.state order b +y state desc;"; #$sth = $dbhA->prepare($stmt) or die "preparing: ",$dbhA->errstr; #$sth->execute or die "executing: $stmt ", $dbhA->errstr; #$sthrows=$sth->rows; my $rowCount = 0; my $filename = "File.xlsx"; my $workbook = Excel::Writer::XLSX->new( $filename ); open(FH, "<", "Source_173_DEBT_10_31_2022_1.csv" ) or die; my $worksheet = $workbook->add_worksheet('List'); my $worksheet2 = $workbook->add_worksheet('List2'); my $format = $workbook->add_format(); $format->set_center_across(); $worksheet->set_column( 0, 0, 20 ); $worksheet->write(0, 0, "source_id" ); $worksheet->write(0, 1, "first_name" ); $worksheet->write(0, 2, "middle" ); $worksheet->write(0, 3, "last_name" ); $worksheet->write(0, 4, "address1" ); $worksheet->write(0, 5, "city"); $worksheet->write(0, 6, "state"); $worksheet->write(0, 7, "postal_code"); $worksheet->write(0, 8, "phone_number"); $worksheet->write(0, 9, "address3"); $worksheet->write(0, 10,"province"); $worksheet->write(0, 11, "email"); my $rowCount1 = my $rowCount2 = 0; while (<FH>){ my @t= split(',',<FH>); #my @ary = $t->fetchrow_array; ### if(length($t[7]) == 10) FOR one singular if statement matc +hing 10 characters # Instead of 2 statements in one for less than or Greater than + 10 > 10 || < 10 if (length($t[8]) == 10) { $worksheet2->write($rowCount1+1, 0, $t[0]); $worksheet2->write($rowCount1+1, 1, $t[1]); $worksheet2->write($rowCount1+1, 2, $t[2]); $worksheet2->write($rowCount1+1, 3, $t[3]); $worksheet2->write($rowCount1+1, 4, $t[4]); $worksheet2->write($rowCount1+1, 5, $t[5]); $worksheet2->write($rowCount1+1, 6, $t[6]); $worksheet2->write($rowCount1+1, 7, $t[7]); $worksheet2->write($rowCount1+1, 8, $t[8]); $worksheet2->write($rowCount1+1, 9, $t[9]); $worksheet2->write($rowCount1+1, 10, $t[10]); $worksheet2->write($rowCount1+1, 11, $t[11]); $rowCount1++; } else { $worksheet->write($rowCount2+1, 0, $t[0]); $worksheet->write($rowCount2+1, 1, $t[1]); $worksheet->write($rowCount2+1, 2, $t[2]); $worksheet->write($rowCount2+1, 3, $t[3]); $worksheet->write($rowCount2+1, 4, $t[4]); $worksheet->write($rowCount2+1, 5, $t[5]); $worksheet->write($rowCount2+1, 6, $t[6]); $worksheet->write($rowCount2+1, 7, $t[7]); $worksheet->write($rowCount2+1, 8, $t[8]); $worksheet->write($rowCount2+1, 9, $t[9]); $worksheet->write($rowCount2+1, 10, $t[10]); $worksheet->write($rowCount2+1, 11, $t[11]); $rowCount2++; } $rowCount++; } $workbook->close(); print $rowCount1.'-'.$rowCount2.'-'.$rowCount close(FH);
I have a list of 10610 based on my rowCount it looks like it splits it, 5298-7-5305, I see 5298 and 7, 7 is the bad data output, but the rest is missing. I think its because I used split with file handler. I would think if the reading is the issue you could inner join the 2 data sheets

Replies are listed 'Best First'.
Re: Reading, filtering one collumns' data, writing
by Marshall (Canon) on Nov 23, 2022 at 18:00 UTC
    It looks like there was a failed attempt to establish a session with a mysql server? Then I guess you got a CSV file with the data you wanted?

    Some recommendations:

    1. Manually import this CSV into an Excel spreadsheet to see what Excel itself makes of it. Under File, Open, then Excel will see that this is a CSV file and will give options on how to import it.
    2. To parse the CSV file, you should use Text::CSV. This: my @t= split(',',<FH>); is not what you want. The CSV format looks simple but in fact is devilishly difficult to parse correctly in all boundary cases.
    3. Deciding if a line is of interest or not is probably a regex that you run vs phone number. This: length($t[7]) == 10 may not be a very robust test due to imbedded spaces and the like...I would probably look at what exactly is in the CSV file before I decided what to do about it.
    4. You could put just a few lines of the CSV file between code tags in your post so that we see what it is. Choose carefully lines that illustrate you point (one with 10 digit phone number, one with different phone number) etc...
    5. If you are comfortable with SQL, there is a DBI driver for CSV files. That may or may not work out well for your application. It has been many years since I used this driver, but before SQLite, it was a pretty good tool.

    Update: See DBD::CSV for some examples of how to open and use your CSV file like a DB.

    while (<FH>){ my @t= split(',',<FH>); ....
    This will throw away half the lines ... the while statement reads a line into $_. The split reads another line.
    This is closer to what you want, but inferior to using Text::CSV. If your CSV has a field with an embedded comma: "343 Anywhere, Apt 12", then split on comma will see that internal comma and you get an "extra" token. Text::CSV will leave that embedded comma "as is".
    while (<FH>){ chomp; my @t= split(',',$_);