fritz1968 has asked for the wisdom of the Perl Monks concerning the following question:

I am not sure what I need nor how to do it (even after reading through the past discussions). It's all very confusing to me. Let me explain my scenario:

I want to parse through my downloaded bank file (a CSV file and I already figured out how to parse it correctly). I have figured out how to read through each line and separate the debits and credits into various variables such as:

Payments (to me):
Shopify
Etsy
PayPal
Misc
And the same with debits: Supplies
Shipping
Ads
Online

The issue that I am running into is that I need to split all of this off on an per month basis. My logical mind is telling me to use an @array where $array1 is January, $array2 is February, etc... for each month. (Yes, I am skipping $array[0], just to keep it simple) I could add a hash to the array to corraliate for the month using the key names above. I want to add those values on a dymanic basis as I reiterate through the CSV file. Hopefully, I explained that correctly.

My issue is that I do not know how to reference, for example, the Paypal payment to me for, say, March. and add these totals on an ongoing basis? For example

CSV LINES:

Debit,03/06/2024,Paypal paymemnt, 25.25,debit_card
Debit,03/09/2024,Paypal paymemnt, 10.13,debit_card
Debit,03/15/2024,Etsy paymemnt, 52.23,debit_card
Debit,03/22/2024,Paypal paymemnt, 16.52,debit_card

#/usr/bin/perl # use strict; use warnings; use Text::CSV; my @chaseArray; my $chaseCSV = Text::CSV->new ({ binary => 1 }); open my $inFile, '<', "chase.csv" or die $!; my @chaseList; my ($cCosts, $cShip, $cGrovery, $cSocial, $cMonthly, $cMisc, $cShopify +, $cEtsy, $cPaypal, $cVenmo, $csMisc, $cTotal) = 0; while (my $row = $chaseCSV->getline ($inFile)) { @chaseList = @$row; my $month = substr ($chaseList[1], 0, ); if ( $chaseList[2] =~ /.*Shopify.*/ ) { my $temp = $chaseList[3] + %@chaseArray[$month]{Shopify}; # @%chaseArray[$month]{Shopify} += $chaseList[3]; %@chaseArray[$month]{Shopify} = $temp; } } print Dumper \@chaseArray;

I keep getting these errors and I am not sure how to fix it. Any help would be greatly apprecaited.


Note:
Once I get the shopify payments adding correctly, I'll put some elsif statements in there for the rest of the items on my list.

Replies are listed 'Best First'.
Re: Hash of Arrays? OR Array of Hashes
by GrandFather (Saint) on Dec 10, 2024 at 23:11 UTC

    The usual answer for a task such as this would be to use a database. To get a flavor of database use you might like my tutorial Databases made easy. Note that DBD::CSV lets you treat a CVS file as a database!

    Something to think about is how much information about the structure of your data is reflected by the code. Neither arrays or hashes are a great fit for storing your data and manipulating it which may be why you are having trouble thinking about and coding stuff up!

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re: Hash of Arrays? OR Array of Hashes
by choroba (Cardinal) on Dec 10, 2024 at 23:28 UTC
    They say any character combination is a valid Perl code. But it's not true. There's no sigil combination like %@. You are interested in a scalar value, so use $chaseArray[$month]{Shopify}.

    Also, to extract the month from the horribly MM/DD/YYYY formatted date, you need to tell substr to use 2 characters.

    Here's an example that actually does something:

    #!/usr/bin/perl use strict; use warnings; use Text::CSV_XS; my $chaseCSV = 'Text::CSV_XS'->new ({ binary => 1 }); my $inFile = *DATA; my @chaseArray; while (my $row = $chaseCSV->getline($inFile)) { my @chaseList = @$row; my $month = substr $chaseList[1], 0, 2; if ($chaseList[2] =~ /(Paypal|Etsy)/) { my $company = $1; my $temp = $chaseList[3] + ($chaseArray[$month]{$company} // 0 +); $chaseArray[$month]{$company} = $temp; } } use Data::Dumper; print Dumper \@chaseArray; __DATA__ Debit,01/06/2024,Paypal paymemnt, 25.25,debit_card Debit,01/09/2024,Paypal paymemnt, 10.13,debit_card Debit,03/15/2024,Etsy paymemnt, 52.23,debit_card Debit,03/22/2024,Paypal paymemnt, 16.52,debit_card

    The output:

    $VAR1 = [ undef, { 'Paypal' => '35.38' }, undef, { 'Paypal' => '16.52', 'Etsy' => '52.23' } ];
    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Hash of Arrays? OR Array of Hashes
by marinersk (Priest) on Dec 11, 2024 at 11:38 UTC

    Just another dimension of thinking -- absolutely not to detract in any way from the excellent posts thus far -- your line:

    @chaseList = @$row;

    You claim you don't fully understand referencing, but it seems like de-referencing is the actual issue. This line is succinct; $row is a scalar which holds a reference to an array. By preceding this with an array indicator (@$row), you have successfully dereferenced it. You then assign this to your array variable @chaseList, making it easier to deal with later in your code.

    As an aside, this relationship can also be expressed as @{$row}, which, while largely superfluous, runs the risk of helping the newbie Perl programmer who comes after you to understand your code more intuitively. I personally find this technique very useful because when I deal with deeper structures, a light version of which might be arrays of hashes of arrays, using the extra braces helps me keep track of what is being referenced without having to dereference it as a separate step. But I digress.)

    Completely off the original question, a thought about this line:

    my $month = substr ($chaseList[1], 0, );

    I would be more inclined to use split here, breaking the date into an array from three fields separated by a slash. Not only is this likely to be more forgiving of issues like a change in the format of the data source to suddenly not including leading zeroes on the month, but it makes it easier to code defensively for a change in format, say, from U.S. to European, or slashes vs. dashes, without a lot of hassle (though handling date formats is a headier topic and I'm not here to fight those religious battles today LOL)

    my @dateParts = split /[\/\-]/, $chaseList[1]; my $month = $dateParts[0]; # For purposes of example; see note belo +w

    This starts to open the possibility of handling a variety of date format changes from the source of your CSV file, though to make this truly flexible you'd have to add some bulk to the code, and I would envision a subroutine called getMonth() in order to more flexibly extract the proper month field...but even as written it handles dashes or slashes and presence or absence of a leading zero on the month field.

    And then we have the crux of your question:

    my $temp = $chaseList[3] + %@chaseArray[$month]{Shopify};

    So you have @ChaseArray, and you want the element of that array referencing the month.

    $chaseArray[$month]

    Now, without commenting on the rest of the code for the moment, this line suggests you want that to be an array of hashes, referencing that hash by the vendor.

    For readability's sake, I'd accept the performance hit of an extra line of code or two:

    $vendor = $chaseList[2]; $amount = $chaseList[3]; # Now we can directly reference the hash by vendor inside the arra +y by month number: # $chaseArray[$month]{$vendor}

    You could also make the conversion from raw array to data in one line, very Perl... and -- making some potentially terrible assumptions about the third field in the CSV -- further snag the vendor as the first word of the transaction description, if that pattern holds:

    my ( $tranType, $tranDate, $tranDesc, $tranAmount, $tranMedium, @extraStuffJustInCase ) = split /\,/, @chaseList; my ($tranVendor, @restOfDesc) = split /\s/, $tranDesc;

    Now that you've done that, it's a cinch to get back to what your commented-out code suggests may have been your original intent:

    $chaseArray[$month]{$tranVendor} += $tranAmount;

    Based on the well-crafted snippet you provided (Thank you!), and if the assumption about the vendor being identifiable strictly by the first word in what I have called the Description field, it looks like this might help you eliminate a long and source-code-maintained list of vendors by if ... elseif ... elseif (or case statement) -- and, frankly, that code simplification would be the real benefit of using a hash.