in reply to Trying to print out only unique array values-

Hi Rickman1! Let me start by cleaning up a little bit and then explaining after.
#!/usr/bin/env perl use strict; use warnings; use List::Util 1.45 qw(uniq); my $char_at=10; my $num_chars=50; open my $fh, '<:encoding(UTF-8)', $ARGV[0] or die "Could not open '$ARGV[0]' $!"; my @trnAccounts; while (my $line = <$fh>) { chomp $line; next unless $line && length($line) >= $char_at; push @trnAccounts, substr($line, $char_at, $num_chars); } my @unique_accounts = uniq(@trnAccounts); for my $i (0..$#unique_accounts) { print $i, ":", $unique_accounts[$i], "\n"; } print "total: ", scalar(@trnAccounts), "\n"; print "uniq : ", scalar(@unique_accounts), "\n";
Some of the ways you were using trim() and putting elements into an array were off. Once cleared up, the code above should hopefully be a little easier to follow. Also, later versions of List::Util contain a uniq function. Use that instead. Hope that helps.

Replies are listed 'Best First'.
Re^2: Trying to print out only unique array values-
by rickman1 (Novice) on Aug 16, 2016 at 19:23 UTC

    Dude you rock! For the most part I modified your example so that it almost works as expected. I had to blow away the UTF-8 encoding part due to errors. Then I modified the '$i (1..$#unique_accounts)' so output would not be zero based. Then I included some added conditions so that it would not process header & trailer records. So output is:
    1:#########
    2:#########
    3:#########
    4:#########
    5:#########
    total: 370659
    uniq : 6
    Sorry but I cannot show actual values. Notice there are 5 unique values (which is correct) yet the 'uniq' counter reads 6. Also total lines processed equals 370659, it should only equal 370657. It is obviously processing header & trailer records. Here is your example with my changes:

    use strict; use warnings; use List::MoreUtils qw(uniq); my $char_at=10; my $num_chars=50; open my $fh, $ARGV[0] #open my $fh, '<:encoding(UTF-8)', $ARGV[0] or die "Could not open '$ARGV[0]' $!"; my @trnAccounts; while (my $line = <$fh>) { my $val = substr($line, $char_at, $num_chars); if ($val eq 'H' or $val eq 'T') { my $output = $line; }else { chomp $line; next unless $line && length($line) >= $char_at; push @trnAccounts, substr($line, $char_at, $num_chars); } } my @unique_accounts = uniq(@trnAccounts); for my $i (1..$#unique_accounts) { print $i, ":", $unique_accounts[$i], "\n"; } print "total: ", scalar(@trnAccounts), "\n"; print "uniq : ", scalar(@unique_accounts), "\n";
      Hi Rickman1!
      In general a great Monk question has some data and actual code that the Monks can run. If you have private data that cannot be disclosed publically, then "dummy up" something that is an acurate representation of the actual data, but is "fake". Use made up names and account numbers, "Luke Skywalker" or whatever.

      This code has some issues that I see:

      while (my $line = <$fh>) { my $val = substr($line, $char_at, $num_chars); if ($val eq 'H' or $val eq 'T') { my $output = $line; }else { chomp $line; next unless $line && length($line) >= $char_at; push @trnAccounts, substr($line, $char_at, $num_chars); } }
      First, my $output = $line; will never be executed. And even if it is, it will do absolutely nothing. You cannot declare a "my" variable conditionally and use it elsewhere. Use it immediately or not at all. So this whole "if" clause is "nonsense".

      Will the condition if ($val eq 'H' or $val eq 'T') ever be satisfied? I think not. $val looks like it is a string of 50 characters, starting at $char_at. This will never equal a single character comparison. Some regex might work, but single character, I think not. You have not chomped the line endings, and this line ending in $val will prevent the "match".

      In the "else" clause, the appears to be confusion. next unless $line, will always work! Right up front, while (my $line = <$fh>) says that $line is true otherwise the loop doesn't proceed. push @trnAccounts, substr($line, $char_at, $num_chars);. Well that substr is just $val.

      I suggest that you have another "go" at this. Generate say 10 example lines, show your code to process those lines and how it fails. Keep simplifying the example until you cannot reproduce the problem any more. Make it as simple as possible. This process may help you discover your own problem.

      Close. There were a few issues with your updates. So, I think I covered those and commented the code enough to make it more understandable here:
      #!/usr/bin/env perl use strict; use warnings; use List::MoreUtils qw(uniq); my $char_at=10; # Character to start grabbing data in the line my $num_chars=50; # Number of characters to grab in the line my @trnAccounts; # our array to store results # open our file open my $fh, '<', $ARGV[0] or die "Could not open '$ARGV[0]' $!"; # go through our file, line by line while (my $line = <$fh>) { chomp $line; # trim off trailing newline character first # OR, you could trim the line using regular expressions # $line =~ s/\A\s*//; # trim beginning # $line =~ s/\s*\z//; # trim end # OR, you could trim using Scalar::Util's trim or something # $line = trim($line); # skip this line completely if it doesn't contain the info we want next unless $line && length($line) >= $char_at; # grab the uppercased version of the first character in the line my $first_char = uc(substr($line,0,1) || ''); # skip this line if that first character's an H or T next if ($first_char eq 'H' or $first_char eq 'T'); # otherwise, push a portion of the line onto our array push @trnAccounts, substr($line, $char_at, $num_chars); } # get a unique list of info we stored. my @unique_accounts = uniq(@trnAccounts); # arrays are zero-based. for my $i (0..$#unique_accounts) { # still zero-based, but display it as 1-based print $i+1, ":", $unique_accounts[$i], "\n"; } # print out some totals print "total: ", scalar(@trnAccounts), "\n"; print "uniq : ", scalar(@unique_accounts), "\n";
      I hope that's more clear. Please speak up if something doesn't make sense. I apologize for not documenting the first attempt a bit more clearly.

        Thank you very much genio!
        I didn't realize I had not initialized my $val variable properly. That if statement is no longer nonsense and does the trick. Now header and trailer records are ignored and not processed, thus my count is accurate. I will live with the fact that the unique values it finds are displayed in a 0 based count format, no big deal as it seems to be working like a charm. BTW, did you know that your user name (genio) in spanish means genius? I am guessing you do... lol. Here is the output of a much simpler file:
        0:123456789
        1:987654321
        2:123654874
        3:785471236
        4:951234569
        5:753698741
        6:478478478
        ============
        Total Records Processed: 10
        Total Unique Values Found : 7

        And here is the modified code:

        use strict; use warnings; use List::MoreUtils qw(uniq); my $char_at=10; my $num_chars=50; my $val; my @trnAccounts; #open my $fh, $ARGV[0] #open my $fh, '<:encoding(UTF-8)', $ARGV[0] open my $fh, '<:encoding(cp1252)', $ARGV[0] or die "Could not open '$ARGV[0]' $!"; while (my $line = <$fh>) { $val = substr($line, 0, 1); chomp $val; if ($val eq 'H' or $val eq 'T') { my $output = $line; }else { chomp $line; next unless $line && length($line) >= $char_at; push @trnAccounts, substr($line, $char_at, $num_chars); } } my @unique_accounts = uniq(@trnAccounts); print "\n"; for my $i (0..$#unique_accounts) { print $i, ":", $unique_accounts[$i], "\n"; } print "============\n"; print "Total Records Processed: ", scalar(@trnAccounts), "\n"; print "Total Unique Values Found : ", scalar(@unique_accounts), "\n";

        THANK YOU GENIO!