CAdood has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to parse a long CSV that has many components that are repeated, and I'm trying to collapse it down within particular fields

Headers are on the top line, and I'm trying to build a hash array based on the names found therein, so this is dynamic.

A B C D E F G (header line)
a b c d E'f g (resulting new line)
a b c D'e f g (resulting new line)
a b c d E'f g (deleted, with E' embedded within last recent new line)
a b c d E'f g (deleted, with E' embedded within last recent new line)
a b c D'e f g (resulting new line)

The input above would collapse from 6 lines to 4

The output changes based on the X' notation within the file, but a b c stays the same for the most part.

The output would be collapsed (where a b c will be the same, and when changes in D or E change, they are concatenated with prior lines with same A', B's, and C's)

The question is how I can dynamically build an array_name{A}{B}{C}{D}{E}{F}{G} based on header names, and then be able to modify anything in that hash array, given the number of fields could change from file to file, and I want to possibly modify an element (to concatenate) and toss out extra lines and leaving the minimal unique combinations of A B C

For example, these files would have the same fileserver name (A), Directory Path and file (B), Inheritance value (C), Permissions (D), username (E), etc. The username and permissions will change but the commercial software outputting this only knows to have 1 line per unique database record (each user has specific permissions on a file found on a fileserver, and I'm trying to collapse lines down as much so that everyone with the same permission level on a file are all seen on the same line)

I'm stumped on how to build a dynamic array with arbitrary depth and then be able to change items in that array of a table to reflect changes.

Am I over-thinking this by only seeing a hash array solution? It seems ideal. I just don't know how to build an array as mentioned above on the fly, and then access any component I need to concatenate on or compare with.
  • Comment on Building a dynamic array or some other method?

Replies are listed 'Best First'.
Re: Building a dynamic array or some other method?
by GrandFather (Saint) on Apr 23, 2024 at 03:47 UTC

    You really, really, really don't want to do that, unless you don't mean what I think you mean. As cavac suggests some sample code showing what you are trying to do would help a lot. Maybe this is a useful starting point:

    use strict; use warnings; use Text::CSV; use Data::Dumper; my $csvFile = <<CSV; A,B,C,D,E,F,G a,b,c,d,E',f,g a,b,c,D',e,f,g a,b,c,d,E',f,g a,b,c,d,E',f,g a,b,c,D',e,f,g CSV open my $fIn, '<', \$csvFile; my $csv = Text::CSV->new(); $csv->column_names($csv->getline($fIn)); my $arrayref = $csv->getline_hr_all($fIn); print Dumper($arrayref);

    Prints:

    $VAR1 = [ { 'E' => 'E\'', 'G' => 'g', 'B' => 'b', 'D' => 'd', 'C' => 'c', 'A' => 'a', 'F' => 'f' }, { 'C' => 'c', 'F' => 'f', 'A' => 'a', 'E' => 'e', 'B' => 'b', 'D' => 'D\'', 'G' => 'g' }, { 'D' => 'd', 'B' => 'b', 'G' => 'g', 'E' => 'E\'', 'F' => 'f', 'A' => 'a', 'C' => 'c' }, { 'G' => 'g', 'D' => 'd', 'B' => 'b', 'E' => 'E\'', 'A' => 'a', 'F' => 'f', 'C' => 'c' }, { 'C' => 'c', 'A' => 'a', 'F' => 'f', 'E' => 'e', 'G' => 'g', 'B' => 'b', 'D' => 'D\'' } ];
    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
      Text::CSV is my first foray into modules, and data manipulation using them.

      Thank you for the example of building dynamic arrays! With the example seen in the code, you can tell I'm trying to learn how to work through them, but in seeing your code, it's obvious I am stuck in the "old ways".

      So I'll have to figure out how to parse through this, and take values from one array and append it to values seen in another array - maintaining comma separation properly, and then delete the arrays I pulled the info from, and having condensed files when I try to dump the condensed data back into CSV files for the customers. I'm going through an entire fileserver to extract these values, and the files are absolutely HUGE, so compacting them down is imperative. (The software dumping these CSV reports is only looking for files with sensitive data in it, and we're looking at who has access to those files.)

      I played a little with excel and pivot tables, but I can't get the concatenation of accounts into a single field, just a list of accounts under each filename and permissions group, and the length is still very long. Some of these files have 12-13+ lines apiece (permutations of access rights and usernames for each file), and I'm hoping to condense them to perhaps 3 lines each.

      The removal of use Strict and use Warnings is because I don't want it barking at me when not using "my" for variables, because "my variables" don't show up in the interactive perl debugger. (unless I don't know how to use it - I've already been bitten by "error notices" BECAUSE I'm in debug mode.)

      Thank you for your example there!

        Always use strictures (use strict; use warnings; - see The strictures, according to Seuss). If your debugger can't cope with lexical variables you really need a better debugger! You might consider using the Komodo IDE which provides an integrated debugger, although some setup is required.

        Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re: Building a dynamic array or some other method?
by cavac (Prior) on Apr 22, 2024 at 22:58 UTC

    Can you post a small sample CSV as well as the code you've got so far? In code tags would be nice.

    You know, a typical Short, Self-Contained, Correct Example

    I'm willing to play around with this (having done many similar things in the past). But, at a minimum, i need a real life sample of the data (a few lines), as well as a what the correct result should be. You can replace any data we're not supposed to see with some decent dummy values, but seeing the actual file format makes understanding your task a bit easier. "a b c d" is a bit too abstract for me.

    PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP
      Sample CSV:
      File Server,Access Path,Current Permissions,Logon Name,Inherited From +Folders,Flags,User/Group,Classification Results,Classification Result +s by Category (Including Nested),Total Hit Count 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,@FOO NOW Onsite Support,\Common,This folder only, +Pathway12.My.Corp.com\@FOO NOW Onsite Support,IRS Data (1/1),PII (1), +1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,Administrators,\Common,This folder only,10.15.106 +.71\Administrators,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,Creator Owner,\Common,This folder only,Abstract\C +reator Owner,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,FP NOW BMG FSE NTFS Admins,\Common,This folder on +ly,Pathway12.My.Corp.com\FP NOW BMG FSE NTFS Admins,IRS Data (1/1),PI +I (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,ClusterSvcDIR,\Common,This folder only,Pathway12. +My.Corp.com\ClusterSvcDIR,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,SYSTEM,\Common,This folder only,Abstract\SYSTEM,I +RS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,MiJim,<not inherited>,This folder only,"Pathway12 +.My.Corp.com\Michaels, Jim@My",IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,MRWX,@FP DIR BMG,\Common,This folder only,Pathway12.My. +Corp.com\@FP DIR BMG,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,&CDAdmin,\Common,This folder only,Pathway12.My.Corp. +com\&CDAdmin,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,@FOO DSMS Admins,\Common,This folder only,Pathway12. +My.Corp.com\@FOO DSMS Admins,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,FOO BMG FS Support,\Common,This folder only,Pathway1 +2.My.Corp.com\FOO BMG FS Support,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,DPeterso,\Common,This folder only,"Pathway12.My.Corp +.com\Peterson, Dan@My",IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,FP BMG IMG Read Access,\Common,This folder only,Path +way12.My.Corp.com\FP BMG IMG Read Access,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,@FOO NOW Onsite Support,\Com +mon,This folder only,Pathway12.My.Corp.com\@FOO NOW Onsite Support,IR +S Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,Administrators,\Common,This +folder only,10.15.106.71\Administrators,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,Creator Owner,\Common,This f +older only,Abstract\Creator Owner,IRS Data (1/1),PII (1),1<br><br>
      Sample output (differences in permissions for each file, with users concatenated having the same permissions)
      (when someone shows "<Not Inherited>", I'll put a "(*)" in as an identifier)
      File Server,Access Path,Current Permissions,Logon Name,Inherited From +Folders,Flags,User/Group,Classification Results,Classification Result +s by Category (Including Nested),Total Hit Count 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,"@FOO NOW Onsite Support,Administrators,Creator O +wner,FP NOW BMG FSE NTFS Admins,ClusterSvcDIR,SYSTEM,MiJim(*)",\Commo +n,This folder only,Pathway12.My.Corp.com\@FOO NOW Onsite Support,IRS +Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,MRWX,@FP DIR BMG,\Common,This folder only,Pathway12.My. +Corp.com\@FP DIR BMG,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,"&CDAdmin,@FOO DSMS Admins,FOO BMG FS Support,DPeter +so,FP BMG IMG Read Access",\Common,This folder only,Pathway12.My.Corp +.com\&CDAdmin,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,"@FOO NOW Onsite Support,Adm +inistrators,Creator Owner",\Common,This folder only,Pathway12.My.Corp +.com\@FOO NOW Onsite Support,IRS Data (1/1),PII (1),1
      Code: it shows lots of iterative notes
Re: Building a dynamic array or some other method?
by tybalt89 (Monsignor) on Apr 23, 2024 at 16:52 UTC

    If what I think you are trying to do is correct, it can't be done because the fields following the username are also different, so there is no way to combine lines.

    However, here's code to build a nested structure (with common parents) from the sample data you provided, and also a way to print that nested structure in indented form.

    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11159049 use warnings; use Text::CSV; $SIG{__WARN__} = sub { die @_ }; my %database; my $csv = Text::CSV->new; my @headers = $csv->getline( *DATA )->@*; while( my $fields = $csv->getline( *DATA ) ) { my $ref = \%database; $ref = $ref->{$_} //= {} for @$fields; } #use Data::Dump 'dd'; dd \%database; print "\n\n------ Indented Form ----------\n\n", show( \%database ); sub show { my $db = shift; my $answer = ''; for my $key ( sort keys %$db ) { $answer .= join '', "$key\n", show($db->{$key}) =~ s/^(?=.)/ /gm +r; } return $answer; } __DATA__ File Server,Access Path,Current Permissions,Logon Name,Inherited From +Folders,Flags,User/Group,Classification Results,Classification Result +s by Category (Including Nested),Total Hit Count 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,@FOO NOW Onsite Support,\Common,This folder only, +Pathway12.My.Corp.com\@FOO NOW Onsite Support,IRS Data (1/1),PII (1), +1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,Administrators,\Common,This folder only,10.15.106 +.71\Administrators,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,Creator Owner,\Common,This folder only,Abstract\C +reator Owner,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,FP NOW BMG FSE NTFS Admins,\Common,This folder on +ly,Pathway12.My.Corp.com\FP NOW BMG FSE NTFS Admins,IRS Data (1/1),PI +I (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,ClusterSvcDIR,\Common,This folder only,Pathway12. +My.Corp.com\ClusterSvcDIR,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,SYSTEM,\Common,This folder only,Abstract\SYSTEM,I +RS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,MiJim,<not inherited>,This folder only,"Pathway12 +.My.Corp.com\Michaels, Jim@My",IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,MRWX,@FP DIR BMG,\Common,This folder only,Pathway12.My. +Corp.com\@FP DIR BMG,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,&CDAdmin,\Common,This folder only,Pathway12.My.Corp. +com\&CDAdmin,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,@FOO DSMS Admins,\Common,This folder only,Pathway12. +My.Corp.com\@FOO DSMS Admins,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,FOO BMG FS Support,\Common,This folder only,Pathway1 +2.My.Corp.com\FOO BMG FS Support,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,DPeterso,\Common,This folder only,"Pathway12.My.Corp +.com\Peterson, Dan@My",IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,FP BMG IMG Read Access,\Common,This folder only,Path +way12.My.Corp.com\FP BMG IMG Read Access,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,@FOO NOW Onsite Support,\Com +mon,This folder only,Pathway12.My.Corp.com\@FOO NOW Onsite Support,IR +S Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,Administrators,\Common,This +folder only,10.15.106.71\Administrators,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,Creator Owner,\Common,This f +older only,Abstract\Creator Owner,IRS Data (1/1),PII (1),1

    Outputs:

    ------ Indented Form ---------- 10.15.106.71 /Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subscription R +enewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Renewal FINAL. +pdf FMRWX @FOO NOW Onsite Support \Common This folder only Pathway12.My.Corp.com\@FOO NOW Onsite Support IRS Data (1/1) PII (1) 1 Administrators \Common This folder only 10.15.106.71\Administrators IRS Data (1/1) PII (1) 1 ClusterSvcDIR \Common This folder only Pathway12.My.Corp.com\ClusterSvcDIR IRS Data (1/1) PII (1) 1 Creator Owner \Common This folder only Abstract\Creator Owner IRS Data (1/1) PII (1) 1 FP NOW BMG FSE NTFS Admins \Common This folder only Pathway12.My.Corp.com\FP NOW BMG FSE NTFS Admins IRS Data (1/1) PII (1) 1 MiJim <not inherited> This folder only Pathway12.My.Corp.com\Michaels, Jim@My IRS Data (1/1) PII (1) 1 SYSTEM \Common This folder only Abstract\SYSTEM IRS Data (1/1) PII (1) 1 MRWX @FP DIR BMG \Common This folder only Pathway12.My.Corp.com\@FP DIR BMG IRS Data (1/1) PII (1) 1 RX &CDAdmin \Common This folder only Pathway12.My.Corp.com\&CDAdmin IRS Data (1/1) PII (1) 1 @FOO DSMS Admins \Common This folder only Pathway12.My.Corp.com\@FOO DSMS Admins IRS Data (1/1) PII (1) 1 DPeterso \Common This folder only Pathway12.My.Corp.com\Peterson, Dan@My IRS Data (1/1) PII (1) 1 FOO BMG FS Support \Common This folder only Pathway12.My.Corp.com\FOO BMG FS Support IRS Data (1/1) PII (1) 1 FP BMG IMG Read Access \Common This folder only Pathway12.My.Corp.com\FP BMG IMG Read Access IRS Data (1/1) PII (1) 1 /Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subscription R +enewal Docs/My-B8245.pdf FMRWX @FOO NOW Onsite Support \Common This folder only Pathway12.My.Corp.com\@FOO NOW Onsite Support IRS Data (1/1) PII (1) 1 Administrators \Common This folder only 10.15.106.71\Administrators IRS Data (1/1) PII (1) 1 Creator Owner \Common This folder only Abstract\Creator Owner IRS Data (1/1) PII (1) 1
      You're right that there are some differences after the unique names. I think the most likely part to change is also the more descriptive name of the user mentioned earlier, and can be dismissed. The types of data found, and the hit counts stay the same for each file, and are important to keep.

      I have a question about notation or elements found within this line, which I'm unfamiliar with.
      I'm unsure what they'd be called to even look them up
      my @headers = $csv->getline( *DATA )->@*;

      What is this called? *DATA? In C (ancient class I took) it looks like a pointer, but what is DATA? It's not defined previously.

      And what does ->@* do after that, as I see the assignment going left into @headers? Is that an array wildcard?
        *DATA is a file handle, pointing to the part of the source file after the __DATA__ mark.

        The ->@* is a post-dereference, it's an alternative way of writing

        my @headers = @{ $csv->getline( *DATA ) };

        The getline method returns an array reference, by dereferencing it we get an array which we (shallow) copy into @headers.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Building a dynamic array or some other method?
by The_Dj (Scribe) on Apr 23, 2024 at 06:54 UTC
    As far as I can tell, there are 3 questions here.

    Firstly:
    From a unix-like prompt, just do head -1 infile > outfile; cat tail -n+2 infile | sort -u >> outfile
    But if you must use Perl (yay):
    open IN,'<',$csvfile; @data=<IN>; close IN; open OUT,'>',$newfile; print OUT shift @data; foreach my $line (@data) {$uniq{$line}=1;} foreach my $line (keys %uniq) {print OUT $line;} close OUT;
    Please do the usual 'strict' and 'warning's stuff.


    Secondly, you talk about putting the data into some sort of structure in memory...
    Here I second cavac's request for a SSCCE


    Thirdly, to create deep structures, mostly Perl will 'Just do it for you'.
    E.G. I can do $data{A}{B}{C}=[0,1]; $data{A}{X}=[5]; and I'll end up with the weird monstrosity I asked for:
    DB<2> x \%data 0 HASH(0x555e18bb11b8) 'A' => HASH(0x555e184f6448) 'B' => HASH(0x555e18bb1218) 'C' => ARRAY(0x555e18bb1290) 0 0 1 1 'X' => ARRAY(0x555e18bb1278) 0 5 DB<3>

    HTH!

      I'd recommend always putting [SSCCE] in brackets, to make a link: SSCCE.

        Thank you for the link to SSCCE. It's very helpful, and points out many areas of improvement for my future posts.
      Sadly, the file is already sorted, and making it uniq based on one field isn't enough. Thank you though!

      Your deep structure uh, scares me. I'm just learning how to access the structure in $aoh (array of hashes) above and it's already been painful enough! (I'm crying uncle here!) I like where cavac was going with it. De-referencing arrays has been very painful for me.

      I knew I was going to get in trouble for the lack of use Strict and use Warning. (LOL)

      I don't know what HTH! means. (I'm new here)
        HTH = Hope This Helps

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
        Both solutions I offered filter the entire line.
        Not just one field.

        Unless I've totally misunderstood what you mean by CSV?

        (As an aside, if the source is already filtered, unix has a command uniq that removes duplicate lines from pre-sorted files,so in my first solution just change sort -u for uniq)
Re: Building a dynamic array or some other method?
by GrandFather (Saint) on Apr 23, 2024 at 22:20 UTC

    Maybe you want something like:

    use strict; use warnings; use Text::CSV; use Data::Dumper; my $csvFile = <<CSV; A,B,C,D,E,F,G a1,b,c,"D,1",E1,f,g a1,b,c,"D,2",E1,f,g a1,b,c,"D,1",E2,f,g a2,b,c,"D,1",E3,f,g a2,b,c,"D,2",E3,f,g CSV open my $fIn, '<', \$csvFile; my $csv = Text::CSV->new(); $csv->column_names($csv->getline($fIn)); my $arrayref = $csv->getline_all($fIn); my %dataHash; $dataHash{join "\t", @{$_}[0, 1, 2, 5, 6]}{$_->[3]}{$_->[4]} = 1 for @ +$arrayref; for my $key (sort keys %dataHash) { my @fields = split "\t", $key; my @dKeys = sort keys %{$dataHash{$key}}; my %eKeys; $eKeys{$_} = 1 for map {keys %{$dataHash{$key}{$_}}} @dKeys; my $dField = join '; ', @dKeys; my $eField = join '; ', sort keys %eKeys; splice @fields, 3, 0, ($dField, $eField); $csv->say(*STDOUT, \@fields); }

    Prints:

    a1,b,c,"D,1; D,2","E1; E2",f,g a2,b,c,"D,1; D,2",E3,f,g

    Update: small bug to deal with repeated E column values

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re: Building a dynamic array or some other method?
by Marshall (Canon) on Apr 27, 2024 at 12:48 UTC
    In your SOPW query, you seem open to "some other method?". So, below I will demo another possible technique for you using SQlite + some Perl code. The DB is good at figuring out unique combinations of columns. Below, I did that for 2 columns. Then for each of the combinations, I got the logons associated with them. I did not fiddle with the inherited column. But other than that, this produces the desired output that you listed.

    It sounds like you are working a large dataset. Using a DB could be much more flexible than some complex Perl hash of hash code. I wanted to make you aware of this possibility.

    use strict; use warnings; use Text::CSV qw(csv); use Data::Dump qw(pp); use DBI; my $aoa = csv (in => \*DATA); # as array of array #pp $aoa; my $dbfile = "testing.sqlite"; my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","",{RaiseError = +> 1}) or die "Couldn't connect to database: " . DBI->errstr; $dbh->do("DROP TABLE IF EXISTS mytable"); $dbh->do ("CREATE TABLE mytable (Server text, Path text, Permissions t +ext, logon text, inheritedFrom text)"); my $insert = $dbh->prepare ("INSERT INTO mytable (Server, Path, Permissions,logon, inher +itedFrom) VALUES (?,?,?,?,?)"); my $get_uniquePathPer = $dbh->prepare ("SELECT Path, Permissions FROM +mytable GROUP BY Path, Permissions"); my $get_logons = $dbh->prepare("SELECT logon FROM mytable WHERE Path=? and Permissions =?"); + # import CSV data into mytable # $dbh->begin_work; shift @$aoa; #throw away header foreach my $csvRef (@$aoa) { $insert->execute((@$csvRef)[0..4]); } $dbh->commit; # Get unique combinations of Path and Permissions # $get_uniquePathPer->execute(); my $aoaPathPer = $get_uniquePathPer->fetchall_arrayref; # For each unique combination get all logons # foreach my $combo_ref (@$aoaPathPer) { print "$combo_ref->[0]\n"; print "$combo_ref->[1]\n"; $get_logons->execute($combo_ref->[0], $combo_ref->[1]); my $aoa_logons = $get_logons->fetchall_arrayref; my @logons = map{@$_}@$aoa_logons; print join (",", @logons),"\n\n"; } =Outputs /Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subscription Rene +wal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Renewal FINAL.pdf FMRWX @FOO NOW Onsite Support,Administrators,Creator Owner,FP NOW BMG FSE NT +FS Admins,ClusterSvcDIR,SYSTEM,MiJim /Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subscription Rene +wal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Renewal FINAL.pdf MRWX @FP DIR BMG /Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subscription Rene +wal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Renewal FINAL.pdf RX &CDAdmin,@FOO DSMS Admins,FOO BMG FS Support,DPeterso,FP BMG IMG Read +Access /Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subscription Rene +wal Docs/My-B8245.pdf FMRWX @FOO NOW Onsite Support,Administrators,Creator Owner =cut __DATA__ File Server,Access Path,Current Permissions,Logon Name,Inherited From +Folders,Flags,User/Group,Classification Results,Classification Result +s by Category (Including Nested),Total Hit Count 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,@FOO NOW Onsite Support,\Common,This folder only, +Pathway12.My.Corp.com\@FOO NOW Onsite Support,IRS Data (1/1),PII (1), +1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,Administrators,\Common,This folder only,10.15.106 +.71\Administrators,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,Creator Owner,\Common,This folder only,Abstract\C +reator Owner,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,FP NOW BMG FSE NTFS Admins,\Common,This folder on +ly,Pathway12.My.Corp.com\FP NOW BMG FSE NTFS Admins,IRS Data (1/1),PI +I (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,ClusterSvcDIR,\Common,This folder only,Pathway12. +My.Corp.com\ClusterSvcDIR,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,SYSTEM,\Common,This folder only,Abstract\SYSTEM,I +RS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,MiJim,<not inherited>,This folder only,"Pathway12 +.My.Corp.com\Michaels, Jim@My",IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,MRWX,@FP DIR BMG,\Common,This folder only,Pathway12.My. +Corp.com\@FP DIR BMG,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,&CDAdmin,\Common,This folder only,Pathway12.My.Corp. +com\&CDAdmin,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,@FOO DSMS Admins,\Common,This folder only,Pathway12. +My.Corp.com\@FOO DSMS Admins,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,FOO BMG FS Support,\Common,This folder only,Pathway1 +2.My.Corp.com\FOO BMG FS Support,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,DPeterso,\Common,This folder only,"Pathway12.My.Corp +.com\Peterson, Dan@My",IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,FP BMG IMG Read Access,\Common,This folder only,Path +way12.My.Corp.com\FP BMG IMG Read Access,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,@FOO NOW Onsite Support,\Com +mon,This folder only,Pathway12.My.Corp.com\@FOO NOW Onsite Support,IR +S Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,Administrators,\Common,This +folder only,10.15.106.71\Administrators,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,Creator Owner,\Common,This f +older only,Abstract\Creator Owner,IRS Data (1/1),PII (1),1
      Thank you Marshall. Apologies for the late response. Medical emergency in the family.

      I will look over your code! I don't have SQL experience, and so I'm interested in learning through what you've posted.

      Again, thank you!
Re: Building a dynamic array or some other method?
by The_Dj (Scribe) on Apr 24, 2024 at 02:05 UTC
    I did this that almost exactly reproduces the output in your example, :
    #!perl use strict; use warnings; use Text::CSV qw( csv ); my $magic_column = 3; #0-indexed my $aoa = csv( in => $ARGV[0] ); my %uniq = (); my $head = shift @$aoa; foreach my $line (@$aoa) { my $key = join( ',', @{$line}[ 0 .. $magic_column - 1 ] ); my $suffix = join( ',', @{$line}[ $magic_column + 1 .. $#$line ] ) +; $uniq{$key} //= [ [], $suffix ]; push @{ $uniq{$key}[0] }, $line->[$magic_column]; } print join( ',', @$head ), "\n"; foreach my $line ( sort { lc $a cmp lc $b } keys %uniq ) { my $magic = join( ',', sort { lc $a cmp lc $b } @{ $uniq{$line}[0] + } ); if ( @{ $uniq{$line}[0] } > 1 ) { $magic = '"' . $magic . '"'; } print $line, ',', $magic, ',', $uniq{$line}[1], "\n"; }
    The two difrences are:
    Your output includes the string MiJim(*) but the (*) doesn't appear anywhere in youur source data, and nothing in your question explains where it would come from.

    Secondly, I don't know what order you reproduce the 'combined' column 3. It's not native or alphabetical order 🤷‍♂️.

      The (*) notation was going to be a quick and dirty reference to a non-inherited permission for a file. I didn't mention it in the initial posting, but mentioned it in the posting where I show it between the CSV in and CSV out samples.

      The columns show:
      File Server,Access Path,Current Permissions,Logon Name,Inherited From Folders,Flags,User/Group,Classification Results,Classification Results by Category (Including Nested),Total Hit Count

      In the sample input CSV, there are a few unique files shown(column 2), and column 3 shows what permissions each user or group has to that file. So those get minimized to show 1 line per file, and it's column 4 where I start packing all the users and groups together that all have the same permissions seen in column 3.

      There's no order to the combine list of users/groups. As I'm parsing the file (sorted by server and filename and permissions), I intend to look at whether those are all the same, and if there's a user already associated with that permission for that file, I append the current line's user onto the list of users that are already having the same permissions. So it's just "whatever comes next" for the list of users.

      I'm sorry I didn't that more clear.

      I'm going to pore over what people have submitted because I have a lot to learn from those techniques. I do want to thank everyone who has provided sample code! I had training all day today, and will have another day tomorrow, and I'll look more closely at it.

      I saw someone used map above, and I failed in using it, and couldn't figure out why it failed, so I want to look over how it's being used there. I could only use it on @ARGV for some reason, and not other arrays. (I had a comment in my code that mentioned the failure)

      Thank you again! I'll be sure to ask some questions if I can't figure out how some of the code functionality works.
      The_DJ, thank you for this. I've learned a lot by slowly poring through this to learn how this worked, and realized that by making unique keys of SEVERAL fields together, that simplifies things greatly! I've run into a problem that I'm not sure how to solve. The entire file gets slurped in within a single line, but I discovered on my full data set that someone actually created filenames with commas in it, causing it to not output correctly.

      Reading through the Text::CSV documentation (which apparently uses CSV_XS), I find something about a quote_char, but it defaults to a quote anyway.
      Adding the reference to it specifically

      my $aoa = csv( in => $filename, quote_char => "\""  );
      Also proves ineffective, or rather no change. (I think that's the default anyway.)

      The input data shows it as a quoted field, but I'm not sure what I'm doing wrong.

      Input data looks like this:
      10.15.106.71,"/ifs/PH01/PH01SUB/ENTNASIS02/PH02/SMB/Share/Share6/Emplo +yee-Share/Contracts/Privacy and Disclosure/Disclosure Unit/Active Con +tracts/County/Sample County/E00526, E00595 Sample County DA/2017-202 +0 M4385566/Emails & Correspondence/Welcome Letter.doc",FMRWX,Creator +Owner,\ifs\PH01\PH01SUB\ENTNASIS02\PH02\SMB\Share\Share6\Employee-Sha +re\Contracts,This folder only,Abstract\Creator Owner,"US PII (1/1),Do +cument Passwords - 2.0 (1/1),US Social Security Number (1/1),GLBA (Gr +amm-Leach Bliley Act) (1/1)","Credentials (1),Financial (1),PII (2)", +4


      After focusing a bit on the 1st part of the data (before the magic field), I noticed the quoted commas issue was there in all other lines containing more than one sensitive data type the entire time. (after the magic field)

      So, somehow I need to figure out how to slurp in (as one field) anything with quotes and having commas within, but still comma separated.

      Am I missing something too obvious in the docs? Text::CSV#quote_char

        I'm not sure what it is that you're doing which has resulted in abnormal operation but here it is as an SSCCE.

        #!/usr/bin/env perl use strict; use warnings; use Text::CSV_XS; my $csv = Text::CSV_XS->new; my $line = '10.15.106.71,"/ifs/PH01/PH01SUB/ENTNASIS02/PH02/SMB/Share/Share6/Em +ployee-Share/Contracts/Privacy and Disclosure/Disclosure Unit/Active +Contracts/County/Sample County/E00526, E00595 Sample County DA/2017- +2020 M4385566/Emails & Correspondence/Welcome Letter.doc",FMRWX,Creat +or Owner,\ifs\PH01\PH01SUB\ENTNASIS02\PH02\SMB\Share\Share6\Employee- +Share\Contracts,This folder only,Abstract\Creator Owner,"US PII (1/1) +,Document Passwords - 2.0 (1/1),US Social Security Number (1/1),GLBA +(Gramm-Leach Bliley Act) (1/1)","Credentials (1),Financial (1),PII (2 +)",4'; $csv->parse ($line); my @fields = $csv->fields; print "Input:\n$line\n\nOutput:\n" . join "\n\n", @fields;

        This outputs the parsed fields separated by empty lines so it should be trivial to see what is contained in each field. The quote characters are honoured as expected. HTH.


        🦛

        It looks like The_DJ post was slurping in correctly, but was not using csv to output a correct csv file.

        Try this:

        #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11159049 use warnings; use Text::CSV qw( csv ); my $data = <<''; File Server,Access Path,Current Permissions,Logon Name,Inherited From +Folders,Flags,User/Group,Classification Results,Classification Result +s by Category (Including Nested),Total Hit Count 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,@FOO NOW Onsite Support,\Common,This folder only, +Pathway12.My.Corp.com\@FOO NOW Onsite Support,IRS Data (1/1),PII (1), +1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,Administrators,\Common,This folder only,10.15.106 +.71\Administrators,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,Creator Owner,\Common,This folder only,Abstract\C +reator Owner,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,FP NOW BMG FSE NTFS Admins,\Common,This folder on +ly,Pathway12.My.Corp.com\FP NOW BMG FSE NTFS Admins,IRS Data (1/1),PI +I (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,ClusterSvcDIR,\Common,This folder only,Pathway12. +My.Corp.com\ClusterSvcDIR,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,SYSTEM,\Common,This folder only,Abstract\SYSTEM,I +RS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,FMRWX,MiJim,<not inherited>,This folder only,"Pathway12 +.My.Corp.com\Michaels, Jim@My",IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,MRWX,@FP DIR BMG,\Common,This folder only,Pathway12.My. +Corp.com\@FP DIR BMG,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,&CDAdmin,\Common,This folder only,Pathway12.My.Corp. +com\&CDAdmin,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,@FOO DSMS Admins,\Common,This folder only,Pathway12. +My.Corp.com\@FOO DSMS Admins,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,FOO BMG FS Support,\Common,This folder only,Pathway1 +2.My.Corp.com\FOO BMG FS Support,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,DPeterso,\Common,This folder only,"Pathway12.My.Corp +.com\Peterson, Dan@My",IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Rene +wal FINAL.pdf,RX,FP BMG IMG Read Access,\Common,This folder only,Path +way12.My.Corp.com\FP BMG IMG Read Access,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,@FOO NOW Onsite Support,\Com +mon,This folder only,Pathway12.My.Corp.com\@FOO NOW Onsite Support,IR +S Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,Administrators,\Common,This +folder only,10.15.106.71\Administrators,IRS Data (1/1),PII (1),1 10.15.106.71,/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Subs +cription Renewal Docs/My-B8245.pdf,FMRWX,Creator Owner,\Common,This f +older only,Abstract\Creator Owner,IRS Data (1/1),PII (1),1 my %database; my $aoa = csv( in => \$data ); # FIXME change to filename my @output = shift @$aoa; # the header for my $fields ( @$aoa ) { my $ref = \%database; $ref = $ref->{$_} //= {} for @$fields; } combine( \%database ); # combine lines with common beginning csv( in => \@output, out => *STDOUT ); # FIXME change to filename sub tail { my $ref = shift; my ($key) = sort keys %$ref; $key ? ( $key, tail( $ref->{$key} ) ) : (); } sub combine { my ($ref, @lines) = @_; my @keys = sort keys %$ref; if( @keys > 1 and @lines >= 3 ) { my $group = join ',', @keys; push @output, [ @lines, $group, tail $ref->{$keys[0]} ]; } else { combine( $ref->{$_}, @lines, $_ ) for @keys; @keys or push @output, \@lines; } }

        which outputs:

        "File Server","Access Path","Current Permissions","Logon Name","Inheri +ted From Folders",Flags,User/Group,"Classification Results","Classifi +cation Results by Category (Including Nested)","Total Hit Count" 10.15.106.71,"/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Sub +scription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Ren +ewal FINAL.pdf",FMRWX,"@FOO NOW Onsite Support,Administrators,Cluster +SvcDIR,Creator Owner,FP NOW BMG FSE NTFS Admins,MiJim,SYSTEM",\Common +,"This folder only","Pathway12.My.Corp.com\@FOO NOW Onsite Support"," +IRS Data (1/1)","PII (1)",1 10.15.106.71,"/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Sub +scription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Ren +ewal FINAL.pdf",MRWX,"@FP DIR BMG",\Common,"This folder only","Pathwa +y12.My.Corp.com\@FP DIR BMG","IRS Data (1/1)","PII (1)",1 10.15.106.71,"/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Sub +scription Renewal Docs/Axidome Quote My Corp 2020211 KnowBe4 2-yr Ren +ewal FINAL.pdf",RX,"&CDAdmin,@FOO DSMS Admins,DPeterso,FOO BMG FS Sup +port,FP BMG IMG Read Access",\Common,"This folder only",Pathway12.My. +Corp.com\&CDAdmin,"IRS Data (1/1)","PII (1)",1 10.15.106.71,"/Common/Awareness and Training/KnowBe4/2020- KnowBe4 Sub +scription Renewal Docs/My-B8245.pdf",FMRWX,"@FOO NOW Onsite Support,A +dministrators,Creator Owner",\Common,"This folder only","Pathway12.My +.Corp.com\@FOO NOW Onsite Support","IRS Data (1/1)","PII (1)",1

        This looks like it correctly quotes unchanged fields that contain commas. If it doesn't for you, please post the failed lines and the code you ran to get the failed lines.