Perl Formatting Text

oopl1999 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Perl Formatting Text by Marshall (Canon) on Jun 23, 2016 at 00:30 UTC
Try something like this: `#!/usr/bin/perl use warnings; use strict; while (my $line = <DATA>) { next if $line =~ /^\s*$/; #skip blank lines my ($label, @rest) = split ' ', $line; @rest = sort {$a <=> $b}@rest; #numeric sort foreach my $col (@rest) { print "$label $col\n"; } } =prints A 1 B 2 B 3 B 6 C 2 C 3 C 4 C 6 C 10 =cut __DATA__ A 1 B 3 6 2 C 3 4 6 2 10` [download]	[reply] [d/l]
Re^2: Perl Formatting Text by oopl1999 (Novice) on Jun 23, 2016 at 19:50 UTC
Hi I am a little confused with your code. Where did you input the data in the code or put the name of the file that contains the data to sort?	[reply]
Re^3: Perl Formatting Text by Marshall (Canon) on Jun 23, 2016 at 21:22 UTC
The post that I made is runnable code. If you download it, it will run "as is". I hope that you did that! Instead of an actual data file, I used the predefined DATA file handle. The data that is being read is right after the __DATA__ statement. This allows me to make a single post that shows the program, the output, and the input data. What Perl does is open the .pl program for read and then "seeks" to the beginning of the line right after __DATA__. The DATA file handle is initialized by Perl without me doing anything extra. Pretty cool! As trivia, it is possible to "seek" the DATA file handle to the beginning of the file. This would allow a Perl program to actually "read itself". To make a "real program", you need to put in something like this: `open FILE, '<', "yourfilename" or "die unable to open read file $!";` [download] Now put FILE everywhere that I used DATA. The other "trick" that I used was perldoc. Perl has a way of embedding documentation right into the program. There is a utility that generates nicely formatted documentation and HTML pages using certain markup tags. The "=prints" says that what follows is documentation. The "=cut" says "end of documentation". So everything between and including the =prints and =cut tags is skipped by the compiler because it figures that this is program documentation. Asking questions if you see something that you don't understand is fine. I can't predict in advance what you know or don't know.	[reply] [d/l]
Re^4: Perl Formatting Text by Marshall (Canon) on Jun 24, 2016 at 00:28 UTC
Re: Perl Formatting Text by AnomalousMonk (Archbishop) on Jun 23, 2016 at 01:11 UTC
If this is not homework, Text::CSV_XS is your friend. (If this is homework, `Text::CSV_XS` will be your friend once you're out in the Real World.) For your test input file: c:\@Work\Perl\monks\oop11999>perl -e "use warnings; use strict; ;; use Text::CSV_XS; ;; use Data::Dump qw(dd); ;; my $csv = Text::CSV_XS->new ({ sep_char => ' ', }) or die qq{Cannot use CSV: }, Text::CSV_XS->error_diag; ;; open my $fh, '<', 'test.csv' or die qq{opening test.csv: $!}; ;; my %letters; while (my $row = $csv->getline($fh)) { my ($letter, @numbers) = @$row; push @{ $letters{$letter} }, @numbers; } $csv->eof or $csv->error_diag; close $fh or die qq{closing test.csv: $!}; ;; dd \%letters; ;; for my $letter (sort keys %letters) { print qq{$letter $_ \n} for @{ $letters{$letter} }; } " { A => [1], B => [3, 6, 2], C => [3, 4, 6, 2, 10] } A 1 B 3 B 6 B 2 C 3 C 4 C 6 C 2 C 10 [download] (The `dd \%letters;` statement is just for debug and illustration.) Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re: Perl Formatting Text by NetWallah (Canon) on Jun 23, 2016 at 04:40 UTC
Here is the obligatory one-liner: `perl -an -E '$c=shift@F; say qq\|$c $_\n\| for sort {$a<=>$b} @F' your-f +ile.txt` [download] This is not an optical illusion, it just looks like one.	[reply] [d/l]
Re: Perl Formatting Text by BillKSmith (Monsignor) on Jun 23, 2016 at 02:32 UTC
I have a lot of questions about your requirements. Can a line have more than one letter? Can the same number appear more than once for the same letter? Can the same letter be repeated anywhere on a line? Or in the file? If the answer to any of these questions is 'yes', what must you do? Is your real data just numbers and letters? If not, how can we tell the difference? Bill	[reply]
Re^2: Perl Formatting Text by oopl1999 (Novice) on Jun 23, 2016 at 19:34 UTC
`10GBE_ADDR1 R3629.2 (ANALOG:107) R3633.1 (ANALOG:107) U212.19 (INPUT:107)` This is an example of one of the lines in the file. I now realize my example probably wasn't the best. I would want to the end file to be: `10GBE_ADDR1 R3629.2 (ANALOG:107) 10GBE_ADDR1 R3633.1 (ANALOG:107) 10GBE_ADDR1 (ANALOG:107) U212.19` [download] And so on for the next lines	[reply] [d/l] [select]
Re^3: Perl Formatting Text by Marshall (Canon) on Jun 23, 2016 at 21:48 UTC
So I see the plot thickens... I made a straightforward modification to previous code to account for the fact that you have pairs of things instead of single space separated things in the input data. I am confused by your last example output line `10GBE_ADDR1 (ANALOG:107) U212.19`. I just assumed that this was a cut-n-paste error? If not, then you have a lot more explaining to do about "what the rules are". I am not sure if this is what you need, but we are incrementally closer... #!/usr/bin/perl use warnings; use strict; while (my $line = <DATA>) { next if $line =~ /^\s$/; #skip blank lines my ($label, @rest) = split ' ', $line; my @pairs; while (@rest) { my $first_num_thing = shift @rest; my $paren_thing = shift @rest; push @pairs, "$first_num_thing $paren_thing"; } @pairs = sort @pairs; #may need special sort?? foreach my $col (@pairs) { print "$label $col\n"; } } =prints 10GBE_ADDR1 R3629.2 (ANALOG:107) 10GBE_ADDR1 R3633.1 (ANALOG:107) 10GBE_ADDR1 U212.19 (INPUT:107) =cut __DATA__ 10GBE_ADDR1 R3629.2 (ANALOG:107) R3633.1 (ANALOG:107) U212.19 (INPUT:1 +07) [download] Now of course in your "real" code vs my "demo" code, use something more descriptive that "$paren_thing". I am sure in your actual context that thing has some name or description that makes a lot more sense than that! I hope that you have read my previous answer to your questions and that this post makes more sense to you now. As with the previous code post, this is "runnable code" as is. What I expect you to do is use my code as a starting point. Play with it. Modify it. I am trying to provide enough info to get you "unstuck". You need to start writing some code yourself. There are of course other ways to write this code. I attempted to be straightforward and not overly fancy. Update: Ok, I will demo another technique. If you can understand how both of these programs work, then you are well on your way. Split and "match global" can solve an enormous percentage of file parsing problems. `#!/usr/bin/perl use warnings; use strict; while (my $line = <DATA>) { next if $line =~ /^\s$/; #skip blank lines my ($label, $rest) = split ' ', $line,2; (my @pairs) = $rest =~ /(\S+\s+\S+)/g; #called "match global"; @pairs = sort @pairs; foreach my $col (@pairs) { print "$label $col\n"; } } =prints 10GBE_ADDR1 R3629.2 (ANALOG:107) 10GBE_ADDR1 R3633.1 (ANALOG:107) 10GBE_ADDR1 U212.19 (INPUT:107) =cut __DATA__ 10GBE_ADDR1 R3629.2 (ANALOG:107) R3633.1 (ANALOG:107) U212.19 (INPUT:1 +07)` [download]	[reply] [d/l] [select]
Re^4: Perl Formatting Text by oopl1999 (Novice) on Jun 23, 2016 at 23:54 UTC
Re^4: Perl Formatting Text by oopl1999 (Novice) on Jun 24, 2016 at 01:56 UTC
Re^5: Perl Formatting Text by Marshall (Canon) on Jun 24, 2016 at 13:31 UTC
Some notes below your chosen depth have not been shown here
Re^4: Perl Formatting Text by oopl1999 (Novice) on Jun 23, 2016 at 23:54 UTC
Re^3: Perl Formatting Text by BillKSmith (Monsignor) on Jun 23, 2016 at 20:22 UTC
oopl1999, You answered only one of my seven questions. Someone might guess the rest of the answers correctly and give you a good solution, but you will get more and better solutions if you post the answers to my previous questions. Remember that examples alone cannot tell about conditions that are impossible. Bill	[reply]
Re^4: Perl Formatting Text by oopl1999 (Novice) on Jun 23, 2016 at 21:30 UTC