in reply to Perl Formatting Text

I have a lot of questions about your requirements. Can a line have more than one letter? Can the same number appear more than once for the same letter? Can the same letter be repeated anywhere on a line? Or in the file? If the answer to any of these questions is 'yes', what must you do? Is your real data just numbers and letters? If not, how can we tell the difference?
Bill

Replies are listed 'Best First'.
Re^2: Perl Formatting Text
by oopl1999 (Novice) on Jun 23, 2016 at 19:34 UTC
    10GBE_ADDR1 R3629.2 (ANALOG:107) R3633.1 (ANALOG:107) U212.19 (INPUT:107)

    This is an example of one of the lines in the file. I now realize my example probably wasn't the best.

    I would want to the end file to be:

    10GBE_ADDR1 R3629.2 (ANALOG:107) 10GBE_ADDR1 R3633.1 (ANALOG:107) 10GBE_ADDR1 (ANALOG:107) U212.19

    And so on for the next lines

      So I see the plot thickens...

      I made a straightforward modification to previous code to account for the fact that you have pairs of things instead of single space separated things in the input data. I am confused by your last example output line 10GBE_ADDR1 (ANALOG:107) U212.19. I just assumed that this was a cut-n-paste error? If not, then you have a lot more explaining to do about "what the rules are".

      I am not sure if this is what you need, but we are incrementally closer...

      #!/usr/bin/perl use warnings; use strict; while (my $line = <DATA>) { next if $line =~ /^\s*$/; #skip blank lines my ($label, @rest) = split ' ', $line; my @pairs; while (@rest) { my $first_num_thing = shift @rest; my $paren_thing = shift @rest; push @pairs, "$first_num_thing $paren_thing"; } @pairs = sort @pairs; #may need special sort?? foreach my $col (@pairs) { print "$label $col\n"; } } =prints 10GBE_ADDR1 R3629.2 (ANALOG:107) 10GBE_ADDR1 R3633.1 (ANALOG:107) 10GBE_ADDR1 U212.19 (INPUT:107) =cut __DATA__ 10GBE_ADDR1 R3629.2 (ANALOG:107) R3633.1 (ANALOG:107) U212.19 (INPUT:1 +07)
      Now of course in your "real" code vs my "demo" code, use something more descriptive that "$paren_thing". I am sure in your actual context that thing has some name or description that makes a lot more sense than that!

      I hope that you have read my previous answer to your questions and that this post makes more sense to you now. As with the previous code post, this is "runnable code" as is.

      What I expect you to do is use my code as a starting point. Play with it. Modify it. I am trying to provide enough info to get you "unstuck". You need to start writing some code yourself. There are of course other ways to write this code. I attempted to be straightforward and not overly fancy.

      Update:
      Ok, I will demo another technique. If you can understand how both of these programs work, then you are well on your way. Split and "match global" can solve an enormous percentage of file parsing problems.

      #!/usr/bin/perl use warnings; use strict; while (my $line = <DATA>) { next if $line =~ /^\s*$/; #skip blank lines my ($label, $rest) = split ' ', $line,2; (my @pairs) = $rest =~ /(\S+\s+\S+)/g; #called "match global"; @pairs = sort @pairs; foreach my $col (@pairs) { print "$label $col\n"; } } =prints 10GBE_ADDR1 R3629.2 (ANALOG:107) 10GBE_ADDR1 R3633.1 (ANALOG:107) 10GBE_ADDR1 U212.19 (INPUT:107) =cut __DATA__ 10GBE_ADDR1 R3629.2 (ANALOG:107) R3633.1 (ANALOG:107) U212.19 (INPUT:1 +07)
        Thank you for the help! I will run these programs and play around with them. If I have any more questions I will ask for clarification.

        Hi I was wondering if you could help with a couple last things. I have modified the code to take input from a file and also I have commented out everything to make sure I understand it. And by the way I have gone with the first method you provided.

        My first problem is getting the output to a new file rather than to the terminal. I have tried several different methods of this and have failed.

        The second problem is in the data itself. Every once in a while the data may look like this because there are two many of the tags for one id:

        10GBE_ADDR1 R3629.2 (ANALOG:107) R3633.1 (ANALOG:107)

        U212.19 (INPUT:107)

        Basically it will start a new line to write all of the tags. I am not sure how to recognize that there is no id and how to then make the following tags attribute themselves back to the previous id. And here is the code I used.

        open FILE, '<', "golden.rpt" or "die unable to open read file $!"; while (my $line = <FILE>) { next if $line =~ /^\s*$/; #skip blank lines next if $line =~ /#/; #Skip comments my ($netname, @referencedesignators) = split ' ', $line; #split file + into a scalar for netname and put all of the reference designators i +nto an array my @singlereference; while (@referencedesignators) { my $firstpart = shift @referencedesignators; #split array into tw +o scalars one for the letter then number sequence and the other for t +he analog thing my $secondpart = shift @referencedesignators; push @singlereference, "$firstpart $secondpart"; #push these two +scalars to form a pair. Each pair is one reference designator. These +pairs form an array. } @singlereference = sort {$a <=> $b} @singlereference; #sort by ascen +ding foreach my $col (@singlereference) { print "$netname $col\n"; #print the netname along with each colum +n of the array containing the singlereference designators. } } print "done\n";
        And that was a copy paste error.
      oopl1999, You answered only one of my seven questions. Someone might guess the rest of the answers correctly and give you a good solution, but you will get more and better solutions if you post the answers to my previous questions. Remember that examples alone cannot tell about conditions that are impossible.
      Bill

        A line can have more than one letter

        The number or string will never be repeated.

        The letter may be repeated

        If it the letters or numbers are repeated just sort it the same way (as I did in the second example I provided.

        The data is letters and numbers but not all mixed together not split (as shown by my example).

        Sorry for my ignorance. I'm new the forum and Perl as well. I hope you can still help!