in reply to Looking for Perl Elegance!

How to ask questions the smart way.

From this description, I cannot tell much at all about what your input looks like, what needs to be associated with what, and what your output form should be.

Are you asking how to format a grid spacing arrangement in text? Are you asking how to parse or output CSV data? Are you asking how to sort things so the item1a comes under item1? Consider your question from the point of view of someone who is NOT inside your head reading your thoughts.

--
[ e d @ h a l l e y . c c ]

Replies are listed 'Best First'.
Re^2: Looking for Perl Elegance!
by perlNinny (Beadle) on Jul 11, 2007 at 17:17 UTC
    Well the input will stream in line by line and will look something like this: The do come in order but they may not have a title. I need to store each item as it comes and I won't know when I am done until I get to the end of the stream. There will always be a DATA-A DATA-B and a DATA-C which will be associated with the last TITLE. so I can do this:
    if( $line =~ /TITLE/) { $line =~ s/TITLE#(.*)#/$1/; chomp($1); } else if ( $line =~ /DATA-A/) { $line =~ s/DATA-A#(.*)#/$1/; chomp($1); }
    ect but that doesn't strike me as an elegant way to do this. The output would be like using a print statement. I was looking at using perlform for the output. To do that, I will need to get all the data first, figure out the max length of each item like:
    max length of 'how are you', 'i am fine'
    max length of 'item1', 'item1a', 'item4'
    max length of 'item2', 'item2b', 'item5'
    max length of 'item3', 'item3c', 'item6'
    then use that to format the output. I can't get a good example on the forum here without using html which might be part of the confusion. No CSV, no sorting...just display it on the screen using a print statement but make it pretty like a table.
    I hope that answers the questions.
    THANKS for the help on this.

      For your table formatting, take a look at Text::Table.

      Here's a quick take at some code to extract the data from your files. Take a close look at your regexes, they don't need to be substitutions, and the use of the greedy '.*' will cause you to get wrong values on lines with multiple tags.

      # Assume a FIELD#...# string can't be split across lines. use strict; use warnings; use Data::Dumper; my @DATA_FIELDS = ( 'TITLE', 'DATA-A', 'DATA-B', 'DATA-C', ); # Build a regex that matches all fields and extracts a value. my $all_fields = join '|', @DATA_FIELDS; my $DATA_REGEX = qr/($all_fields)#([^#]*)#/; my @data; # store all tag data my $title_data = {}; # reference to current title's data store. while (<DATA>) { # the /g while ( /$DATA_REGEX/g ) { my $field = $1; my $value = $2; print "$_ -> $field, $value\n"; if( $field eq 'TITLE' ) { # store previous title data set if not empty. push @data, $title_data if %$title_data; # start new title data set $title_data = { TITLE => $value }; } else { # store field data in current title data set. push @{ $title_data->{$field} }, $value; } } } # store final title data set push @data, $title_data if %$title_data; print Dumper \@data; __DATA__ asdf asdfg=4eafvasdfadsf ashfasdf asdf qer qwer asd as dsasdi weeiwer dfhjTITLE#How are you#asdfads asdfa asdg rt wqrqw re DATA-A#item1#asdfdasfdasdasDATA-B#item2# asdfda dasfa asdfdas DATA-C#item3# aasdfDATA-A#item1a#DATA-B#item2b# asd asdf asdDATA-C#item3c# asdf asdf3132 adsf TITLE#I am fine#ads fadsfdasfdas


      TGI says moo

      I'm not sure what to say to answer your actual questions, but I suspect that
      $line =~ s/DATA-A#(.*)#/$1/;
      may not do what you actually want. It removes "DATA-A#" and the final "#" from the line. So, for instance, given the line
      aasdfDATA-A#item1a#DATA-B#item2b#
      
      from your sample data, it would produce
      aasdfitem1a#DATA-B#item2b
      
      Assuming you're trying to extract the text "item1a" from the line, the regex you want is
      $line =~ /DATA-A#([^#]*)#/;
      which extracts "item1a" into $1 (without destroying the rest of the line, so the DATA-B will still be there to collect later). Using [^#]* instead of .* will cause it to stop capturing at the first # it sees instead of continuing to the last one.

      I suppose something like

      my %data = (); while ($line =~ /(TITLE|DATA-[ABC])#([^#]*)#/g) { $data{$1} = $2; handle_data($data{TITLE}, $data{DATA-A}, $data{DATA-B}, $data{DATA-C}) if $1 eq 'DATA-C'; }
      might be what you're looking for here, but I'm not entirely sure. Note the assumption that you identify the end of a data set by the appearance of a DATA-C element. handle_data would then either print the data, store it for later formatting, or whatever else may need to be done with it. Other initialization and/or sanity checking is probably needed unless your input stream is known to be perfect and will never, say, send an ITEM-C before all of the other elements have appeared.

      update: Added a forgotten ) in the last code fragment. This code is (obviously) untested. If it breaks, you get to keep both pieces.

      You should use 'elsif' and not 'else if'.