in reply to Re^2: Parsing a file and finding the dependencies in it
in thread Parsing a file and finding the dependencies in it

There is nothing wrong with making Desc: a special case for the splitting. I show some code below...

In this special situation, you can just test for /^Desc:/. The technique is to limit the number of things returned from the split, in this case 2 things. Doing that requires that we take care of one more detail, a chomp() is needed.

When we let split() do its default thing, a chomp() is not needed because the trailing \n will be removed (default split is on any sequence of the 5 whitespace characters (space,\n,\f,\r,\t). If we tell split() to stop working after it has 2 things, then we have to do manually what it would have done to the last thing.

I set up %record so that it is a Hash of Array, each value is a reference to an anonymous array of data. That is true even for a single value like the id number. "Same-ness" is a good thing in programming. So, I would do the same for the description string.

Then the question of so what do you do with this description once the record is complete? You could say put another dimension on the hash which has id's as the key. However, there is something to be said for keeping things simple. You could just make another hash that is keyed on id's with the string as the value. Some purists might shudder in horror, but again simplicity has virtues!

# ........ snip if (my $num = /\[/.../\]/) { if (/^Desc:/) { chomp; my ($desc, $string) = split(/\s+/,$_,2); $record{$desc} = [$string]; next; } my ($tag, @values) = split; @{$record{$tag}} = @values; #........ snip OR....perhaps... if (/^Desc:/) { chomp; my ($desc, $string) = split(/\s+/,$_,2); $record{$desc} = [$string]; # same as @{$record{$desc}} = ($s +tring); } else { my ($tag, @values) = split; @{$record{$tag}} = @values; #....snip...

Replies are listed 'Best First'.
Re^4: Parsing a file and finding the dependencies in it
by legendx (Acolyte) on Jul 07, 2011 at 13:28 UTC
    I didn't realize or check to see if you could limit what is returned from split. Thanks for that, that works perfectly fine.
    The "flip-flop" implementation that Marshall referred to was new to me as well, so much to learn!
Re^4: Parsing a file and finding the dependencies in it
by legendx (Acolyte) on Jul 07, 2011 at 14:49 UTC

    Can anyone help to explain what this line does?

    print map{" $_\n"}grep{!$seen{$_}++}priorFiles($file);

    I've read the perldoc on the map function and I think the grep{!$seen{$_}++}priorFiles($file) portion extracts unique elements and the priorFiles subroutine returns the "Start" files? Could someone explain it please?

    Also, I have been trying to figure out how I would be able to tell if an "ID" or "Desc" depends on another "ID" or "Desc" such as showing ID 456 depends on ID 423 which basically entails looking up the input or "Start" files to see where (which "ID") they came from
      Yes, grep{!$seen{$_}++} just removes duplicates from the list. Perl grep is filtering operation and more powerful than command line grep. It passes the input to the output if the last line of the grep evaluates to "true".

      This grep code checks the "truthness" of the seen hash entry for $_. The ! makes it a "not". So this is true if we have not seen a value before. The ++ is a post increment which happens after we've tested for existence. If the key does not already exist, Perl creates it, and allows the undef initial value to be used in the increment. The resulting value is 1 (0+1). If the key already exists it just gets incremented.

      The list returned from priorFiles is every possible file that could have affected a particular output file. It contains dupes because some of the input files will share a at least a partial ancestry. priorFiles() is a bit tricky as it calls itself. This is a recursive function and may a bit mind-bending if you haven't seen one before.

      I think you are on the way now. Play with the code, insert prints to watch what it does.