in reply to Re^4: nesting loops help?
in thread nesting loops help?

So something like this:

use warnings; use strict; my $scheduleData = <<SDATA; SCHEDULE Unwanted mu1 mu2 Nancy#mu3 mu4 mu2 END SCHEDULE Wanted mu1 mu2 Mindy#mu4 mu4 mu2 END SDATA my $agentData = <<ADATA; Harry Mindy Nancy Orlon ADATA my $defsData = <<DDATA; Mindy#mu4 SCRIPTNAME "doit.cmd" DESCRIPTION "mu2" Orlon#mu5 SCRIPTNAME "doit.cmd" DESCRIPTION "SOMETIMES HAS AGENTNAME Orlon" Nancy#mu6 SCRIPTNAME "doit.cmd" DESCRIPTION "mu2" Harry#mu7 SCRIPTNAME "SOMETIMES HAS AGENTNAME Harry" DESCRIPTION "mu2" DDATA open my $schdIn, '<', \$scheduleData or die "Can't open 'Schedule': $! +"; my %schedules; my $scheduleName; while (defined(my $line = <$schdIn>)) { if ($line =~ /SCHEDULE\s+(\S+)/) { $scheduleName = $1; next; } next if $line !~ /(\S+)#(\S+)/; ++$schedules{$1}{$scheduleName}; } open my $agentsIn, '<', \$agentData or die "Can't open 'Agents': $!"; my $agentsList = join '|', map {chomp; qr/\Q$_\E/} <$agentsIn>; my $agentsMatch = qr|\b($agentsList)\b|; my $wantedSchedule = 'Wanted'; open my $defsIn, '<', \$defsData or die "Can't open 'Definitions': $!" +; while (defined(my $defLine = <$defsIn>)) { chomp $defLine; next if $defLine !~ /$agentsMatch#/; my $lineout = $defLine; my $flag = $schedules{$1} ? 'Yes' : 'No'; my $wanted = $schedules{$1}{$wantedSchedule}; $wanted = $wanted ? " | $wantedSchedule" : ''; print "$defLine | $flag$wanted\n"; }

Prints:

Mindy#mu4 | Yes | Wanted Orlon#mu5 | No Nancy#mu6 | Yes Harry#mu7 | No

My eyes don't parse a wall of uppercase text so I changed strings to something I could work with. I also removed the irrelevant file opens and the code building %scheduleMatch from an external file. It is important to trim out needless code and errors to focus just on the task at hand (see I know what I mean. Why don't you?).

The key is to improve the parsing of the schedules data so that we have both the agent name and the schedule name together in the schedules hash. Then the test in the loop processing the defs data becomes trivial.

Update: remove duplicated lines at the start of the example code (bogus copy and paste :-( ).

Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

Replies are listed 'Best First'.
Re^6: nesting loops help?
by shadowfox (Beadle) on Mar 14, 2022 at 02:52 UTC
    Thanks, that looks a lot cleaner, I think I understand everything except the schedule parsing, in your example you're manually setting $wantedSchedule to 'Wanted', but in order to work as intended the script will need to take the variable output of I think $lineout which should be the info it gathered from $defsdata to see which schedule name to pull from $scheduleData, at least that's how it looks but then I don't know what your testing in the $wanted line above.

    It's late and my laptop is off so I'll take a real look in the morning, not going to figure anything out on my phone with tired eyes.

    I really appreciate the help though!
Re^6: nesting loops help?
by shadowfox (Beadle) on Mar 14, 2022 at 15:25 UTC
    Fresh eyes didn't help much lol, I think perhaps I didn't explain the point of merging the scripts adequately.

    The value that's printed in $defLine is the full line found in $defData after finding matching line in $AgentData, that seems ok. The $flag (yes/no) is telling me if that $defLine is also found in $scheduleData and if it is, then also scroll up and tell me the last matched regex capture for schedule name. We can't really scroll up, that's why I was using the look behind, It looks like you're putting that in $scheduleName but since it's being iterated before you go through $defData I don't understand the code suggestion as we don't know what we're looking for at that step.

    The result looks ok in your example but it wouldn't actually work without setting the static variable in $wantedSchedule which I can't do because that needs to be dynamic data from the above, my example was static in the second script because I didn't know how to get the data out of the first one, but that's what I was asking for additional assistance with, I'm not sure I was clear or maybe I don't understand your proposal.

      You are too focused on the immediate solution and haven't provided enough context for us to test suggestions against. A bit of overview of the big picture problem can save a lot of iterations while we slowly leach enough context out of you to be helpful. Something like the following problem description could help a lot:
      I have a schedule file with named lists of tasks. The task lists include actors involved in the tasks. I also have a file containing a list of actors I'm interested in, and a definitions file matching actors to the jobs they do. I want to generate an output list indicating which actors are used and, if they are used, which schedule they appear in.

      Given that and assuming the sample data from my previous example the following fits the bill:

      . . . open my $schdIn, '<', \$scheduleData or die "Can't open 'Schedule': $! +"; my %actorSchedules; my $scheduleName; while (defined(my $line = <$schdIn>)) { if ($line =~ /SCHEDULE\s+(\S+)/) { $scheduleName = $1; next; } next if $line !~ /(\S+)#(\S+)/; ++$actorSchedules{$1}{$scheduleName}; } open my $agentsIn, '<', \$agentData or die "Can't open 'Agents': $!"; my $agentsList = join '|', map {chomp; qr/\Q$_\E/} <$agentsIn>; my $agentsMatch = qr|\b($agentsList)\b|; my $wantedSchedule = 'Wanted'; open my $defsIn, '<', \$defsData or die "Can't open 'Definitions': $!" +; while (defined(my $defLine = <$defsIn>)) { chomp $defLine; next if $defLine !~ /$agentsMatch#/; my $lineout = $defLine; my $tail = $actorSchedules{$1} ? 'Yes' : 'No'; my @schedules = keys %{$actorSchedules{$1}}; $tail = join ' | ', $tail, @schedules; print "$defLine | $tail\n"; }

      Prints:

      Mindy#mu4 | Yes | Wanted Orlon#mu5 | No Nancy#mu6 | Yes | Unwanted Harry#mu7 | No

      Of course I don't know just what your big picture goal is so it's likely I haven't hit the mark exactly.

      Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
        I hear what you're saying, I just am not sure which part I'm not adequately explaining, perhaps I can example it better without actual code. If you see something wrong in the logic let me know

        Step 1 starts with a list of agents, the source data we've already assigned $agentsIn which reads from $agentsData
        Harry Mindy Nancy Orlon
        Step 2 looks over every agent in the previous list, and prints every matching line based on the regex, the full line because the full line contains information we need out of schedulesData later. In this example it should return Mindy#mu4, Orlon#mu5, Nancy#mu6, Harry#mu7, Mindy#mu8 because Ted isn't in $agentsData, but the rest are.
        Ted#mu3 SCRIPTNAME "doit.cmd" DESCRIPTION "mu2" Mindy#mu4 SCRIPTNAME "doit.cmd" DESCRIPTION "mu2" Orlon#mu5 SCRIPTNAME "doit.cmd" DESCRIPTION "SOMETIMES HAS AGENTNAME Orlon" Nancy#mu6 SCRIPTNAME "doit.cmd" DESCRIPTION "mu2" Harry#mu7 SCRIPTNAME "SOMETIMES HAS AGENTNAME Harry" DESCRIPTION "mu2" Mindy#mu8 SCRIPTNAME "doit.cmd" DESCRIPTION "mu2"
        Step 3 wants to know if the 5 lines we just found are found in the $schedulesData we're pulling from $schedulesIn file.
        Mindy#mu4 is in the list below, return 'yes' BUT also save this item for next step
        Orlon#mu5 is NOT in the list below, return 'no'.
        Nancy#mu3 is in the list below, return 'yes' BUT also save this item for next step
        Harry#mu7 is NOT in the list below, return 'no'.
        Mindy#mu8 is in the list below, return 'yes' BUT also save this item for next step
        SCHEDULE magicName1 mu1 mu2 Nancy#mu9 mu4 mu2 END SCHEDULE magicName2 mu1 mu2 Mindy#mu4 mu4 mu2 END SCHEDULE magicName3 mu1 mu2 Nancy#mu3 mu4 mu2 END SCHEDULE magicName4 mu1 mu2 Mindy#mu8 mu4 mu2 END
        Step 4 is the tricky part I've tried a few examples and different ways to explain but thus far am not getting it across. There should be 3 matches in the above, the final step is figuring out what the closest magicName in the file above the matches are. This was the functionality in my second script, my problem was figuring out how to hit on the match and look upward to the previously last match of the other regex while in the previous loop. This is what I meant with $wantedSchedule can't be static, its found by storing all hits on SCHEDULE(.*) to figure out what the magicName is and then using the agent name to determine which one applies to that record

        use Mindy#mu4 to find magicName2
        use Nancy#mu3 to find magicName3
        use Mindy#mu8 to find magicName4


        The final output would be like this
        Harry#mu7 | No | No Mindy#mu4 | Yes | magicName2 Mindy#mu8 | Yes | magicName4 Nancy#mu6 | No | No Nancy#mu3 | Yes | magicName3 Orlon#mu5 | No | No
        The second column of Yes/No isn't really needed in the final solution, it was used as an intermediary step to get me toward the final solution which turned out to be harder than expected. If its yes it means there will be a record in the last step, if its no then the last step will also be no, but I wanted to start simple by determining first if a match is in the file at all then hit it with a more complex function to find the match and tell me the closest magicName above. magicName is not always the same distance above a match, so there's no good way to figure it out without storing all the magicNames and just printing the last one stored when I reach the input match because the match we search for will always be lower in the record than magicName we actually want to know.