in reply to Re^3: nesting loops help?
in thread nesting loops help?

use warnings; use strict; open my $schedule, '<', 'Schedule'; my %schedule; $schedule{$_} = 1 while (<$schedule>); close $schedule; open my $wave, '>', 'Wave' or die "Can't open 'Wave': $! +"; open my $keywords, '<', 'Agents' or die "Can't open 'Agents': +$!"; open my $search_file, '<', 'Definitions' or die "Can't open 'Definitio +ns': $!"; my $scheduleData = <<SDATA; SCHEDULE OTHER_NAME DONTCARE NOTIMPORTANT : MNDJWIEL#DIFFERENTDATA OTHERDATA NOTIMPORTANT END SCHEDULE NAME_I_WANT DONTCARE NOTIMPORTANT : MNDJWIEL#OTHERDATA OTHERDATA NOTIMPORTANT END SDATA my $agentData = <<ADATA; HSJEKDIE MNDJWIEL NSKQI OIFNHDU H3KID ADATA my $defsData = <<DDATA; MNDJWIEL#OTHERDATA SCRIPTNAME "JKASDHAJSDHAKJDAS.cmd" DESCRIPTION "NOTIMPORTANT" OIFNHDU#UNIMPORTANT SCRIPTNAME "JKASDHAJSDHAKJDAS.cmd" DESCRIPTION "SOMETIMES HAS AGENTNAME OIFNHDU" NSKQI#SOMETHINGHERE SCRIPTNAME "JKASDHAJSDHAKJDAS.cmd" DESCRIPTION "NOTIMPORTANT" HSJEKDIE#DOESNTMATTER SCRIPTNAME "SOMETIMES HAS AGENTNAME HSJEKDIE" DESCRIPTION "NOTIMPORTANT" DDATA open my $schdIn, '<', \$scheduleData or die "Can't open 'Schedule': $! +"; my %scheduleMatch = map {chomp; $_ => 1} <$schdIn>; open my $agentsIn, '<', \$agentData or die "Can't open 'Agents': $!"; open my $defsIn, '<', \$defsData or die "Can't open 'Definitions': +$!"; my $agentsList = join '|', map {chomp; qr/\Q$_\E/} <$agentsIn>; my $agentsMatch = qr|\b($agentsList)\b|; while (defined(my $defLine = <$defsIn>)) { chomp $defLine; if ($defLine =~ $agentsMatch && $defLine !~ /(SCRIPTNAME|DESCRIPTI +ON)/) { my $lineout = $defLine; my $flag = $scheduleMatch{$defLine} ? 'Yes' : 'No'; print "$defLine | $flag\n"; } }
Which prints this, because the example has only 1 of the below values from DDATA that's also in SDATA:
MNDJWIEL#OTHERDATA | Yes OIFNHDU#UNIMPORTANT | No NSKQI#SOMETHINGHERE | No HSJEKDIE#DOESNTMATTER | No
From this point I'd like to use the look behind method to make the data this
MNDJWIEL#OTHERDATA | Yes | NAME_I_WANT OIFNHDU#UNIMPORTANT | No | Missing NSKQI#SOMETHINGHERE | No | Missing HSJEKDIE#DOESNTMATTER | No | Missing
Which I can do via two separate scripts, but obviously it would be more efficient to join the functionality.
use warnings; use strict; my $scheduleData = <<SDATA; SCHEDULE OTHER_NAME DONTCARE NOTIMPORTANT : MNDJWIEL#DIFFERENTDATA OTHERDATA NOTIMPORTANT END SCHEDULE NAME_I_WANT DONTCARE NOTIMPORTANT : MNDJWIEL#OTHERDATA OTHERDATA NOTIMPORTANT END SDATA my $defLine = 'MNDJWIEL#OTHERDATA'; open my $schdIn, '<', \$scheduleData or die "Can't open 'Schedule': $! +"; my @values=(); while (<$schdIn>) { if (/(^SCHEDULE)(.*)/) { # SCHEDULE SCHEDULENAME, I only want the na +me. push(@values, $2); } if (/$defLine/) { print $values[-1]; last; } }
Which prints:
NAME_I_WANT
I will say that at first I didn't see the point in adding sample data but I kind of get it now, seeing what I'm trying to get from the data helps convey desired functionality that I may not be good at explaining.

Replies are listed 'Best First'.
Re^5: nesting loops help?
by GrandFather (Saint) on Mar 13, 2022 at 23:01 UTC

    So something like this:

    use warnings; use strict; my $scheduleData = <<SDATA; SCHEDULE Unwanted mu1 mu2 Nancy#mu3 mu4 mu2 END SCHEDULE Wanted mu1 mu2 Mindy#mu4 mu4 mu2 END SDATA my $agentData = <<ADATA; Harry Mindy Nancy Orlon ADATA my $defsData = <<DDATA; Mindy#mu4 SCRIPTNAME "doit.cmd" DESCRIPTION "mu2" Orlon#mu5 SCRIPTNAME "doit.cmd" DESCRIPTION "SOMETIMES HAS AGENTNAME Orlon" Nancy#mu6 SCRIPTNAME "doit.cmd" DESCRIPTION "mu2" Harry#mu7 SCRIPTNAME "SOMETIMES HAS AGENTNAME Harry" DESCRIPTION "mu2" DDATA open my $schdIn, '<', \$scheduleData or die "Can't open 'Schedule': $! +"; my %schedules; my $scheduleName; while (defined(my $line = <$schdIn>)) { if ($line =~ /SCHEDULE\s+(\S+)/) { $scheduleName = $1; next; } next if $line !~ /(\S+)#(\S+)/; ++$schedules{$1}{$scheduleName}; } open my $agentsIn, '<', \$agentData or die "Can't open 'Agents': $!"; my $agentsList = join '|', map {chomp; qr/\Q$_\E/} <$agentsIn>; my $agentsMatch = qr|\b($agentsList)\b|; my $wantedSchedule = 'Wanted'; open my $defsIn, '<', \$defsData or die "Can't open 'Definitions': $!" +; while (defined(my $defLine = <$defsIn>)) { chomp $defLine; next if $defLine !~ /$agentsMatch#/; my $lineout = $defLine; my $flag = $schedules{$1} ? 'Yes' : 'No'; my $wanted = $schedules{$1}{$wantedSchedule}; $wanted = $wanted ? " | $wantedSchedule" : ''; print "$defLine | $flag$wanted\n"; }

    Prints:

    Mindy#mu4 | Yes | Wanted Orlon#mu5 | No Nancy#mu6 | Yes Harry#mu7 | No

    My eyes don't parse a wall of uppercase text so I changed strings to something I could work with. I also removed the irrelevant file opens and the code building %scheduleMatch from an external file. It is important to trim out needless code and errors to focus just on the task at hand (see I know what I mean. Why don't you?).

    The key is to improve the parsing of the schedules data so that we have both the agent name and the schedule name together in the schedules hash. Then the test in the loop processing the defs data becomes trivial.

    Update: remove duplicated lines at the start of the example code (bogus copy and paste :-( ).

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
      Thanks, that looks a lot cleaner, I think I understand everything except the schedule parsing, in your example you're manually setting $wantedSchedule to 'Wanted', but in order to work as intended the script will need to take the variable output of I think $lineout which should be the info it gathered from $defsdata to see which schedule name to pull from $scheduleData, at least that's how it looks but then I don't know what your testing in the $wanted line above.

      It's late and my laptop is off so I'll take a real look in the morning, not going to figure anything out on my phone with tired eyes.

      I really appreciate the help though!
      Fresh eyes didn't help much lol, I think perhaps I didn't explain the point of merging the scripts adequately.

      The value that's printed in $defLine is the full line found in $defData after finding matching line in $AgentData, that seems ok. The $flag (yes/no) is telling me if that $defLine is also found in $scheduleData and if it is, then also scroll up and tell me the last matched regex capture for schedule name. We can't really scroll up, that's why I was using the look behind, It looks like you're putting that in $scheduleName but since it's being iterated before you go through $defData I don't understand the code suggestion as we don't know what we're looking for at that step.

      The result looks ok in your example but it wouldn't actually work without setting the static variable in $wantedSchedule which I can't do because that needs to be dynamic data from the above, my example was static in the second script because I didn't know how to get the data out of the first one, but that's what I was asking for additional assistance with, I'm not sure I was clear or maybe I don't understand your proposal.

        You are too focused on the immediate solution and haven't provided enough context for us to test suggestions against. A bit of overview of the big picture problem can save a lot of iterations while we slowly leach enough context out of you to be helpful. Something like the following problem description could help a lot:
        I have a schedule file with named lists of tasks. The task lists include actors involved in the tasks. I also have a file containing a list of actors I'm interested in, and a definitions file matching actors to the jobs they do. I want to generate an output list indicating which actors are used and, if they are used, which schedule they appear in.

        Given that and assuming the sample data from my previous example the following fits the bill:

        . . . open my $schdIn, '<', \$scheduleData or die "Can't open 'Schedule': $! +"; my %actorSchedules; my $scheduleName; while (defined(my $line = <$schdIn>)) { if ($line =~ /SCHEDULE\s+(\S+)/) { $scheduleName = $1; next; } next if $line !~ /(\S+)#(\S+)/; ++$actorSchedules{$1}{$scheduleName}; } open my $agentsIn, '<', \$agentData or die "Can't open 'Agents': $!"; my $agentsList = join '|', map {chomp; qr/\Q$_\E/} <$agentsIn>; my $agentsMatch = qr|\b($agentsList)\b|; my $wantedSchedule = 'Wanted'; open my $defsIn, '<', \$defsData or die "Can't open 'Definitions': $!" +; while (defined(my $defLine = <$defsIn>)) { chomp $defLine; next if $defLine !~ /$agentsMatch#/; my $lineout = $defLine; my $tail = $actorSchedules{$1} ? 'Yes' : 'No'; my @schedules = keys %{$actorSchedules{$1}}; $tail = join ' | ', $tail, @schedules; print "$defLine | $tail\n"; }

        Prints:

        Mindy#mu4 | Yes | Wanted Orlon#mu5 | No Nancy#mu6 | Yes | Unwanted Harry#mu7 | No

        Of course I don't know just what your big picture goal is so it's likely I haven't hit the mark exactly.

        Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond