Re: partial matching of lines in perl

Replies are listed 'Best First'.
Re^2: partial matching of lines in perl by choroba (Cardinal) on Jun 12, 2020 at 12:37 UTC
Note that @a2 and $a2 are unrelated variables. Use strict and warnings to catch similar kinds of errors. `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l]
Re^2: partial matching of lines in perl by AnomalousMonk (Archbishop) on Jun 12, 2020 at 15:29 UTC
`foreach $a1(@a1){ while(@a2 = <F2>) { ... } }` [download] Note also that with the block structure quoted above, the `F2` filehandle in the nested `while`-loop will be "exhausted" after handling the first item in the the `foreach`-loop and will thereafter, I think, assign to the `@a2` array a list consisting of a single undef value. To be useful, the `F2` filehandle would need to be rewound after each pass through the `while`-loop; see seek. (bliako has already made this point here.) A marginally better loop nesting structure would be `while (my $line2 = <F2>) { foreach my $line1 (@a1) { print $line1 if line2_appears_in_line1($line2, $line1); } }` [download] However, this approach still requires a complete pass through `@a1` for every line in the `F2` file, i.e., it's still O(n1 n2). Another point. If you want to find out if a line from `file2` (always remember to chomp* this line!) is exactly present within a line from `file1`, the comparison should be `if ($line1 =~ /\Q$line2_chomped\E/) { ... }` or better still (because simpler and faster) `if (index($line1, $line2_chomped) >= 0) { ... }` See index. (`index` is more appropriate here because no real regex matching seems needed, only an exact substring match.) Your code here seems to have this relationship wackbards. A more advanced point. If file `file2` is small enough, the technique described in haukex's Building Regex Alternations Dynamically article could be used to build a single regex that could be matched against each line of file `file1` to determine which of these lines were to be printed. This approach would require only a single pass through each file, i.e., will be O(n), but will fail if `file2` is much more than several hundred (or perhaps several thousand — YMMV) lines. This approach is capable of handling an unlimited number of lines in `file1` however. Update: Minor wording and spelling corrections. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^2: partial matching of lines in perl by bliako (Abbot) on Jun 12, 2020 at 13:45 UTC
It seems it compares only the first line of `FH` with all lines from `F2`. You must either re-wind `F2` each time the `while(@a2 = <F2>){}` exits using: `seek F2, 0, SEEK_SET;` . Or, read `@a2 = <F2>` outside both loops, just like `@a1=<FH>`.	[reply] [d/l] [select]