in reply to Re^9: Immediately writing the results of search-and-replace
in thread Immediately writing the results of search-and-replace

Protects csv file, and a couple of other minor tweaks.

#!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11145971 use warnings; use List::AllUtils qw( nsort_by uniq ); use Path::Tiny; use Text::CSV qw( csv ); my $dir = path( '/tmp/dir.1114597' ); mkdir $dir; my $csv = "$dir/11145971,csv"; path( $csv )->spew( <<END ); whitelist entries,Noun,access list entries IP whitelist,Noun,IP access list IP whitelist entries,Noun,IP access list entries whitelist,Adjective,allow whitelist,Noun,access list whitelist,Noun,access-list whitelist,Verb,allow your whitelist,Noun,your access list your whitelist,Noun,my permission list your whitelist,Noun,thy permission list your whitelist,Noun,his permission list your whitelist,Noun,her permission list your whitelist,Noun,its permission list your whitelist,Noun,our permission list your whitelist,Noun,your permission list your whitelist,Noun,their permission list END $csv = $ARGV[0] // $csv; # use ARG if present my $testfile = path( "$dir/something.txt"); $testfile->spew( <<END ); a. this is some whitelist test b. for IP whitelist testing and your whitelist c. line with no replacement d. with whitelist and IP whitelist entries and IP whitelist entries ag +ain. e. this is another whitelist test END my ($old, $new, $reset) = map "\e[${_}m", 91, 92, 0; my %table; push @{ $table{ lc $_->[0] } }, $_->[2] for @{ csv( in => "$csv" ) }; my $match = qr/(@{[ join '|', map quotemeta($_), nsort_by { -length } keys %table ]})/i; #use Data::Dump 'dd'; dd \%table, $match; path( $ARGV[1] // $dir )->visit( sub # NOTE replaces File::Find { -f $_ && /\.(?:txt|rst|yaml)$/ and search_and_replace($_); }, { recurse => 1 } ); print "\n", $testfile->slurp; # FIXME for debugging sub search_and_replace { my ($target_file) = @_; my $pos = 0; # NOTE replaces whole file for each change my $more = 1; while( $more ) { $target_file->edit( sub { pos($_) = $pos; if( /$match/g ) { my ( $was, $where, $pre, $post ) = ( $1, $-[1], $`, $'); print "\n", $pre =~ s/^.*\n(?=.*\n)//sr, "$old$was$reset", $post =~ s/\n.*?\n\K.*//sr, "\n"; my $replace = ask( $was ); $was eq $replace or substr $_, $where, length $was, $replace; $pos = $where + length $replace; } else { $more = 0 } } ); } } sub ask { my ($was) = @_; my @choices = uniq @{ $table{ lc $was } }; local $| = 1; if( @choices > 1 ) { my $n = 1; printf "%8d. $new%s$reset\n", $n++, $_ for @choices; print " replace '$old$was$reset' with ${new}above pick$reset ( +or 0 to not change) : "; my $pick = <STDIN> =~ tr/0-9//cdr || 0; 0 <= $pick && $pick <= @choices or $pick = 0; return ($was, @choices)[$pick]; } else { print " replace '$old$was$reset' with '$new@choices$reset' ? y +es/no : "; return <STDIN> =~ /y/i ? $choices[0] : $was; } }

Replies are listed 'Best First'.
Re^11: Immediately writing the results of search-and-replace
by elemkeh (Acolyte) on Aug 09, 2022 at 14:33 UTC
    Bloody brilliant! Thank you for all your help. This whole thread has given me a lot of good material to study to evolve my Perl abilities. After I've had time to digest it, I think I will try to extend this tool to do similar work on filenames using what I've learned.
Re^11: Immediately writing the results of search-and-replace
by elemkeh (Acolyte) on Aug 12, 2022 at 03:56 UTC
    After looking around for a while and poking at the code myself, I still don't understand just what is going on with this line:
    my ($old, $new, $reset) = map "\e[${_}m", 91, 92, 0;
    Beyond the fact that this is variable assignment, I don't know what to make of it. It doesn't appear to be a common Perl trick, or if it is I have yet to see another example of it to compare against. Can anyone explain? Thank you.
      my ($old, $new, $reset) = map "\e[${_}m", 91, 92, 0;

      Left-hand side: Three variables, $old, $new, $reset.

      Right-hand side: A map expression iterating over the three values 91, 92, and 0. map evaluates the expression "\e[${_}m" for each of the three values, resulting in the list "\e[91m", "\e[92m", "\e[0m".

      Finally, the three element RHS list is assigned to the LHS list of variables.

      The values are ANSI terminal escape sequences for setting text attributes, see https://en.wikipedia.org/wiki/ANSI_escape_code#SGR. 91 selects bright red foreground, 92 selects bright green foreground, 0 resets back to defaults.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      my ($old, $new, $reset) = map "\e[${_}m", 91, 92, 0;

      ... doesn't appear to be a common Perl trick ...

      It's a very common (and very valuable) Perl "trick". See map and its cousin grep, and see List::Util for many useful functions inspired by the basic behavior of these two built-ins (i.e., iterate over a list and produce another list). As for other examples of the use of this type of function... well, keep your eyes open and I think you'll start to see quite a few, especially in code on this and similar websites. See also Map: The Basics in Tutorials.

      Update: Per a /msg from hippo, double-quote "trick" to emphasize that it's not really a trick, but common usage. Also, another minor wording change for clarity.


      Give a man a fish:  <%-{-{-{-<

        Thank you. It's not the behavior of map or grep that was/is mysterious to me, but the particular use of "\e[${_}m", which looks like some manner of regex (which it isn't), and I couldn't figure out what it was supposed to be doing to the numbers. I got hung up on the presence of an open square bracket but no closing square bracket, and the ${_} which looks like some sort of default variable distinct from $_, but for which I could find no definition. If I understand right from afoken's comment, then ${_} substitutes in the values from the provided list, which is what I would have expected $_ to do in a map. Is there anywhere the ${_} form is documented so I could read up on it? I had read the various docs you pointed to and didn't find it there. If it is not a form of its own, why use the curly braces instead of just $_?