multiple-pass search?

propellerhat has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: multiple-pass search? by choroba (Cardinal) on Dec 09, 2021 at 19:01 UTC
Yes, substitution is the right tool. You can use Lingua::EN::Numbers to turn digits into words. `#!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Lingua::EN::Numbers qw{ num2en }; my $text = 'abc No. 347 xyz'; $text =~ s/No\. \K(\d+)/join "", "\\", map num2en($_), split m{}, $1/g +e; print $text; # abc No. \threefourseven xyz` [download] I used /e which evaluates the replacement part as code. The regex matches "No. " followed by a number, but replaces just the number due to \K. It splits the number into digits, replaces each with the word (via num2en) and joins them together with a \ at the beginning. `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l] [select]
Re: multiple-pass search? by jdporter (Paladin) on Dec 09, 2021 at 20:29 UTC
`use Tie::File; sub replace_serialnumbers_in_file($) { my @word = qw( zero one two three four five six seven eight nine ) +; my $filename = shift; my $serno = join '', map $word[$_], $filename =~ /(\d)/; # assumin +g no other digits in the filename tie my @lines, 'Tie::File', $filename or die; s/\\zerozerozero/\\$serno/g for @lines; }` [download] You don't need multipass if you take the serial number from the filename. I reckon we are the only monastery ever to have a dungeon staffed with 16,000 zombies.	[reply] [d/l]
Re^2: multiple-pass search? by propellerhat (Novice) on Dec 09, 2021 at 22:39 UTC
Here is how I extract the serial number from the filename: `use File::Find; my $dir = "documents"; find( sub { my $filename = $_; return unless ( $filename =~ /abstract-([0-9][0-9][0-9]).tex/ && -f $filename ); my $serialnumber = $1 ;` [download]	[reply] [d/l]
Re^3: multiple-pass search? by jdporter (Paladin) on Dec 10, 2021 at 01:57 UTC
So I take it the filename will have the exact pattern `abstract-NNN.tex`. If so, the regex you gave is too broad. It will match, for example, `nonabstract-000stexts`. You need to anchor the beginning and end, and escape the dot: `/^abstract-(\d{3})\.tex$/`	[reply] [d/l] [select]
Re^4: multiple-pass search? by propellerhat (Novice) on Dec 10, 2021 at 02:18 UTC
Re^5: multiple-pass search? by Fletch (Bishop) on Dec 10, 2021 at 13:54 UTC
Re^5: multiple-pass search? by jdporter (Paladin) on Dec 10, 2021 at 16:11 UTC
Re^5: multiple-pass search? by propellerhat (Novice) on Dec 10, 2021 at 18:39 UTC
Re: multiple-pass search? by LanX (Saint) on Dec 09, 2021 at 20:22 UTC
this should get you started, I kept it flexible so that you can adjust it. `DB<49> sub english_num { my ($pre,$num) = @_; my $eng = join "-", ma +p {(qw/zero one two three four five six seven \ eight nine/)[$_] } split //,$num; return "$pre \\$eng"} DB<50> $txt =" some text No. 345 other text No. 123 end text" DB<51> $txt =~ s/(No.) (\d{3})/english_num($1,$2)/ge DB<52> say $txt some text No. \three-four-five other text No. \one-two-three end text` [download] edit In case you are sure that it's always exactly 3 digits, you can also use a hardwired regex, with a lookup array `s/(No.) (\d)(\d)(\d)/$1 \\$nums[$2]-$nums[$3]-$nums[$4]/g` `DB<94> $_ =" some text No. 345 other text No. 123 end text" DB<95> p some text No. 345 other text No. 123 end text DB<96> s/(No.) (\d)(\d)(\d)/$1 \\$nums[$2]-$nums[$3]-$nums[$4]/g DB<97> p some text No. \three-four-five other text No. \one-two-three end text` [download] update after reading the OP again, please provide an SSCCE clarifying input and expected output. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l] [select]
Re^2: multiple-pass search? by propellerhat (Novice) on Dec 10, 2021 at 00:49 UTC
This is about the best I can do by way of providing a SSCCE: 1) files: a) title files (one title per file): title-001.tex title-002.tex ... title-999.tex b) catchfile index (one file of a thousand lines; a thousand titles + is about three or four times the number needed): \CatchFileDef{\zerozerozero}{title-000.tex}{} \CatchFileDef{\zerozeroone}{title-001.tex}{} ... \CatchFileDef{\nineninenine}{title-999.tex}{} c) document files (several categories, having same title): article-001.tex article-002.tex ... article-003.tex abstract-001.tex abstract-002.tex ... abstract-003.tex catalogue-001.tex catalogue-002.tex ... catalogue-003.tex 2) In the head of each document file is a placeholder for the English +representation of the serial number of the title: "\zerozerozero". I +f the placeholder is not useful, I can delete it. 3) In the head of each document file is the serial number of the title +, in Arabic representation: "No. 345". 4) The serial number of the title appears also in the filename of the +document file: "article-345". 5) The objective is to write in the document file the English represen +tation of the serial number of the title: "\threefourfive". 6) Once the English representations are in place, I can use Perl to ma +ke necessary adjustments. [download]	[reply] [d/l]
Re^3: multiple-pass search? by hippo (Bishop) on Dec 10, 2021 at 10:06 UTC
Unfortunately you have not provided so much as a single line of Perl here. As such it is impossible to know at which point you are encountering a problem, let alone what that problem is. Here is the sort of SSCCE you could have written: use strict; use warnings; use Test::More tests => 3; my $filename = 'abstract-345.tex'; my $have = <<'EOT'; foo Here: \zerozerozero bar No. 345 baz EOT my $want = <<'EOT'; foo Here: \threefourfive bar No. 345 baz EOT my @digits = qw/zero one two three four five six seven eight nine/; my ($arabic) = $filename =~ /-([0-9]{3})\.tex/; (my $eng = $arabic) =~ s/([0-9])/$digits[$1]/g; $have =~ s/\\zerozerozero/\\$eng/; is $arabic, '345', 'Digits extracted'; is $eng, 'threefourfive', 'Converted to English'; is $have, $want, 'Replaced in text'; [download] Now you can see how to perform these three operations. If that doesn't solve your problem you need to provide some runnable code which demonstrates the problem which you are having (ideally with a test such as shown here). In that way we will know what it is you are actually asking. There's a detailed rationale at How to ask better questions using Test::More and sample data. 🦛	[reply] [d/l]
Re: multiple-pass search? by jwkrahn (Abbot) on Dec 10, 2021 at 07:00 UTC
In occurs to me that matching with the greedy modifier "/g" From perlop: g Match globally, i.e., find all occurrences.	[reply]


Do you know where your variables are?
	PerlMonks

multiple-pass search?

edit

update