paschacroutt has asked for the wisdom of the Perl Monks concerning the following question:
I wrote the following code to modify a bunch of pandoc-generated DokuWiki formatted files in order to convert some expressions to DokuWiki internal links.
That went through many iterations and I am now somewhat satisfied with the result except for one mystery I am unable to pierce through.
It does not crash my code, nor does it sends any warning, but I can't explain this result.
In this substitution part of the code: .(defined($4) ? $4 =~ tr[\/*][]dr : '.') line 44, I check if the value of $4 is defined and if it is, I apply a tr to it to remove italic (//) or bold (**) marks if they are present. If $4 is undefined (meaning there are no dot, comma or semicolon after the last word of the expression), the conditional operator sends a dot to end the substitution.
So either this data entry:
or that one:**Voir :** proton, solution hydrogénée//.//
Give this result:**Voir :** proton, solution hydrogénée.
Which is exactly what is needed.**Voir :** [[glossaire:entrees:p:proton|proton]], [[glossaire:entrees: +s:solution_hydrogenee|solution hydrogénée]].
What I get is:**Voir :** proton, solution hydrogénée
With line breaks after solution_hydrogene and hydrogénée**Voir :** [[glossaire:entrees:p:proton|proton]], [[glossaire:entrees: +s:solution_hydrogenee |solution hydrogénée ]].
No line breaks.**Voir :** [[glossaire:entrees:p:proton|proton]], [[glossaire:entrees: +s:solution_hydrogenee|solution hydrogénée]].
It may have something to do with the /x modifier I suspect.
The actual code follows
1 #!/usr/bin/env perl 2 3 use 5.36.1; 4 use warnings; 5 use strict; 6 use utf8; 7 use autodie; 8 9 use warnings qw< FATAL utf8 >; 10 use open qw< :std :utf8 >; 11 use charnames qw< :full >; 12 use feature qw< unicode_strings >; 13 14 binmode(STDIN, ":utf8"); 15 binmode(STDOUT, ":utf8"); 16 binmode(STDERR, ":utf8"); 17 18 use Text::Undiacritic qw(undiacritic); 19 20 $^I = ".bak"; 21 22 while (<>){ 23 24 my $voir = $_; 25 26 $voir =~ s/ 27 (?:^\*\*Voir\s:\*\* 28 | 29 \G(?!^) 30 (?!\[)) 31 \K 32 (\s?) 33 ((\w[\/*]*) 34 (?:[^\.,;\n\r]\s?)+) 35 [\/*]*([\.,;])?[\/*]* 36 / 37 "$1\[\[glossaire:entrees:" 38 .lc(undiacritic($3)) 39 .":" 40 .lc(undiacritic($2 =~ tr[ \/*][_]dr)) 41 ."|" 42 .$2 =~ tr[\/*][]dr 43 ."\]\]" 44 .(defined($4) ? $4 =~ tr[\/*][]dr : '.') 45 /gemx; 46 47 print $voir; 48 }
Thanks for reading through !
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Unexpected line breaks in substitution results
by tybalt89 (Monsignor) on Apr 10, 2024 at 15:47 UTC | |
by paschacroutt (Acolyte) on Apr 10, 2024 at 16:25 UTC | |
|
Re: Unexpected line breaks in substitution results
by Danny (Chaplain) on Apr 10, 2024 at 15:12 UTC |