in reply to split keywords
You want "split" to return all the original input characters, and just put "keyword" tags immediately around each string which does not consist of separator characters. So add some logic to the "map" block, like this:
In this case, whitespace alone will not trigger a split; a single keyword item could contain multiple words separated by whitespace.my $input = "kw1,kw2; kw3 — kw4‐kw5"; # separator is any string consisting of comma, semicolon, # —, – or ‐, bounded by 0 or more whitespace: my $sep = qr{ \s* (?: , | ; | \&(?:[mn]dash|hyphen); ) \s* }x; # in the map block, add keyword tags to non-separator items my @out = map { /$sep/ ? $_ : "<keyword>$_</keyword>" } split /($sep)/ +, $input; print join "\n",@out,"";
|
|---|