a|b||d becomes a|b|\N|d
|b|c|d becomes \N|b|c|d
a|b|c| becomes a|b|c|\N
and similarly,
a|b|.|d becomes a|b|\N|d
but
.|b|c|d does not become \N|b|c|d
a|b|c|. does not become a|b|c|\N
Is that a bug?
If the above is a bug, the following regexps are probably faster:
s/\s*\|\s*/\|/g;
s/^\.?(?=\|)/\\N/;
s/(?<=\|)\.?(?=\||$)/\\N/g;
s/(?<=\d{2}:\d{2}:\d{2})\.\d+//g;
s/(?<=\d{5})-(?:\d{1,4}|\s+)//;
If the above is not a bug, the following regexps are probably faster:
s/\s*\|\s*/\|/g;
s/^(?=\|)/\\N/;
s/(?<=\|)(?=\||$)/\\N/g;
s/(?<=\|)\.(?=\|)/\\N/g;
s/(?<=\d{2}:\d{2}:\d{2})\.\d+//g;
s/(?<=\d{5})-(?:\d{1,4}|\s+)//;
I reduced the number of regexps by combining a few, I shortened the regexps by removing the spaces first (not last), and I used zero-widths positive lookaheads and lookbehinds to mimimze the text being captured and substituted.
Use this in conjuction with the -p or -pi suggestion for better results.
|