Respected monks,
Recently I got an Excel spreadsheet and I extracted the data from it so that I can populate these data into a database.
However from the extracted data, I see some outstanding 'M-' characters.
It looks interesting that some of the 'M-' characters seems was interpreted from the visible special characters from the original spreadsheet, while the other of the 'M-' looks like trivail.
I paste each of the examples below:
'M-' that looks come from special characters
user@server> cat data1.txt
Depósito Centralizado
user@server> cat -v data1.txt
DepM-ssito Centralizado
'M-' that looks trivial
user@server> cat data2.txt
London and NewYork
user@server> cat -v data2.txt
London andM- NewYork
The spreadsheet was generated from Windows platform, and I'm extracting it on linux.
To make sure to filter out the usually mentioned "Windows Control Characters" like the '^M' and so on, I used dos2unix for each of the data files, but the 'M-' characters didn't disappear.
So, my monks, could you kindly let me know how do these 'M-' characters come from? Are they also the control characters?
Another question is, is it possible to use perl to clean-up/correct these 'M-' characters?
Many thanks in advance
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |