I’m trying to extract International Phonetic Alphabet (IPA) symbols from html source code. Internet Explorer’s View > Source console displays the symbols as symbols (ie, not in any utf form); moreover, from this console I can copy these symbols and paste them into Notepad as unformatted Arial-font text without utf code taking their place.
However, when I ask Perl to extract such symbols from html source and write them to Excel via Spreadsheet::WriteExcel, I get junk. (Spreadsheet::WriteExcel’s default ‘write’ font is Arial. Having opened the resulting Excel file, it makes no difference what font – including IPA-specific fonts – I choose to display a given cell's contents: it’s still junk.)
Can you explain to me what’s going on? Is there a fix? I’m not familiar with utf-8 programming in Perl, though I suspect I’ll need to go there.
The website I’m trying to use is www.dictionary.com. Search on a word like ‘hello’ and click Show IPA. The returned stuff between the slashes – that’s the stuff I wish to extract and have Perl write to a spreadsheet.
Many thanks!
In reply to Writing International Phonetic Alphabet symbols to Excel? by cypress
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |