in reply to Re: Problems reading UTf-8 file with BOM
in thread Problems reading UTf-8 file with BOM

That could be an option. But shouldn't this have an solution in the standard perl distribution?

I thought I was doing something wrong because it did not occurred to me that it was necessary to get into so specific details just to read an UTF-8 file. I never had before such problem to deal with UTF-8 files (with or without BOM) but I found some posts here about Text::CSV_XS and UTF-8. Looks like this module cannot deal with UTF-8 in anyway. Could it be because of the "XS" part of the module?

Alceu Rodrigues de Freitas Junior
---------------------------------
"You have enemies? Good. That means you've stood up for something, sometime in your life." - Sir Winston Churchill
  • Comment on Re^2: Problems reading UTf-8 file with BOM

Replies are listed 'Best First'.
Re^3: Problems reading UTf-8 file with BOM
by ikegami (Patriarch) on Mar 25, 2010 at 21:44 UTC

    That could be an option. But shouldn't this have an solution in the standard perl distribution?

    There are many problems with including modules in core.

    • A certain level of quality is assumed.
    • A high level of maintenance is demanded.
    • Endorsement is assumed.
    • Presence in core in perpetuity is expected.
    • Even if better alternatives surface.
    • etc

    There are also problems with selecting modules to include in the code. Keep in mind that Perl is used for a wide variety of applications, and including everything is just not an option.

    The focus is on making it easy to install modules rather than including everything in core.

    You can install it with ppm install File::BOM (ActiveState) or cpan File::BOM (elsewhere). And if you have a distro that requires File::BOM, all you need to do is add one line to your Makefile.

Re^3: Problems reading UTf-8 file with BOM
by ikegami (Patriarch) on Mar 25, 2010 at 21:49 UTC

    Looks like this module cannot deal with UTF-8 in anyway.

    It can now. There are some (documented) limits on what characters can be used as quotes and separators, but that's it.