in reply to Re: issue with Encode::Guess
in thread issue with Encode::Guess

Hello @lnickt,

thanks first.
Concerning your welcome to monastery I am many years into it. But thanks nonetheless.

An early concern about late mention before I forget it: Test::utf8

<cite>This module is a collection of tests useful for dealing with utf8 strings in Perl.</cite>

I read: "dealing with utf8 strings"
This is not, what I want. I want to deal with "non-utf8" strings. They should become utf8 strings but I don't want to deal with it. I guess that's quite different.

To do an early "over all": I don't understand everything you say, I am also not native english.
So I try to answer things you name as far as I can. Something you write is irritating for me. Besides I think you're running to make the things too big, they aren't as big. They are small.

to answer your mentions:

I've had a look on SSCCE.

<cite>An even simpler SSCCE</cite>
I don't know, what you want to tell me with this. Actually I am not able to see a relation to my case. If there is one, then sorry.

<cite>Your code looks as though you first spent time</cite>
I'm aware that different people have different strategies for software development. As for my programming experience since 1986 I believe that's a good idea to move emerging problems to solved section, every thing that emerges that is solved will not be a problem in the future.

<cite>Your script, judging by its name, appears actually to be trying to decide if a file is encoded in UTF-8.</cite>
No. my script in the future should convert some hundred files from any encoding to UTF-8. Because the files are output of at mindst 5 different tools it is no way to decide by hand what encoding they are. It has to be done by a guess.
But in this first step I want to do this first step and guess what encoding they are. And for this I tought it would be a good idea to use a standard package. Maybe I've choosen the wrong.

There's no more to say. Regards.

Replies are listed 'Best First'.
Re^3: issue with Encode::Guess
by 1nickt (Canon) on Apr 05, 2020 at 16:13 UTC

    Hello again!

    Something you write is irritating for me.

    I apologize. I did not mean to seem condescending. I think I shared some valuable tips that will help you get better help more quickly.

    <cite>An even simpler SSCCE</cite>
    I don't know, what you want to tell me with this. Actually I am not able to see a relation to my case. If there is one, then sorry.

    The bug in your code was that you were misusing qw. I showed the simplest test that would prove that. (Also, note that the error you were originally getting was quite explicit, quoting the literal string "$encodings_test" as what you passed to the function.)

    my issue - at this time - is to guess what encoding the input file is.

    I'm sorry -- and I do not think this is a question of language -- but that is not your issue. That is your objective. Your issue is (was) the thing that was causing your current code to fail. Now, since you did not know what that was, you could not state it. But you could have stated the output that you did not expect from your program.

    This is not, what I want. I want to deal with "non-utf8" strings. They should become utf8 strings but I don't want to deal with it. I guess that's quite different.

    Very true. I mentioned Test::utf8 because it contains functions for trying to verify that what you think is encoded in UTF-8 really is. If you encode something to UTF-8 based on the guessed encoding of the source, you may want to check the result. But again I have to ask -- is it not possible for you to know the encoding of the data you are working with?

    Hope this helps!


    The way forward always starts with a minimal test.