blanchon has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I've a character coding pb with DOS command and perl.

1) I generate a file by the system command Perl :
system("VisualSourceSafe command... > file");

2) I read the file and put out a variable corresponding to a line of the file. I post this variable recovered with the Perl command print:
print ("$Variable"); -> "Documentation/Base de donnée/Contraintes.xls"

3) I launch a new system command Perl with this variable : system("VisualSourceSafe history $variable");

But the variable after the command post is : "Documentation/Base de donn' e/Contraintes.xls".
The program plants because "é" was modified in "'".

Can you give me a way?

Thank you.

FB
  • Comment on DOS coding compatibility with perl "é"->"'"

Replies are listed 'Best First'.
Re: DOS coding compatibility with perl "é"->"'"
by davis (Vicar) on Oct 01, 2003 at 10:28 UTC
    NB: This is all guesses. I haven't proved any of it.

    I believe you're saying "When I run this command with *this* argument, the argument gets mangled"

    Short Answer (I think, not tested at all):

    system("VisualSourceSafe", "history", $variable);

    Long answer: you're giving system 1 argument, which contains a "special" character, so system hands it to the shell for processing. Passing system a list avoids that problem.
    Check perldoc -f system for more information.


    davis
    It's not easy to juggle a pregnant wife and a troubled child, but somehow I managed to fit in eight hours of TV a day.
      Thank you, but system("VisualSourceSafe", "history", $variable); does the same thing... It don't work. FB
Re: different DOS/perl encoding
by chromatic (Archbishop) on Oct 01, 2003 at 18:12 UTC

    Is your DOS shell Unicode-aware? You could try the multi-arg form of system to avoid any shell processing of the arguments:

    my $status = system( 'ss.exe', 'history', $line_var )

    I don't know much about Windows though.

Re: DOS coding compatibility with perl "é"->"'"
by tilly (Archbishop) on Oct 05, 2003 at 03:08 UTC
    Random guess. You are being bitten by character encodings.

    Different applications in Windows use different character encodings. When cutting and pasting between them you will find that Windows silently changes the actual data to be what is closest to being displayed in the same way. To see this in action, print a file to a command line, and view it both in an application that runs in the command-prompt (eg edit, or just print it out with Perl), and one that is Window based (eg notepad). You will find that é in one is not in the other. Play around with cutting and pasting between them, and with chr and ord to get a sense of what is happening.

    If you don't know how to cut-and-paste to a terminal, try right-clicking on the title-bar of the command prompt and follow the menus.

Re: different DOS/perl encoding
by tilly (Archbishop) on Oct 05, 2003 at 03:25 UTC
    You have confirmed my (late) reply to your original question. When you view the same byte in a command prompt and in a Windows application, it will display differently. When you cut and paste between them, Windows automatically converts the data to keep the visual display the same. Unfortunately for programming, you care more about the binary data than details of how it is shown, and this behaviour causes you problems.

    If you save to a file and read that in another application then the data remains the same, but the display changes. This is much better from your point of view.

    My suggestion is therefore never to cut and paste between a command prompt and any Windows application. Instead save to files and read from them in other applications.