in reply to Re^4: unpacking wmic command's unicode output
in thread unpacking wmic command's unicode output

Does wmic really output UTF-8?

It appears it does (some form of unicode anyway), if you redirect its output to a file (or via a pipe):

c:\test>wmic.exe process > junk c:\test>u:head -c 200 junk  ■C a p t i o n C o m m a n d L i n +e

That first spludge(*) is the LE BOM 0xfffe. (It was a spludge when I c&p'd it!)

Probably utf-16le I think, but whatever it is, using -CS does seem to cause perl to work out what it is getting and treat it appropriately:

c:\test>wmic.exe process | perl -CS -pe1 > junk c:\test>u:head -c 200 junk Caption CommandLine

Whether by accident or design, it is useful.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^6: unpacking wmic command's unicode output
by ikegami (Patriarch) on Nov 12, 2008 at 19:02 UTC

    I didn't suggest
    wmic.exe process > junk
    I suggested you drop the -CS, so you need to compare
    wmic.exe process | perl -CS -pe1 > junk
    to
    wmic.exe process | perl -pe1 > junk

    It's simple,

    • If wmic outputs "C a p t i o n", then it's cleary using a 16-bit encoding and -CS is guaranteed to hurt.
    • If wmic doesn't output "C a p t i o n", then you've lost your justification for using -CS.

      Now I'm confused. a) how would it hurt? b) Why does it seem not to hurt in my use now?


      #my sig used to say 'I humbly seek wisdom. '. Now it says:
      use strict;
      use warnings;
      I humbly seek wisdom.

        how would it hurt?

        >perl -e"print qq{C\x00a\x00p\x00}" | perl -CS -we"print length <>" 6

        Why does it seem not to hurt in my use now?

        Because -CS has no effect if there are no bytes with bit 7 set.