While hunting malware, a colleague noticed a folder that was displayed in his DOS console as if it had no name or was a single space.
This has nothing to do with the malware he was cleaning: http://blogs.technet.com/b/mmpc/archive/2013/02/27/the-strange-case-of-gamarue-propagation.aspx
It’s actually due to the fact that we work in a multi-language environment and that some cyrillic characters cannot be displayed correctly in a console whose code page is set to Western European (DOS) (850)
In DOS, you can use the command chcp.com to display the current code page.
In Powershell you can also use this old DOS command.
But PowerShell is different from a DOS console. It uses 3 code pages. One for the input and 2 for the Output.
The standard console output encoding is the same as the input encoding:
But for the output being sent through the pipeline to native applications, there’s an automatic variable called $OutputEncoding
The Help file says the following about the $OutputEncoding
Determines the character encoding method that Windows PowerShell
uses when it sends text to other applications.
For example, if an application returns Unicode strings to Windows
PowerShell, you might need to change the value to UnicodeEncoding
to send the characters correctly.
Valid values: Objects derived from an Encoding class, such as
ASCIIEncoding, SBCSCodePageEncoding, UTF7Encoding,
UTF8Encoding, UTF32Encoding, and UnicodeEncoding.
Default: ASCIIEncoding object (System.Text.ASCIIEncoding)
This example shows how to make the FINDSTR command in Windows
work in Windows PowerShell on a computer that is localized for
a language that uses Unicode characters, such as Chinese.
The first command finds the value of $OutputEncoding. Because the
value is an encoding object, display only its EncodingName property.
PS> $OutputEncoding.EncodingName # Find the current value
In this example, a FINDSTR command is used to search for two Chinese
characters that are present in the Test.txt file. When this FINDSTR
command is run in the Windows Command Prompt (Cmd.exe), FINDSTR finds
the characters in the text file. However, when you run the same
FINDSTR command in Windows PowerShell, the characters are not found
because the Windows PowerShell sends them to FINDSTR in ASCII text,
instead of in Unicode text.
PS> findstr # Use findstr to search.
PS> # None found.
To make the command work in Windows PowerShell, set the value of
$OutputEncoding to the value of the OutputEncoding property of the
console, which is based on the locale selected for Windows. Because
OutputEncoding is a static property of the console, use
double-colons (::) in the command.
PS> $OutputEncoding = [console]::outputencoding
PS> # Set the value equal to the
OutputEncoding property of the
OEM United States
# Find the resulting value.
As a result of this change, the FINDSTR command finds the characters.
# Use findstr to search. It find the
characters in the text file.
I think that the above help content will fully make sense with a concrete example that you can find on this page:
Last quick tip. To view all the available code pages with PowerShell, you do:
[System.Text.Encoding]::GetEncodings() | ft -AutoSize
I was also wondering how the code page was selected when you start a PowerShell or DOS console.
Well, it’s based on the “System Locale”. (See my article on the system locale vs. the user locale)
You can check the system locale with the following V3 cmdlet:
Show-ControlPanelItem -Name Region
If you set the system Locale” to French, you end up with a 850 (Multilingual) code page.
If you set the system locale to English US, you end up with the 437 (English US) code page.
Without wanting to overcomplicate things, be also aware that the font has also an impact: http://stackoverflow.com/questions/1259084/what-encoding-code-page-is-cmd-exe-using