From 22cbf4d9509d3b537e6c2fcf4c95ae73d930b9aa Mon Sep 17 00:00:00 2001 From: Yuji Yamamoto Date: Tue, 4 Jul 2017 23:13:12 +0900 Subject: Avoid errors on non UTF-8 Windows (#566) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Avoid errors on non UTF-8 Windows Problem ==== haddock exits with errors like below: `(1)` ``` haddock: internal error: : hPutChar: invalid argument (invalid character) ``` `(2)` ``` haddock: internal error: Language\Haskell\HsColour\Anchors.hs: hGetContents: invalid argument (invalid byte sequence) ``` `(1)` is caused by printing [the "bullet" character](http://www.fileformat.info/info/unicode/char/2022/index.htm) onto stderr. For example, this warning contains it: ``` Language\Haskell\HsColour\ANSI.hs:62:10: warning: [-Wmissing-methods] • No explicit implementation for ‘toEnum’ • In the instance declaration for ‘Enum Highlight’ ``` `(2)` is caused when the input file of `readFile` contains some Unicode characters. In the case above, '⇒' is the cause. Environment ---- OS: Windows 10 haddock: 2.17.3 GHC: 8.0.1 Solution ==== Add `hSetEncoding handle utf8` to avoid the errors. Note ==== - I found the detailed causes by these changes for debugging: - https://github.com/haskell/haddock/commit/8f29edb6b02691c1cf4c479f6c6f3f922b35a55b - https://github.com/haskell/haddock/commit/1dd23bf2065a1e1f2c14d0f4abd847c906b4ecb4 - These errors happen even after executing `chcp 65001` on the console. According to the debug code, `hGetEncoding stderr` returns `CP932` regardless of the console encoding. * Avoid 'internal error: : hPutChar: invalid argument (invalid character)' non UTF-8 Windows Better solution for 59411754a6db41d17820733c076e6a72bcdbd82b's (1) --- haddock-api/src/Haddock/Interface.hs | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/haddock-api/src/Haddock/Interface.hs b/haddock-api/src/Haddock/Interface.hs index 62b0aea9..cdc3064e 100644 --- a/haddock-api/src/Haddock/Interface.hs +++ b/haddock-api/src/Haddock/Interface.hs @@ -1,3 +1,4 @@ +{-# LANGUAGE CPP #-} ----------------------------------------------------------------------------- -- | -- Module : Haddock.Interface @@ -48,14 +49,20 @@ import qualified Data.Set as Set import Distribution.Verbosity import System.Directory import System.FilePath +import System.IO import Text.Printf import Digraph import DynFlags hiding (verbosity) import Exception import GHC hiding (verbosity) +#if defined(mingw32_HOST_OS) +import GHC.IO.Encoding.CodePage (mkLocaleEncoding) +import GHC.IO.Encoding.Failure (CodingFailureMode(TransliterateCodingFailure)) +#endif import HscTypes import FastString (unpackFS) +import MonadUtils (liftIO) -- | Create 'Interface's and a link environment by typechecking the list of -- modules using the GHC API and processing the resulting syntax trees. @@ -68,6 +75,10 @@ processModules -> Ghc ([Interface], LinkEnv) -- ^ Resulting list of interfaces and renaming -- environment processModules verbosity modules flags extIfaces = do +#if defined(mingw32_HOST_OS) + -- Avoid internal error: : hPutChar: invalid argument (invalid character)' non UTF-8 Windows + liftIO $ hSetEncoding stderr $ mkLocaleEncoding TransliterateCodingFailure +#endif out verbosity verbose "Creating interfaces..." let instIfaceMap = Map.fromList [ (instMod iface, iface) | ext <- extIfaces -- cgit v1.2.3