Skip to content

Commit 42f8358

Browse files
Yahya FarhadiLarsAsplund
authored andcommitted
fix(sim): decode simulator output on Windows
The old code used a simple data.decode("utf-8") which would throw a raw UnicodeDecodeError when simulator output contained non-UTF-8 characters (e.g., Windows legacy code pages like cp1252). The error message was something like: UnicodeDecodeError:'utf-8' codec can't decode byte 0xe9 in position 42: invalid continuation byte This was confusing because it gave no indication that the problem was with simulator output encoding, not with your source code or VUnit itself. The change tries each encoding in order. If all fail, it falls back to data.decode("utf-8", errors="replace") which substitutes undecodable bytes with � instead of crashing. Before: A simulator emitting a single non-UTF-8 byte (e.g., accented character in a file path, or a copyright symbol in a vendor library message) would crash VUnit with an opaque UnicodeDecodeError — both on compile success output and on compile failure output (err.output). After: The output is decoded gracefully using the most likely encoding, so VUnit continues normally and the user sees the actual compile pass/fail message instead of a decoding traceback.
1 parent 3357e4b commit 42f8358

1 file changed

Lines changed: 25 additions & 2 deletions

File tree

vunit/sim_if/__init__.py

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
import sys
1212
import os
1313
from os import environ, listdir, pathsep
14+
import locale
1415
import subprocess
1516
from pathlib import Path
1617
from typing import List
@@ -360,14 +361,36 @@ def check_output(command, env=None):
360361
"""
361362
Wrapper arround subprocess.check_output
362363
"""
364+
def _decode(data: bytes) -> str:
365+
"""Decode tool output robustly across platforms.
366+
367+
Some simulators on Windows emit output in a legacy code page (e.g. cp1252),
368+
which can raise UnicodeDecodeError if decoded as strict UTF-8.
369+
"""
370+
371+
encodings_to_try = (
372+
"utf-8",
373+
"utf-8-sig",
374+
locale.getpreferredencoding(False) or "utf-8",
375+
"cp1252",
376+
)
377+
378+
for encoding in encodings_to_try:
379+
try:
380+
return data.decode(encoding)
381+
except UnicodeDecodeError:
382+
continue
383+
384+
return data.decode("utf-8", errors="replace")
385+
363386
try:
364387
output = subprocess.check_output( # pylint: disable=unexpected-keyword-arg
365388
command, env=env, stderr=subprocess.STDOUT
366389
)
367390
except subprocess.CalledProcessError as err:
368-
err.output = err.output.decode("utf-8")
391+
err.output = _decode(err.output)
369392
raise err
370-
return output.decode("utf-8")
393+
return _decode(output)
371394

372395

373396
def check_executable(simulator_name, prefix, executable_name):

0 commit comments

Comments
 (0)