786
|
1 .. _debugging_encodings:
|
|
2
|
|
3 Debugging encoding issues (SpecificCharacterSet)
|
|
4 ================================================
|
|
5
|
|
6 .. contents::
|
|
7
|
|
8 .. highlight:: bash
|
|
9
|
|
10 Orthanc does not display the PatientName correctly
|
|
11 --------------------------------------------------
|
|
12
|
|
13 If your DICOM files are valid, Orthanc should display all strings correctly both
|
|
14 in the UI and in the Rest API in which all strings are converted to UTF-8.
|
|
15
|
788
|
16 However, it might still be useful to understand what's wrong with your files
|
786
|
17 such that you can possibly fix your files once they have been stored in Orthanc
|
|
18 or configure your modality correctly.
|
|
19
|
787
|
20 **Example 1**: a DICOM file is sent to Orthanc with ``SpecificCharacterSet`` set to ``ISO_IR 100``
|
786
|
21 (Latin1). The PatientName is expected to be ``ccžšd^CCŽŠÐ`` but Orthanc displays ``ccd^CCÐ``.
|
|
22 If you open the DICOM file in an Hex editor and search for the PatientName, you'll find this sequence
|
|
23 of bytes: ``63 63 9e 9a 64 5e 43 43 8e 8a d0``. By checking the `Latin1 code page
|
|
24 <https://en.wikipedia.org/wiki/ISO/IEC_8859-1>`__, you realise that the ``9e`` and ``9a`` characters
|
787
|
25 are not valid Latin1 characters and are therefore replaced by ```` in Orthanc UI.
|
786
|
26
|
787
|
27 In this case, they have most likely been generated on a Windows system by using the default `Windows 1252
|
786
|
28 <https://en.wikipedia.org/wiki/Windows-1252>`__ encoding in which ``9e`` is ``ž``.
|
|
29
|
|
30 How to solve it ? It is highly recommended to fix it before Orthanc: in your RIS, worklist server or modality.
|
|
31 However, if you can not fix it there, you may still try to fix it once the file has been stored in Orthanc.
|
|
32 You can get inspiration from this `lua script <https://bitbucket.org/osimis/orthanc-setup-samples/src/master/lua-samples/sanitizeInvalidUtf8TagValues.lua>`__
|
|
33 that is fixing invalid UTF-8 characters
|
|
34
|