annotate Sphinx/source/faq/debugging-encodings.rst @ 787:b8171b4046da

fix
author Alain Mazy <am@osimis.io>
date Fri, 29 Oct 2021 19:00:32 +0200
parents d050289fd0b3
children 8856bcfc561e
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
786
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
1 .. _debugging_encodings:
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
2
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
3 Debugging encoding issues (SpecificCharacterSet)
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
4 ================================================
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
5
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
6 .. contents::
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
7
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
8 .. highlight:: bash
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
9
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
10 Orthanc does not display the PatientName correctly
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
11 --------------------------------------------------
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
12
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
13 If your DICOM files are valid, Orthanc should display all strings correctly both
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
14 in the UI and in the Rest API in which all strings are converted to UTF-8.
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
15
787
Alain Mazy <am@osimis.io>
parents: 786
diff changeset
16 However, it might still be useful to understand what's wrong in your files
786
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
17 such that you can possibly fix your files once they have been stored in Orthanc
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
18 or configure your modality correctly.
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
19
787
Alain Mazy <am@osimis.io>
parents: 786
diff changeset
20 **Example 1**: a DICOM file is sent to Orthanc with ``SpecificCharacterSet`` set to ``ISO_IR 100``
786
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
21 (Latin1). The PatientName is expected to be ``ccžšd^CCŽŠÐ`` but Orthanc displays ``ccžšd^CCŽŠÐ``.
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
22 If you open the DICOM file in an Hex editor and search for the PatientName, you'll find this sequence
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
23 of bytes: ``63 63 9e 9a 64 5e 43 43 8e 8a d0``. By checking the `Latin1 code page
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
24 <https://en.wikipedia.org/wiki/ISO/IEC_8859-1>`__, you realise that the ``9e`` and ``9a`` characters
787
Alain Mazy <am@osimis.io>
parents: 786
diff changeset
25 are not valid Latin1 characters and are therefore replaced by ``ž`` in Orthanc UI.
786
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
26
787
Alain Mazy <am@osimis.io>
parents: 786
diff changeset
27 In this case, they have most likely been generated on a Windows system by using the default `Windows 1252
786
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
28 <https://en.wikipedia.org/wiki/Windows-1252>`__ encoding in which ``9e`` is ``ž``.
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
29
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
30 How to solve it ? It is highly recommended to fix it before Orthanc: in your RIS, worklist server or modality.
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
31 However, if you can not fix it there, you may still try to fix it once the file has been stored in Orthanc.
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
32 You can get inspiration from this `lua script <https://bitbucket.org/osimis/orthanc-setup-samples/src/master/lua-samples/sanitizeInvalidUtf8TagValues.lua>`__
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
33 that is fixing invalid UTF-8 characters
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
34