annotate Sphinx/source/faq/debugging-encodings.rst @ 1046:63edb430f259

mosaic
author Sebastien Jodogne <s.jodogne@gmail.com>
date Tue, 26 Mar 2024 09:50:04 +0100
parents 5df222ddd7d1
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
786
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
1 .. _debugging_encodings:
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
2
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
3 Debugging encoding issues (SpecificCharacterSet)
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
4 ================================================
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
5
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
6 .. contents::
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
7
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
8 .. highlight:: bash
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
9
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
10 Orthanc does not display the PatientName correctly
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
11 --------------------------------------------------
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
12
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
13 If your DICOM files are valid, Orthanc should display all strings correctly both
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
14 in the UI and in the Rest API in which all strings are converted to UTF-8.
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
15
788
Alain Mazy <am@osimis.io>
parents: 787
diff changeset
16 However, it might still be useful to understand what's wrong with your files
786
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
17 such that you can possibly fix your files once they have been stored in Orthanc
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
18 or configure your modality correctly.
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
19
787
Alain Mazy <am@osimis.io>
parents: 786
diff changeset
20 **Example 1**: a DICOM file is sent to Orthanc with ``SpecificCharacterSet`` set to ``ISO_IR 100``
786
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
21 (Latin1). The PatientName is expected to be ``ccžšd^CCŽŠÐ`` but Orthanc displays ``ccžšd^CCŽŠÐ``.
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
22 If you open the DICOM file in an Hex editor and search for the PatientName, you'll find this sequence
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
23 of bytes: ``63 63 9e 9a 64 5e 43 43 8e 8a d0``. By checking the `Latin1 code page
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
24 <https://en.wikipedia.org/wiki/ISO/IEC_8859-1>`__, you realise that the ``9e`` and ``9a`` characters
787
Alain Mazy <am@osimis.io>
parents: 786
diff changeset
25 are not valid Latin1 characters and are therefore replaced by ``ž`` in Orthanc UI.
786
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
26
787
Alain Mazy <am@osimis.io>
parents: 786
diff changeset
27 In this case, they have most likely been generated on a Windows system by using the default `Windows 1252
786
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
28 <https://en.wikipedia.org/wiki/Windows-1252>`__ encoding in which ``9e`` is ``ž``.
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
29
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
30 How to solve it ? It is highly recommended to fix it before Orthanc: in your RIS, worklist server or modality.
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
31 However, if you can not fix it there, you may still try to fix it once the file has been stored in Orthanc.
962
5df222ddd7d1 fix links to setup-samples
Alain Mazy <am@osimis.io>
parents: 947
diff changeset
32 You can get inspiration from this `lua script <https://github.com/orthanc-server/orthanc-setup-samples/tree/master/lua-samples/sanitizeInvalidUtf8TagValues.lua>`__
786
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
33 that is fixing invalid UTF-8 characters
d050289fd0b3 debugging encodings
Alain Mazy <am@osimis.io>
parents:
diff changeset
34