comparison Sphinx/source/developers/implementation-notes.rst @ 73:c742c8f9ffa3

encodings
author Sebastien Jodogne <s.jodogne@gmail.com>
date Wed, 14 Dec 2016 09:43:08 +0100
parents
children a976607e46f7
comparison
equal deleted inserted replaced
72:83c0a6556e6c 73:c742c8f9ffa3
1 .. _implementation-notes:
2
3 Implementation notes
4 ====================
5
6 Encodings
7 ---------
8
9 DICOM supports many codepages to encode strings. DICOM instances using
10 special characters should contain the ``SpecificCharacterSet
11 (0008,0005)`` tag. The latter tag `specifies which codepage
12 <http://dicom.nema.org/dicom/2013/output/chtml/part03/sect_C.12.html#sect_C.12.1.1.2>`__
13 is used by the DICOM instance. Internally, Orthanc converts all these
14 codepages to the `UTF-8 encoding
15 <https://en.wikipedia.org/wiki/UTF-8>`__.
16
17 In particular, :ref:`plugins <creating-plugins>` must assume that any
18 string or JSON file coming from the Orthanc core is encoded using
19 UTF-8. Similarly, plugins must use UTF-8 when calling services
20 provided by the Orthanc core. The conversion to/from UTF-8 is done
21 transparently by the plugin engine.
22
23 The :ref:`configuration option <configuration>` ``DefaultEncoding``
24 plays an important role. It is used in three cases:
25
26 1. If receiving a DICOM instance without the ``SpecificCharacterSet
27 (0008,0005)`` tag, Orthanc will interpret strings within this
28 instance using this default encoding. This is important in
29 practice, as many DICOM modalities are not properly configured with
30 respect to encodings.
31
32 2. When answering a :ref:`C-Find query <dicom-find>` (including for
33 worklists), Orthanc will use its default encoding. If one single
34 answer uses a different encoding, it will be transcoded.
35
36 3. If creating a new instance (e.g. through the
37 ``/tools/create-dicom`` URI of the :ref:`REST API <rest>`, or
38 through the ``OrthancPluginCreateDicom()`` primitive of the plugin
39 SDK) and if ``SpecificCharacterSet (0008,0005)`` is not provided
40 for this instance, Orthanc will use its default encoding. Note
41 however that if ``SpecificCharacterSet`` is set, Orthanc will
42 transcode the incoming UTF-8 strings to the codepage specified in
43 this tag, and not to the default encoding.