Mercurial > hg > orthanc-book
view Sphinx/source/faq/orthanc-storage.rst @ 1140:e3142f727541
clarified storage/index
author | Alain Mazy <am@orthanc.team> |
---|---|
date | Mon, 10 Feb 2025 09:57:26 +0100 |
parents | 286beb0fef4d |
children | a23a5e08f3b4 |
line wrap: on
line source
.. _orthanc-storage: How does Orthanc store its database? ==================================== Orthanc actually uses 2 different places to store its data: * the files are saved in the :ref:`Storage Area <orthanc-storage-area>`, usually, a standard file-system but it can also be replaced by a cloud :ref:`object storage <object-storage>` like AWS S3, Azure blob storage or Google cloud. * a summary of all resources is saved in the :ref:`Index <orthanc-index>` that is a SQL database. Orthanc always needs both the ``Storage`` and the ``Index`` and these 2 components must always remain synchronized. .. _orthanc-storage-area: Storage area ------------ **By default**, Orthanc stores all the DICOM files it receives in a folder called ``OrthancStorage`` on the filesystem (defined in the ``StorageDirectory`` configuration in the :ref:`configuration file <configuration>`). The default storage can also be replaced by a plugin to store these files in an :ref:`object storage <object-storage>` like AWS S3, Azure blob storage or Google cloud. Alternatively, the file storage can also be implemented inside a :ref:`PostgreSQL <postgresql>` or :ref:`MySQL <mysql>` Database but this is actually quite uncommon. More precisely, the ``Storage`` contains a set of so-called **attachments**, that may correspond to either a DICOM file, a JSON file, or any user-defined file. Internally, each attachment is automatically associated with an `universally unique identifier (UUID) <https://en.wikipedia.org/wiki/Universally_unique_identifier>`__. Orthanc can be configured to compress these files on-the-fly in order to save disk space (cf. the ``StorageCompression`` :ref:`configuration option <configuration>`). To reduce the number of files in a single directory (which is something that some operating systems might not like), a 3-level hierarchy of directories is created to store the attachments: The first two hexadecimal characters of the UUID give the first-level folder, and the two next characters give the second-level folder. .. _orthanc-index: Orthanc Index ------------- Orthanc also maintains a summary of all the DICOM resources in a SQL database in the so called ``Index``. This ``Index`` is mandatory to rapidly provide information when browsing and accessing the resources either through the :ref:`REST API of Orthanc <rest>` or through the :ref:`DICOM protocol <dicom-guide>`. **By default**, this index is implemented in a `SQLite database <https://en.wikipedia.org/wiki/SQLite>`__ that is stored in the same folder as the files (if you are using a file-system). This folder is defined by the ``IndexDirectory`` in the :ref:`configuration option <configuration>`) The default ``Index`` can also be replaced by a plugin to store the index in a :ref:`PostgreSQL <postgresql>`, :ref:`MySQL <mysql>` or :ref:`ODBC <odbc>` Database. Index content ------------- The ``Index`` database **indexes** all the attachments stored in the ``Storage``. The database records, for each attachment, its compression method, and its MD5 hashes before and after compression in order to detect disk corruption (cf. the ``StoreMD5ForAttachments`` :ref:`configuration option <configuration>`). One attachment must be associated with one :ref:`DICOM resource <model-world>` (patient, study, series, or instance). Incoming DICOM files are associated with one instance-level resource, but user-defined attachments can be associated with any kind of resource. Given one DICOM resource, all of its child attachments are identified by a number between 0 and 65535. Identifiers <= 1023 are reserved for the Orthanc core, whereas identifiers >= 1024 can be user-defined for external applications. Besides the attachments, the database index maintains other information for each DICOM resource, notably the :ref:`metadata <metadata>`, the :ref:`history of changes <changes>`, and an associative map that stores the so-called "main" DICOM tags (to avoid accessing the storage folder are when this is not needed). The database schema is kept as simple as possible, e.g, for SQLite, the schema can be found in the following two files of the source code of Orthanc: `PrepareDatabase.sql <https://orthanc.uclouvain.be/hg/orthanc/file/Orthanc-1.12.6/OrthancServer/Sources/Database/PrepareDatabase.sql>`__ and `InstallTrackAttachmentsSize.sql <https://orthanc.uclouvain.be/hg/orthanc/file/Orthanc-1.12.6/OrthancServer/Sources/Database/InstallTrackAttachmentsSize.sql>`__. Direct access ------------- Directly accessing the content of the ``Storage`` folder and the content of the SQLite/MySQL/PostgreSQL ``Index`` database is strongly discouraged for several reasons: * The ``Storage`` internal organization outlined above is only true when no database plugin is used (e.g. the :ref:`PostgreSQL <postgresql>` and :ref:`MySQL <mysql>` plugins can be configured to store the attachments inside a database). * Orthanc can be configured to compress the attachments before writing them on the disk (cf. the ``StorageCompression`` option) making them less easily readable by an external tool (check the ``OrthancRecoverCompressedFile`` executable in the Orthanc distribution). * By directly reading/writing the content of the ``Storage``, you bypass all the locking mechanisms used by Orthanc, which might result in data corruption. * If you are using SQLite for the ``Index``, one SQLite database should be accessed by at most one process at any time to avoid any problem (e.g. with NFS filesystems), for reasons that are `explained in the SQLite FAQ <https://www.sqlite.org/faq.html#q5>`__. Orthanc will stop if it receives the ``SQLITE_BUSY`` status. * The internal structure of the databases might evolve across successive versions of Orthanc or of the database plugins. As a consequence, it is **HIGHLY recommended NOT to directly access** the ``Storage`` and the SQLite/MySQL/PostgreSQL ``Index`` database. Use the :ref:`REST API <rest>` instead, which contains primitives to access the attachments (cf. the ``.../attachments/...`` URIs) and all other resources. The only exception to this rule is for **read-only access when Orthanc is stopped**, e.g. as a part of a :ref:`backup <backup>` or :ref:`upgrade/replication <replication>` process.