# HG changeset patch # User Sebastien Jodogne # Date 1619185634 -7200 # Node ID 411e82bb3a9fccc1d05aa7e4320dac97a9d07357 # Parent a76d83a00c689d89d018f5ab3923701c0c90d7b6 documenting revisions and multiple writers diff -r a76d83a00c68 -r 411e82bb3a9f Sphinx/source/faq/features.rst --- a/Sphinx/source/faq/features.rst Fri Apr 23 10:34:13 2021 +0200 +++ b/Sphinx/source/faq/features.rst Fri Apr 23 15:47:14 2021 +0200 @@ -303,3 +303,70 @@ (e.g. using a distributed `message-broker system `__ such as RabbitMQ that is fed by an Orthanc plugin). + + +.. _revisions: + +Revisions +--------- + +.. highlight:: bash + +Higher-level applications built on the top of Orthanc might have to +modify metadata and/or attachments. This can cause concurrency +problems if multiple clients modify the same metadata/attachment +simultaneously. To avoid such problems, Orthanc implements a so-called +**revision mechanism** to protect from concurrent modifications. + +The revision mechanism is optional, was introduced in **Orthanc +1.9.2** and must be enabled by setting :ref:`configuration option +` ``CheckRevision`` to ``true``. It is strongly +inspired by the `CouchDB API +`__. + +When the revision mechanism is enabled, each metadata and attachment +is associated with a **revision number**. Whenever one sets a metadata +for the first time using a ``PUT`` query, this revision number can be +found in the HTTP header ``ETag`` that is reported by Orthanc:: + + $ curl -v http://localhost:8042/instances/19816330-cb02e1cf-df3a8fe8-bf510623-ccefe9f5/metadata/1024 -X PUT -d 'Hello' + [...] + < ETag: "0" + +Any ``GET`` query will also return the current value of ``ETag``:: + + $ curl -v http://localhost:8042/instances/19816330-cb02e1cf-df3a8fe8-bf510623-ccefe9f5/metadata/1024 + [...] + < ETag: "0" + +If one needs to subsequently modify or delete this metadata, the HTTP +client must set this value of ``ETag`` into the ``If-Match`` HTTP +header:: + + $ curl -v http://localhost:8042/instances/19816330-cb02e1cf-df3a8fe8-bf510623-ccefe9f5/metadata/1024 -X PUT -d 'Hello 2' -H 'If-Match: 0' + [...] + < ETag: "1" + +Note how this second call has incremented the value of ``ETag``: This +is the new revision number to be used in future updates. If a bad +revision number is provided, the HTTP error ``409 Conflict`` is +generated:: + + $ curl -v http://localhost:8042/instances/19816330-cb02e1cf-df3a8fe8-bf510623-ccefe9f5/metadata/1024 -X PUT -d 'Hello 3' -H 'If-Match: 0' + [...] + < HTTP/1.1 409 Conflict + +Such a ``409`` error must be handled by the higher-level +application. The revision number must similarly be given if deleting a +metadata/attachment:: + + $ curl -v http://localhost:8042/instances/19816330-cb02e1cf-df3a8fe8-bf510623-ccefe9f5/metadata/1024 -X DELETE -H 'If-Match: 1' + [...] + < HTTP/1.1 200 OK + +Check out the `OpenAPI reference ` of +the REST API of Orthanc for more information. + +**Warning:** The database index back-end must support revisions. As of +writing, only the **PostgreSQL plugin** in versions above 4.0 +implement support for revisions. diff -r a76d83a00c68 -r 411e82bb3a9f Sphinx/source/faq/scalability.rst --- a/Sphinx/source/faq/scalability.rst Fri Apr 23 10:34:13 2021 +0200 +++ b/Sphinx/source/faq/scalability.rst Fri Apr 23 15:47:14 2021 +0200 @@ -90,6 +90,20 @@ * ``SaveJobs = false`` * ``StorageAccessOnFind = Never`` +* Since Orthanc 1.9.2 and PostgreSQL plugins 4.0: By default, the + PostgreSQL index plugin uses 1 single connection to the PostgreSQL + database. You can have multiple connections by setting the + ``IndexConnectionsCount`` to a higher value (for instance ``5``) in + the ``PostgreSQL`` section of the configuration file. This will + improve concurrency. Check out :ref:`the explanation below `. + +* Since Orthanc 1.9.2 and PostgreSQL plugins 4.0: If you have an + hospital-wide VNA deployment, you could consider to deploy multiple + Orthanc servers sharing the same PostgreSQL database. A typical + scenario is having one "writer" Orthanc server that handles the + ingesting of DICOM instances, and multiple "reader" Orthanc servers + with features such as DICOMweb or viewers. + * Make sure to carefully :ref:`read the logs ` in ``--verbose`` mode, especially at the startup of Orthanc. The logs may contain very important information regarding performance. @@ -175,7 +189,11 @@ the `manpage `__ of ``mallopt()`` for more information. -**Status of the Docker images:** +**Status:** + +* Since **Orthanc 1.8.2**, the global configuration ``MallocArenaMax`` + automatically sets ``MALLOC_MMAP_THRESHOLD_`` (defaults to ``5``) + during the startup of Orthanc. * The ``jodogne/orthanc`` and ``jodogne/orthanc-plugins`` Docker images automatically set ``MALLOC_ARENA_MAX`` to ``5`` **since @@ -191,55 +209,134 @@ Known limitations ----------------- -Exclusive access to the DB -^^^^^^^^^^^^^^^^^^^^^^^^^^ +Exclusive access to the DB in Orthanc <= 1.9.1 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Orthanc was originally designed as a mini-DICOM server in 1-to-1 +relation with a SQLite database. Until **Orthanc 1.9.1**, because of +this original design, the internal code accessing the DB was affected +by a strong limitation: Inside a single Orthanc process, there was no +concurrent access to the DB. + +One solution to avoid this limitation was to have multiple Orthanc +accessing the same DB (works only for MySQL and PostgreSQL) as +presented in this `sample +`__. +However, this solution was only robust if there was **one single +"writer" Orthanc server** (i.e. only one Orthanc was modifying the +database). Indeed, the core of Orthanc <= 1.9.1 did not support the +replay of database transactions, which is necessary to deal with +conflicts between several instances of Orthanc that would +simultaneously write to the database. -As of Orthanc 1.9.1, the internal code accessing the DB is still affected -by limitations induced by the SQLite engine that was the only one originally -available at the beginning of the project: inside a single Orthanc process, -there is no concurrent access to the DB. +Concretely, in Orthanc <= 1.9.1, when connecting multiple Orthanc to a +single database by setting ``Lock`` to ``false``, there should only be +one instance of Orthanc acting as a writer and all the other instances +of Orthanc acting as readers only. Be careful to set the option +``SaveJobs`` to ``false`` in the configuration file of all the +instances of Orthanc acting as readers (otherwise the readers would +also modify the database). -One solution to avoid this limitation is to have multiple Orthanc accessing -the same DB (works only for MySQL and PostgreSQL) as presented in this `sample -`__. +Some issues reported in our bug tracker are related this limitation: +`issue 83 `__, +`issue 121 `__, +`issue 151 `__. + +This limitation has disappeared with Orthanc 1.9.2 and +PostgreSQL/MySQL plugins 4.0, were the database engine was fully +rewritten. + + +.. _multiple-writers: + +Concurrent accesses to the DB in Orthanc >= 1.9.2 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Also note that the core of Orthanc does not currently support the replay -of database transactions, which is necessary to deal with conflicts -between several instances of Orthanc that would simultaneously write -to the database. As a consequence, as of Orthanc 1.9.1, when connecting multiple -Orthanc to a single database by setting ``Lock`` to ``false``, there -should only be one instance of Orthanc acting as a writer and all the -other instances of Orthanc acting as readers only. Be careful to set -the option ``SaveJobs`` to ``false`` in the configuration file of all -the instances of Orthanc acting as readers. +In **Orthanc 1.9.2 and PostgreSQL/MySQL plugins 4.0**, the database +engine of Orthanc was rewritten from scratch to allow multiple +writers/readers to share the same database. This new feature +necessitated a full refactoring of the database engine, so as to +replay transactions in the case of collisions between concurrent +transactions to the database. + +Furthermore, one Orthanc server can also manage several connections to +PostgreSQL or MySQL, in order to improve performance by adding +concurrency. Read-only database transactions are also distinguished +from read-write transactions in order for the database engine to +further optimize the patterns of access. + +Summarizing, the **multiple readers/writers** is now possible. Here +is a drawing representing a possible deployment: + +.. image:: ../images/2021-04-22-MultipleWriters.png + :align: center + :width: 500px + +Care must be taken to the following aspects: -A refactoring is needed to improve the core of Orthanc in that -respect, for which we are looking for funding from the -industry. Some issues reported in our bug tracker call for this -refactoring: `issue 83 -`__, `issue 121 -`__, `issue 151 -`__. +* Orthanc 1.9.2 must be combined with a database plugin that supports + multiple writers. This is the case of the PostgreSQL and MySQL + plugins with version >= 4.0. The built-in SQLite database **does + not** support multiple writers. + +* Concurrent access can result in so-called `non-serializable + transactions + `__ + if two separate database transactions modify the database at the + same time (cf. ``ErrorCode_DatabaseCannotSerialize`` in the source + code of Orthanc). Orthanc will **automatically replay such + transactions** a certain number of times (waiting 100ms more between + each retry), until the transactions succeed. The plugins provide an + option to control the maximum number of retries. If the maximum + number of retries is exceeded, the ``503 Service Unavailable`` HTTP + error is raised (server overloaded because of unsuccessful retries + of concurrent transactions). + +* If a higher-level application **modifies metadata and/or + attachments** in the presence of multiple writers, Orthanc provides + a :ref:`revision mechanism ` to prevent concurrent + updates. + Latency ^^^^^^^ -As of Orthanc 1.9.1, Orthanc still performs quite a large number of small -SQL requests. A simple request to a route like ``/studies/{id}`` can trigger -6 SQL queries. +For some queries to the database, Orthanc performs several small SQL +requests. For instance, a request to a route like ``/studies/{id}`` +can trigger 6 SQL queries. Given these round-trips between Orthanc and +the DB server, it's important for the **network latency to be as small +as possible**. For instance, if your latency is 20ms, a single request +to ``/studies/{id}`` might take 120ms. Typically, a latency of 1-4 ms +is expected to have correct performances. -This is not an ideal situation and this might be addressed -in a future larger DB refactoring (the most time-consuming queries have already -been optimized). Given the large number of round-trips -between Orthanc and the DB server, it's important that the latency is reduced -as much as possible. I.e, if deploying Orthanc in a cloud infrastructure, -make sure that the DB server and Orthanc VMs are located in the same datacenter. +As a consequence, if deploying Orthanc in a cloud infrastructure, make +sure that the DB server and Orthanc VMs are located in the **same +datacenter**. Note that most of the time-consuming queries have +already been optimized, and that future versions of Orthanc SDK might +aggregate even more SQL requests. -Typically, a latency of 1-4 ms is expected to have correct performances. If your -latency is 20ms, a simple request to ``/studies/{id}`` might spend 120ms in -round-trip alone. +Starting with Orthanc 1.9.2, and PostgreSQL/MySQL index plugins 4.0, +Orthanc can also be configured to handle **multiple connections to the +database server** by setting the ``IndexConnectionsCount`` to a value +greater than ``1``. This allows concurrent accesses to the database, +which avoids to sequentially wait for a database transaction to be +concluded before starting another one. Having multiple connections +makes the latency problem much less important. +Slow deletions +^^^^^^^^^^^^^^ +Deleting large studies can take some time, because removing a large +number of files from a filesystem can be an expensive operation (which +might sound counterintuitive). +It is possible to create an :ref:`storage area plugin +` that delays the actual deletion from the +filesystem. The plugin would maintain a queue (e.g. as a SQLite +database) of files to be removed. The actual deletion from the +filesystem would be done asynchronously in a separate thread. + +We are looking for funding from the industry to implement such a +plugin. diff -r a76d83a00c68 -r 411e82bb3a9f Sphinx/source/images/2021-04-22-MultipleWriters.png Binary file Sphinx/source/images/2021-04-22-MultipleWriters.png has changed diff -r a76d83a00c68 -r 411e82bb3a9f Sphinx/source/plugins/mysql.rst --- a/Sphinx/source/plugins/mysql.rst Fri Apr 23 10:34:13 2021 +0200 +++ b/Sphinx/source/plugins/mysql.rst Fri Apr 23 15:47:14 2021 +0200 @@ -13,6 +13,9 @@ For general information, check out the `official homepage of the plugins `__. +For information about scalability, make sure to read the section about +:ref:`multiple writers in large-scale deployments `. + **Warning:** According to `this thread on our discussion group `__, the MySQL/MariaDB plugins require MySQL 8.x if running on Microsoft @@ -100,9 +103,12 @@ "Username" : "orthanc", "Password" : "orthanc", "EnableSsl" : false, // force SSL connections - "SslVerifyServerCertificates": true, // verify server certificates if EnableSsl is true - "SslCACertificates": "", // path to CA certificates to validate servers - "Lock" : true // See section about Locking + "SslVerifyServerCertificates": true, // Verify server certificates if EnableSsl is true + "SslCACertificates": "", // Path to CA certificates to validate servers + "Lock" : true, // See section about Locking + "MaximumConnectionRetries" : 10, // New in release 3.0 + "ConnectionRetryInterval" : 5, // New in release 3.0 + "IndexConnectionsCount" : 1 // New in release 4.0 }, "Plugins" : [ "/home/user/orthanc-databases/BuildMySQL/libOrthancMySQLIndex.so", @@ -165,6 +171,34 @@ configuration of the MySQL plugins. They are documented below. +.. _mysql-multiple-writers: + +Multiple writers or connections +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Starting with Orthanc 1.9.2 and MySQL 4.0, it is possible to use +:ref:`multiple writers or connections in large-scale deployments +`. Here is the list of configuration that control +this behavior: + +* ``Lock`` must be set to ``false`` (cf. :ref:`below `) + +* ``MaximumConnectionRetries`` governs how many times Orthanc tries to + connect to the database, as well as how many times Orthanc replays + transactions to deal with collisions between multiple writers. + +* ``IndexConnectionsCount`` controls the number of connections from + the index plugin to the MySQL database. It is set to ``1`` by + default, which corresponds to the old behavior of Orthanc <= 1.9.1. + +* ``ConnectionRetryInterval`` is only used when opening one database + connection to MySQL. + +* As of release 4.0, the MySQL plugin does **not** support yet the + :ref:`revision mechanism ` to protect metadata and + attachments from concurrent modifications. + + Locking ^^^^^^^ diff -r a76d83a00c68 -r 411e82bb3a9f Sphinx/source/plugins/postgresql.rst --- a/Sphinx/source/plugins/postgresql.rst Fri Apr 23 10:34:13 2021 +0200 +++ b/Sphinx/source/plugins/postgresql.rst Fri Apr 23 15:47:14 2021 +0200 @@ -13,6 +13,9 @@ For general information, check out the `official homepage of the plugins `__. +For information about scalability, make sure to read the section about +:ref:`multiple writers in large-scale deployments `. + Compilation @@ -121,7 +124,10 @@ "Database" : "orthanc", "Username" : "orthanc", "Password" : "orthanc", - "EnableSsl" : false // New in version 3.0 + "EnableSsl" : false, // New in release 3.0 + "MaximumConnectionRetries" : 10, // New in release 3.0 + "ConnectionRetryInterval" : 5, // New in release 3.0 + "IndexConnectionsCount" : 1 // New in release 4.0 }, "Plugins" : [ "/home/user/orthanc-databases/BuildPostgreSQL/libOrthancPostgreSQLIndex.so", @@ -210,6 +216,35 @@ configuration of the PostgreSQL plugins. They are documented below. +.. _postgresql-multiple-writers: + +Multiple writers or connections +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Starting with Orthanc 1.9.2 and PostgreSQL 4.0, it is possible to use +:ref:`multiple writers or connections in large-scale deployments +`. Here is the list of configuration that control +this behavior: + +* ``Lock`` must be set to ``false`` (cf. :ref:`below `) + +* ``MaximumConnectionRetries`` governs how many times Orthanc tries to + connect to the database, as well as how many times Orthanc replays + transactions to deal with collisions between multiple writers. + +* ``IndexConnectionsCount`` controls the number of connections from + the index plugin to the PostgreSQL database. It is set to ``1`` by + default, which corresponds to the old behavior of Orthanc <= 1.9.1. + +* ``ConnectionRetryInterval`` is only used when opening one database + connection to PostgreSQL. + +* The PostgreSQL plugin supports the :ref:`revision mechanism + ` to protect metadata and attachments from concurrent + modifications. + + + .. _postgresql-lock: Locking @@ -222,6 +257,7 @@ `__) to prevent other instances of Orthanc from using the same PostgreSQL database. If you want several instances of Orthanc to share the same +database or if you need multiple connections to the PostgreSQL database, set the ``Lock`` option to ``false`` in the configuration file::