Bug 121 - Crash when PG transaction fails
Summary: Crash when PG transaction fails
Status: RESOLVED FIXED
Alias: None
Product: Orthanc
Classification: Unclassified
Component: Orthanc Core (show other bugs)
Version: unspecified
Hardware: All All
: --- normal
Assignee: Sébastien Jodogne
URL:
Depends on:
Blocks:
 
Reported: 2020-06-29 15:14 CEST by Sébastien Jodogne
Modified: 2020-06-29 15:26 CEST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sébastien Jodogne 2020-06-29 15:14:55 CEST
[BitBucket user: Michael Hobbs]
[BitBucket date: 2019-01-07.21:50:22]

Initial title: Crashed in docker with hang thus Docker did not restart

What steps will reproduce the problem? 
Dockerfile
FROM osimis/orthanc:18.12.2
COPY orthanc.json /etc/orthanc/

ENV PG_ENABLED true
ENV PG_INDEX_ENABLED true
ENV PG_STORAGE_ENABLED true

Very high volume of incoming DICOM.


```
#!text

dicom-sender_1    | E: DcmElement: Unknown Tag & Data (0a7b,2020) larger (1950884384) than remaining bytes (250) in file, premature end of stream
dicom-sender_1    | E0107 21:32:19.474432 OrthancException.h:85] Bad file format: Cannot parse an invalid DICOM file (size: 258 bytes)
db_1              | 2019-01-07 21:32:19.547 UTC [43] ERROR:  could not serialize access due to read/write dependencies among transactions
db_1              | 2019-01-07 21:32:19.547 UTC [43] DETAIL:  Reason code: Canceled on identification as a pivot, during commit attempt.
db_1              | 2019-01-07 21:32:19.547 UTC [43] HINT:  The transaction might succeed if retried.
db_1              | 2019-01-07 21:32:19.547 UTC [43] STATEMENT:  COMMIT
dicom-sender_1    | E0107 21:32:19.547747 PluginsManager.cpp:164] PostgreSQL error: ERROR:  could not serialize access due to read/write dependencies among transactions
dicom-sender_1    | DETAIL:  Reason code: Canceled on identification as a pivot, during commit attempt.
dicom-sender_1    | HINT:  The transaction might succeed if retried.
dicom-sender_1    |
dicom-sender_1    | W0107 21:32:19.547836 PluginsManager.cpp:168] PostgreSQL: An active PostgreSQL transaction was dismissed
db_1              | 2019-01-07 21:32:19.548 UTC [43] WARNING:  there is no transaction in progress
dicom-sender_1    | WARNING:  there is no transaction in progress
dicom-sender_1    | E0107 21:32:19.548652 PluginsManager.cpp:164] Cannot rollback a non-existing transaction
dicom-sender_1    | terminate called after throwing an instance of 'Orthanc::OrthancException'

```



What is the expected output? What do you see instead?
Should handle bad dicom and if it must crash then crash cleanly ie not hang.

What version of the product are you using? On what operating system?
osimis/orthanc:18.12.2
Docker on Windows for testing / Docker on Centos for Production
Comment 1 Sébastien Jodogne 2020-06-29 15:23:22 CEST
[BitBucket user: Sébastien Jodogne]
[BitBucket date: 2019-01-08.11:32:11]

I am unable to reproduce this issue. I have created a DICOM file that mimics your error as follows:

```
# Take one sample image from the database of our integration tests: https://hg.orthanc-server.com/orthanc-tests/file/default/Database/
$ cp ColorTestMalaterre.dcm /tmp/tmp.dcm

# Add some private tag (dcmodify won't allow to create the (0a7b,2020) directly)
$ dcmodify -i '0a7b,0020=HELLO2' /tmp/tmp.dcm

# Make the private tag larger than the filesize, and replace the element from 0x0020 to 0x2020
$ xxd -p /tmp/tmp.dcm | tr -d '\n' | sed 's/7b0a20000600/7b0a202006ff/' | xxd -r -p > /tmp/Issue121.dcm
```

Sending this `/tmp/Issue121.dcm` file through the REST API of Orthanc, I get a log similar to yours (even if using the PostgreSQL plugin), but no crash:

```
I0108 11:28:19.565982 OrthancRestApi.cpp:117] Receiving a DICOM file of 5962 bytes through HTTP
E: DcmElement: Unknown Tag & Data (0a7b,2020) larger (65286) than remaining bytes (5058) in file, premature end of stream
E0108 11:28:19.566147 OrthancException.h:85] Bad file format: Cannot parse an invalid DICOM file (size: 5962 bytes)
```

Please provide us with a way to systematically reproduce your issue, otherwise we won't be able to provide support. In particular, please share your DICOM file that triggers the issue.
Comment 2 Sébastien Jodogne 2020-06-29 15:23:23 CEST
[BitBucket user: Alain Mazy]
[BitBucket date: 2019-01-08.13:52:09]

Hi Michael,

Given the transaction errors/warnings in your logs, it looks like you have multiple Orthanc accessing the same DB.  Is this the case ?
Comment 3 Sébastien Jodogne 2020-06-29 15:23:24 CEST
[BitBucket user: Michael Hobbs]
[BitBucket date: 2019-01-08.15:34:32]

Additional Info: We are running multiple copies of Orthanc in a swarm off the same DB. We have switched to using multiple DB's one for the receivers and one for the senders. This seems to have eliminated the issue. In the original configuration, an instance would be received by a receiver and a Lua script would notify a nodejs process. The nodejs process would tell a receiver to generate a modified version of the instance. The nodejs process would then tell a sender to store the modified instance and then tell it where to send the modified version. Very rarely the above issue would be seen.

Edit: Alain Mazy, that is correct.
Comment 4 Sébastien Jodogne 2020-06-29 15:23:25 CEST
[BitBucket user: Michael Hobbs]
[BitBucket date: 2019-01-08.15:41:38]

Correction to bug report: After checking the Dev swarm it was discovered that the docker-stack.yml was incorrectly configured to not restart on failures. After correcting that it was confirmed that this issue is, in fact, a 'crash' not a 'hang'. Priority should be downgraded as a crash is far easier to handle than a hang.
Comment 5 Sébastien Jodogne 2020-06-29 15:23:26 CEST
[BitBucket user: Alain Mazy]
[BitBucket date: 2019-01-08.16:36:01]

Hi Michael,

Actually, when connecting multiple Orthanc to a single DB, there should only be one orthanc acting as a writer and all other orthanc instances acting as readers only.

Your issue is very similar to this one: bug 83.  Although in your case, you seem to encounter a crash while we just encounter 503 errors when using the Rest API.
Comment 6 Sébastien Jodogne 2020-06-29 15:24:41 CEST
[BitBucket user: Sébastien Jodogne]
[BitBucket date: 2019-01-25.13:07:25]

It might be worth giving another try with the newly-released PostgreSQL 3.0 plugin. Indeed, since PostgreSQL 2.2 that is shipped in the image `osimis/orthanc:18.2.2` that you use, the following changeset has been included that might improve things: https://hg.orthanc-server.com/orthanc-databases/changeset/46

Furthermore, if you have multiple Orthanc connecting to the same database, you should set the option `SaveJobs` to `false` in the Orthanc configuration file. This option will be part of Orthanc 1.5.3, to be released soon. https://hg.orthanc-server.com/orthanc/changeset/1fe524e211af9018d53235406d10c6dcff2ebbb2
Comment 7 Sébastien Jodogne 2020-06-29 15:26:13 CEST
[BitBucket user: Michael Hobbs]
[BitBucket date: 2019-08-19.19:37:35]

.