Bug 155 - Failure to respond to client after initiating clean shutdown
Summary: Failure to respond to client after initiating clean shutdown
Status: CONFIRMED
Alias: None
Product: Orthanc
Classification: Unclassified
Component: Orthanc Core (show other bugs)
Version: unspecified
Hardware: All All
: Lowest normal
Assignee: Sébastien Jodogne
URL:
Depends on:
Blocks:
 
Reported: 2020-06-29 15:16 CEST by Sébastien Jodogne
Modified: 2021-04-21 13:39 CEST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sébastien Jodogne 2020-06-29 15:16:10 CEST
[BitBucket user: Thibault Nélis]
[BitBucket date: 2019-10-21.16:56:30]

After sending an interrupt signal \(SIGINT\) to the Orthanc process, it is expected that a clean shutdown sequence is performed \(including connection draining\). However, the server fails to respond to clients which are waiting for synchronous outbound transfers \(which they initiated\) to complete. Instead, their connection is aborted.

```
$ ./test large-instance 
== "foo" server information
Address: 172.21.0.2
Orthanc version: 1.5.8
DICOMweb plugin version: "1.0"

== "bar" server information
Address: 172.21.0.3
Orthanc version: 1.5.8
DICOMweb plugin version: "1.0"

== Environment preparation
Uploading "large-instance" ...
Instance ID: d8996bb1-0ce7e959-e33317c7-eead9f4f-dc934f15
Study ID: efe74f24-d89b716e-5e3d93cd-8081e0c1-7abfdd42
Instance size: 635735934
Reported size: 635735934
Reported hash: 57fb4f1ae04bd32cf6edfec016946730

== Test 1: DICOM, no interruption
Initiating transfer ...
{
  "InstancesCount": 1,
  "FailedInstancesCount": 0
}

real    0m5.768s
user    0m0.218s
sys 0m0.028s
Verifying data integrity ...
OK

== Test 2: DICOM, interruption after one second
Initiating transfer ...
Killing orthanc-clean-shutdown_foo_1 ... done

http: error: ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) while doing POST request to URL: http://172.21.0.2/modalities/bar/store

real    0m7.508s
user    0m0.208s
sys 0m0.025s
Verifying data integrity ...
OK

== Test 3: DICOMweb, no interruption
Initiating transfer ...
{
  "InstancesCount": "1"
}

real    6m46.399s
user    0m0.204s
sys 0m0.032s
Verifying data integrity ...
OK

== Test 4: DICOMweb, interruption after one second
Initiating transfer ...
Killing orthanc-clean-shutdown_foo_1 ... done

http: error: ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) while doing POST request to URL: http://172.21.0.2/dicom-web/servers/bar/stow

real    7m10.411s
user    0m0.200s
sys 0m0.033s
Verifying data integrity ...
OK

== Test 5: Orthanc peering, no interruption
Initiating transfer ...
{
  "InstancesCount": 1,
  "FailedInstancesCount": 0
}

real    0m7.629s
user    0m0.212s
sys 0m0.034s
Verifying data integrity ...
OK

== Test 6: Orthanc peering, interruption after one second
Initiating transfer ...
Killing orthanc-clean-shutdown_foo_1 ... done

http: error: ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) while doing POST request to URL: http://172.21.0.2/peers/bar/store

real    0m6.881s
user    0m0.233s
sys 0m0.027s
Verifying data integrity ...
OK
```

Thankfully, integrity checks confirm the data is in fact fully transferred even though it is not reported as such.

Looking at timestamps, it seems as though the connection to the client is aborted _after_ the connection to the peer is complete.

Although not tested here, it is likely the behavior is identical for termination signals \(SIGTERM\) and calls to `/tools/shutdown`.

[Repro](https://bitbucket.org/tn-osimis/orthanc-clean-shutdown-test). Tested on three different \(but similar\) machines.