[BitBucket user: Thibault Nélis] [BitBucket date: 2019-10-21.17:06:36] Outbound DICOMweb transfers utilize all CPU resources. This results in extremely low throughput. In the following example, an instance which takes about 5 to 6 seconds to transfer via DICOM or Orthanc peering takes as much as 6min 46s to transfer via DICOMweb \(Store Transaction / STOW-RS\). ``` $ ./test large-instance == "foo" server information Address: 172.21.0.2 Orthanc version: 1.5.8 DICOMweb plugin version: "1.0" == "bar" server information Address: 172.21.0.3 Orthanc version: 1.5.8 DICOMweb plugin version: "1.0" == Environment preparation Uploading "large-instance" ... Instance ID: d8996bb1-0ce7e959-e33317c7-eead9f4f-dc934f15 Study ID: efe74f24-d89b716e-5e3d93cd-8081e0c1-7abfdd42 Instance size: 635735934 Reported size: 635735934 Reported hash: 57fb4f1ae04bd32cf6edfec016946730 == Test 1: DICOM, no interruption Initiating transfer ... { "InstancesCount": 1, "FailedInstancesCount": 0 } real 0m5.768s user 0m0.218s sys 0m0.028s Verifying data integrity ... OK == Test 3: DICOMweb, no interruption Initiating transfer ... { "InstancesCount": "1" } real 6m46.399s user 0m0.204s sys 0m0.032s Verifying data integrity ... OK == Test 5: Orthanc peering, no interruption Initiating transfer ... { "InstancesCount": 1, "FailedInstancesCount": 0 } real 0m7.629s user 0m0.212s sys 0m0.034s Verifying data integrity ... OK ``` [Repro](https://bitbucket.org/tn-osimis/orthanc-clean-shutdown-test). Tested on three different \(but similar\) machines. A quick look at performance counters show the time is spent in-process \(sampled during DICOMweb transfer, does not represent the full transfer\): ``` Performance counter stats for process id '4691': 82,940.62 msec task-clock # 1.004 CPUs utilized 20,328 context-switches # 0.245 K/sec 12 cpu-migrations # 0.000 K/sec 16 page-faults # 0.000 K/sec <not supported> cycles <not supported> instructions <not supported> branches <not supported> branch-misses 82.595531201 seconds time elapsed ``` Although I haven’t captured anything yet, I had a sense from looking at live stack profiles that the processor was specifically spending a lot of time moving data in memory. \(This is a very hasty observation and is likely incorrect.\)
[BitBucket user: Michel Rozpendowski] [BitBucket date: 2019-12-20.13:52:02] I tested by setting `foo` to send to `bar` in DICOMweb with `ChunkedTransfers` set to `false` and noticed that the performance issue is gone. No performance issue either with the DICOMweb plugin version 0.6 with Orthanc 1.5.8 and `ChunkedTransfers` set to `true` as the feature was implemented in DICOMweb plugin version 1.0 so the setting is just ignored. The problem is therefore in the implementation of `ChunkedTransfers` `true` for the DICOMweb plugin version 1.0.
[BitBucket user: Alain Mazy] [BitBucket date: 2019-12-20.17:42:55] this line seems to be the culprit: https://hg.orthanc-server.com/orthanc/file/a648c2d67a6502fe0cac76e540a11566da6adf27/Core/HttpClient.cpp#L243 since there is a huge realloc \+ memcpy eachtime a 65KB chunk is transferred out of the 600MB buffer \(in case of a 600MB instance\). I’ll arrange that one of these days.
[BitBucket user: Alain Mazy] [BitBucket date: 2019-12-23.09:44:41] this has been fixed in this commit: https://hg.orthanc-server.com/orthanc/changeset/d5740d3b1d67a09e3fa4d545df2080f58324731b Note that the fix is in Orthanc Core, not the Dicom-web plugin…
This initial fix could lead to a crash in Orthanc 1.6.0 to 1.7.2. Orthanc 1.7.3 will include another fix to resolve the crash: https://hg.orthanc-server.com/orthanc/rev/8c559dd5034b