view NOTES @ 445:cec6a0cd399f pg-transactions

now logging DatabaseCannotSerialize as a warning only with the details (Orthanc will log it as an error after all retries)
author Alain Mazy <am@osimis.io>
date Wed, 10 Jan 2024 15:25:03 +0100
parents 2ca939d02d39
children 9e039e65d68e
line wrap: on
line source

Resources:
*********
- PG transaction modes explained: https://www.postgresql.org/files/developer/concurrency.pdf
- Isolation level explained (PG + MySQL): https://amirsoleimani.medium.com/understanding-database-isolation-level-via-examples-mysql-and-postgres-a86b5502d404
- Message queuing in PG: https://www.crunchydata.com/blog/message-queuing-using-native-postgresql


Create and delete instances Internals:
*************************************

isNewInstance = CreateInstance(...)

if (!isNewInstance && overwriteInstances)
  DeleteResource(instance)
       -> ClearDeletedFiles(manager);
            DELETE FROM DeletedFiles  ------> this is not a temporary table in postgres but it is supposed to be empty before and after each transaction -> it is empty when taking a snapshot of the DB in READ COMMITTED mode!!!
          ClearDeletedResources(manager);
            DELETE FROM DeletedResources  ------> this is not a temporary table in postgres but it is supposed to be empty before and after each transaction !!!

            DELETE FROM RemainingAncestor  ------> this is not a temporary table in postgres but it is supposed to be empty before and after each transaction !!!
            DELETE FROM Resources WHERE internalId=..
               -> cascades delete the MainDicomTags, the Metadata and the AttachedFiles
                  -> this triggers AttachedFileDeletedFunc
                         INSERT INTO DeletedFiles VALUES
                            (old.uuid, old.filetype, old.compressedSize,
                            old.uncompressedSize, old.compressionType,
                            old.uncompressedHash, old.compressedHash);
                        RETURN NULL;
               -> this triggers a SQL trigger: ResourceDeletedFunc
                        INSERT INTO DeletedResources VALUES (old.resourceType, old.publicId);
                        IF EXISTS (SELECT 1 FROM Resources WHERE parentId = old.parentId) THEN
                            -- Signal that the deleted resource has a remaining parent 
                            -- (a parent that must not be deleted but whose LastUpdate must be updated)
                            INSERT INTO RemainingAncestor
                            SELECT resourceType, publicId FROM Resources WHERE internalId = old.parentId;
                        ELSE
                            -- Delete a parent resource when its unique child is deleted 
                            DELETE FROM Resources WHERE internalId = old.parentId;
                        END IF;

            SELECT * FROM RemainingAncestor
              -> SignalRemainingAncestor()  // There is at most 1 remaining ancestor
                 -> ServerIndex::TransactionContext::SignalRemainingAncestor()
                    -> stores remainingType and remainingPublicId (this is used in StatelessDatabaseOperations::DeleteResource to build the Rest Response of /delete 
                                                                   and to update the LastUpdate of all parent (only when deleted from /delete))

          SignalDeletedFiles(output, manager);
            SELECT * FROM DeletedFiles
              -> SignalDeletedAttachment()
                 -> ServerIndex::TransactionContext::SignalAttachmentDeleted()
                    -> pendingFilesToRemove_.push_back(FileToRemove(info))  (files are deleted in CommitFilesToRemove in the ServerIndex::TransactionContext::Commit)

          SignalDeletedResources(output, manager);
            SELECT resourceType, publicId FROM DeletedResources
              -> SignalDeletedResource()
                 -> Emit DeletedResource event (lua)


  if (!CreateInstance(...))
    Error: "No new instance while overwriting; this should not happen"

if isNewInstance -> LogChange
if isNewSeries -> LogChange
....

Sample SQL code that you can execute in DBeaver to test new functions/procedures:

CreateInstance
************************************************************************

CREATE OR REPLACE FUNCTION CreateInstance(
  IN patient_public_id TEXT,
  IN study_public_id TEXT,
  IN series_public_id TEXT,
  IN instance_public_id TEXT,
  OUT is_new_patient BIGINT,
  OUT is_new_study BIGINT,
  OUT is_new_series BIGINT,
  OUT is_new_instance BIGINT,
  OUT patient_internal_id BIGINT,
  OUT study_internal_id BIGINT,
  OUT series_internal_id BIGINT,
  OUT instance_internal_id BIGINT) AS $body$

BEGIN
	is_new_patient := 1;
	is_new_study := 1;
	is_new_series := 1;
	is_new_instance := 1;

	
	BEGIN
        INSERT INTO "resources" VALUES (DEFAULT, 0, patient_public_id, NULL);
    EXCEPTION
        WHEN unique_violation THEN
            is_new_patient := 0;
    END;
    SELECT internalid INTO patient_internal_id FROM "resources" WHERE publicId = patient_public_id AND resourcetype = 0 FOR UPDATE;  -- also locks the resource and its parent to prevent from deletion while we complete this transaction

	BEGIN
        INSERT INTO "resources" VALUES (DEFAULT, 1, study_public_id, patient_internal_id);
    EXCEPTION
        WHEN unique_violation THEN
            is_new_study := 0;
    END;
    SELECT internalid INTO study_internal_id FROM "resources" WHERE publicId = study_public_id AND resourcetype = 1 FOR UPDATE;  -- also locks the resource and its parent to prevent from deletion while we complete this transaction

	BEGIN
	    INSERT INTO "resources" VALUES (DEFAULT, 2, series_public_id, study_internal_id);
    EXCEPTION
        WHEN unique_violation THEN
            is_new_series := 0;
    END;
	SELECT internalid INTO series_internal_id FROM "resources" WHERE publicId = series_public_id AND resourcetype = 2 FOR UPDATE;  -- also locks the resource and its parent to prevent from deletion while we complete this transaction

  	BEGIN
		INSERT INTO "resources" VALUES (DEFAULT, 3, instance_public_id, series_internal_id);
    EXCEPTION
        WHEN unique_violation THEN
            is_new_instance := 0;
    END;
    SELECT internalid INTO instance_internal_id FROM "resources" WHERE publicId = instance_public_id AND resourcetype = 3 FOR UPDATE;  -- also locks the resource and its parent to prevent from deletion while we complete this transaction

    IF is_new_instance > 0 THEN
        -- Move the patient to the end of the recycling order.
        PERFORM PatientAddedOrUpdated(patient_internal_id, 1);
    END IF;  
END;

$body$ LANGUAGE plpgsql;


DO $$ 
DECLARE 
    result record;
begin
	delete from "resources";

    SELECT * INTO result from CreateInstance('patient1', 'study1', 'series1', 'instance1');

    RAISE NOTICE 'Value patientInternalId: %, is_new: %', result.patient_internal_id, result.is_new_patient;
    RAISE NOTICE 'Value studyInternalId: %, is_new: %', result.study_internal_id, result.is_new_study;
    RAISE NOTICE 'Value seriesInternalId: %, is_new: %', result.series_internal_id, result.is_new_series;
    RAISE NOTICE 'Value instanceInternalId: %, is_new: %', result.instance_internal_id, result.is_new_instance;
    RAISE NOTICE '--------------';

    SELECT * INTO result from CreateInstance('patient1', 'study1', 'series1', 'instance2');

    RAISE NOTICE 'Value patientInternalId: %, is_new: %', result.patient_internal_id, result.is_new_patient;
    RAISE NOTICE 'Value studyInternalId: %, is_new: %', result.study_internal_id, result.is_new_study;
    RAISE NOTICE 'Value seriesInternalId: %, is_new: %', result.series_internal_id, result.is_new_series;
    RAISE NOTICE 'Value instanceInternalId: %, is_new: %', result.instance_internal_id, result.is_new_instance;
    RAISE NOTICE '--------------';

    SELECT * INTO result from CreateInstance('patient1', 'study1', 'series1', 'instance2');

    RAISE NOTICE 'Value patientInternalId: %, is_new: %', result.patient_internal_id, result.is_new_patient;
    RAISE NOTICE 'Value studyInternalId: %, is_new: %', result.study_internal_id, result.is_new_study;
    RAISE NOTICE 'Value seriesInternalId: %, is_new: %', result.series_internal_id, result.is_new_series;
    RAISE NOTICE 'Value instanceInternalId: %, is_new: %', result.instance_internal_id, result.is_new_instance;
    RAISE NOTICE '--------------';

END $$;


-- \set patient_key 'patient_key'
-- SELECT CreateInstance('patient', 'study', 'series', 'instance', patient_key) ;

-- drop function CreateInstance
-- select * from "resources";
-- delete from "resources";
-- INSERT INTO "resources" VALUES (DEFAULT, 0, 'patient', NULL)



************************************************************************

In debug, no verbose logs, 10 connections
Orthanc 1.12.1 + PG 5.1 (serializable mode)          : test_concurrent_anonymize_same_study with 4 workers and 10x repeat: 43.957 s
Orthanc mainline + PG mainline (read-committed mode) : test_concurrent_anonymize_same_study with 4 workers and 10x repeat: 15.744 s
                                                       test_concurrent_anonymize_same_study deletion took: 18.8 s

Orthanc 1.12.1 + PG 5.1 (serializable mode)          : test_concurrent_uploads_same_study with 20 workers and 1x repeat: 21.214 s
Orthanc mainline + PG mainline (read-committed mode) : test_concurrent_uploads_same_study with 20 workers and 1x repeat: 6.57 s

Orthanc 1.12.1 + PG 5.1 (serializable mode)          : test_upload_delete_same_study_from_multiple_threads with 5 workers and 3x repeat (10x): 23.016 s
Orthanc mainline + PG mainline (read-committed mode) : test_upload_delete_same_study_from_multiple_threads with 5 workers and 3x repeat (10x): 7.129 s 

Orthanc mainline + PG mainline (read-committed mode) : test_upload_multiple_studies_from_multiple_threads with 10 workers and 25 files and repeat 3x: 6.454 s

With Docker with 10 connections SQL:
osimis/orthanc:23.11.0: TIMING test_concurrent_anonymize_same_study with 4 workers and 10x repeat: 25.047 s
osimis/orthanc:current: TIMING test_concurrent_anonymize_same_study with 4 workers and 10x repeat: 8.649 s

osimis/orthanc:23.11.0: TIMING test_concurrent_anonymize_same_study deletion took: 11.807 s
osimis/orthanc:current: TIMING test_concurrent_anonymize_same_study deletion took: 10.620 s

osimis/orthanc:23.11.0: TIMING test_concurrent_uploads_same_study with 20 workers and 1x repeat: 10.736 s
osimis/orthanc:current: TIMING test_concurrent_uploads_same_study with 20 workers and 1x repeat: 3.465 s

osimis/orthanc:23.11.0: TIMING test_upload_delete_same_study_from_multiple_threads with 5 workers and 30x repeat: 11.966 s
osimis/orthanc:current: TIMING test_upload_delete_same_study_from_multiple_threads with 5 workers and 30x repeat: 5.092 s

Testing the connecions (note: Orthanc and PG server running on the same server)
10 connections : test_concurrent_anonymize_same_study with 4 workers and 10x repeat: 15.744 s
1  connection  : test_concurrent_anonymize_same_study with 4 workers and 10x repeat: 21.341 s
10 connections : test_concurrent_uploads_same_study with 20 workers and 1x repeat: 6.57 s
1  connection  : test_concurrent_uploads_same_study with 20 workers and 1x repeat: 10.223 s
10 connections : test_upload_delete_same_study_from_multiple_threads with 5 workers and 3x repeat (10x): 7.129 s 
1  connection  : test_upload_delete_same_study_from_multiple_threads with 5 workers and 3x repeat (10x): 11.172 s 


TODO:
- have a separate "thread" to UpdateStatistics ?
- perf tests: upload generated data (different studies)
- check https://discourse.orthanc-server.org/t/image-insert-are-too-slow-databse-performance-too-poor-when-using-mysql-mariadb/3820
- implement a downgrade script ?  And test it in PotgresUpgrades integ tests
- In Docker images, re-enable MySQL & ODBC plugins + tests

- this error sometimes occur...
======================================================================
FAIL: test_upload_delete_same_study_from_multiple_threads (Concurrency.test_concurrency.TestConcurrency)
----------------------------------------------------------------------
    Traceback (most recent call last):
    File "/home/runner/work/orthanc-builder/orthanc-builder/docker/integration-tests/orthanc-tests-repo-normal/NewTests/Concurrency/test_concurrency.py", line 254, in test_upload_delete_same_study_from_multiple_threads
        self.assertEqual(0, len(self.o.studies.get_all_ids()))
    AssertionError: 0 != 1

    2023-12-21 17:10:04.737 UTC [69] ERROR:  deadlock detected
    2023-12-21 17:10:04.737 UTC [69] DETAIL:  Process 69 waits for ShareLock on transaction 20657; blocked by process 70.
        Process 70 waits for ShareLock on transaction 18193; blocked by process 69.
        Process 69: SELECT * FROM DeleteResource($1)
        Process 70: SELECT * FROM CreateInstance($1, $2, $3, $4)
    2023-12-21 17:10:04.737 UTC [69] HINT:  See server log for query details.
    2023-12-21 17:10:04.737 UTC [69] CONTEXT:  while deleting tuple (134,66) in relation "resources"
        SQL statement "DELETE FROM ONLY "public"."resources" WHERE $1 OPERATOR(pg_catalog.=) "parentid""
        SQL statement "DELETE FROM Resources WHERE internalId = old.parentId
                                    AND NOT EXISTS (SELECT 1 FROM Resources WHERE parentId = old.parentId)"
        PL/pgSQL function resourcedeletedfunc() line 7 at SQL statement
        SQL statement "DELETE FROM Resources WHERE internalId=id RETURNING *"
        PL/pgSQL function deleteresource(bigint) line 26 at SQL statement
    2023-12-21 17:10:04.737 UTC [69] STATEMENT:  SELECT * FROM DeleteResource($1)

DONE:
- reenable PatientRecyclingOrder
- force the create/update DB transaction to be serializable (not needed: this is handled by POSTGRESQL_LOCK_DATABASE_SETUP)
- PatientAddedFunc contains an IF  (check if other IF/THEN/ELSE pattern remains)
- validate upgrade DB from previous Orthanc and from scratch
- check minimal version of PG (9.5 - 9.6 ? for create index if not exists): seems to work with 9.5 cfr PotgresUpgrades integ tests
- test events generation StableSeries .... (count the NewSeries, NewInstances event and make sure they match the numb)