comparison Sphinx/source/plugins/python.rst @ 351:e2863083fa30

multiprocessing
author Sebastien Jodogne <s.jodogne@gmail.com>
date Sat, 28 Mar 2020 14:09:30 +0100
parents c238865e7538
children 6258b2c14e56
comparison
equal deleted inserted replaced
350:c238865e7538 351:e2863083fa30
199 199
200 else: 200 else:
201 output.SendMethodNotAllowed('GET') 201 output.SendMethodNotAllowed('GET')
202 202
203 orthanc.RegisterRestCallback('/pydicom/(.*)', DecodeInstance) # (*) 203 orthanc.RegisterRestCallback('/pydicom/(.*)', DecodeInstance) # (*)
204
205
206 Performance and concurrency
207 ---------------------------
208
209 .. highlight:: python
210
211 Let us consider the following sample Python script that makes a
212 CPU-intensive computation on a REST callback::
213
214 import math
215 import orthanc
216 import time
217
218 # CPU-intensive computation taking about 4 seconds
219 def SlowComputation():
220 start = time.time()
221 for i in range(1000):
222 for j in range(30000):
223 math.sqrt(float(j))
224 end = time.time()
225 duration = (end - start)
226 return 'computation done in %.03f seconds\n' % duration
227
228 def OnRest(output, uri, **request):
229 answer = SlowComputation()
230 output.AnswerBuffer(answer, 'text/plain')
231
232 orthanc.RegisterRestCallback('/computation', OnRest)
233
234
235 .. highlight:: text
236
237 Calling this REST route from the command-line returns the time that is
238 needed to compute 30 million times a squared root on your CPU::
239
240 $ curl http://localhost:8042/computation
241 computation done in 4.208 seconds
242
243 Now, let us call this route three times concurrently (we use bash)::
244
245 $ (curl http://localhost:8042/computation & curl http://localhost:8042/computation & curl http://localhost:8042/computation )
246 computation done in 11.262 seconds
247 computation done in 12.457 seconds
248 computation done in 13.360 seconds
249
250 As can be seen, the computation time has tripled. This means that the
251 computations were not distributed across the available CPU cores.
252 This might seem surprising, as Orthanc is a threaded server (in
253 Orthanc, a pool of C++ threads serves concurrent requests).
254
255 The explanation is that the Python interpreter (`CPython
256 <https://en.wikipedia.org/wiki/CPython>`__ actually) is built on the
257 top of a so-called `Global Interpreter Lock (GIL)
258 <https://en.wikipedia.org/wiki/Global_interpreter_lock>`__. The GIL is
259 basically a mutex that protects all the calls to the Python
260 interpreter. If multiple C++ threads from Orthanc call a Python
261 callback, only one can proceed at any given time.
262
263 .. highlight:: python
264
265 The solution is to use the `multiprocessing primitives
266 <https://docs.python.org/3/library/multiprocessing.html>`__ of Python.
267 The "master" Python interpreter that is initially started by the
268 Orthanc plugin, can start several `children processes
269 <https://en.wikipedia.org/wiki/Process_(computing)>`__, each of these
270 processes running a separate Python interpreter. This allows to
271 offload intensive computations from the "master" Python interpreter of
272 Orthanc onto those "slave" interpreters. The ``multiprocessing``
273 library is actually quite straightforward to use::
274
275 import math
276 import multiprocessing
277 import orthanc
278 import signal
279 import time
280
281 # CPU-intensive computation taking about 4 seconds
282 # (same code as above)
283 def SlowComputation():
284 start = time.time()
285 for i in range(1000):
286 for j in range(30000):
287 math.sqrt(float(j))
288 end = time.time()
289 duration = (end - start)
290 return 'computation done in %.03f seconds\n' % duration
291
292 # Ignore CTRL+C in the slave processes
293 def Initializer():
294 signal.signal(signal.SIGINT, signal.SIG_IGN)
295
296 # Create a pool of 4 slave Python interpreters
297 POOL = multiprocessing.Pool(4, initializer = Initializer)
298
299 def OnRest(output, uri, **request):
300 # Offload the call to "SlowComputation" onto one slave process.
301 # The GIL is unlocked until the slave sends its answer back.
302 answer = POOL.apply(SlowComputation)
303 output.AnswerBuffer(answer, 'text/plain')
304
305 orthanc.RegisterRestCallback('/computation', OnRest)
306
307 .. highlight:: text
308
309 Here is now the result of calling this route three times concurrently::
310
311 $ (curl http://localhost:8042/computation & curl http://localhost:8042/computation & curl http://localhost:8042/computation )
312 computation done in 4.211 seconds
313 computation done in 4.215 seconds
314 computation done in 4.225 seconds
315
316 As can be seen, the calls to the Python computation now fully run in
317 parallel (the time is cut down from 12 seconds to 4 seconds, the same
318 as for one isolated request).
319
320 Note also how the ``multiprocessing`` library allows to make a fine
321 control over the computational resources that are available to the
322 Python script: The number of "slave" interpreters can be easily
323 changed in the constructor of the ``multiprocessing.Pool`` object, and
324 are fully independent of the threads used by the Orthanc server.
325
326 Obviously, an in-depth discussion about the ``multiprocessing``
327 library is out of the scope of this document. There are many
328 references available on Internet. Also, note that ``multithreading``
329 is not useful here, as Python multithreading is also limited by the
330 GIL, and is more targeted at dealing with costly I/O operations.