Prabhath Kota: Python multiprocessing fork

Mar 27, 2020

Python multiprocessing fork

Fork
# When we fork, the entire Python process is duplicated in memory including the Python interpreter, code, libraries, current stack, etc.
# This creates a new copy of the python interpreter.
# Fork creates two python interpreters each with its own GIL.
# Fork is faster than Spawn (Fork child inherit all resources from the parent process, Spawn re-imports all above main() method)

# Fork is the default method for multi-processing

#Disadvantages of Fork
# It won't work on windows
# When child shares parent libraries, values, data-structures, if a lock acquired by parent, child ends up waiting for that lock ever
# Very hard to debug when you import a third-party module/library that uses threads behind the scenes
# Fork and Multi-threading won't go well

from multiprocessing import Process
import multiprocessing
import os

file_desc = None

def process_task1():
# write to the file in child process
file_desc.write(f"\nWritten by child process with id {os.getpid()}")
file_desc.flush()

if __name__ == '__main__':
# create a file in the parent process
file_desc = open("sample.txt", "w")
file_desc.write(f"\nWritten by parent process with id {os.getpid()}")
file_desc.flush()

# Fork is default method to create a process
# multiprocessing.set_start_method('fork')

p = Process(target=process_task1)
p.start()
p.join()
file_desc.close()

file_des = open("sample.txt", "r")
print(file_des.read())

os.remove("sample.txt")

Output:
Written by parent process with id 288
Written by child process with id 294

Prabhath Kota

Mar 27, 2020

Python multiprocessing fork

No comments:

Post a Comment