News

My campaign to produce Shakespeare's Sonnets: A Graphic Novel Adaptation needs your help! Please sign up at https://www.patreon.com/fisherking for access to exclusive content and the opportunity to be a part of the magic!

I'm also producing a podcast discussing the sonnets, available on
industrial curiosity, itunes, spotify, stitcher, tunein and youtube!
For those who prefer reading to listening, the first 25 sonnets have been compiled into a book that is available now on Amazon and the Google Play store.

Friday, 2 April 2021

Simple safe (atomic) writes in Python3

 In sensitive circumstances, trusting a traditional file write can be a costly mistake - a simple power cut before the write is completed and synced may at best leave you with some corrupt data, but depending on what that file is used for you could be in for some serious trouble.

While there are plenty of interesting, weird, or over-engineered solutions available to ensure safe writing, I struggled to find a solution online that was simple, correct and easy-to-read and that could be run without installing additional modules, so my teammates and i came up with the following solution:

Explanation:

temp_file = tempfile.NamedTemporaryFile(delete=False,
                                        dir=os.path.dirname(target_file_path))

The first thing to do is create a temporary file in the same directory as the file we're trying to create or update. We do this because move operations (which we'll need later) aren't guaranteed to be atomic when they're between different file systems. Additionally, it's import to set delete=False as the standard behaviour of the NamedTemporaryFile is to delete itself as soon as it's not in use.

# preserve file metadata if it already exists
if os.path.exists(target_file_path):
    copyWithMetaData(target_file_path, temp_file.name)

We needed to support both file creation and updates, so in the case that we’re overwriting or appending to an existing file, we initialize the temporary file with the target file’s contents and metadata.

with open(temp_file.name, mode) as f:
    f.write(file_contents)
    f.flush()
    os.fsync(f.fileno())

Here we write or append the given file contents to the temporary file, and we flush and sync to disk manually to prepare for the most critical step:

os.replace(temp_file.name, target_file_path)

This is where the magic happens: os.replace is an atomic operation (when the source and target are on the same file system), so we're now guaranteed that if this fails to complete, no harm will be done.

We use the finally clause to remove the temporary file in case something did go wrong along the way, but now the very worst thing that can happen is that we end up with a temporary file ¯\_(ツ)_/¯


No comments:

Post a Comment