It’s easier for me to access all my data with some sort of a central storage – for this purpose I decided to use S3 a long time ago. Some cool tools I use do not have native s3 support (yet) but rclone helps with that. In this article I’ll show you how to use the docker volume plugin with a minio s3 storage configured in docker-compose.yml for use with paperless-ngx.
Backups. That’s how we shall always start. Once you are sure you have a Backup I would advise to make an update first if possible. So that nothing stops you from re-importing your documents later. Once you updated your paperless-ngx docker compose installation continue by exporting all documents:
docker compose exec -T webserver document_exporter -d -c ../export
100%|██████████| 1020/1020 [00:35<00:00, 29.08it/s]
The next step then is to install the docker volume plugin in the host:
apt-get -y install fuse
mkdir -p /var/lib/docker-plugins/rclone/config
mkdir -p /var/lib/docker-plugins/rclone/cache
docker plugin install rclone/docker-volume-rclone:amd64 args="-v" --alias rclone --grant-all-permissions
Then edit your docker-compose.yml. Here are the relevant snippets. Basically add the named storage s3 to the volumes list of the webserver. The mountpoint is /usr/src/paperless/media/documents. I used that directory so that I do not add logs, locks and other stuff to the s3. Comment the old media volume (you may still reference it by media, but then you may need to docker volume rm the current media first or it won’t work).
webserver:
image: ghcr.io/paperless-ngx/paperless-ngx:latest
[..]
volumes:
- s3:/usr/src/paperless/media/documents
- data:/usr/src/paperless/data
#- media:/usr/src/paperless/media
- ./export:/usr/src/paperless/export
- ./consume:/usr/src/paperless/consume
[..]
Then at the end of this file there is a section for all the volumes. Comment the media section and add the s3 section:
volumes:
data:
#media:
pgdata:
redisdata:
s3:
driver: rclone
driver_opts:
remote: "minio:paperless-jean"
allow_other: "true"
vfs_cache_mode: "full"
The name for my bucket is paperless-jean. I reference the remote using “minio”. Create a rclone.conf in /var/lib/docker-plugins/rclone/config it could look like this – just an example:
[minio]
type = s3
region = somewhere-over-the-rainbow
endpoint = https://your-s3:9000
provider = Minio
env_auth = false
access_key_id =
secret_access_key =
acl = bucket-owner-full-control
Then do the usual docker compose pull, and docker compose up -d. You should see the s3 mount also in the host e.g. df -hT should contain something like:
df -hT | grep minio
minio:paperless-jean fuse.rclone 1.0P 0 1.0P 0% /var/lib/docker/plugins/[..]/propagated-mount/paperless_s3
docker volume list should contain paperless_s3:
docker volume list | grep s3
rclone:latest paperless_s3
Now re-import all documents using:
docker compose exec -T webserver document_importer ../export
Found existing user(s), this might indicate a non-empty installation
Found existing documents(s), this might indicate a non-empty installation
Checking the manifest
Installed 1299 object(s) from 1 fixture(s)
Copy files into paperless...
100%|██████████| 1020/1020 [00:56<00:00, 17.96it/s]
Updating search index...
100%|██████████| 1020/1020 [00:41<00:00, 24.52it/s]
According to the above 1020 files. According to my s3 now there are 2961 objects in s3. Now that I do have all my paperless-ngx documents also in my S3 I can easily access them from other places, as well.