Updating & Upgrading paperless-ngx stack (Installed with docker compose)

Just some notes on how I update my paperless-ngx Stack which I previously installed with docker compose as explained in one of my previous posts. In this one I’ll also upgrade PostgreSQL from 15 to 16 and Gotenberg from 7.10 to 8. This is with Paperless-NGX 2.5.3.

Updates

Updating paperless-ngx itself is pretty simple. I installed it to /opt/paperless. So I switch directory to /opt/paperless/paperless-ngx:

cd /opt/paperless/paperless-ngx/

Then I stop all containers, pull and start everything again.

Stopping

~:/opt/paperless/paperless-ngx# sudo -Hu paperless docker compose down
[+] Running 6/6
 ✔ Container paperless-webserver-1  Removed                              6.2s 
 ✔ Container paperless-db-1         Removed                              0.3s 
 ✔ Container paperless-gotenberg-1  Removed                             10.2s 
 ✔ Container paperless-tika-1       Removed                              0.2s 
 ✔ Container paperless-broker-1     Removed                              0.4s 
 ✔ Network paperless_default        Removed                              0.2s 

Updating / Pulling

~:/opt/paperless/paperless-ngx# sudo -Hu paperless docker compose pull   
[+] Pulling 15/15
 ✔ db Pulled                                                             1.0s 
 ✔ webserver Pulled                                                      0.5s 
 ✔ tika Pulled                                                           0.5s 
 ✔ broker Pulled                                                         1.0s 
 ✔ gotenberg 10 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulled          48.6s 
   ✔ e1caac4eb9d2 Already exists                                         0.0s 
   ✔ 0a8a5d218416 Pull complete                                          0.4s 
   ✔ 22b9a3162daf Pull complete                                         15.0s 
   ✔ ff469c77f2a3 Pull complete                                         19.5s 
   ✔ a994f2588bfe Pull complete                                         16.3s 
   ✔ 034e8a1d21d3 Pull complete                                         23.4s 
   ✔ 63331405a93d Pull complete                                         17.0s 
   ✔ 2e13e3166189 Pull complete                                         17.5s 
   ✔ 206d2ba0ffba Pull complete                                         18.6s 
   ✔ 4f4fb700ef54 Pull complete                                         19.0s 

Starting

~:/opt/paperless/paperless-ngx# sudo -Hu paperless docker compose up -d
[+] Running 5/6
 ⠼ Network paperless_default        Created                               1.4s 
 ✔ Container paperless-tika-1       Started                              0.6s 
 ✔ Container paperless-db-1         Started                              0.7s 
 ✔ Container paperless-gotenberg-1  Started                              0.9s 
 ✔ Container paperless-broker-1     Started                              1.0s 
 ✔ Container paperless-webserver-1  Started                              1.2s 

That’s it for updates.

Automating this

If you set versions you may automatize docker pull – if you use “latest” you might not want to automatize docker pull. I believe that latest would also be safe for redis. For PostgreSQL it would be bad, since PostgreSQL needs a database upgrade if you change Major versions. For Gotenberg I’m to be honest not sure but I read some (older) reports that a switch in the Gotenberg version made some functionality with paperless-ngx buggy.

I use (do not set postgres to :16 unless you plan to do the upgrade described later in this article) the following versions:

~:/opt/paperless/paperless-ngx# grep image docker-compose.yml
image: docker.io/library/redis:latest
image: docker.io/library/postgres:16
image: ghcr.io/paperless-ngx/paperless-ngx:latest
image: docker.io/gotenberg/gotenberg:8
image: ghcr.io/paperless-ngx/tika:latest

A very very very simple script (there are better ways to do this) which could run every night might be:

#!/bin/bash

set -e 

# you may want to add sudo -Hu paperless in front of the commands
# if you can't run this as paperless user from your crontab.

PDIR=/opt/paperless/paperless-ngx
LOG=/opt/paperless/docker-compose-cron.log
DATE=$(date);

echo "$DATE Starting Docker Compose Update" >> $LOG
cd $PDIR
# stop all containers and pull if stopping was successful
/usr/bin/docker compose down >> $LOG 2>&1 && /usr/bin/docker compose pull >> $LOG 2>&1
# start all containers
/usr/bin/docker compose up --wait -d >> $LOG 2>&1
echo "$DATE Finished Docker Compose Update" >> $LOG

You can add it to the user paperless’ cronjobs (if that user has docker-permissions) like so:

~:/opt/paperless# crontab -l -u paperless | grep -v "^#"
0 1 * * * /opt/paperless/update-paperless.sh

Upgrades

However, if you want to upgrade the major version you have to edit the docker-compose.yml and change the version in the lines image:. So I also wouldn’t use latest for gotenberg. Before I started the Upgrade my docker-compose.yml contained the following versions:

~:/opt/paperless/paperless-ngx# grep image docker-compose.yml
image: docker.io/library/redis:7
image: docker.io/library/postgres:15
image: ghcr.io/paperless-ngx/paperless-ngx:latest
image: docker.io/gotenberg/gotenberg:7.10
image: ghcr.io/paperless-ngx/tika:latest

These are also the settings found in Git linked from the official documentation. At least, when I wrote this post.

I tried upgrading PostgreSQL to Version 16 and Gotenberg to Version 8. So far everything seems to work fine. However, you should check if Postgres 16 and Gotenberg 8 are officially supported and only do this upgrade if they are.

So don’t continue if you’re not sure. Also create a backup or at least a snapshot before you continue. In my case paperless is a virtualized system, so I can simply create a snapshot in the Hypervisor. Here are some snippets how I do that from my host:

# [..] (output of zfs list -t all | grep 116)
tank/kvm/116/root           17.4G   607G     9.29G  -
tank/kvm/116/root@transfer   672M      -     1.46G  -
tank/kvm/116/root@fuzzy     2.37G      -     11.2G  -
tank/kvm/116/root@backup    1.76G      -     13.7G  -
tank/kvm/116/root@backup2   4.67M      -     9.30G  -

# [..] (I basically f**ked up one update, so here I rollback)
~# zfs rollback tank/kvm/116/root@backup2  -r

# [..] (I start my VM again)
~# virsh start 116
Domain '116' started

# [..] (I create a new snapshot)
~# zfs snapshot tank/kvm/116/root@postgres15-step1

So you can see, having snapshots (Yes, you can do this with other filesystems and other virtualization environments as well…) are quiet useful for this sort of thing. If you don’t have this possibility or Backups take a long time, consider making Backups using the paperless-ngx document_exporter (You can see example for this in Variant 2 below).

Database Upgrade / Postgres 15 to 16

This one was tricky. Basically there are 3 variants I can think of to make this upgrade. I tried two.

The first one is to do a dump of the database, upgrade and import this dump. For me this sounds like a pretty clean approach. However; I noticed that the import will always say that there are already things in the database. And I was unable to get the container running with empty database directory. While everything seems to work with this variant – it does not seem to be the clean approach – due to the messages I’m getting. Maybe I missed a step here.

Variant 2 is to use the document_exporter and document_importer of paperless-ngx. This seems to be the clean approach I was looking for. Here basically the database is newly initialized hence empty – and paperless-ngx takes care of everything on import.

The third variant which I’m not going to cover here is to use pg_upgrade. I saw some guides on how to do this but Variant 2 looks much easier. I think one needs a special container which loads both volumes (the old pg15 and the new pg16 – which is empty) and then runs pg_upgrade from pg15 to pg16. MAYBE you could also do this from outside of the Docker Image if you have pg_upgrade and check /var/lib/docker/volumes/.

I think Variant 1 and Variant 3 are useful when an upgrade is required but there are so many files / there is so much data that using the document_exporter and importer is just not going to work.

Variant 1 (using dumps)

I found a good explanation on the process here. Here’s it for paperless-ngx PostgreSQL:

Create a dump

# sudo -Hu paperless docker exec -it paperless-db-1 pg_dumpall -U paperless > /home/jean/paperless.sql

I used the pg_extract.sh part like explained in Thomas’ article.

./pg_extract.sh paperless.sql paperless >> paperless.sql 

Stop the stack

# sudo -Hu paperless docker compose down
[+] Running 6/6
 ✔ Container paperless-webserver-1  Removed                                      6.6s 
 ✔ Container paperless-tika-1       Removed                                      0.3s 
 ✔ Container paperless-broker-1     Removed                                      0.3s 
 ✔ Container paperless-gotenberg-1  Removed                                     10.2s 
 ✔ Container paperless-db-1         Removed                                      0.4s 
 ✔ Network paperless_default        Removed                                      0.3s 

Edit the docker-compose.yml

Change pgdata to pgdata16 (two times in that file)

      - pgdata16:/var/lib/postgresql/data

# the second occurence is at the end of the file after volumes:
  pgdata16:

Change postgres:15 to postgres:16

    image: docker.io/library/postgres:16

Start _only_ the database

# sudo -Hu paperless docker compose up -d d
b
[+] Running 1/3
 ⠼ Network paperless_default    Created                                          0.5s 
 ⠼ Volume "paperless_pgdata16"  Created                                          0.4s 
 ✔ Container paperless-db-1     Started                                          0.4s 

Import the database dump

# cat /home/jean/paperless.sql | docker exec -i paperless-db-1 psql -U paperless

Now on importing there are errors like data exists already. But this does not seem to be a problem – Paperless-NGX runs fine, I could not detect any problems. However, because of this I thought using pg_upgrade might be better. I believe I missed a step here. Once I found the missing piece I’ll update this post.

Variant 2 (export & import

This is my prefered way to do this upgrade. Basically, I start the whole stack, make a backup (export), re-install paperless with new versions and finally import everything. More or less.

~:/opt/paperless/paperless-ngx# sudo -Hu paperless docker compose exec -T webserver document_exporter ../export 
100%|██████████| 1004/1004 [00:07<00:00, 126.43it/s]

root@paperless:/opt/paperless/paperless-ngx# du -sh export 
986M    export

Edit docker.yml

Attention, I also upgrade Gotenberg here. You may want to stick to gotenberg 7.10 / whatever version you do have there currently.

~:/opt/paperless/paperless-ngx# grep image docker-compose.yml 
    image: docker.io/library/redis:7
    image: docker.io/library/postgres:16
    image: ghcr.io/paperless-ngx/paperless-ngx:latest
    image: docker.io/gotenberg/gotenberg:8
    image: ghcr.io/paperless-ngx/tika:latest

Pull

# sudo -Hu paperless docker compose pull
[+] Pulling 28/28
 ✔ broker Pulled                                               1.0s 
 ✔ db 14 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulled     30.6s 
   ✔ e1caac4eb9d2 Already exists                               0.0s 
   ✔ f3f2d5b672fb Pull complete                                0.3s 
   ✔ 09bbd9ebb4bf Pull complete                                0.6s 
   ✔ 73af4999342c Pull complete                                0.5s 
   ✔ 410224db6f54 Pull complete                                1.0s 
   ✔ 00c8397ea547 Pull complete                                0.9s 
   ✔ 8de37ed8017f Pull complete                                1.0s 
   ✔ cafd8b2980da Pull complete                                1.3s 
   ✔ 446f52a8411a Pull complete                               16.6s 
   ✔ 3eb5d6c884ad Pull complete                                1.4s 
   ✔ 152d5aad8c0d Pull complete                                1.7s 
   ✔ 0859b7e55055 Pull complete                                1.8s 
   ✔ 8d723bf86e14 Pull complete                                2.1s 
   ✔ 5a108fed7862 Pull complete                                2.2s 
 ✔ gotenberg 9 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulled     52.0s 
   ✔ 0a8a5d218416 Pull complete                                2.3s 
   ✔ 22b9a3162daf Pull complete                               15.9s 
   ✔ ff469c77f2a3 Pull complete                               12.9s 
   ✔ a994f2588bfe Pull complete                               28.4s 
   ✔ 034e8a1d21d3 Pull complete                               24.9s 
   ✔ 63331405a93d Pull complete                               17.4s 
   ✔ 2e13e3166189 Pull complete                               17.9s 
   ✔ 206d2ba0ffba Pull complete                               19.0s 
   ✔ 4f4fb700ef54 Pull complete                               19.4s 
 ✔ tika Pulled                                                 0.9s 
 ✔ webserver Pulled                                            1.0s 

Empty the Docker Volumes

. o ( There’s probably a better way to do this… )

cd /var/lib/docker/volumes
mv paperless_media paperless_media_backup
mkdir -p paperless_media/_data
mv paperless_pgdata paperless_pgdata_backup
mkdir -p paperless_pgdata/_data

Start

~:/opt/paperless/paperless-ngx# sudo -Hu paperless docker compose up -d
[+] Running 5/5
 ✔ Container paperless-tika-1       Running                    0.0s 
 ✔ Container paperless-gotenberg-1  Running                    0.0s 
 ✔ Container paperless-db-1         Started                    0.5s 
 ✔ Container paperless-broker-1     Running                    0.0s 
 ✔ Container paperless-webserver-1  Started                    0.0s 

I tried to login through the Panel. But it did not work (Which is to be expected and good)

Import!

~:/opt/paperless/paperless-ngx# sudo -Hu paperless docker compose exec -T webserver document_importer ../export
Checking the manifest
Installed 1278 object(s) from 1 fixture(s)
Copy files into paperless...
100%|██████████| 1004/1004 [00:16<00:00, 62.37it/s]
Updating search index...
100%|██████████| 1004/1004 [00:24<00:00, 41.64it/s]

Done

Now test your login, check with docker ps that everything is up and running.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.