Audiobookshelf with S3 using rclone

Audiobookshelf is a self-hosted audiobook and podcast server, which can also be used for ebooks. I previously stored my ebooks as PDFs in paperless-ngx, but that turned out to be somewhat unwieldy. I was searching for an alternative and found Audiobookshelf, which fulfills this need perfectly.

The example docker-compose.yml for Audiobookshelf looks like this:.

services:
  audiobookshelf:
    image: ghcr.io/advplyr/audiobookshelf:latest
    ports:
      - 13378:80
    volumes:
      - </path/to/audiobooks>:/audiobooks
      - </path/to/podcasts>:/podcasts
      - </path/to/config>:/config
      - </path/to/metadata>:/metadata
    environment:
      - TZ=America/Toronto

I assume the basics of docker compose down, docker compose pull, and docker compose up -d are already familiar.

The Rclone Docker Volume Plugin

As I described in my article on https://blog.jeanbruenn.info/2024/04/11/paperless-ngx-with-s3-using-rclone/, using the Docker volume plugin is quite straightforward.

services:
  audiobookshelf:
    image: ghcr.io/advplyr/audiobookshelf:latest
    ports:
      - 13378:80
    volumes:
      - s3:/audiobooks
      - ./podcasts:/podcasts
      - ./config:/config
      - ./metadata:/metadata
    environment:
      - TZ=Europe/Berlin

volumes:
  s3:
    driver: rclone
    driver_opts:
      remote: "minio:audiobookshelf-jean"
      allow_other: "true"
      vfs_cache_mode: "full"

This works if you have installed the Rclone Docker plugin. For example:

apt-get -y install fuse3
mkdir -p /var/lib/docker-plugins/rclone/config
mkdir -p /var/lib/docker-plugins/rclone/cache
docker plugin install rclone/docker-volume-rclone:amd64 args="-v" --alias rclone --grant-all-permissions

If you are running this on a Raspberry Pi 5, you should replace amd64 with arm64. You can also use sub-folders within your volume like this:

services:
  audiobookshelf:
    image: ghcr.io/advplyr/audiobookshelf:latest
    ports:
      - 13378:80
    volumes:
      - type: volume
        source: s3
        target: /audiobooks
        volume:
          subpath: audiobooks
      - type: volume
        source: s3
        target: /podcasts
        volume:
          subpath: podcasts
      - type: volume
        source: s3
        target: /metadata
        volume:
          subpath: metadata
      - ./config:/config
    environment:
      - TZ=Europe/Berlin

volumes:
  s3:
    driver: rclone
    driver_opts:
      remote: "minio-audiobookshelf:audiobookshelf-jean"
      allow_other: "true"
      vfs_cache_mode: "full"

This approach maps your audiobooks, podcasts, and metadata to sub-paths within your S3 bucket. My /var/lib/docker-plugins/rclone/config/rclone.conf contains the following snippet for Audiobookshelf:

[minio-audiobookshelf]
type = s3
region = us-east-1
endpoint = http://127.0.0.1:9000
provider = Minio
env_auth = false
access_key_id = audiobookshelf-jean
secret_access_key = 
acl = bucket-owner-full-control
location_constraint =
server_side_encryption =

You can also use other settings in your docker-compose.yml that have given me good results. See end of this post for more information about the settings.

volumes:
  s3:
    driver: rclone
    driver_opts:
      remote: "minio-audiobookshelf:audiobookshelf-jean"
      allow_other: "true"
      vfs_cache_mode: full
      vfs_cache_max_size: 20G
      vfs_cache_max_age: 72h
      vfs_read_ahead: 8M
      # if you are running a vfs cache over local, s3 or swift backends
      # then using this flag is recommended.
      vfs_fast_fingerprint: 'true'
      # for high performance object stores (eg AWS S3) a reasonable
      # place to start might be --vfs-read-chunk-streams 16 and 
      # --vfs-read-chunk-size 4M.
      vfs_read_chunk_streams: 8
      vfs_read_chunk_size: 4M
      vfs_read_chunk_size_limit: 2G
      # In particular S3 and Swift benefit hugely from the --no-modtime
      # flag as each read of the modification time takes a transaction.
      no_modtime: 'true'
      # This flag allows you to manually set the statistics about the
      # filing system. It can be useful when those statistics cannot be
      # read correctly automatically.
      vfs_disk_space_total_size: 1T

I also reduced the buffer-size used by Rclone. This cannot be set in the docker-compose.yml and must be configured at the plugin level:

docker plugin disable rclone
docker plugin set rclone args="-v --buffer-size 1M"
docker plugin enable rclone
docker plugin inspect rclone

Alternatively, use a host-bind mount

If you prefer not to use the Docker plugin, you can also bind-mount the S3 directories from the host into the container. A docker-compose.yml for this approach might look like this:

services:
  audiobookshelf:
    image: ghcr.io/advplyr/audiobookshelf:latest
    ports:
      - 13378:3333
    volumes:
      - type: bind
        source: /srv/audiobookshelf/audiobooks
        target: /audiobooks
      - type: bind
        source: /srv/audiobookshelf/podcasts
        target: /podcasts
      - type: bind
        source: /srv/audiobookshelf/metadata
        target: /metadata
      - ./config:/config
    environment:
      - TZ=Europe/Berlin
      - PORT=3333

This assumes your S3 bucket is already mounted using Rclone to /srv/audiobookshelf. One way to achieve this is by using a systemd service. For example, I use a template like this:

root@mayu:~# cat /etc/systemd/system/rclone@.service 
[Unit]
Description=rclone - s3 mount for minio %i data
Documentation=https://rclone.org/ man:rclone(1)
AssertPathExists=/etc/rclone/rclone-%i.conf
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
Environment=RCLONE_CONFIG=/etc/rclone/rclone-%i.conf
ExecStart=/usr/bin/rclone \
    mount minio-%i:%i-jean /srv/%i/ \
    --use-mmap \
    --allow-other \
    --vfs-cache-mode full \
    --cache-dir /var/cache/rclone/%i \
    --temp-dir /var/tmp/rclone/%i \
    --log-level INFO \
    --log-file /var/log/rclone/rclone-%i.log \
    --umask 002
ExecStop=/bin/fusermount -uz /srv/%i
Restart=on-failure

[Install]
WantedBy=default.target

Note the minio-%i:%i-jean part. The %i is replaced by the name I use to call this service. So if I have a bucket called audiobookshelf-jean, I can enable and mount it by simply running:

systemctl enable rclone@audiobookshelf
systemctl start rclone@audiobookshelf

I can also create a per-bucket systemd override for more specific options:

# /etc/systemd/system/rclone@audiobookshelf.service.d/override.conf
[Service]
ExecStart=
ExecStart=/usr/bin/rclone \
  mount minio-%i:%i-jean /srv/%i \
      --use-mmap \
      --allow-other \
      --vfs-fast-fingerprint \
      --vfs-cache-mode full \
      --vfs-cache-max-age 72h \
      --vfs-cache-max-size 20G \
      --vfs-read-ahead 8M \
      --vfs-read-chunk-streams 8 \
      --vfs-read-chunk-size 4M \
      --vfs-read-chunk-size-limit 2G \
      --no-modtime \
      --vfs-disk-space-total-size 1T \
      --cache-dir /var/cache/rclone/%i \
      --temp-dir /var/tmp/rclone/%i \
      --log-level INFO \
      --log-file /var/log/rclone/rclone-%i.log \
      --umask 002

Which one to use

To be honest, there’s no single “best” method, as it depends on your specific use case. However, I personally prefer the systemd-based approach and would recommend it for most self-hosting setups.

Here’s a breakdown of the pros and cons:

  • Rclone Docker Plugin:
    • Pros: The mount is tightly coupled with the container’s lifecycle. It’s clean and simple to set up, and the mount only exists when the container is running.
    • Cons: Changing global plugin settings (like buffer-size) affects all containers using the plugin, which can be inconvenient. Restarting the Docker daemon will unmount all volumes and might require a manual intervention.
  • Systemd Host Mount:
    • Pros: Offers much more flexibility and control. You can manage each mount independently, change its settings without affecting other services, and access the mounted directory from any application on the host, not just from within a specific container. The mount persists even if Docker is restarted.
    • Cons: Requires a little more initial setup (systemd service file, etc.) and runs separately from the container’s lifecycle. You need to manage the mount’s state manually.

In my experience, the flexibility of the systemd approach outweighs the minor convenience of the plugin.

Things you may want take a look at

Runing Docker as non-root

You may add user: xxx:yyy to the docker file where-as xxx is the uid and yyy the gid of a given user in your system so that audiobookshelf’s docker container runs as user and not root e.g:

services:
  audiobookshelf:
    image: ghcr.io/advplyr/audiobookshelf:latest
    ports:
      - 13378:3333
    user: 134:121
    volumes:
      - type: bind
        source: /srv/audiobookshelf/audiobooks
        target: /audiobooks
      - type: bind
        source: /srv/audiobookshelf/podcasts
        target: /podcasts
      - type: bind
        source: /srv/audiobookshelf/metadata
        target: /metadata
      - ./config:/config
    environment:
      - TZ=Europe/Berlin
      - PORT=3333

Rclone offers you a uid and gid switch which makes the mounted S3 appear to belong to a specific uid and gid. If you mount manually or using the systemd example above you just add

–uid 134
–gid 121

If the user’s ID is 134 and the group’s ID is 121. For the docker-plugin one you can also just add

uid: 134
gid: 121

At the driver_opts in the docker-compose.yml.

The rclone settings I’m using

I chose some of these settings because the documentation states they are recommended for my scenario.

vfs_fast_fingerprint: 'true'
The Rclone documentation states this flag is recommended when running a VFS cache over a local, S3, or Swift backend. Since I am using a VFS cache over an S3 backend, this flag is perfect.

vfs_read_chunk_streams: 8 and vfs_read_chunk_size: 4M
The documentation states that for high-performance object stores (e.g., AWS S3), a good starting point is --vfs-read-chunk-streams 16 and --vfs-read-chunk-size 4M. My MinIO setup is a self-hosted, high-performance object store located next to the Audiobookshelf machine, which allows for high theoretical throughput. I still need to test this to confirm the performance.

no_modtime: 'true'
S3 and Swift benefit hugely from the --no-modtime flag, as each read of the modification time requires a separate transaction. Since I am using S3, this is an ideal setting.

vfs_disk_space_total_size: 1T
This flag allows you to manually set the total disk space reported by the filesystem, which is useful when that information cannot be read correctly automatically. The mount will work even if you don't use this flag.

use-mmap
This flag tells Rclone to use memory mapping (mmap) on Unix-based systems. Memory allocated this way does not go on the Go heap and can be returned to the OS immediately when no longer needed. This is beneficial for systems with low memory, so I definitely want this.

Cache and Logging Settings

You must be careful if you have multiple Rclone mounts using a VFS cache: you must not use the same directory for the cache. The %i in my systemd example is replaced with the bucket name, so each S3 mount has its own cache directory. I also want to separate temporary files and logs.

--cache-dir /var/cache/rclone/%i
--temp-dir /var/tmp/rclone/%i
--log-file /var/log/rclone/rclone-%i.log
--umask 002

Remaining VFS Settings

I decided to use a vfs_cache_mode of "full". I reduced the maximum cache age to 72 hours and the maximum size to 20G. I thought that a vfs-read-ahead of double the vfs-read-chunk-size (i.e., 8 MB since the read chunk size is 4 MB) might be a smart choice to ensure the next data block is already available.

A few final insights from my highly caffeinated AI assistant

After a very long talk and discussion with a large language model named Gemini, in which I had to correct it a few times (it’s learning!), we came to some critical conclusions about my setup.

It concluded that my vfs-cache-max-size was not nearly big enough (I had 256M) for my system architecture and that increasing it to 20 GB was absolutely critical for performance. Yes, I explained my entire architecture, including my slow disks and network limits, so that it could challenge my stack. And boy, did it challenge it!

My digital co-pilot insists that I should enable the dynamic vfs-read-chunk-size-limit by setting it to -1 or 2G (I had 0) to efficiently handle large outlier files. It also advised me to reduce vfs-read-chunk-streams from 16 to 8 to better match the limited speed of my disks on the S3 side.

Tailored Rclone Settings for Different Services

The key to a high-performance setup is to use different Rclone settings for each service, optimized for its specific data type. Here’s what I’ve found works best for my multi-mount setup.

1. For Audiobookshelf (Streaming Media)

Since my average file size is large 144 MB and my files are streamed, I’ll focus on efficiency for large, sequential reads. The key here is sequential access. When I stream an audiobook, I’m reading the file from start to finish. This makes the --vfs-read-ahead setting incredibly important. A high read-ahead value ensures that Rclone is always one step ahead, preloading the next part of the file before my player even needs it. This prevents buffering and ensures a smooth listening experience.

  • --vfs-read-chunk-size 64M: I’m loading large files in a few efficient chunks.
  • --vfs-read-ahead 64M: I’ll prefetch a large chunk of the next part of the audiobook, ensuring smooth playback without buffering.
  • --vfs-read-chunk-size-limit -1 or 2G: This is crucial for handling those massive audiobook files without unnecessary overhead.
  • --vfs-read-chunk-streams 8: I’ll use this to fully utilize my disk speed by loading multiple chunks in parallel.

2. For Paperless-ngx (Small Documents)

With an average document size of just 279 KB, my goal is to load each file in a single, fast request. PDF readers and document viewers are more about random access. I might jump to a specific page or open a different document entirely. Pre-loading the next document in the folder is often useless. Therefore, for Paperless-ngx, it’s best to set --vfs-read-ahead to 0 to save system resources. I don’t need to read ahead; I just need a single, fast read for the file that’s requested right now.

  • --vfs-read-chunk-size 512K: This is perfectly sized to grab most documents in a single go, minimizing wasted bandwidth.
  • --vfs-read-ahead 0: I won’t prefetch here. Documents are rarely read sequentially.
  • --vfs-read-chunk-size-limit 0: No need for dynamic sizing with such consistently small files.
  • --vfs-read-chunk-streams 8: I’ll still use this to fetch multiple small documents simultaneously, like when a folder is being indexed.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.