Rclone’s VFS Cache: A Deep Dive into Optimizing for a Local MinIO S3 Backend

I realized a critical detail about my setup: the standard vfs-cache strategy is a good starting point only if the cache’s performance is superior to the S3 backend. With this theory in mind, it was time to put it to the test.

In my case, the cache is a single rotating disk, while my MinIO S3 backend is powered by a multi-disk array. This meant my cache was a potential performance bottleneck, potentially slower than a direct request to my MinIO cluster, which can leverage parallel I/O from its multiple drives.

This led I to re-evaluate my approach and configure the mounts without a read cache, instead tuning the other settings to let MinIO handle the load directly.

Optimized Settings for Read Performance (Without a Read Cache)

I chose this approach to rely on MinIO’s parallel disk access, which is crucial for streaming large files. The settings below represent the best balance I found between efficiency and performance.

Use Case 1: Large File Streaming (e.g., Audiobooks)

SettingValueRationale
--vfs-cache-modewritesOnly cache write operations. Read operations stream directly from MinIO, utilizing its multi-disk read speeds.
--vfs-read-chunk-size64MLoading large files in efficient chunks that leverage my fast internal network.
--vfs-read-ahead64MVital for smooth streaming by prefetching the next part of the file.
--vfs-read-chunk-size-limit2GAllows Rclone to dynamically increase the chunk size to handle massive files without unnecessary overhead.
--vfs-read-chunk-streams8Maintains 8 parallel streams to fully utilize the disk I/O of my MinIO setup.

Use Case 2: Small Document Access (Paperless-ngx)

With consistently small documents for Paperless-ngx, the goal is to fetch each file as quickly as possible without relying on a single-disk cache that lacks parallelization.

SettingValueRationale
--vfs-cache-modewritesOnly caching writes. Reads go directly to MinIO.
--vfs-read-chunk-size512KPerfectly sized for documents, ensuring most files are retrieved in a single request.
--vfs-read-ahead0No prefetching is needed for random document access, saving system resources.
--vfs-read-chunk-size-limit0Disables dynamic sizing as files are consistently small.
--vfs-read-chunk-streams8Keeps 8 parallel streams to handle multiple small document requests efficiently (e.g., during indexing).

Test Methodology and Results

To compare the performance of both caching strategies, I ran a series of controlled tests using the dd command, which is perfect for measuring sequential read speed.

  • Test Setup: A test file of 864 MB from my Audiobookshelf bucket. The Rclone mount was unmounted and the cache was cleared before each test. The same dd command was used for all runs: dd if=/path/to/my/testfile of=/dev/null bs=1M.

Test 1: VFS Cache Off (--vfs-cache-mode writes)

This test measured the baseline performance of reading directly from my multi-disk MinIO setup.

  • Result: The average read speed was 56 MB/s.
  • Analysis: This speed confirms that the bottleneck is indeed the disk speed and not the network.

Test 2: VFS Cache On (--vfs-cache-mode full)

This test had two parts: the initial read (MinIO to cache) and the subsequent cached read.

  • Result (First Run – Reading from MinIO, writing to cache): The average read speed was 30.76 MB/s.
  • Analysis: As expected, this speed was significantly slower than the “cache off” run. The single-disk cache became a bottleneck, limiting the speed as data had to be written to disk while it was being read from MinIO.
  • Result (Second Run – Reading from local cache): The average read speed was 82.52 MB/s.
  • Analysis: This speed was close to the raw read speed of my single cache disk.

The Final Verdict: Why a Cache Was the Right Choice All Along

After extensive testing and a deep dive into my system’s architecture, I have come to a definitive conclusion: the local cache, despite my initial skepticism, is the most performant solution for my specific setup.

My initial assumption was that MinIO’s multi-disk setup would be inherently faster than a single-disk cache. The real-world tests, however, told a different story. The “no-cache” approach, which streamed data directly from my MinIO cluster, only achieved a speed of around 56 MB/s. In contrast, the second read from my local HDD cache consistently hit 82.52 MB/s.

This result shed light on my MinIO instance’s internal architecture. While my MinIO backend has four disks, my tests showed a highly imbalanced load. It appears that my MinIO configuration prioritizes reading from a limited (2) number of disks to serve a single request, while the other drives handle background tasks. This turns the multi-disk array into a speed bottleneck for my specific read-heavy use case, despite the overall number of disks.

The Path Forward: Scaling for Performance

This conclusion led I to a final thought experiment. My tests revealed that my MinIO setup delivers a combined read performance of 56 MB/s. Given that each of my single disks can achieve a raw read speed of around 80 MB/s, and my MinIO setup is reading from only two of them (simplified), the theoretical maximum speed should have been 160 MB/s.

This means my MinIO configuration is operating at an efficiency of only 35% (56 MB/s observed / 160 MB/s theoretical).

A dedicated SSD as a cache provides the high speed and high IOPS needed to outperform the complexities and bottlenecks of a distributed storage system, without requiring a change in the core architecture.

Important Note: A Homelab Scenario

These test results and conclusions are derived from a specific home lab environment and are not directly comparable to a professional cloud or data center setup. However, the methodology of challenging assumptions with real-world tests and data-driven analysis is universally applicable. The key takeaway for any similar setup is to measure the performance of each component of your stack to identify bottlenecks, rather than relying on theoretical assumptions.

Sources / See Also

  1. Rclone Documentation. Mount Options and Usage (VFS Cache Modes). https://rclone.org/commands/rclone_mount/
  2. Rclone Documentation. Understanding VFS Fast Fingerprint and Modtime. https://rclone.org/docs/#vfs-fast-fingerprint
  3. MinIO Documentation. MinIO Distributed Deployment and Erasure Coding. https://min.io/docs/minio/linux/deployment/distributed-deployment/
  4. Linux Manpage: dd. The utility used for sequential read benchmarking. https://man7.org/linux/man-pages/man1/dd.1.html

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.