Rclone’s VFS Cache: A Deep Dive into Optimizing for a Local MinIO S3 Backend

I realized a critical detail about my setup: the standard vfs-cache strategy is a good starting point only if the cache’s performance is superior to the S3 backend. With this theory in mind, it was time to put it to the test.

Continue reading Rclone’s VFS Cache: A Deep Dive into Optimizing for a Local MinIO S3 Backend

ZFS Disaster Recovery: Rebuilding and Mirroring a Pool After Top-Level Vdev Error

I recently learned a hard lesson about ZFS Vdev architecture after attempting to convert a single-disk pool into a mirror. By mistake, I added the new disk as a top-level Vdev, rather than attaching it as a mirror. As zpool remove and zpool detach both failed on the top-level Vdev, I was forced to destroy the pool and restore the data from a snapshot.

This process outlines how I recovered data and subsequently created a proper mirror configuration.

Continue reading ZFS Disaster Recovery: Rebuilding and Mirroring a Pool After Top-Level Vdev Error

Nextcloud S3 Workaround: Multi-User Rclone Mounts with Systemd Templates

I experienced trouble with Nextcloud’s built-in S3 connector, as it would corrupt photos during auto-upload from the Android client. Since dedicated S3FS or Goofys were also not ideal, I decided on a reliable alternative: using rclone to manage the mounts. This strategy allows me to decouple the unreliable Nextcloud S3 implementation from the underlying object storage.

Continue reading Nextcloud S3 Workaround: Multi-User Rclone Mounts with Systemd Templates

StrongSwan VPN: Mastering IKEv2 EAP-TLS and ChromeOS Client Integration

StrongSwan is the complete IPsec solution used to secure communication between servers and clients via mutual certificate-based authentication and encryption. This guide documents the necessary implementation steps for the highly secure IKEv2 EAP-TLS protocol, focusing on critical workarounds for seamless ChromeOS integration.

Continue reading StrongSwan VPN: Mastering IKEv2 EAP-TLS and ChromeOS Client Integration

Suricata Performance: Resolving eBPF Bypass Failure via Manual Kernel Filter Compilation

Enabling eBPF (Extended Berkeley Packet Filter) bypass is the ultimate step in Suricata performance tuning. It allows the kernel to filter known-safe traffic (e.g., TLS data) before the packets reach the resource-intensive Userspace engine. However, this functionality often fails to work out-of-the-box.

I found a bug report confirming that the pre-compiled .bpf files shipped in my distribution were incompatible with the current libbpf library (version > 1.0). Without a successful .bpf load, the kernel bypass mechanism is completely inactive.

Part I: Diagnosis of the Bypass Failure

To confirm the failure, I checked Suricata’s internal statistics via suricatasc. The initial output confirmed that the eBPF bypass was not occurring, despite the configuration being set in suricata.yaml.

Initial Failure Metrics

The metrics show zero packets being bypassed (ipv4_success: 0):

>>> ebpf-bypassed-stat
Success:
{
    "ens3": {
        "ipv4_fail": 0,
        "ipv4_maps_count": 0,
        "ipv4_success": 0,
        "ipv6_fail": 0,
        "ipv6_maps_count": 0,
        "ipv6_success": 0
    },
    "ens5": {
        "ipv4_fail": 0,
        "ipv4_maps_count": 78,
        "ipv4_success": 0,
        "ipv6_fail": 0,
        "ipv6_maps_count": 0,
        "ipv6_success": 0
    }
}

The simple interface status confirmed the failure, but also revealed an underlying issue with checksums that requires further attention:

>>> iface-stat ens3
Success:
{
    "bypassed": 0,
    "drop": 0,
    "invalid-checksums": 11510,
    "pkts": 21704175
}

The attempt to load the default .bpf file resulted in a fatal error:

 Error: ebpf: Unable to load eBPF objects in '/usr/lib/suricata/ebpf/bypass_filter.bpf': Operation not supported

Part II: Manual Kernel Filter Compilation

The solution is to manually compile the .bpf files from the Suricata source code, linking them against the host system’s current libbpf library. This resolves the version incompatibility.

The Compilation Process

I grab the Suricata source code and configure the build process specifically to include eBPF support:

# Install dependencies as explained in Suricata installation documentation
./scripts/bundle.sh
./autogen.sh
./configure --enable-ebpf-build

# Change into the eBPF directory and compile the kernel filters
cd ebpf
make

Deployment

The newly compiled files are copied to the correct path, replacing the broken distribution files.

cp *.bpf /usr/lib/suricata/ebpf/

Once the corrected filter is loaded, the logs show success:

 Info: ebpf: Successfully loaded eBPF file '/usr/lib/suricata/ebpf/bypass_filter.bpf' on 'ens3'
 Info: ebpf: Successfully loaded eBPF file '/usr/lib/suricata/ebpf/bypass_filter.bpf' on 'ens5'

Part III: Verification

The successful loading of the eBPF filter confirms that Suricata is now utilizing the kernel to filter traffic before passing it to the Userspace engine, resulting in significant CPU savings.

Final Success Metrics (Post-Compilation)

The metrics now show thousands of successful bypasses, validating the fix:

>>> ebpf-bypassed-stat
Success:
{
    "ens3": {
        "ipv4_fail": 0,
        "ipv4_maps_count": 32,
        "ipv4_success": 32292,
        "ipv6_fail": 0,
        "ipv6_maps_count": 0,
        "ipv6_success": 0
    },
    "ens5": {
        "ipv4_fail": 0,
        "ipv4_maps_count": 78,
        "ipv4_success": 32290,
        "ipv6_fail": 0,
        "ipv6_maps_count": 0,
        "ipv6_success": 0
    }
}

The interface statistics now display the successfully bypassed packets:

>>> iface-stat ens5
Success:
{
    "bypassed": 807883,
    "drop": 0,
    "invalid-checksums": 0,
    "pkts": 316991330
}

Note: The original log showed a high count of invalid-checksums. This is a separate, critical issue (often related to offloading) that needs to be addressed, but the eBPF bypass functionality itself is now working.

Sources / See Also

  1. Suricata Documentation. Working with eBPF and XDP. https://docs.suricata.io/en/latest/install/ebpf-xdp.html
  2. Suricata Documentation. Suricata 7 Changelog (Note new policy behavior). https://suricata.io/changelog/
  3. Suricata Documentation. FAQ: Traffic gets blocked after upgrading to Suricata 7. https://suricata-update.readthedocs.io/en/latest/faq.html#my-traffic-gets-blocked-after-upgrading-to-suricata-7
  4. Libvirt Documentation. VirtIO Device Configuration (Driver Offload Parameters). https://libvirt.org/formatdomain.html#elementsNICS
  5. GitHub Repository libbpf. eBPF library source and version compatibility issues. https://github.com/libbpf/libbpf
  6. Linux Networking. Understanding the eBPF framework and its application in networking. https://www.kernel.org/doc/html/latest/networking/filter.html

Suricata IPS: Fixing Legitimate Traffic Drops by Disabling drop-invalid

I encountered a peculiar issue where my WordPress instance was unable to reach wordpress.org, and DokuWiki could not access its plugin repository. All standard network checks (wget, curl, DNS) worked fine, and no drops were registered by the standard firewall rules.

However, logging revealed a problem deep within the Intrusion Prevention System (IPS) layer.

The Diagnostic: Stream Errors

I noticed an unusually high number of dropped packets related to stream errors in the stats.log:

ips.drop_reason.flow_drop | Total | 837
ips.drop_reason.rules | Total | 3398
ips.drop_reason.stream_error | Total | 19347

This confirmed that Suricata’s TCP Stream Engine was classifying legitimate traffic as invalid, causing the connection to stall before the application layer could proceed. The volume of stream_error drops was alarmingly high.

Further investigation into Suricata’s internal statistics revealed details about the nature of the errors:

stream.fin_but_no_session                     | Total | 12508
stream.rst_but_no_session                     | Total | 2577
stream.pkt_spurious_retransmission            | Total | 14735

These specific counters (FINs/RSTs without an active session, spurious retransmissions) point to common issues in asymmetric routing or session tracking in complex bridged/virtualized environments.

The Workaround: Disabling Strict Stream Enforcement

Based on community discussions regarding unexpected drops in IPS mode, I tested a key stream-configuration variable.

The default setting drop-invalid: yes instructs Suricata to immediately drop packets it deems invalid according to its internal state machine (often due to out-of-sync sequence numbers or timing issues).

The Fix: I set this directive to no.

stream:
  memcap: 64mb
  memcap-policy: ignore  
  drop-invalid: no # Set to 'no' to fix legitimate traffic drops
  checksum-validation: yes
  midstream-policy: ignore
  inline: auto
  reassembly:

As soon as I applied this change, the traffic to wordpress.org and the DokuWiki repository resumed functioning normally.

Conclusion: The Security Trade-off

While this workaround immediately solved the connectivity problem, I am consciously accepting a security trade-off. Disabling drop-invalid instructs the IPS to allow potentially ambiguous or invalid packets to pass.

  • Risk: This allows a low-volume attacker to potentially use malformed packets to bypass the stream state-tracking.
  • Benefit: It ensures Service Availability for crucial application updates and connections that the IPS was incorrectly flagging due to virtualization or network environment subtleties.

My next step will be to investigate the root cause of the high stream_error count to see if the error is caused by a kernel-level configuration or a misaligned network path.

Sources / See Also (Quellen)

  1. Suricata Documentation. Stream Configuration and Settings (Specifically drop-invalid). https://docs.suricata.io/en/latest/configuration/stream.html
  2. Suricata Documentation. Understanding and Analyzing the Stats Log. https://docs.suricata.io/en/latest/output/stats/stats-log.html
  3. Suricata Documentation. IPS Mode and Traffic Drop Reasons. https://docs.suricata.io/en/latest/performance/ips-mode.html
  4. OISF Community Forum. Discussion on high stream errors/spurious retransmissions and network offloading. (Diese Art von Diskussion ist der primäre Fundort für solche Workarounds).
  5. Linux Manpage: ethtool. Documentation on Network Offloading (TSO, GSO, LRO) which often causes Suricata Stream issues.

Suricata AF-Packet: Resolving VirtIO Non-Functionality via Checksum Offload Disablement

This article documents a two-part process: successfully upgrading Suricata to version 7 on Debian Bookworm and solving a critical stability issue required to run the AF-Packet IPS mode with high-performance VirtIO NICs in a virtual machine. Without this specific configuration, the IPS failed to function.

Part I: Suricata 7 Upgrade and Policy Changes

A much newer Suricata version can be installed by utilizing Debian’s bookworm-backports repository, which is essential for access to the latest security features and performance enhancements.

The Backports Installation

  1. Ensure the backports repository is configured in your /etc/apt/sources.list:

    deb https://ftp.debian.org/debian/ bookworm-backports contrib main non-free non-free-firmware
  2. Install Suricata using the specific target:

    apt-get install -t bookworm-backports suricata

Post-Upgrade Security Alert (Critical)

After upgrading to Suricata 7, you may experience immediate traffic blocking. This is not a bug, but a deliberate change in the application’s default security posture.

  • Reason: Suricata 7 introduced new policy rules that are often set to drop by default.
  • Action: You must review your new suricata.yaml configuration. The recommended approach is to install the new configuration files, compare them with your old setup, and set unwanted policies to ignore.

Reference: This new behavior is explicitly documented in the official Suricata 7 Changelog. Consult the Suricata FAQ for troubleshooting details on blocking issues.

Part II: The VirtIO and AF-Packet Critical Failure Fix

When using Suricata in IPS mode with the high-performance AF-Packet acquisition method, using VirtIO NICs is preferred. However, without a specific Libvirt configuration, the IPS fails entirely to process bridged traffic.

The Problematic Default VirtIO Config

If the VirtIO NIC is defined simply with <model type='virtio'/> in the Libvirt XML, AF-Packet fails to initialize or correctly process traffic.

The Solution: Disabling Guest Checksum Offload

The fix requires overriding the default driver settings by introducing the <driver> block and explicitly setting checksum (csum) offloading to off for the guest system.

This solution was found while troubleshooting similar packet loss issues in a thread related to XDP drivers in RHEL environments, suggesting a common kernel/driver interaction problem with aggressive offloading features.

The minimal required working Libvirt XML configuration looks like this:

    <interface type='bridge'>
      <mac address='..:..:..:..:..:..'/>
      <source bridge='ovs-guests'/>
      <virtualport type='openvswitch'>
      </virtualport>
      <model type='virtio'/>      
      <driver name='vhost'>
        <host csum='off' gso='off' tso4='off' tso6='off' ecn='off' ufo='off' mrg_rxbuf='off'/>
        <guest csum='off' tso4='off' tso6='off' ecn='off' ufo='off'/>
      </driver>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </interface>

Crucial Insight: The key fix is the parameter csum='off' within the <guest/> tag. If checksum offloading is left enabled (csum='on'), the system fails to bridge traffic completely.

Part III: The Deep Dive: Why Checksum Offload Causes Complete Failure

Here is the rationale for why Checksum Offload (CSUM) leads to complete non-functionality:

1. The CSUM Optimization Paradigm (CSUM=’on’)

When you set csum='on', you are performing a performance optimization aimed at saving CPU cycles:

  • The Host/Hypervisor receives packets and passes them to the VirtIO Driver (Vhost).
  • The Vhost Driver passes the packets into the VirtIO Ring in the Guest System, but marks them with a special flag (e.g., in the skb—Socket Buffer—metadata) signaling to the Guest Kernel: “Attention, the L3/L4 checksum is invalid/missing and must be corrected or calculated before further processing up the stack.”
  • This is a performance trick: the CPU-intensive checksum calculation is delegated to the Guest Kernel, but only when it is truly necessary.

2. The Collision Point: AF-Packet Bypass

Suricata using AF-Packet now bypasses precisely this process:

  • AF-Packet is a very low-level packet capture method. It operates directly above the driver (or in the kernel) and fetches the raw L2 frames directly from the VirtIO Ring.
  • Suricata receives the packet at a point before the standard kernel stack has performed the checksum finalization.
  • Suricata’s Deep Packet Inspection (DPI) engine relies on the integrity of the Layer 3/Layer 4 headers (e.g., to check the TCP segment length, track the TCP state machine, or evaluate the validity of IP headers).
  • The Non-Functionality: Since Suricata receives a packet with the “Checksum missing/invalid” flag, it interprets this not as an optimization instruction, but as a critical error in the packet itself (Corrupted Packet).

3. The Resolution (CSUM=’off’)

By explicitly setting <guest csum='off'>, we force the Host/Vhost Driver to deliver the packets to the Guest as if they were ‘normal’ Ethernet frames that already contain all checksums. Suricata therefore only sees complete, consistent packets and can apply the DPI logic without error.


Sources / See Also

  1. Suricata Documentation. Suricata 7 Changelog (Note new policy behavior). https://suricata.io/changelog/
  2. Suricata Documentation. FAQ: Traffic gets blocked after upgrading to Suricata 7. https://suricata-update.readthedocs.io/en/latest/faq.html#my-traffic-gets-blocked-after-upgrading-to-suricata-7
  3. Suricata Documentation. Working with AF-Packet. https://docs.suricata.io/en/latest/install/af-packet.html
  4. Libvirt Documentation. VirtIO Device Configuration (Driver Offload Parameters). https://libvirt.org/formatdomain.html#elementsNICS
  5. Debian Wiki. Instructions for using Debian Backports. https://wiki.debian.org/Backports
  6. Suricata Community Forums. Troubleshooting references for XDP/Packet Loss (Context for driver tuning). https://forum.suricata.io/
  7. Linux Networking. Understanding the Checksum Offload Mechanism. https://www.kernel.org/doc/Documentation/networking/checksum-offloads.txt

Automated Defense: Building a Central Log Hub for Fail2ban and External Firewall Integration

A very light-weight and efficient approach for consolidating logs centrally is by using rsyslog. My virtual machines all use rsyslog to forward their logs to a dedicated internal virtual machine, which acts as the central log hub. A fail2ban instance on this hub checks all incoming logs and sends a block command to an external firewall—a process helpful for automated security.

Continue reading Automated Defense: Building a Central Log Hub for Fail2ban and External Firewall Integration

Nextcloud Client on Chromebook (ARM/aarch64): Solving Two-Way Sync

Short explanation on how to get the Nextcloud Linux desktop client working reliably on a Chromebook. This solution is necessary because the official Android desktop client does not offer true two-way synchronization, which is a critical feature for managing files across systems.

Continue reading Nextcloud Client on Chromebook (ARM/aarch64): Solving Two-Way Sync

Nextcloud and MinIO Integration: Why Direct S3 Fails and the Filesystem Abstraction Workaround

MinIO is a fantastic Object Storage solution, and I intended to use my distributed MinIO system as the primary external storage for Nextcloud. This distributed setup, which uses Sidekick as a load balancer for seamless node access, proved functional but revealed a critical stability flaw, particularly with mobile uploads.

Continue reading Nextcloud and MinIO Integration: Why Direct S3 Fails and the Filesystem Abstraction Workaround