Navigation: blog > 2020 RSS

Explore a single month: 02 03

Installing Jitsi behind a reverse proxy

Jitsi logo

Introduction

Videoconferencing with the official meet.jit.si instance has always been a pleasure, but it seemed a good idea to research how to install a Jitsi instance locally, that could be used by customers, by members of the local Linux Users Group (COAGUL), or by anyone else.

This instance is available at jitsi.debamax.com and that service should be considered as a beta: it's just been installed, and it's still running the stock configuration. Feel free to tell us what works for you and what doesn't!

Networking vs. virtualization host

One host was already set up as a virtualization environment, featuring libvirt, managing LXC containers and QEMU/KVM virtual machines. In this article, we focus on IPv4 networking. Basically, the TCP/80 and TCP/443 TCP ports are exposed on the public IP, and NAT'd to one particular container, which acts as a reverse proxy. The running Apache server defines as many VirtualHosts as there are services, and acts as a reverse proxy for the appropriate LXC container or QEMU/KVM virtual machine.

Schematically, here's what happens:

What does that mean for the Jitsi installation? Well, Jitsi expects those ports to be available:

For this specific host, TCP/4443 and UDP/10000 were available, and have been NAT'd as well to the Jitsi virtual machine directly. Given the existing services, the same couldn't be done for the TCP/443 port, which explains the need for the following section.


NAT and reverse proxy for Jitsi (click for full view)
NAT and reverse proxy for Jitsi

Note: A summary of the host's iptables configuration is available in the annex at the bottom of this article.

Apache as a reverse proxy

A new VirtualHost was defined on the apache2 service running as reverse proxy. The important parts are quoted below:

<VirtualHost *:80>
    ServerName jitsi.debamax.com
    RedirectMatch permanent ^(?!/\.well-known/acme-challenge/).* https://jitsi.debamax.com/
</VirtualHost>

<VirtualHost *:443>
    SSLProxyEngine on
    SSLProxyVerify none
    SSLProxyCheckPeerCN off
    SSLProxyCheckPeerName off
    SSLProxyCheckPeerExpire off

    ProxyPass        / https://192.168.122.120/
    ProxyPassReverse / https://192.168.122.120/
</VirtualHost>

The redirections set up on the TCP/80 port were already mentioned in the previous section, so let's concentrate on the TCP/443 port part.

The ProxyPass and ProxyPassReverse directives act on /, meaning every path will be proxied to the Jitsi virtual machine. If one wasn't using VirtualHost directives to distinguish between services, one could be dedicating some specific paths (“subdirectories”) to Jitsi, and proxying only those to the Jitsi instance. But let's concentrate on the simpler “the whole VirtualHost is proxied” case.

The first SSLProxyEngine on directive is needed for apache2 to be happy with proxying requests to a server using HTTPS, instead of plain HTTP.

All other SSLProxy* directives aren't too nice as they disable all checks! Why do that, then? The answer is that Jitsi's default installation is setting up an NGINX server with HTTP-to-HTTPS redirections, and it seemed easier to directly forward requests to the HTTPS port, disabling all checks since that NGINX server was installed with a self-signed certificate. One could deploy a suitable certificate there instead and enable the checks again, instead of using this “StackOverflow-style heavy hammer” (some directives might not even be needed).

Jitsi configuration

Jitsi itself was installed on a QEMU/KVM virtual machine, running a basic Debian 10 (buster) system, initially provisioned with 2 CPUs, 4 GB RAM, 3 GB virtual disk. Its IP address is 192.168.122.120, which is what was configured as the target of the ProxyPass* directives in the previous section.

The installation was done using the quick-install.md documentation, entering jitsi.debamax.com as the FQDN, and opting for a self-signed certificate (letting the reverse proxy in charge of the Let's Encrypt certificate dance, like it does for all VirtualHosts).

Now, a very important point needs to be addressed (no pun intended), which isn't so much related to the fact one is running behind a reverse proxy, but related to the fact TCP/4443 and UDP/10000 ports are NAT'd: the videobridge component needs to know about that, and needs to know about the public IP and the local IP. In this context, the local IP is the Jitsi virtual machine's local IP, where the NAT for TCP/4443 and UDP/10000 points to, and it is not the reverse proxy's local IP. That's why those lines have to be added to the /etc/jitsi/videobridge/sip-communicator.properties configuration file:

org.ice4j.ice.harvest.NAT_HARVESTER_LOCAL_ADDRESS=192.168.122.120
org.ice4j.ice.harvest.NAT_HARVESTER_PUBLIC_ADDRESS=163.172.19.80

[ Hint: Beware, there's another sip-communicator.properties configuration file, for the jicofo component! ]

Remember to restart the service:

systemctl restart jitsi-videobridge

Annex: host networking configuration

The relevant iptables rules on the host are the following (leaving aside the usual MASQUERADING which is required when using NAT):

Chain FORWARD (filter table)
target     prot opt source               destination
ACCEPT     tcp  --  0.0.0.0/0            192.168.122.100      tcp dpt:80
ACCEPT     tcp  --  0.0.0.0/0            192.168.122.100      tcp dpt:443
ACCEPT     tcp  --  0.0.0.0/0            192.168.122.120      tcp dpt:4443
ACCEPT     udp  --  0.0.0.0/0            192.168.122.120      udp dpt:10000

Chain PREROUTING (nat table)
target     prot opt source               destination
DNAT       tcp  --  0.0.0.0/0            163.172.19.80        tcp dpt:80 to:192.168.122.100:80
DNAT       tcp  --  0.0.0.0/0            163.172.19.80        tcp dpt:443 to:192.168.122.100:443
DNAT       tcp  --  0.0.0.0/0            163.172.19.80        tcp dpt:4443 to:192.168.122.120:4443
DNAT       udp  --  0.0.0.0/0            163.172.19.80        udp dpt:10000 to:192.168.122.120:10000

Published: Wed, 18 Mar 2020 10:15:00 +0100

Fixing faulty synchronization in Nextcloud

Nextcloud logo

Introduction

Some problems were detected on a customer’s Nextcloud instance: disk space filled up all of a sudden, leading to service disruptions. Checking the Munin graphs, it seemed that disk usage gently increased from 80% to 85% over an hour, before spiking at 100% in a few extra minutes.

Looking at the “activity feed” (/apps/activity), there were a few minor edits by some users, but mostly many “SpecificUser has modified [lots of files]” occurrences. After allocating some extra space and making sure the service was behaving properly again, it was time to check with that SpecificUser whether those were intentional changes, or whether there might have been some mishap…

It turned out to be the unfortunate consequence of some disk maintenance operations that led to file system corruption. After some repair attempts, it seems the Nextcloud client triggered a synchronization that involved a lot of files, until it got interrupted because of the disk space issue on the server side. The question became: How many of the re-synchronized files might have been corrupted in the process? For example, /var/lib/dpkg/status had been replaced by a totally different file on the affected client.

Searching for a solution

Because of the possibly important corruption, a way to get back in time was to take note of all the “wanted” changes by other users, put them aside, restore from backups (previous night), and replay the changes. But then it was feared that any Nextcloud client having seen the new files could attempt to re-upload them, replacing the files that would have been just restored.

That solution wasn’t very appealing, that’s why Cyril tried his luck on Twitter, asking whether there would be a way to revert all modifications from a given user during a given timeframe.

Feedback from the @Nextclouders account was received shortly after that, pointing out that such issues could have been caught client-side and a warning might have been displayed before replacing so many files, but that wasn’t the case unfortunately, and we were already in an after-the-fact situation.

The second lead could be promising if such an issue would be to happen more than once. All the required information is in the database already, and there’s already a malware app that knows how to detect files that could have been encrypted by a cryptovirus, and which would help restore them by reverting to the previous version. It should be possible to create a new application, implementing the missing feature by adjusting the existing malware app…

Diving into the actual details

At this stage, the swift reply from Nextcloud upstream seemed to indicate that early research didn’t miss any obvious solutions, so it was time to assess what happened on the file system, and see if that would be fixable without resorting to either restoring from backups or creating a new application…

It was decided to look at the last 10 hours, making sure to catch all files touched that day (be it before, during, or after the faulty synchronization):

    cd /srv/nextcloud
    find -type f -mmin -600 | sort> ~/changes

Searching for patterns in those several thousand files, one could spot those sets:

Good news: Provided there are no more running Nextcloud clients trying to synchronize things with the server for that SpecificUser, all those files under data/specificuser/uploads could go away entirely, freeing up 10 GiB.

Next: preview files were only 100 MiB, meaning spending more time on them didn’t seem worth it.

The remaining parts were of course the interesting ones: what about those files/ versus files_versions/ entries?

Versions in Nextcloud

Important note: The following is based on observation on this specific Nextcloud 16 instance, which wasn’t heavily customized; exercise caution, and use at your own risk!

Without checking either code or documentation, it seemed pretty obvious how things work: when a given foo/bar.baz file gets modified, the previous copy is kept by Nextcloud, moving it from under files/ to under files_versions/, adding a suffix. It is constructed this way: .vTIMESTAMP, where TIMESTAMP is expressed in seconds since epoch. Here’s an example:

./data/commonuser/files/foo/bar.baz
./data/commonuser/files_versions/foo/bar.baz.v1564577937

To convert from a given timestamp:

$ date -d '@1564577937'
Wed 31 Jul 14:58:57 CEST 2019

$ date -d '@1564577937' --rfc-2822
Wed, 31 Jul 2019 14:58:57 +0200

$ date -d '@1564577937' --rfc-3339=seconds
2019-07-31 14:58:57+02:00

Given there’s a direct mapping (same path, except under different directories) between an old version and its most recent file, this opened the way for a very simple check: “For each of those versions, does that version match the most recent file?”

If that version has the exact same content, one can assume that the Nextcloud client re-uploaded the exact same file (as a new version, though), and didn’t re-upload a corrupted file instead; which means that the old version can go away. If that version has a different content, it has to be kept around, and users notified so that they can check whether the most recent file is desired, or if a revert to a previous version would be better (Cyril acting as a system administrator here, rather than as an end-user).

Here’s a tiny shell script consuming the ~/changes file containing the list of recently-modified files (generated with find as detailed in the previous section), filtering and extracting each version (called $snapshot in the script for clarity), determining the path to its most recent file by dropping the suffix and adjusting the parent directory, and checking for identical contents with cmp:

#!/bin/sh
set -e

cd /srv/nextcloud
grep '/files_versions/' ~/changes | \
while read snapshot; do
  current=$(echo "$snapshot" | sed 's/\.v[0-9][0-9]*$//' | sed 's,/files_versions/,/files/,')
  if cmp -s "$snapshot" "$current"; then
    echo "I: match for $snapshot"
  else
    echo "E: no match for $snapshot"
  fi
done

At this point, it became obvious that most files were indeed re-uploaded without getting corrupted, and it seemed sufficient to turn the first echo call into an rm one to get rid of their old, duplicate versions and regain an extra 2 GiB. Cases without a match seemed to resemble the list of files touched by other users, which seemed like good news as well. To be on the safe side, that list was mailed to all involved users, so that they could check that current files were the expected ones, possibly reverting to some older versions where needed.

Conclusion

Fortunately, there was no need to develop an extra application to implement a new “let’s revert all changes from this user during that timeframe” feature to solve this specific case. Observation plus automation shrank the list of 2500+ modified files to just a handful that needed manual (user) checking. Some time lost, and some space that was reclaimed in the end. Not too bad for a Friday afternoon…

Many thanks to the Nextcloud team and community for a great piece of software, and for a very much appreciated swift reply!


Published: Sat, 29 Feb 2020 01:00:00 +0100