Backup remote workloads without exposing Bacula to the Internet

Bacula Community Edition is an amazingly powerful open source backup software which works extremely well on private networks. However, if you need to backup workloads that are only reachable over the Internet, your first reflex may be to allow traffic to your Bacula File and Storage Daemons in your firewalls. Certainly, you can safely accomplish this by constraining access to specific source IPs on both sides, but what if you are hosting your Bacula infrastructure behind an ISP providing dynamic IPs? This post provides a solution to backup remote workloads without exposing Bacula’s endpoints to the Internet as such:

Bacula's typical network connectivity flow with FD and SD endpoints exposed over the Internet.

You can port-forward traffic from your router towards the Storage Daemon and leverage dynamic DNS to ensure the Storage Daemon’s host name remains the same for your remote File Daemons. This solution requires exposing each of your remote workloads’ File Daemons and your Storage Daemon to the Internet, which somewhat increases the server attack surfaces.

What if you could do away with most of that without setting up a persistent VPN connection?

SSH tunnels to the rescue

Since SSH is likely already setup to perform remote management of your servers, its builtin tunnelling feature allows you to expose services from one computer to another by sending traffic inside of an SSH socket.

Use SSH tunnels to backup your remote workloads without exposing Bacula's endpoints to the Internet

ssh -L 9112:remote.fqdn.tld:9102 -R 9103:bacula.fqdn.tld:9103 remote.fqdn.tld sleep 10

In the above example diagram and its associated command, the Bacula server initiates an SSH connection to the Remote server using the typical port tcp/22 and calls the sleep 10 command on the remote host. The diagram’s blue tunnel represents this SSH connection. The 10 second sleep keeps the SSH socket established long enough to allow traffic to start flowing through the SSH tunnel’s listeners.

The supplied SSH command creates a local listener bound to tcp/9112, on the Bacula server which has its target on the Remote server‘s File Daemon listener, on tcp/9102. This can be observed as the red dot and its arrows on the above diagram and is accomplished using the -L 9112:remote.fqdn.tld:9102 argument. I will cover the reason why both port bindings are different later on.

Similarly, the -R 9103:bacula.fqdn.tld:9103 argument creates a remote listener on tcp/9103 having its target on the Bacula server‘s port tcp/9103 as well.

This rudimentary command has all the pieces to allow the various flows required by Bacula:

  1. The Bacula Director first triggers the sshtunnel.sh script, which initiates the SSH connection and sets up the tunnels.
  2. Then, the Bacula Director initiates a File Daemon connection to localhost:9112, which is a local SSH tunnel listener sending traffic to the remote File Daemon.
  3. Finally, the remote File Daemon initiates a connection to the Storage Daemon on bacula.fqdn.tld:9103 in order to stream data. Because the traffic is expected to flow through the SSH tunnel, you first need to create a /etc/hosts entry for that FQDN pointing to 127.0.0.1

Setup a RunBeforeJob script

When the Bacula Director triggers the following job’s execution, the sshbacula.sh script will be executed before the actual backup starts. The sshbacula.sh script is meant to initiate a temporary SSH tunnel between the Bacula server and the remote server, using the RunBeforeJob parameter for the given Job resource.

Job {
  Name = "myremoteserver.com_root"
  Client = myremoteserver.com
  Type = Backup
  RunBeforeJob = "/etc/bacula/scripts/sshbacula.sh myremoteserver.com ~/.ssh/tunneluser 9112"
  FileSet = "myremoteserver.com_root"
  Schedule = "WeeklyCycle"
  Pool = LocalPool
  Storage = LocalHugeFileStorage
  Write Bootstrap = "/var/bacula/myremoteserver.com_root.bsr"
  Messages = Standard
}

You may notice that 3 arguments are provided to the sshbacula.sh script:

  • The hostname of the remote server. This is typically the same value as what you provide for the Client property.
  • The passwordless SSH private key to use to authenticate on the remote host.
  • The port onto which the Bacula Director will connect to, on localhost, which will be tunneled into the SSH tunnel. Because Bacula is able to execute backup jobs concurrently if your configuration allows for it, you must set a different port number for every job you expect to run concurrently!

These 3 arguments allow the same script to be reused across different remote hosts.

The sshbacula.sh script

Here is the contents of the sshbacula.sh script in question and line #16 is where all the magic happens, by initiating the SSH connection and setting up the local and remote tunnels that the Bacula Director and File Daemons will use.

#!/bin/bash
# Establishes a self-killing SSH tunnel to the given SSH server, and forwards the correct ports for bacula usage.

USER=tunneluser      # The username to use when connecting to the remote server.
CLIENT=$USER@$1      # user@host for the ssh command.
CLIENT_KEY=$2        # The location for the SSH private key to use for the ssh command.
LOCAL=$(hostname -f) # The FQDN of the bacula server.
SSH=`which ssh`      # The full path for the ssh command.

SD_LOCAL_TUNNEL=9103:$LOCAL:9103 # Tunnel for the LocalHugeFileStorage.
FD_TUNNEL=$3:localhost:9102      # Tunnel for the File Daemon.

# Everything echoed to stdout will appear in Bacula's messages. Nice.
echo "Starting SSH tunnel to $CLIENT..."

$SSH -fC2 -i $CLIENT_KEY -R 9101:$LOCAL:9101 -R $SD_LOCAL_TUNNEL -L $FD_TUNNEL $CLIENT sleep 10 >/dev/null 2>/dev/null

# Provide ssh some time to establish the connection.
sleep 3

Tunneling traffic for the Client resources

The last change we need to perform on the Bacula server side is to change the Address and FDPort properties for Client resources that require tunneling traffic using this solution. Indeed, we are exposing the File Daemon tunnel on localhost as can be seen on line #11 in the sshbacula.sh script above and we’re using the port number we used on the `RunBeforeJob` property. Thus, you need to amend the client resource to use these values, as highlighted in the following example:

Client {
  Name = myremoteserver.com
  Address = localhost 
  FDPort = 9112
  Catalog = MyCatalog
  Password = "...OBFUSCATED..."
  File Retention = 30 days
  Job Retention = 30 days
  AutoPrune = yes
}

With this configuration, the Bacula Director will initiate a File Daemon connection on localhost:9112 and the traffic will be tunneled towards myremoteserver.com:9102 !

Tunneling traffic to the Storage Daemon

Until now, you had to perform all previous steps on the Bacula server itself, in order to allow the Director to connect to remote File Daemons. However, as you can see in this post’s first image, the remote File Daemons need to initiate a connection to the Storage Daemon as well. The sshbacula.sh script binds the tunnel on the remote host’s port 9103 so all we need to do is to override the Storage Daemon’s hostname.

An easy method to accomplish this is to add a hosts file entry, which is located in /etc/hosts for *nix machines, or under C:\Windows\System32\drivers\etc\hosts for Windows machines, pointing to localhost‘s IP address.

In the following example, assume that my Bacula Storage Daemon is hosted on the bacula.fqdn.tld host:

tunneluser@myremoteserver:~$ cat /etc/hosts
127.0.0.1 localhost
127.0.0.1 bacula.fqdn.tld

Keep in mind that this hostname must also be used in the Storage resource definition’s Address property for the job in question, which is named LocalHugeFileStorage in the above example. Here is an example:

Storage {
    Name = LocalHugeFileStorage
    Address = bacula.fqdn.tld # <--- Customize this property!
    SDPort = 9103
    Password = "...OBFUSCATED..."
    Device = LocalFileDevice
    Media Type = LocalFile
    Heartbeat Interval = 10
    Maximum Concurrent Jobs = 1
}

Conclusion

Backing up your workloads over the Internet in a secure fashion may be easier to accomplish in an enterprise context than in the homelab. This is because most homelabbers will have an ISP providing a dynamic IP address. This makes it challenging to adequately maintain File Daemon configurations and firewall rules to only allow connectivity from your desired hosts to your Storage Daemon. On the other side, SSH is often times a necessary evil and you hopefully already have the means to control access to this endpoint.

This solution leverages SSH’s tunneling capabilities to reduce your hosts’ attack surface by allowing you to backup your remote workloads without actually exposing Bacula’s endpoints to the Internet.

One Reply to “Backup remote workloads without exposing Bacula to the Internet”

Leave a Reply

Related Posts