Self-hosting ArchiveBox on a VPS
Note: this article is edited and published at Vultr Docs
ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view sites you want to preserve offline. This guide explains how to self-host ArchiveBox on a Vultr One-Click Docker application, and publish it with a Caddy reverse proxy.
Prerequisites
- A Vultr One-Click Docker application running Ubuntu 18.04
- Caddy
This guide assumes that only ArchiveBox is hosted on the server, but you can easily extend the configuration of Caddy for more applications.
1. Set up ArchiveBox
Using docker-compose
is the recommended way to set up ArchiveBox. And ArchiveBox provides an official docker-compose.yml
which bundles all dependencies that we can use to set up our server.
After One-Click Docker deploys, log in as root via SSH.
Switch to the
docker
user
su - docker
- Create a new empty directory, and download the official
docker-compose.yml
file. Note this folder will also be the place to store data of ArchiveBox.
mkdir ~/archivebox && cd ~/archivebox
curl -O 'https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/master/docker-compose.yml'
- (Optional) You can set a restart policy of the
archivebox
service in the downloadeddocker-compose.yml
, so that ArchiveBox can start automatically on different situations. For example, set it toalways
.
archivebox:
image: ${DOCKER_IMAGE:-archivebox/archivebox:latest}
command: server --quick-init 0.0.0.0:8000
ports:
- 8000:8000
restart: always
environment:
- ALLOWED_HOSTS=*
- MEDIA_MAX_SIZE=750m
volumes:
- ./data:/data.
- Run the initial setup and create an admin user. You will use this admin user to create bookmarks in ArchiveBox.
docker-compose run archivebox init --setup
- Start the server at
localhost:8000
docker-compose up -d
- Switch back to the root user by pressing
Ctrl+D
2. Set up a server firewall
A firewall prevents access to our server via un-allowed ports. For ArchiveBox, we only need to expose port 80 (for HTTP) and 443 (for HTTPS). You can also enable the SSH port which is helpful for a lot of situations, but it’s optional. In this guide, we will use the Uncomplicated Firewall ufw
that is a front-end for iptables
and easier to manage and use.
- Install
ufw
sudo apt-get install ufw
- Enable the HTTP and HTTPS ports
sudo ufw allow 80
sudo ufw allow 443
- (Optional) Enabled the SSH port. It’s recommended that you set the port to another port that is not
22
, and make sure you update/etc/ssh/sshd_config
to match the new port
sudo ufw allow 22
- Start
ufw
sudo ufw enable
sudo ufw status # Should show "Status: active"
3. Set up a reverse proxy with Caddy
Now that we have ArchiveBox running at localhost:8000
, we want to publish it as a public-trusted site over HTTPS. We use Caddy which will handle reverse proxy and SSL termination with very little configuration.
- Set your domain’s A record (e.g.
archivebox.example.com
) point to your Vultr server in your DNS provider. Verify correct records with an authoritative lookup
curl "https://cloudflare-dns.com/dns-query?name=archivebox.example.com&type=A" -H "accept: application/dns-json"
- Install Caddy. There are many ways and you can even extend the
docker-compose.yml
of ArchiveBox. Here we will use the standard Caddy package.
sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | sudo apt-key add -
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | sudo tee -a /etc/apt/sources.list.d/caddy-stable.list
sudo apt update
sudo apt install caddy
- Update Caddy’s configuration
/etc/caddy/Caddyfile
archivebox.example.com
reverse_proxy localhost:8000
Run
caddy reload
to reload the configuration gracefully (without downtime)Verify it works by visiting
archivebox.example.com
Conclusion
At this point, you will have a self-hosted ArchiveBox application that you can bookmark websites from everywhere!
Published on:
Last modified: