Scan to your own private cloud – automatic upload to Nextcloud (or any other cloud)

Published by Oliver on

The digital, paperless future is coming closer and closer. When I was looking to go digital I ran into one major problem: automatic upload to Nextcloud. The automatic upload from your scanner makes digital copies of your documents available everywhere – but only to you! It turns out that his problem is more complicated than expected, but I found a solution that works very well for me. It even works with other cloud providers like S3, Dropbox, Google drive and more.

Going paperless – what you need

If you want to replace your old “analog” folders with a digital system you need a couple of components. First you need some kind of scanner to get an image of your documents. Next you need some storage to place these images in. This should be backed up and secure. Finally you will most likely need another system that lets you categorize and search all these documents.

Here are two setups I have used over time. A very simple one without much cost and a more professional one that needs a bit more time and money to set up but results in a great experience.

If you are only interested in the auto upload part then jump here.

Scanner

The first step is getting a device to digitalize your documents. By far the easiest and also cheapest solution is just using your smartphone (assuming you already have one). You could simply take a photo of your document but I suggest getting a scanner app. Those can automatically optimize your scans and combine multiple images into one document. I have been using Microsofts Office Lens for some time but there are a lot of other options out there. Just be careful with auto-upload to some cloud if you value your privacy.

I have set the app to scan to a certain folder on my phone and then set my Nextcloud app (it can be simple to set up your own server) to watch this folder and automatically upload all the documents there. Then I can continue working with them on my desktop PC. Pretty neat.

Using an app can be pretty tedious though, especially if you have two side documents or you want to scan a bigger amount (say all your existing archive. If you plan on scanning all your future documents and want to build a great workflow then I strongly suggest buying a document scanner which is able to scan a whole bunch of documents at a time and scan both sides at the same time too.

If you are looking to buy a scanner consider doing it via this affiliate link to support the blog!
Brother ADS-2400N document scanner

I got a ADS-2400N scanner from Brother. It is not a cheap device but so far definitely worth the price. I simply put a couple of documents into the intake, press a button and the documents show up on my Nextcloud a couple of seconds later. More on this setup later. This scanner has an ethernet port and allows scanning to other devices on the network.

Storage system

Once you have scanned your documents it is quite important to store them somewhere safe. I prefer some place controlled by myself for privacy reasons but either way you need backups! At first I put my documents into my Nextcloud system. This is running on my own server (as a Docker setup) and the data is safely stored on ZFS filesystem with regular snapshots and external backups. There is even an automatic check for missing backups. I would strongly recommend similar care for your documents.

DMS – document management system

Storing your documents in a plain file system, or even something like Nextcloud or Google drive my be enough for beginners. If you really plan to use digital document storage I personally suggest using a dedicated system. These come with way more features, especially OCR, which makes the content of your scans searchable, and tagging for your documents.

There are a lot of open source free DMS systems that you can easily install via Docker. I have tried out DocSpell, Paperless(-ng), Papermerge and Teedy. All of them have different features, strengths and weaknesses. I really liked paperless-ng but unfortunately it does not really support multiple users yet. Instead I use Teedy for now (also called sismics docs).

Teedy scans each document I upload, making all of them (including their content) searchable. I can also tag each document (with multiple tags) to put them into a folder like structure. This results in a super searchable setup. You know who sent you the document? Or just some number in the document? Usually it can be found in seconds.

Scanning to the cloud – automatic upload to Nextcloud

Now that we know the needed components, how do we actually scan documents to our own cloud? My setup works like this: when I get a new document I put it into the Brother scanner, click one button and a couple of seconds later it shows up on my Nextcloud. Now I can view this document either on my phone or my PC. From there I can upload it to Teedy and add relevant tags. That’s it. Now I can always find it in just a couple of seconds while having the document safely stored in my own cloud with a set of automatic backups.

So how does this work? If your scanner supports WebDAV you could automatically upload your scans to your Nextcloud. Mine, and many others, do not. Instead I created a small samba share on my smart home server. That is a small Raspberry Pi with an external SSD running 24/7. Creating a share can be done like this

sudo mkdir /mnt/extssd/scans
sudo adduser --no-create-home --disabled-password --disabled-login scanner
sudo chown scanner /mnt/extssd/scans/
sudo smbpasswd -a scanner

This will create a new folder that you will later share, located in /mnt/extssd/scans. It also creates a new user, without a home directory, that owns this folder and has a samba password. Now we can create the share

sudo nano /etc/samba/smb.conf

// add this to the end
[scans]
comment = Scans folder
path = /mnt/extssd/scans/
force user = scanner
force group = scanner
writeable = yes
valid users = scanner

// safe via ctrl+O and close via ctrl+c
testparm // to check for errors
sudo systemctl restart smbd.service // to create the share

Now you should be able to see a new share from another device on your network. You will have to log in with scanner and the password you chose earlier.

Next you need to set up your scanner to scan to this share via SMB (samba). I opened the web interface for mine and profile to scan to \\homeserver\scans using scanner as user name combined with the smb password. Homeserver is the name of my Raspberry Pi on the network, you can of course also use the IP address.

setup for the automatic upload to nextcloud. Scanner web interface setting up the scan to the smb share
Web interface for my scanner – adding a network target

Next I needed a way to automatically upload all the documents in this scans folder to my (external) Nextcloud instance. While this would be easy with the Nextcloud sync client for desktops there is no equivalent with instant automatic upload on the command line. The only solutions I found were using regular checks with a cron job.

Instead I came up with a better solution. I use the inotify tool on Linux to monitor the folder and get instantly notified on new files added. Then I use rclone to move this file to my Nextcloud instance. All of this is done by a Systemd service that runs a bash script. To make this process easier in the future I uploaded all the needed code to a GitHub repository.

Here is how to set it up (can also be founded in the readme): first install the needed tools via APT

sudo apt install rclone inotify-tools git

Next we need to configure rclone to know where our Nextcloud instance is. You could replace Nextcloud with any other remote supported by rclone: like S3, Dropbox or Google drive. Type rclone config and follow the instructions. This will create a file under ~/.config/rclone/rclone.conf.

Now you can start installing the service. Download the files from GitHub and enable them as a service by running:

# clone this repository locally
sudo git clone https://github.com/OliverHi/autouploader.git /opt/autouploader

# copy your rclone config
sudo cp ~/.config/rclone/rclone.conf /opt/autouploader/rclone.conf

# and update the user rights
sudo chown scanner /opt/autouploader/rclone.conf

# then install this script as a background service
sudo ln -s /opt/autouploader/autouploader.service /etc/systemd/system/autouploader.service
sudo systemctl daemon-reload

By default the service will use the scanner user we created before. If you want to use a different one you need to update the chown command and the autouploader.service file. You should also update the synchronized path in the autoupload.sh file if you use a different one. If the file is to your liking then enable autostart for the new service and run it via

# start the script at system startup during boot
sudo systemctl enable autouploader.service

# start the script running
sudo systemctl start autouploader.service

# check and make sure the startup worked
sudo systemctl status autouploader.service

# to follow all the logs use
journalctl -u autouploader.service -f

The script should now be running via the service and you should see some output via the journalctl command. You can now manually create a file in the /mnt/extssd/scans folder (or any other you used) or upload one via your scanner. Afterwards you should see some logs from the autouploader service pushing the file to Nextcloud.

In my case the scanner creates a new file first and then uploads the content bit by bit. inotify makes the script instantly react to that but the upload fails because the file is still locked during upload (and we don’t want half of a file uploaded anyways). To work around this issue I changed the parameters for the rclone command in the autoupload.sh file to retry 10 times with a 1 second wait time. For my setup this is always enough to receive and upload the full file. You can play around with these settings if that is different for you.

This system works very well for me and is easily adapted for other cloud service. Feel free to create a GitHub issue or pull request if there are problems for you. You will always find the up to date version of this service in my GitHub account. There are also instructions on the update process in the readme file.