Install S3FS on Ubuntu 14.04 LTS

S3FS is a virtual filesystem that enables you to mount an Amazon S3 bucket as a folder on your computer or server.

In this article, we'll provide you with a step-by-step guide to install S3FS on Ubuntu 14.04.

Install the required packages

Execute following command as root (or via sudo):

apt-get install build-essential libfuse-dev libcurl4-openssl-dev libxml2-dev mime-support automake libtool

Install S3FS

Download and unpack S3FS in a temporary folder. Change the version number if there is a new release. Make sure to execute the last command (make install) as root!

cd /tmp
wget https://github.com/s3fs-fuse/s3fs-fuse/archive/v1.77.tar.gz
mv v1.77.tar.gz s3fs-fuse-1.77.tar.gz
tar zxvf s3fs-fuse-1.77.tar.gz
cd s3fs-fuse-1.77/
./autogen.sh
./configure --prefix=/usr
make
make install

Try it out!

First, we need to enter our AWS credentials in a place where s3fs can find them. I prefer to place this file globally in /etc/passwd-s3fs. Just enter your Credentials (AWS Access Key and Secret Access Key) in following format:

AWS_ACCESS_KEY:SECRET_ACCESS_KEY

and secure the file for unauthorized access:

chmod 640 /etc/passwd-s3fs

Then we can try to mount our bucket in the filesystem.

s3fs bucketname /my/mountpoint

Ensure the bucket is mounted at boot

To make sure the bucket is mounted again in case of a reboot, we'll need to edit our /etc/fstab file. Add a line like below

s3fs#bucketname /my/mountpoint fuse allow_other 0 0

Optimizing our s3fs

To avoid read/write errors and speed up access to frequent files, or reduce costs there are some interesting options that can be included when mounting our bucket.

retries

The number of times s3fs will retry a failed s3 transaction. By default this is set to 2, but we could increase this if we face errors.

To use this commandline: 

s3fs -o retries=5 bucketname /my/mountpoint
To use this in /etc/fstab:
s3fs#bucketname /my/mountpoint fuse retries=5,allow_other 0 0

use_cache

By adding this parameter, we can use a local folder to cache the files that are written to and retrieved from s3. This can really speed up things (and reduce costs) if you frequently request the same files.

To use this commandline:

s3fs -o use_cache=/tmp_bucket bucketname /my/mountpoint

To use this in /etc/fstab:

s3fs#bucketname /my/mountpoint fuse use_cache=/tmp_bucket,allow_other 0 0

use_rrs

This way, you can use the Reduced Redundancy Storage, which - if your use case permits it - will give you a considerable cost benefit.

To use this commandline:

s3fs -o use_rrs=1 bucketname /my/mountpoint
To use this in /etc/fstab:
s3fs#bucketname /my/mountpoint fuse use_rrs=1,allow_other 0 0

Our setup

We use s3 buckets continuously, and really like to have them available on boot, so we added following line for each bucket in our /etc/fstab

s3fs#bucketname /my/mountpoint fuse retries=5,use_cache=/tmp_bucket,use_rrs,allow_other 0 0

Pitfall

Using a local cache is nice, it will speed up things and reduce cost, but s3fs doesn't manage the size of this local cache folder. To limit the size of our local cache folder, we could create a cron job to periodically clean up all files that were not accessed in the last 7 days, or not modified during the last month by adding following line to root's crontab:

@daily find /tmp_bucket -atime +7 -mtime +30 -exec rm {} \;

Do you have other cleanup strategies? Do let us know in the comments!