Press "Enter" to skip to content

Hosting a Mastodon instance: moving asset storage to S3

I’ve had my Mastodon instance (iowadon.org) up and running for about a week and a half now. It’s not grown very large – 10 users or so last time I checked – but it’s been a great project to learn some more server admin type stuff.

Ten days and ten users in, my Linode instance seems to quite adequate from a processing standpoint. (I’m using the lowest-level Linode offering, which provides 1 CPU core and 2 GB of RAM.) However, the disk space usage is growing. The database itself isn’t overly large, but the cache directory (which stores profile pictures, preview cards for images) is up over 15 GB. I could work to get really aggressive about cache management, but realistically a larger cache is a reasonable thing. Mastodon provisions for this by providing easy hooks to use S3 buckets for the cache, so I figured I’d give that a shot.

I found Nolan Lawson’s excellent step-by-step instructions and followed them to a T. Well, almost to a T. I first went to set up an S3 bucket, kicked off the script to copy the cache over from Linode to S3, then went to bed. The next morning I did some more reading and decided that Linode’s very similar Object Storage service (it’s just their S3 clone) might be a better deal cost-wise. Amazon S3 charges a small amount per GB for storage and then a different rate for data access. Linode does it slightly differently – you pay a flat fee per month for a given bucket size, and then you get a large amount of transfer every month for free. Since my server is already on Linode, it was easier and simpler to just use the Linode buckets, so I tried again there.

One gotcha that’s not obvious when creating the bucket at Linode: if you’re going to put a custom domain name in front of the bucket, you need to name the bucket that domain name if you want their TLS/SSL stuff to work. In my case, I setup a CNAME record to point assets.iowadon.org at my bucket, so I needed to name my pocket assets.iowadon.org. There’s no renaming buckets, so I had to empty and delete my old one and then create a new one with the correct name. After that the certificate generation went smoothly enough and I once again kicked off the copy job. Then I went to the gym.

A couple hours and 68,000 file copies later, my cache is in the bucket and a quick restart of Mastodon via docker-compose pulled in the configuration updates that now point out to the cloud. It went amazingly smoothly.

Edit: I posted this a little too soon…

All the existing assets were working fine, but new assets weren’t loading properly. Commence some more googling. The correct answer was that in addition to the .env.production settings listed in the instructions above, you also need this one:

S3_ENDPOINT=https://<linode_region>.linodeobjects.com

In my instance, that looked like this:

S3_ENDPOINT=https://us-southeast-1.linodeobjects.com/

Now it seems to be fully working.

4 Comments

  1. Al Al

    Thanks for this. I was contemplating spinning up a D0 instance and have the feeling much of this is going to apply. That 15G for user files…with ten users… seems like a lot.

    • Yeah, as best I can tell that 15G isn’t just files from my users, but basically a media cache for any media (including avatars) that comes across any of my server’s feeds, including the federated one. I set some timeouts on the administration panel to delete caches after 10-14 days, so we’ll see how well that limits things. I set up a busy relay earlier this week that apices up my federated feed but also upsized my object storage… it’s around 25GB right now. I assume it will level out at some point. I have a 250 GB limit in the Linode bucket before I have to pay more, so I feel ok about it for the foreseeable future.

    • I didn’t try to do the nginx caching; the Linode object bucket provides the first 1 TB of egress for free every month, so it’s not a cost savings. I do have a subdomain in front of the bucket so the user sees the subdomain and not the bucket url. And since it’s the same service provider for both the server and the storage bucket, there’s not any additional IP disclosure.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.