Sat, 16 Sep 2017

Restic Systems Backup Setup, Part 2 - Running minio under runit under systemd

Part 2 of my series on building a restic-based system backup setup. Part 1 can be found found here.

As described in Part 1, my general strategy is to have a centralized backup server at a particular location, running an instance of minio for each server being backed up. In essence, I'm going to want to be running N minio server --config-dir=/... instances, and I want a simple way to add and start instances, and keep them running. In essence, I want a simple init service.

Fortunately, if you're looking for a simple init service, you need look no further than runit. It's an incredibly tiny init-like system, composed of some simple tools: runsv to run a service, keep it up and optionally log stdout output somewhere; sv to control that service by simply talking to a socket; and runsvdir to keep a collection of runsv instances going. Defining a service is simple, in a directory there is a run file, which is used by runsv to start the service. If you want to log, create a log subdirectory, with it's own run file — that file is executed and given the stdout of the main process as its input (the included svlogd command is a simple process for handling logs). To run a bunch of runsv instances, put them (or symlinks to them) all in a single directory, and point runsvdir at it. As a bonus, runsvdir monitors that directory, and if a runsv directory is created or goes away, runsvdir does the right thing.

It's an incredibly useful set of commands, and allows you to manage processes fairly easily. In this case, every time I add a machine to this backup scheme, I make an appropriate runsv dir with the correct minio incantation in the run file, and just symlink it into the runsvdir directory. We've been using runit at work for quite a while now in containers, and it's an awsome tool.

My newly-minted backup server is running Debian Stretch, which uses systemd as its init system. Creating systemd unit files is still something I have to think about hard whenever I do it, so here's the one I use for runit:

[Unit]
Description=Backup Service Minio Master runsvdir

[Service]
ExecStart=/usr/bin/runsvdir -P /backups/systems/conf/runit/
Restart=always
KillMode=process
KillSignal=SIGHUP
SuccessExitStatus=111
WorkingDirectory=/backups/systems
User=backups
Group=backups
UMask=002

[Install]
WantedBy=multi-user.target
    

Here, systemd starts runsvdir, pointing it at my top-level directory of runsv directories. It runs it as the backups user and group, and makes it something that starts up once the system reaches "multi-user mode".

Part 3 is coming, where I'll document backing up my first system.

Posted at: 18:41 | category: /computers/backups/restic-systems-backups | Link

Current PGP Practices: GPG 2.1 and a Yubikey 4

I might write this up as a full tutorial someday, but there's already a few of those out there. That said, here's a short outline of my current usage of PGP, aided by modern GPG and the OpenPGP smartcard functionality of a Yubikey 4.

Posted at: 14:55 | category: /computers/gpg | Link

Sat, 09 Sep 2017

Restic Systems Backup Setup, Part 1

This is the first in what will undoubtedly be a series of posts on the new restic-based system backup setup.

As I detailed earlier this week, I've started playing around with using restic for backups. Traditionally, I've used a variant of the venerable rsync snapshots method to backup systems, wrapped in some python and make, of all things. Some slightly younger scripts slurp everything down to a machine at home so I've got at least another copy of everything.

In my previous post, I discussed my initial attempt at restic, simply replicating that home backup destination into Backblaze B2. That works, but it feels a bit brute-force, and there have been other things I've wanted to change about this for a while:

Replicating from colo to home takes an order of magnitude longer: Backing up the ten or so VMs I have on my colo machine takes about 10 minutes. Pulling that down to home takes 100 minutes or so. (I'll note here that the bulk of my 'large' data is in AFS; what I'm backing up on systems is primarily configuration files, logs, and some things that happen to live locally on a system).

Some of this is due to the fact that the replication traffic goes from Michigan to New York, while the initial backups are all happening within the same physical host. But the larger part, I think, is due to the fact that in order to replicate my system backups, I have to preserve hardlinks. A bit of background here: the 'rsync snapshots' method works by using the --link-dest option to rsync. As I backup a system, if the file hasn't been changed, rsync makes a hardlink to the corresponding file in the --link-dest directory. This doesn't use any additional space, and it's an easy way of keeping, say, fourteen days worth of backups while only using more space for the files that change from day-to-day. Most of my systems keep that may days of backups around.

Since I want to replicate all of those backups (and not, for example, only replicate the latest day's worth of backups), but I want to keep the space savings that --link-dest gets me, I need to use the -H argument to the replicating rsync so it can scan all the files to be sent to find multiply hard-linked files. This takes a long long time — so much so that the sshd man page warns about it:

Note that -a does not preserve hardlinks, because finding multiply-linked files is expensive. You must separately specify -H.

The backing-up or replicating rsync must run as root: Of course the rsync on the machine being backed up must run as root, it needs to be able to read everything to be backed up. But the destination side also has to run as root, because I want to preserve permissions and ownership, and only root can do this. I've long wished for an rsync 'server' that spoke the rsync protocol out one side and simply stored everything in some sort of object storage. Unfortunately, the rsync protocol is less a protocol and more akin to C structs shoved over a network, as far as I understand. And the protocol isn't really defined except as "here's some C that makes it go".

Restoring files is done entirely on the backup server: Because of the previous issue, I didn't want root on the client servers to ssh in as root on the backup server — I felt it was much safer and easier to isolate backups by having the backup server reach out to do backups. There's no ssh key on the client to even be able to get into the backup server. It's not a big issue, but if I need to restore a handful of files spread out I've got kinda stage them somewhere and then get them over to the client system. And because the backup server has a command-restricted ssh key on the client server, it takes some convoluted paths to get stuff moved around.

Adding additional replicas adds even more suck: Adding another replica means another 100 minutes somewhere pulling stuff down. And it also means a full-blown server, someplace where I can run rsync as root, and it's got to be some place I trust. Also, most of the really cheap storage to be found is in object storage, not disks (real or virtual) — part of what attracted me to restic in the first place.

When I started playing with restic, I saw a tool that could solve a bunch of those problems. Today I've been playing around with it, and here's my ideas so far.

Distinct restic repositories: One of the benefits of restic is the inherent deduplication it does within a repo. And if I were backing up a large number of systems, I might save something by only having one copy of, say, /etc/resolv.conf. But really, most of what I'm backing up is either small configuration files, or log files. And these days, the few tens of gigabytes of backups I have there isn't really worth deduplicating. In addition, the largest consumer of backup space for me — stupidly unrotated log files that get a little bit appended to them every day — would benefit from the deduplication, even if it's only deduplicating on a single system.

More important than that, however, is that I want isolation between my systems. For example, the backups of my kerberos kdc are way more important than, say, web server logs. And I really don't want something that would run on a public-facing system be able to see backups for an internal system. So, distinct repositories.

Use minio as the backend: My first thought when I was going to experiment was to use the sftp backend to restic. But to isolate things fully, I'd have to make a distinct user on the backup server to hold backups for each client, and that sounds like too damn much work.

Unrelated, I've been playing around with minio. Essentially, it's about the simplest thing you can get that exposes the 90% of S3 that you want. "Here's an ID and a KEY, list blobs, store blobs, get blobs, delete blobs". Because it's very simple, it doesn't offer multi-tenancy, so I will have to run a distinct minio for each client. That said, I think that should be easy enough, especially if I use something like runit to manage all of them.

Benefit from the combination of minio and restic for replication: Minio is very simplistic in how it stores objects: some/key/name is stored as the file /top/of/minio/storage/some/key/name. This has two benefits: first, because the minio storage directory is also a restic repository, I can just point a restic client at that directory, and as long as I have a repository password, I can see stuff there. Second, every file in the restic repository other than the top level 'config' file is named after the sha256 hash of the file as it exists on disk, and all files in a repository are immutable. This makes it trivial to copy a restic repository elsewhere. While I'll likely start by simply using the b2 command line tool to sync things into B2, I think you can do it even faster. I haven't looked deeply, but my gut feeling is that the b2 sync command looks at the sha1 hash of the source file to decide if it needs to re-upload a file that exists already in B2. We don't need to do that at all; repository files are named after their sha256 hash, so if the files have the same name, they have the same contents [0]. So moving stuff around is incredibly trivial.

Future niceties. I've got a bunch of other ideas floating around in the back of my head for restic. One is a repository auditing tool: since nearly everything in restic is named for the sha256 hash of the file content, I'd like a tool I could run every day that would pull down, say, 1/30th of the files in the repository and run sha256 on them, to make sure there's no damage.

The second is some way of keeping a local cache of the restic metadata so operations what have to read all that are much faster. Third, and related, a smarter tool for syncing repositories. For example, I'd love to, say, keep three days of backups in my local repository, and be able to shove new things to an S3 repository but keeping seven days there, and shove things in B2 and keep there until my monthly bill finally makes me care.

Anyways, this has been a few hour brain dump of a few hours of experimentation, so I'll end this part here.

Posted at: 19:12 | category: /computers/backups/restic-systems-backups | Link

Mon, 04 Sep 2017

Techno Housekeeping

A long weekend (here in the US) combined with a few strategic days off, and I had a long, five day weekend. A few of those days I managed to get out of the house and down to a coffee shop, so I got a bit of work in, and managed to wrap up a bunch of techno housekeeping.

First, with a new laptop and a fresh VM install of Debian 9, I've got all the components in place to reach my ideal PGP setup ‐ my day-to-day keys are on a Yubikey 4, ssh can now forward unix domain sockets, and gpg has well-defined socket locations for the agent that deals with keys. Any key operations on the remote VM tunnel back through ssh to the gpg agent running on my laptop, which passes them along to the Yubikey. PIN protected, touch required for operations, and the key material never leaves the Yubikey. This gives me a deeply warm and fuzzy feeling inside. In a year or so, when I build a new colocation box, my key material won't ever touch it.

The info for this is spread out in a few places, perhaps soon I'll put it all together, at least what I do.

Attempting to straighten out the mess of cables under the TV at home caused me to plug the wrong power adapter back into the USB3 drive I have hanging off a NUC that I use as the secondary site for backups for the colo machine, which sent it into the afterlife. A spare drive and 24 hours later, I had all the material re-synced, but it gave me the gumption to start throwing together a plan to shove those backups into at least a third location. I've been doing backup stuff long enough in my career to definitely not trust stuff backed up to two different locations, and to cast a very wary eye on stuff not backed up to at least three different locations.

I'd been wanting to use the Backblaze B2 storage since I first heard about it. After fooling around with it, it's nowhere near as full featured as S3, which I've used a decent amount, but it works and you certainly can't beat the price. After coming across Filippo Valsorda's review of restic, circumstances aligned and I started shoving copies of my AFS volume dumps into B2, encrypted and tracked with restic. Things are slowly bubbling up, which I attribute to the fact that it's not the world's beefiest USB drive setup. After that's up, I'll send a copy of all my system backups there ‐ I've been using a venerable rsync backup script for over a decade now (I just checked the date in the script header). And, with a new laptop, I have a new drive on the way to use for Carbon Copy Cloner, but, owing to this new allegiance to the "at least three sites" mantra, I'll probably be shoving that into restic as well.

That said, I'm also increasingly coming to the opinion that if you use any cloud service, you should use at least two distinct ones. So, depending on what my B2 bill is like, I may end up shoving restic somewhere else as well, perhaps S3 shoved into Glacier.

Posted at: 21:46 | category: /random/2016a/09 | Link

Sat, 19 Aug 2017

Yuri on Ice Cosplay Skates

Bae and I both got addicted to Yuri on Ice when it came out, and when picking a costume for Flame Con, bae picked Yuri. He wanted to have ice skates in the costume, and so I put a bunch of thought into how we could make ice skates something that would be walkable.

Eventually I decided that I'd embed the blades of iceskates in a plastic resin block with some sort of sole attached to it. Photos of the process and some notes can be found here.

Other notes: I used "Castin' Craft Clear Polyester Casting Resin" as the resin. I mixed it up in about 32 ounce batches with the appropriate amount of catalyst. I didn't wait for the resin to start setting before pouring a new batch, it's just that I had a couple disposable 32 ounce mixing cups for my work. The resin cured overnight. Note: this stuff stinks to high heaven; even working by a window with a large box fan sucking air out the apartment smelled like a plastics factory, and did so for a few days after.

In my mind Dick Blick would have had some sort of dense rubber foam that I could cut and use as the sole of the skates. They don't, so I (as you can see in the photos) used some felt as an interface. The resin bonded quite strongly to that, and I bought some of those large rubber floor tiles that you see in gyms or for kid playrooms and used them as raw material for the sole. Perhaps not quite as grippy as I'd like, but they worked well enough and added a bit of cushion. I used several coats of 3M Super 77 spray adhesive on both the felt and the foam, letting it dry to a tacky touch before mating them together and weighing things down. After letting it sit overnight, I trimmed the foam rubber to match the plastic.

Worked well enough that bae won a prize for shoes at the cosplay competition.

Posted at: 23:07 | category: /making | Link

Fri, 28 Jul 2017

Issues with the $169 Chromebook

tl;dr: If you're trying to follow Kenn White's My $169 development Chromebook and the Google account you're using on the Chromebook is associated with a Google Apps For Your Domain domain, there will be ... issues. You'll quickly discover that at the "Turn on the Play Store" step, doing that for GAFYD domains is controlled by your domain administrator. I happen to be my domain administrator, and I quickly fell into a morass of device management and device enrollment and licenses and and and.

An update once I figure it out.

Posted at: 21:11 | category: /computers/chromebook | Link

Mon, 29 May 2017

Dear Google Recruiters

Hi! You, or one of your colleagues, has decided to recruit me for Google. Typically, I've been reluctant to consider Google as part of my career path, but I thought I'd give you folks a chance. But first, a story.

Back in the Mists of Time(TM) (July 2006) I created a YouTube account at youtube.com/users/tproa/. Then, Google, starting along the path to becoming the computing behemoth we think of it today, bought out YouTube. For the longest time, I resolutely refused to associate a Google identity with YouTube, logging in with the account name "tproa" for years, until I finally gave in (I think when you folks made it near impossible not to), associating it with "kula@tproa.net".

Then you folks unleashed Google+ on the world, and I'm pretty sure I refused to tie that to my YouTube videos at all.

Then we come to 2015. I've decided that running my own imap service just isn't as much fun as it used to be, and I moved my primary domain, tproa.net, to be a Google Apps domain. Of course, when I did that, I had to rename the former Google identity of 'kula@tproa.net' so I could have it in my new GAFYD, so I renamed it 'old-kula@tproa.net', and made 'kula@tproa.net' one of the accounts in my GAFYD.

When I did that, suddenly all videos associated with youtube.com/users/tproa/ vanished, and I couldn't see them logged in as either old-kula@tproa.net or kula@tproa.net. Much sadness ensued — no longer would I be able to see the wondrous thundercloud formation outside of Ann Arbor, or the ad-hoc tire repair at bike polo, or making coffee with a Chemex at Ugly Mug. Google, in all of its wisdom, has essentially no support, even when I'm a paying customer, so those videos have been stuck, somewhere in the aether, unwatched, unloved.

So here's where you come in. If you can get my videos back, then you'll be the Recruiter at Google who Got Me to Interview.

Posted at: 09:47 | category: /random | Link

Thu, 03 Nov 2016

'Zero-Factor' Apps

I'm at Container Days NYC 2016 and during the OpenSpaces kick-off session I might have invented the term 'zero-factor' apps.

A play on the Twelve-Factor App methodology, 'zero-factor' might be considered things that are basically the opposite of whatever twelve-factor is. I thought of it as

If I were going to start something new now, I'd likely do twelve-factor or something very akin to it. But I'm stuck with legacy apps that aren't going to get much (if any) love any time soon, or the process of making those apps is going to take a lot of time — they're 'zero-factor'.

In the meantime, however, what strategies can we come up with to help get some of the advantages of containers (primarily, in my mind, "Here's a blob that contains this shitty thing, all I have to deal with is shoving this blob (the container) around") during this transition?

Posted at: 13:33 | category: /computers/containers | Link

Fri, 30 Sep 2016

Using minicom with the FTDI friend

For ad-hoc quick usage I most often use the screen /dev/somedevice baudrate for serial things, but for real usage, I prefer minicom. Mostly because I typically want my things to be running under screen, and screen in screen makes my head hurt, and because when I use that trick, I can never remember how to make screen quit.

As I've been doing more with Raspberry Pis, I've gotten a handful of the Adafruit FTDI friends to use as USB to serial adapters. I tried using one tonight, and while I could get output from the Pi booting, I couldn't type anything. I spent a half-hour in vain, swapping out FTDI friends, trying to wire two back to back, etc, until I figured out the trick.

minicom defaults to turning hardware control on, but the most common FTDI Friend config out there is three wires only — RX, TX and GND. No hardware control lines wired up. Which causes this exact problem. To fix, you can hit the minicom control key, then select 'cOnfigure Minicom', 'Serial port setup' and turn off 'Hardware Flow Control'. There doesn't seem to be a way to specify this on the command line, but since I use minicom pretty much for serial console access these days, I just save the configuration as default and get on with it.

Posted at: 20:03 | category: /computers/serial | Link

Sun, 10 Apr 2016

The Power of Physical Media

Let me preface this by saying I love digital media. I'm not one of those that grouses about the soullessness of digital music, and I love that in one small physical device I can carry enough text to read to satisfy me for days and music to listen to to satisfy me for weeks.

That said....

Last Friday at work we somehow got talking about the cartoon Powerpuff Girls and somehow came across the fact that the end-credits theme song to the show was performed by the Scottish band Bis. I was convinced I had heard of them somewhere, although I thought they were the house band on some late 1990s/early 2000s television show. In looking them up, however, I came across an image of one of their early albums, New Transistor Heros.

I was taken aback, since I own that album but hadn't even thought of it in probably a decade. That prompted me this evening to dig out the two physical boxes of CDs that I still own, and dig through them, both to find that album and to see what other gems were lurking around unthought of.

Two things became readily apparent. One, I had some dubious taste in music between, say, 1998 and 2003. Then again, those were interesting times, and who didn't? Second, there are some amazing gems in there, stuff I hadn't digitized and so haven't thought of in ages. And that's where the joy of physical media came through. Several of the CDs I dug up brought back vivid memories, way more than scrolling past them in a playlist. A random CD I bought in Portland, Oregon. The off-brand chain bookstore in Ames that was mediocre but strangly had a really good local music section.

Posted at: 22:51 | category: /music | Link