I like simple.
Memcache is simple.
But sometimes it's maybe too simple.
I needed a substrate for holding session data which was accessible from more than one host (and potentially, more than one application). Given the lack of persistence in memcache I had a look at Couchbase....and found something even more bewildering than Oracle! Redis looked like it wouldn't take me weeks to get running, but while frequently touted as a replacement for memcache, I couldn't find any documentation stating whether it had a binary compatible memcache interface. So back to memcache.
First problem: I want high availability - that means more than one instance. Oh, C(r)AP! Sharding is easy enough, but replication needs a bit more work. A bit of digging and I found mcrouter which, along with haproxy means I can have load balancing and failover. But recovery is still missing.
The memcache distribution comes with a Perl script which will copy memcache data from another instance. However when I tested it, I found the copy operation was very lossy - I was only getting around 70% of the data across to the new instance (HIGHLY variable) when hammering the source with a 1:1 mix of updates and gets. A bit disappointing - but understandable really. Maintaining a consistent list of all known items would add a lot of complexity and performance problems.
After a bit of reading I implemented and tested my own script using lru_crawler. Although still not perfect, in testing it was achieving >99%, again while the source was getting hosed. This is now available at https://github.com/symcbean/mcseed/blob/main/mcseed.php
I disabled the package systemd unit file and created my own which uses a shell script to
- Block external access to the memcache and mcrouter ports
- start the memcache binary
- run mcseed to populate the cache
- allow incoming traffic to the mcrouter and memcache ports
Job done.