commented: I've had data corruption when using a 3rd party vendor just the same as I've had when self-hosting. As far as I'm concerned this is roughly comparable to the time you spend debugging RDS connection limits, working around parameter groups you can't modify, or dealing with surprise maintenance windows. The main operational difference is that you're responsible for incident response. If your database goes down at 3 AM, you need to fix it. But here's the thing: RDS goes down too. And when it does, you're still the one getting paged, you just have fewer tools to fix the problem. Doesn't look like it will ever become non-true… commented: None of this is technically complex Proper automated failover is quite complex. Luckily you don’t implement it yourself for Postgres. We run our own Postgres just fine but I would love to not have to run MySQL. Cloud vendors typically handle database replication by replicating block devices, which gives them a generic solution that doesn’t have to deal with the problems of each database. commented: For what it's worth, we've been running MySQL clusters on VMs (where we can get local NVMe) for a long time using its replication, and this stuff is now mundane for us. Initial setup isn't fire-and-forget, and if, for example, you don't go in aware what replication likes and doesn't (it doesn't like huge transactions or large ALTERs without an online-schema-change tool!) you might learn the hard way. But those seem like more learning and start-up costs than recurring ones. I haven't looked deeply into it, but it sounds a little like maybe Percona is trying to sort of build an open-source RDS-like thing on top of k8s with their Everest product, which might help with one class of setup work. Postgres is probably the route if you're just deciding what to use--the larger ecosystem seems to have gone that direction--but given my past experience I'd probably personally start a thing with MySQL again if I were starting over. I mostly agree with the OP: the work that is specific to running our own instances happens, but is occasional and very manageable. (And the local-NVMe perf is sure nice.) Most of our database-related work is the stuff that you'd have to do no matter how you hosted: looking at what's inefficient and improving it, once every N years making sure your app remains happy across a major-vesion upgrade, things like that. Of course, saying something works well is always calling the wrath of the ops gods, and I'm knocking wood just writing this. But empirically it has been working pretty well for us! commented: Does something like https://pigsty.io/ help with that? commented: There's also https://github.com/patroni/patroni which only does the HA part, without being a full on postgres distro. commented: Perhaps this ZFS backup strategy from 2022 can help as well as long as there is equivalent to pg_start_backup/pg_stop_backup in MySQL? commented: So do it as well. Use shared storage with dual controllers and failover the whole DB VM. Use replica as manual failover in case things go wrong with the whole primary cluster. commented: It may be only a couple hours a month or so of maintenance, but it's the task switching and minutiae knowledge that kills you in these routines. You look at it so infrequently that actually changing anything becomes a behemoth task in my experience as you have to refresh yourself on the implementation details. commented: I find it believable what author says about «similar amount of maintenance to what interfacing with RDS requires», and then the same exact issue applies equally on both sides. commented: Yeah, it also feels like the author is significantly more knowledgeable about (and therefore confident in) his Postgres management skills. He knows the failure modes and how to recover from them. He knows how failover works and how to set up Patroni or whatever. He knows how to configure backrest and has practiced recovery. I don’t know that much and don’t have that confidence. And I’m sure I could learn it—it’s probably not all that much and I’m sure I could find some good content on learning all of it. On the other hand, my company can pay Google perhaps tens or hundreds of dollars per month (on top of the cost of the underlying instances) to manage it for me which is a rounding error on our cloud bill, and instead I can either help increase revenue or decrease costs by some figure several orders of magnitude larger than what I would save running Postgres myself. We have people who write a bad BigQuery query that waste more money than I would save running Postgres myself. It just isn’t worth my time right now, and that’s probably true for many—if we get to an aggressive cost optimization phase and that’s the next biggest bang/buck, then we’ll tackle it then, but for now it would be the wrong move. Also worth noting that it’s never “running it myself”—I also have to teach my team how to do that work. I also don’t buy the argument that if RDS goes down you still have to deal with it. No I don’t—Amazon deals with it. I maybe have to do a bit of communication with stakeholders that RDS went down, but it’s much less work and much less stress than fixing it myself. More importantly, if I run the database myself and things break, it’s my fault, but if RDS goes down, my stakeholders and their stakeholders are understanding (maybe it shouldn’t be that way, but that’s the world we live in). commented: Great article. One thing seems to be (partially) missing; patching the VM/server os (along with dist-upgrades when current version falls out of support). As one alternative to self-manage we're currently slicing up managed postgres instances (from UpCloud) - giving our test environments a seperate database, user and schema on a "shared" managed postgres instance - which makes the "managed sevices tax" more reasonable. For prod environments it can make sense to pay for one instance per service - because of the "now it's somebody else's problem"-feature, and making it trivial to do point in time recovery via just bringing a new service instance online for recovery. But the general trend of steering users toward one postgres instance per service does become a little silly when you have important, but low volume services. commented: Good article! What I'd also be interested in: what are the usual reasons for outages (i.e. for 3am pager calls) with a self-hosted Postgres? Are there any common patterns? And is it possible for a non-expert to debug/fix such problems on short notice? commented: If random people can send SQL to your PG, chances are it's one of those random SQL queries abusing the poor server. You have a bunch of people sending terrible SQL and your going to have a bad time. Sometimes it can be locking, if you allow really long running queries against tables that are being inserted against. As for figuring out if these are the issues are not, you can query for both of those things. see the PG wiki for queries that do test for these. There is also the server admin section of the PG manual: https://www.postgresql.org/docs/current/admin.html Also it should be noted, a hosted DB instance won't fix any of these issues for you either, they will always be your problem. If it's not that, it's probably not PG's issue. It's probably hardware failure, OS crashing, no disk space, stuff like that. commented: I was a senior manager in a small company that ran Postgres in its core and quite honestly the number of Postgres related after hours calls we had in the 20+ years we were running could be counted on one hand. One example that stands out due to its severity was an instance that just kept crashing - but it turned out to be a raid controller flipping bits. While replication etc is great (and we did it with drbd for a long time before other techniques came along), pg is rock solid, I’d have no qualms running it solo and just relying on backups for a small project. All of that said, I currently run a smallish instance at Vultr for $WORK. Backups, upgrades and failover just happen, and I’d need to pay for the CPU and disk anyway. So while I don’t think it’s hard to run your own PG, it is (was?) non trivial to set up and maintain that sweet sweet automation, and there are plenty of cheapish hosting options around that do it all for you. commented: The benefit of hosted is getting accessed to closed source distributed DBs like Aurora and AlloyDB. I never understood the appeal to RDS/CloudSQL. commented: Fine, we'll self-host RDS Aurora. .