URI: 
        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   Benchmarking Postgres 17 vs. 18
       
       
        fourseventy wrote 11 hours 27 min ago:
        I'm literally in the middle of upgrading my prod db to pg18. Its about
        6tb, has a few thousand queries per second, should I be considering
        running in 'worker' mode instead of 'io_uring'?
       
          spprashant wrote 5 hours 22 min ago:
          For upgrades which have enough risks as it is, I would keep the
          number of variables low. Once upgraded and stable, you can replicate
          to a secondary instance with io_method switched and test on it before
          switching over.
       
          parthdesai wrote 11 hours 11 min ago:
          Why would you migrate your prod db if you aren't sure of all the
          changes and which config params to use?
       
        samlambert wrote 12 hours 8 min ago:
        While this post is here I'd like to call out that Vitess for Postgres
        is coming
        
  HTML  [1]: https://www.neki.dev/
       
        nodesocket wrote 13 hours 37 min ago:
        I'm currently running PostgreSQL in docker containers using
        bitnami/postgresql:17.6.0-debian-12-r4. As I understand it, Bitnami is
        no longer supporting or updating their Docker containers. Any
        recommendations on a upgrade path to PostgreSQL 18 in Docker?
        
        A quick glance of swapping to the official postgres container shows
        POSTGRESQL_DATABASE is renamed to POSTGRESQL_DB. The other issue is the
        volume mount path is currently /bitnami/postgresql.
       
          makkes wrote 12 hours 44 min ago:
          Either do a proper upgrade with backup/restore or use `PGDATA`[1] and
          `pg_upgrade`[2]. [1]
          
  HTML    [1]: https://hub.docker.com/_/postgres#pgdata
  HTML    [2]: https://www.postgresql.org/docs/current/upgrading.html#UPGRA...
       
        novoreorx wrote 15 hours 27 min ago:
        The charts looks beautiful, I wonder which library it uses.
       
          miklosz wrote 10 hours 37 min ago:
          Seems it's Recharts.
       
        cowsandmilk wrote 15 hours 38 min ago:
        Where are the error bars? I don’t get why people run all these tests
        and don’t give me an idea of standard deviation or whether the
        differences are actually statistically significant.
       
        jackdoe wrote 16 hours 31 min ago:
        > IOPS: 3,000
            > IOPS: 300,000 for 551$ per month
        
        the cloud is ridiculous.
        
        just for reference with 4 consumer nvmes and raid10 and pciex16 you can
        easily do 3m IOPS for one time cost of like 1000$
        
        in my current job we constantly have to rethink db queries/design
        because of cloud IOPS, and of course not having control over RDS page
        cache and numa.
        
        every time I am woken up at night because a seemingly normal query all
        of the sudden goes beyond our IOPS budget and the WAL starts trashing,
        I seriously question my choices.
        
        the whole cloud situation is just ridiculous.
       
          vbezhenar wrote 12 hours 40 min ago:
          Most clouds I've used allow you to create VM with local disk, and
          that might be cheaper that network disk.
       
            jackdoe wrote 10 hours 52 min ago:
            This is the 500$ per month option they are describing in the post,
            the magnificent 300,000 IOPS, the network disk is 1500$ for 16000
            IOPS
            
            not to mention then you have all the issues people are discussing
            managing your own backups, snapshots and replication and etc
       
          makkes wrote 12 hours 49 min ago:
          Comparing monthly cloud cost with one-time hardware purchasing cost
          completely dismisses the latter's long-time cost like people,
          replacement parts, power, housing, accessories. While I do believe
          you can run your own hardware much cheaper, there's a lot to consider
          before making the decision.
       
          jaza wrote 15 hours 0 min ago:
          You don't pay for RDS because you care about IOPS. You pay for it
          because you want backups and replication to be somebody else's
          problem. And because you (by which I mean probably the MBA management
          above you, rather than you yourself) care about it being an opex
          rather than capex cost, a lot more than you care about how much the
          cost is. And because ISO audit boxes get ticked.
       
            benjiro wrote 12 hours 41 min ago:
            >  You pay for it because you want backups and replication to be
            somebody else's problem.
            
            Or you just use something like CockroachDB, YugabyteDB etc that
            auto replicate, auto rebalance if a node goes down, and have build
            in support for backups to and from S3...
            
            Or if your a bit more hands on, multigress seems to be closing to
            completion ( [1] ) from the guy that make Vitess for Mysql.
            
            The idea that managing hardware and software is hard, is silly yet,
            people (mostly managers it seems ) think its the best solution.
            
  HTML      [1]: https://github.com/multigres/multigres
       
              manacit wrote 11 hours 57 min ago:
              I wouldn't say it's closing to completion - it looks like it's in
              the very early stages development according to their repo. I
              don't see any evidence they've gotten as far as even running a
              single query through it.
              
              Even when it's done, it's going to be a lot of work to run. sure,
              it's not guaranteed to be hard, but if it's not your core
              business and you're making money, having someone else do it gives
              you time to focus on what matters.
       
            thyristan wrote 13 hours 15 min ago:
            If you want your own hardware to be OPEX, just do leasing. Every
            enterprise hardware seller will make you an offer for that.
       
          Hrun0 wrote 16 hours 27 min ago:
          But now you need someone to deal with the hardware.
       
            lossolo wrote 14 hours 3 min ago:
            You don't need to, just rent dedicated servers, still 20-50x
            cheaper, problem solved.
       
            layoric wrote 15 hours 32 min ago:
            Working at IT places in the late 2000s, it was still pretty common
            place for there to be a server rooms. Even for a large org with
            multiple sites 100s of kms a part, you could manage it with a
            pretty small team. And it is a lot easier to build resilient
            applications now than it was back then from what I remember.
            
            Cloud costs are getting large enough that I know I’ve got one
            foot out the door and a long term plan to move back to having our
            own servers and spend the money we save on people. I can only see
            cloud getting even more expensive, not less.
       
              mbesto wrote 10 hours 24 min ago:
              > I can only see cloud getting even more expensive, not less.
              
              When you have three major hyperscalers competing for your dollars
              this is basically not true and not how markets work...unless they
              start colluding on prices.
              
              We've already seen reduction in prices of web services costs
              across the three major providers due to this competitive nature.
       
              hylaride wrote 14 hours 15 min ago:
              There is currently a bit of an early shift back to physical
              infra.    Some of this is driven by costs(1), some by geopolitical
              concerns, and some by performance.  However, dealing with
              physical equipment does introduce a different set (old fashioned,
              but somewhat atrophied) set of skills and costs that companies
              need to deal with.
              
              (1) It is shocking how much of a move to the cloud was driven by
              accountants wanting opex instead of capex, but are now concerned
              with actual cashflow and are thinking of going back.  The cloud
              is really good at serving web content and storing gobs of data,
              but once you start wanting to crunch numbers or move that data,
              it gets expensive fast.
       
                unregistereddev wrote 12 hours 0 min ago:
                In some orgs the move to the cloud was driven by accountants.
                In my org it was driven by lawyers. With GDPR on the horizon
                and murmurs of other data privacy laws that might (but didn't)
                require data to be stored in that customer's jurisdiction, we
                needed to host in additional regions.
                
                We had a couple rather large datacenters, but both were in the
                US. The only infrastructure we had in the EU was one small
                server closet. We had no hosting capacity in Brazil, China,
                etc. Multi-region availability drove us to the cloud - just not
                in the "high availability" sense of the term.
       
              ralusek wrote 14 hours 31 min ago:
              And it’ll be so good and cheap that you’ll figure “hell, I
              could sell our excess compute resources for a fraction of AWS.”
              And then I’ll buy them, you’ll be the new cloud. And then
              more people will, and eventually this server infrastructure
              business will dwarf your actual business. And then some person in
              10 years will complain about your IOPS pricing, and start their
              own server room.
       
            jackdoe wrote 16 hours 26 min ago:
            oh no! this is proven to be impossible, no man can tell a computer
            what to do
            
            lspci is only written in the old alchemy books, in the whispers of
            the thrice great Hermes.
            
            PS: I have personally put down actual fires in a datacenter, and I
            prefer it to this 3000 IOPS crap.
       
        cheema33 wrote 17 hours 16 min ago:
        The primary lesson I learned here was this:
        
        If you care about performance, don't use network storage.
        
        If you are using local nvme disk, then it does not matter if you are
        using Postgres 17 or 18. Performance is about the same. And
        significantly faster than network storage.
       
          samlambert wrote 12 hours 11 min ago:
          Correct. Network storage is flexible for a variety of use cases
          that's why PlanetScale supports both.
       
          saxenaabhi wrote 16 hours 54 min ago:
          But ephemeral and non-redundant.
          
          Am I correct in that using local disk on any VPS has durability
          concerns?
       
            rcrowley wrote 11 hours 34 min ago:
            RAID isn't the answer, either, for the record. In AWS and GCP, the
            CPU or RAM blowing up will cost you access to that local NVMe
            drive, too, no matter how much RAID you throw at it.
       
            rcrowley wrote 11 hours 35 min ago:
            Yes, a single disk in a VPS or cloud provider has durability
            concerns. That's why EBS and products like it that pretend to be a
            single disk are actually several. Instead of relying on multiple
            block devices, though, we create that redundancy at a higher level
            by relying on multiple MySQL or Postgres servers for durability,
            each with a local NVMe drive for performance.
       
            sgarland wrote 12 hours 6 min ago:
            Yes, it’s the ephemerality that’s the biggest issue.
            Enterprise-grade SSDs are quite reliable, and typically have PLP so
            even in the event of a sudden power loss, any queued writes that
            the drive has accepted - and thus ack’d the fsync() - will be
            written. Presumably you’d be running some kind of redundancy,
            likely some flavor of RAID or zRAID (assuming purely local storage
            here, not a distributed system like Ceph, nor synchronous
            replication).
            
            But in the cloud, if the physical server backing your instance
            dies, or even if someone accidentally issues a shutdown command,
            you don’t get that same drive back when the new instance comes
            up. So a problem that is normally solved by basic local redundancy
            suddenly becomes impossible, and thus you must either run
            synchronous or semi-sync replication (the latter is what
            PlanetScale Metal does), accepting the latency hit from distributed
            storage, or asynchronous replication and accept some amount of data
            loss, which is rarely acceptable.
       
              pas wrote 8 hours 11 min ago:
              ... sounds like a trivial job for bare metal instances
              
              and that EC2 local NVMe encryption keys are ephemeral is nice
              against leaks, but not a necessity for other clouds (and not
              great for resumability, which can really downgrade business
              continuity scores), and I expect for all the money they ask for
              it, to be able to keep it relatively secure even across reboots
       
                BonoboIO wrote 7 hours 19 min ago:
                Or even a bare metal simple server that just does databases
                with redundant nvme ssd
       
              samlambert wrote 12 hours 3 min ago:
              Agreed on these trade offs. We do both synchronous and
              semi-synchronous depending on Postgres or MySQL.
       
            samlambert wrote 12 hours 13 min ago:
            we have mitigated the durability concerns in multiple ways.
       
            jascha_eng wrote 14 hours 8 min ago:
            yeh planetscale loves to flex how fast they are but the main reason
            they are fast is because they run a full abstraction less than any
            other cloud provider and this does in fact have trade-offs.
       
              samlambert wrote 12 hours 16 min ago:
              What is wrong with running without lots of abstractions? We are
              clear about the downsides. The results are clear, you can see the
              customers love it. We run insane amounts of state safely on
              ephemeral compute. It's a flex. All I've seen from Timescale
              people is qqing. Write some code or be quiet.
       
                jascha_eng wrote 11 hours 55 min ago:
                I'm not criticizing your engineering approach at all. Running
                everything in one box has its merits as your benchmarks show
                but it is also just not apples to apples there are other
                trade-offs and I am just appreciating that the community calls
                that out.
                
                Also hey this is HN not Twitter I think we can be a bit more
                civilized. Not a good look imo for a CEO to get that upset over
                a harmless comment.
       
                  samlambert wrote 11 hours 7 min ago:
                  We run 3 nodes not 1. Your comment is not in isolation we get
                  constant shade from Timescale people when we don't even think
                  about you.
       
            XCSme wrote 15 hours 45 min ago:
            On some providers (e.g. Hetzner), the dedicated servers come by
            default with 2x RAID 1 disks, so it's a lot less likely to fail
            (unless the datacenter burns down).
       
              whizzter wrote 13 hours 37 min ago:
              You have a call from France, some company called OVH on the line!
       
                BonoboIO wrote 7 hours 17 min ago:
                And your backup goes up in flames too.
                
                I would never ever trust OVH with any important data or
                servers, I mean we saw how they secured their datacenters where
                it took 3h to cut the power while the datacenter was burning.
       
            fabian2k wrote 16 hours 29 min ago:
            Databases like Postgres have well established ways to handle that.
            And if you're setting up the DB yourself, you absolutely need to do
            backups anyway. And a replica on a different server.
       
            inapis wrote 16 hours 46 min ago:
            Sure. Till an extent. And if you run some mission-critical
            application, definitely.
            
            But most applications run fine from local storage and can tolerate
            some downtime. They might even benefit from the improved
            performance. You can also fix the durability and disaster recovery
            concerns by setting up on RAID/ZFS and maintaining proper backups.
       
            CodesInChaos wrote 16 hours 47 min ago:
            Using a single disk has durability concerns. But I don't see why
            VPS vs dedicated server should matter much.
       
        DicIfTEx wrote 17 hours 29 min ago:
        I was expecting `pg_dumpall` to get the `--format` option in v18,[0]
        but at the moment the docs say it's still only available in the
        development branch.[1]
        
        Is anyone familiar with Postgres development able to give an update on
        the state of the feature? Is it planned for a future (18 or 19)
        release?
        
        [0]: [1]:
        
  HTML  [1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h...
  HTML  [2]: https://www.postgresql.org/docs/devel/app-pgdump.html#:~:text=...
       
          anarazel wrote 12 hours 1 min ago:
          The docs for 18 also show it, where do you get from that it's not
          available for 18?
       
            DicIfTEx wrote 3 hours 27 min ago:
            Ah my mistake, I linked to the docs for `pg_dump` (which has long
            had the `format` option) rather than `pg_dumpall` (which lacks it).
            
            Before Postgres 18 was released, the docs listed `format` as an
            option for `pg_dumpall` in the upcoming version 18 (e.g. Wayback
            Machine from Jun 2025 [1] ). The relevant commit is from Apr 2025
            (see link #0 in my original comment). But now all mention has been
            scrubbed, even from the Devel branch docs.
            
  HTML      [1]: https://web.archive.org/web/20250624230110/https://www.pos...
       
        rastignack wrote 19 hours 0 min ago:
        Is there now a way to avoid double buffering and use direct IO in
        postgresql ?
        
        Has anybody seriously benchmarked this ?
        
        I don’t think io uring would make a difference with this setting but
        I’m curious, as it’s the default for oracle and sybase.
       
          hans_castorp wrote 16 hours 13 min ago:
          Direct I/O is being worked on, but is not yet available.
          
          See e.g. here:
          
  HTML    [1]: https://www.cybertec-postgresql.com/en/postgresql-18-and-bey...
       
        p_zuckerman wrote 19 hours 40 min ago:
        Thanks for posting this interesting article! Do we know if timescale
        extension is available as well?
       
          samlambert wrote 12 hours 12 min ago:
          We are working on it.
       
          travisgriggs wrote 19 hours 18 min ago:
          As in timescaledb? Or something else…?
       
            p_zuckerman wrote 18 hours 5 min ago:
            Yes, as in timescaledb. Sorry for not be specific.
       
        anarazel wrote 23 hours 0 min ago:
        Afaict nothing in this benchmark will actually use AIO in 18. As of 18
        there is aio reads for seq scans, bitmap scans, vacuum, and a few other
        utility commands. But the queries being run should normally be planned
        as index range scans. We're hoping to the the work for using AIO for
        index scans into 19, but it could work end up in 20, it's nontrivial.
        
        It's also worth noting that the default for data checksums has changed,
        with some overhead due to that.
       
          ozgune wrote 12 hours 37 min ago:
          If the benchmark doesn’t use AIO, why the performance difference
          between PG 17 and 18 in the blog post (sync, worker, and io_uring)?
          
          Is it because remote storage in the cloud always introduces some
          variance & the benchmark just picks that up?
          
          For reference, anarazel had a presentation at pgconf.eu yesterday
          about AIO. anarazel mentioned that remote cloud storage always
          introduced variance making the benchmark results hard to interpret.
          His solution was to introduce synthetic latency on local NVMes for
          benchmarks.
       
          nopurpose wrote 17 hours 7 min ago:
          Then io_uring AIO mode underperformance is even more curious.
       
            anarazel wrote 12 hours 21 min ago:
            It is. I tried to repro it without success.
            
            I wonder if it's just being executed on a different VMs with
            slightly different performance characteristics. I can't tell based
            on the formulation in the post whether all the runs for one test
            are executed on the same VM or not.
       
          mebcitto wrote 21 hours 49 min ago:
          That explains why `sync` and `worker` have so similar results in
          almost all runs.  The benchmarks from Tomas Vondra ( [1] ) showed
          some significant differences.
          
  HTML    [1]: https://vondra.me/posts/tuning-aio-in-postgresql-18/
       
        alberth wrote 23 hours 20 min ago:
        Am I interrupting the data correctly in that, if you’re running on
        NVMe - it’s just so fast, that it doesn’t make a difference what
        mode you pick.
       
          cientifico wrote 19 hours 15 min ago:
          That was the same conclusion I got by playing with the graphs.
          
          I concluded that better IO planning it's only worth it for "slow" I/O
          in 18.
          
          Pretty sure it will bring a lot of learnings. Postgress devs are
          pretty awesome.
       
          6r17 wrote 20 hours 32 min ago:
          typo *interpreting i guess ?
       
       
   DIR <- back to front page