URI: 
        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   PyPI in 2025: A Year in Review
       
       
        nodesocket wrote 1 hour 8 min ago:
        Is the compute and network required to service pypi all from donations
        or do they have any business arm that generates income?
       
        zahlman wrote 3 hours 5 min ago:
        > 1.92 exabytes of total data transferred
        
        That's something like triple the amount from 2023, yes?
       
        fud101 wrote 6 hours 21 min ago:
        This seems to suggest once the bubble pops, it will take Python down
        with it. The next AI winter will definitely replace Lisp with Python.
       
          talideon wrote 5 hours 49 min ago:
          Appropriate username!
       
        nmstoker wrote 7 hours 47 min ago:
        Great work!
        
        Side issue: anyone else seeing that none of the links in the article
        work? They're all 404s.
       
          miketheman wrote 7 hours 9 min ago:
          Whoops, sorry about that. Should be fixed now. Happy New Year!
       
        heavyset_go wrote 9 hours 22 min ago:
        One of the big companies making billions on Python software should step
        up and fund the infrastructure needed to enable PyPI package search via
        the CLI, like you could with `pip search` in the past.
       
          talideon wrote 5 hours 49 min ago:
          I upvoted you because I broadly agree with you, but search is never
          coming back in the API. They previously outlined the cost involved
          and there's no way, given how minimal the value it gives more
          broadly, it's coming back ant time soon. It's basically an abusive
          vector because of the compute cost.
       
          rat9988 wrote 6 hours 34 min ago:
          They probably don't need it. You can start a crowdfunding campaign if
          you do.
       
          woodruffw wrote 8 hours 22 min ago:
          Serious question: how important is `pip search` to your workflows? I
          don’t think I ever used it, back when PyPI still had an XMLRPC
          search endpoint.
          
          (I think the biggest blocker on CLI search isn’t infrastructure,
          but that there’s no clear agreement on the value of CLI search
          without a clear scope of what that search would do. Just listing
          matches over the package names would be less useful than structured
          metadata search for example, but the latter makes a lot of
          assumptions about the availability of structured metadata!)
       
          firesteelrain wrote 8 hours 51 min ago:
          Funding could help, but it still requires PyPI/Warehouse to ship and
          operate a new public search interface that is safe at internet scale.
       
            BiteCode_dev wrote 10 min ago:
            If you really need it, they publish a dump regularly and you can
            query that.
            
            For simple use cases, you have the web search, and you can curl it.
       
            bastawhiz wrote 6 hours 56 min ago:
            Pypi has a search interface on their public website, though?
       
            coldtea wrote 8 hours 8 min ago:
            They operate a public package hosting interface, how is a search
            one any harder?
       
              miketheman wrote 7 hours 23 min ago:
              PyPI responses are cached at 99% or higher, with less
              infrastructure to run.
              
              Search is an unbounded context and does not lend itself to
              caching very well, as every search can contain anything
       
                bastawhiz wrote 6 hours 57 min ago:
                Pypi has fewer than one million projects. The searchable
                content for each package is what? 300 bytes? That's a 200mb
                index. You don't even need fancy full text search, you could
                literally split the query by word and do a grep over a text
                file. No need for elasticsearch or anything fancy.
                
                And anyway, hit rates are going to be pretty good. You're not
                taking arbitrary queries, the domain is pretty narrow. Half the
                queries are going to be for requests, pytorch, numpy, httpx,
                and the other usual suspects.
       
                  woodruffw wrote 4 hours 4 min ago:
                  The searchable context for a distribution on PyPI is
                  unbounded in the general case, assuming the goal is to allow
                  search over READMEs, distribution metadata, etc.
                  
                  (Which isn’t to say I disagree with you about scale not
                  being the main issue, just to offer some nuance. Another
                  piece of nuance is the fact that distributions are the source
                  of metadata but users think in terms of projects/releases.)
       
                  froh wrote 4 hours 56 min ago:
                  I wonder how a PyPi search index could be statically served
                  and locally evaluated on `pip search`?
       
                    firesteelrain wrote 4 hours 34 min ago:
                    PyPI servers would have to be constantly rebuilding a
                    central index and making it available for download. Seems
                    inefficient
       
        dalanmiller wrote 9 hours 48 min ago:
        Great work Dustin and team!
       
       
   DIR <- back to front page