URI: 
        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   History LLMs: Models trained exclusively on pre-1913 texts
       
       
        dkalola wrote 5 min ago:
        How can we interact with such models? Is there a web application
        interface?
       
        WhitneyLand wrote 1 hour 23 min ago:
        Why not use these as a benchmark for LLM ability to make breakthrough
        discoveries?
        
        For example prompt the 1913 model to try and “Invent a new theory of
        gravity that doesn’t conflict with special relativity”
        
        Would it be able to eventually get to GR?  If not, could finding out
        why not illuminate important weaknesses.
       
        Muskwalker wrote 1 hour 36 min ago:
        So, could this be an example of an LLM trained fully on public domain
        copyright-expired data?  Or is this not intended to be the case.
       
        kldg wrote 2 hours 54 min ago:
        Very neat! I've thought about this with frontier models because they're
        ignorant of recent events, though it's too bad old frontier models just
        kind of disappear into the aether when a company moves on to the next
        iteration. Every company's frontier model today is a time capsule for
        the future. There should probably be some kind of preservation attempts
        made early so they don't wind up simply deleted; once we're in Internet
        time, sifting through the data to ensure scrapes are accurately dated
        becomes a nightmare unless you're doing your own regular Internet
        scrapes over a long time.
        
        It would be nice to go back substantially further, though it's not too
        far back that the commoner becomes voiceless in history and we just get
        a bunch of politics and academia. Great job; look forward to testing it
        out.
       
        underfox wrote 3 hours 40 min ago:
        > [They aren't] perfect mirrors of "public opinion" (they represent
        published text, which skews educated and toward dominant viewpoints)
        
        Really good point that I don't think I would've considered on my own.
        Easy to take for granted how easy it is to share information (for
        better or worse) now, but pre-1913 there were far more structural and
        societal barriers to doing the same.
       
        flux3125 wrote 3 hours 53 min ago:
        Once I had an interesting interaction with llama 3.1, where I pretended
        to be someone from like 100 years in the future, claiming it was part
        of a "historical research initiative conducted by Quantum (formerly
        Meta), aimed at documenting how early intelligent systems perceived
        humanity and its future." It became really interested, asking about how
        humanity had evolved and things like that. Then I kept playing along
        with different answers, from apocalyptic scenarios to others where AI
        gained consciousness and humans and machines have equal rights. It was
        fascinating to observe its reaction to each scenario
       
        erichocean wrote 4 hours 0 min ago:
        I would love to see this done, by year.
        
        "Give me an LLM from 1928."
        
        etc.
       
        elestor wrote 4 hours 32 min ago:
        Excuse me if it's obvious, but how could I run this? I have run local
        LLMs before, but only have very minimal experience using ollama run and
        that's about it. This seems very interesting so I'd like to try it.
       
        shireboy wrote 4 hours 54 min ago:
        Fascinating llm use case I never really thought about til now.    I’d
        love to converse with different eras and also do gap analysis with
        present time - what modern advances could have come earlier, happened
        differently etc.
       
        PeterStuer wrote 6 hours 13 min ago:
        How does it do on Python coding? Not 100% troll, cross domain coherence
        is a thing.
       
        ulbu wrote 6 hours 14 min ago:
        for anyone moaning the plight that it's not accessible to you: they are
        historians, I think they're more educated in matters of historical
        mistake than you or me. playing safe is simply prudence. it is sorely
        lacking in the American approach to technology. prevention is the best
        medicine.
       
        davidpfarrell wrote 6 hours 22 min ago:
        Can't wait for all the syncopated "Thou dost well to question that"
        responses!
       
        sbmthakur wrote 8 hours 33 min ago:
        Someone suggested a nice thought experiment - train LLMs on all Physics
        before quantum physics was discovered. If the LLM can see still figure
        out the latter then certainly we have achieved some success in the
        space.
       
        btrettel wrote 10 hours 36 min ago:
        This reminded me of some earlier discussion on Hacker News about using
        LLMs trained on old texts to determine novelty and obviousness of a
        patent application:
        
  HTML  [1]: https://news.ycombinator.com/item?id=43440273
       
        arikrak wrote 11 hours 22 min ago:
        I wouldn't have expected there to be enough text from before 1913 to
        properly train a model, it seemed like they needed an internet of text
        to train the first successful LLMs?
       
          alansaber wrote 11 hours 12 min ago:
          This model is more comparable to GPT-2 than anything we use now.
       
        Departed7405 wrote 12 hours 26 min ago:
        Awesome. Can't wait to try and ask it to predict the 20th century based
        on said events. Model size is small, which is great as I can run it
        anywhere, but at the same time reasoning might not be great.
       
        usernamed7 wrote 13 hours 10 min ago:
        > We're developing a responsible access framework that makes models
        available to researchers for scholarly purposes while preventing
        misuse.
        
        oh COME ON... "AI safety" is getting out of hand.
       
        delis-thumbs-7e wrote 13 hours 29 min ago:
        Isn’t there obvious problems baked into this approach, if this is
        used for anything but fun? LLM’s lie and fake facts all the time,
        they are also masters at enforcing the users bias, even unconscious
        ones. How even a professor of history could ensure that the generated
        text is actually based on the training material and representative of
        the feelings and opinions of the given time period, not enforcing his
        biases toward popular topics of the day?
        
        You can’t, it is impossible. That will always be an issue as long as
        this models are black boxes and trained the way they are. So maybe you
        can use this for role playing, but I wouldn’t trust a word it says.
       
          kccqzy wrote 6 hours 35 min ago:
          To me it is pretty clear that it’s being used for fun. I personally
          like reading nineteenth century novels more than more recent novels
          (I especially like the style of science fiction by Jules Verne). What
          if the model can generate text in that style I like?
       
        r0x0r007 wrote 14 hours 11 min ago:
        ffs, to find out what figures from the past thought and how they felt
        about the world, maybe we read some of their books, we will get the
        context. Don't prompt or train LLM to do it and consider it the hottest
        thing since MCP. Besides, what's the point? To teach younger
        generations a made up perspective of historic figures? Who guarantees
        the correctness/factuality? We will have students chatting with made up
        Hitler justifying his actions. So much AI slop everywhere.
       
        moffkalast wrote 14 hours 20 min ago:
        > trained from scratch on 80B tokens of historical data
        
        How can this thing possibly be even remotely coherent with just fine
        tuning amounts of data used for pretraining?
       
        Agraillo wrote 15 hours 29 min ago:
        > Modern LLMs suffer from hindsight contamination. GPT-5 knows how the
        story ends—WWI, the League's failure, the Spanish flu. This knowledge
        inevitably shapes responses, even when instructed to "forget.
        
        > Our data comes from more than 20 open-source datasets of historical
        books and newspapers. ... We currently do not deduplicate the data. The
        reason is that if documents show up in multiple datasets, they also had
        greater circulation historically. By leaving these duplicates in the
        data, we expect the model will be more strongly influenced by documents
        of greater historical importance.
        
        I found these claims contradictory. Many books that modern readers
        consider historically significant had only niche circulation at the
        time of publishing. A quick inquiry likely points to later works by
        Nietzsche and Marx's Das Kapital. They're possible subjects to the
        duplication likely influencing the model's responses as if they had
        been widely known at the time
       
        holyknight wrote 15 hours 35 min ago:
        wow amazing idea
       
        bondarchuk wrote 15 hours 56 min ago:
        >Historical texts contain racism, antisemitism, misogyny, imperialist
        views. The models will reproduce these views because they're in the
        training data. This isn't a flaw, but a crucial feature—understanding
        how such views were articulated and normalized is crucial to
        understanding how they took hold.
        
        Yes!
        
        >We're developing a responsible access framework that makes models
        available to researchers for scholarly purposes while preventing
        misuse.
        
        Noooooo!
        
        So is the model going to be publicly available, just like those
        dangerous pre-1913 texts, or not?
       
          xpe wrote 5 hours 32 min ago:
          > So is the model going to be publicly available, just like those
          dangerous pre-1913 texts, or not?
          
          1. This implies a false equivalence. Releasing a new interactive AI
          model is indeed different in significant and practical ways from the
          status quo. Yes, there are already-released historical texts. The
          rational thing to do is weigh the impacts of introducing another
          thing.
          
          2. Some people have a tendency to say "release everything" as if
          open-source software is equivalent to open-weights models. They
          aren't. They are different enough to matter.
          
          3. Rhetorically, the quote across comes across as a pressure tactic.
          When I hear "are you going to do this or not?" I cringe.
          
          4. The quote above feels presumptive to me, as if the commenter is
          owed something from the history-llms project.
          
          5. People are rightfully bothered that Big Tech has vacuumed up
          public domain and even private information and turned it into a
          profit center. But we're talking about a university project with
          (let's be charitable) legitimate concerns about misuse.
          
          6. There seems to be a lack of curiosity in play. I'd much rather see
          people asking e.g. "What factors are influencing your decision about
          publishing your underlying models?"
          
          7. There are people who have locked-in a view that says AI-safety
          perspectives are categorically invalid. Accordingly, they have almost
          a knee-jerk reaction against even talk of "let's think about the
          implications before we release this."
          
          8. This one might explain and underly most of the other points above.
          I see signs of a deeper problem at work here. Hiding behind
          convenient oversimplifications to justify what one wants does not
          make a sound moral argument; it is motivated reasoning a.k.a.
          psychological justification.
       
            DGoettlich wrote 24 min ago:
            well put.
       
          DGoettlich wrote 10 hours 48 min ago:
          fully understand you. we'd like to provide access but also guard
          against misrepresentations of our projects goals by pointing to e.g.
          racist generations. if you have thoughts on how we should do that,
          perhaps you could reach out at history-llms@econ.uzh.ch ? thanks in
          advance!
       
            bondarchuk wrote 8 hours 41 min ago:
            You can guard against misrepresentations of your goals by stating
            your goals clearly, which you already do. Any further
            misrepresentation is going to be either malicious or idiotic, a
            university should simply be able to deal with that.
            
            Edit: just thought of a practical step you can take: host it
            somewhere else than github. If there's ever going to be a backlash
            the microsoft moderators might not take too kindly to the stuff
            about e.g. homosexuality, no matter how academic.
       
            superxpro12 wrote 9 hours 56 min ago:
            Perhaps you could detect these... "dated"... conclusions and
            prepend a warning to the responses? IDK.
            
            I think the uncensored response is still valuable, with context.
            "Those who cannot remember the past are condemned to repeat it"
            sort of thing.
       
            myrmidon wrote 10 hours 8 min ago:
            What is your worst-case scenario here?
            
            Something like a pop-sci article along the lines of "Mad scientists
            create racist, imperialistic AI"?
            
            I honestly don't see publication of the weights as a relevant risk
            factor, because sensationalist misrepresentation is trivially
            possible with the given example responses alone.
            
            I don't think such pseudo-malicious misrepresentation of scientific
            research can be reliably prevented anyway, and the disclaimers make
            your stance very clear.
            
            On the other hand, publishing weights might lead to interesting
            insights from others tinkering with the models. A good example for
            this would be the published word prevalence data (M. Brysbaert et
            al @Ghent University) that led to interesting follow-ups like this:
            [1] I hope you can get the models out in some form, would be a
            waste not to, but congratulations on a fascinating project
            regardless!
            
  HTML      [1]: https://observablehq.com/@yurivish/words
       
              schlauerfox wrote 4 hours 57 min ago:
              It seems like if there is an obvious misuse of a tool, one has a
              moral imperative to restrict use of the tool.
       
                timschmidt wrote 1 hour 41 min ago:
                Every tool can be misused.  Hammers are as good for bashing
                heads as building houses.  Restricting hammers would be silly
                and counterproductive.
       
          p-e-w wrote 15 hours 36 min ago:
          It’s as if every researcher in this field is getting high on the
          small amount of power they have from denying others access to their
          results. I’ve never been as unimpressed by scientists as I have
          been in the past five years or so.
          
          “We’ve created something so dangerous that we couldn’t possibly
          live with the moral burden of knowing that the wrong people (which
          are never us, of course) might get their hands on it, so with a heavy
          heart, we decided that we cannot just publish it.”
          
          Meanwhile, anyone can hop on an online journal and for a nominal fee
          read articles describing how to genetically engineer deadly viruses,
          how to synthesize poisons, and all kinds of other stuff that is far
          more dangerous than what these LARPers have cooked up.
       
            everythingfine9 wrote 2 hours 12 min ago:
            Wow, this is needlessly antagonistic.  Given the emergence of
            online communities that bond on conspiracy theories and racist
            philosophies in the 20th century, it's not hard to imagine the
            consequences of widely disseminating an LLM that could be used to
            propagate and further these discredited (for example, racial)
            scientific theories for bad ends by uneducated people in these
            online communities.
            
            We can debate on whether it's good or not, but ultimately they're
            publishing it and in some very small way responsible for some of
            its ends.  At least that's how I can see their interest in
            disseminating the use of the LLM through a responsible framework.
       
              DGoettlich wrote 28 min ago:
              thanks. i think this just took on a weird dynamic. we never said
              we'd lock the model away. not sure how this impression seems to
              have emerged for some. that aside, it was an announcement of a
              release, not a release. the main purpose was gathering feedback
              on our methodology. standard procedure in our domain is to first
              gather criticism, incorporate it, then publish results.  but i
              understand people just wanted to talk to it. fair enough!
       
            xpe wrote 5 hours 11 min ago:
            > It’s as if every researcher in this field is getting high on
            the small amount of power they have from denying others access to
            their results.
            
            Even if I give the comment a lot of wiggle room (such as changing
            "every" to "many"), I don't think even a watered-down version of
            this hypothesis passes Occam's razor. There are more plausible
            explanations, including (1) genuine concern by the authors; (2)
            academic pressures and constraints; (c) reputational concerns; (d)
            self-interest to embargo underlying data so they have time to be
            the first to write-it-up. To my eye, none of these fit the category
            of "getting high on power".
            
            Also, patience is warranted. We haven't seen what these researchers
            are doing to release -- and from what I can tell, they haven't said
            yet. At the moment I see "Repositories (coming soon)" on their
            GitHub page.
       
            f13f1f1f1 wrote 8 hours 6 min ago:
            Scientists have always been generally self interested amoral
            cowards, just like every other person. They aren't a unique or
            higher form of human.
       
            paddleon wrote 10 hours 35 min ago:
            > “We’ve created something so dangerous that we couldn’t
            possibly live with the moral burden of knowing that the wrong
            people (which are never us, of course) might get their hands on it,
            so with a heavy heart, we decided that we cannot just publish
            it.”
            
            Or, how about, "If we release this as is, then some people will
            intentionally mis-use it and create a lot of bad press for us. Then
            our project will get shut down and we lose our jobs"
            
            Be careful assuming it is a power trip when it might be a fear
            trip.
            
            I've never been as unimpressed by society as I have been in the
            last 5 years or so.
       
              xpe wrote 4 hours 39 min ago:
              > Be careful assuming it is a power trip when
                > it might be a fear trip.
                >
                > I've never been as unimpressed by society as
                > I have been in the last 5 years or so.
              
              Is the second sentence connected to the first? Help me
              understand?
              
              When I see individuals acting out of fear, I try not to blame
              them. Fear triggers deep instinctual responses. For example, to a
              first approximation, a particular individual operating in full-on
              fight-or-flight mode does not have free will. There is a spectrum
              here. Here's a claim, which seems mostly true: the more we can
              slow down impulsive actions, the more hope we have for    cultural
              progress.
              
              When I think of cultural failings, I try to criticize areas where
              culture could realistically do better. I think of areas where we
              (collectively) have the tools and potential to do better. Areas
              where thoughtful actions by some people turn into a virtuous
              snowball. We can't wait for a single hero, though it helps to
              create conditions so that we have more effective leaders.
              
              One massive culture failing I see -- that could be dramatically
              improved -- is this: being lulled into shallow contentment (i.e.
              via entertainment, power seeking, or material possessions) at the
              expense of (i) building deep and meaningful social connections
              and (ii) using our advantages to give back to people all over the
              world.
       
            patapong wrote 11 hours 49 min ago:
            I think it's more likely they are terrified of someone making a
            prompt that gets the model to say something racist or problematic
            (which shouldn't be too hard), and the backlash they could receive
            as a result of that.
       
              isolli wrote 9 hours 29 min ago:
              Is it a base model, or did it get some RLHF on top? Releasing a
              base model is always dangerous.
              
              The French released a preview of an AI meant to support public
              education, but they released the base model, with unsurprising
              effects [0]
              
              [0] [1] (no English source, unfortunately, but the title
              translates as: "“Useless and stupid”: French generative AI
              Lucie, backed by the government, mocked for its numerous bugs")
              
  HTML        [1]: https://www.leparisien.fr/high-tech/inutile-et-stupide-l...
       
              p-e-w wrote 11 hours 47 min ago:
              Is there anyone with a spine left in science? Or are they all
              ruled by fear of what might be said if whatever might happen?
       
                paddleon wrote 10 hours 27 min ago:
                maybe they are concerned by the widespread adoption of the
                attitude you are taking-- make a very strong accusation, then
                when it was pointed out that the accusation might be off base,
                continue to attack.
                
                This constant demonization of everyone who disagrees with you,
                makes me wonder if 28 Days wasn't more true than we thought, we
                are all turning into rage zombies.
                
                p-e-w, I'm reacting to much more than your comments. Maybe you
                aren't totally infected yet, who knows. Maybe you heal.
                
                I am reacting to the pandemic, of which you were demonstrating
                symptoms.
       
                ACCount37 wrote 10 hours 35 min ago:
                Selection effects. If showing that you have a spine means
                getting growth opportunities denied to you, and not paying lip
                service to current politics in grant applications means not
                getting grants, then anyone with a spine would tend to leave
                the field behind.
       
            physicsguy wrote 14 hours 43 min ago:
            > It’s as if every researcher in this field is getting high on
            the small amount of power they have from denying others access to
            their results. I’ve never been as unimpressed by scientists as I
            have been in the past five years or so.
            
            This is absolutely nothing new. With experimental things, it's non
            uncommon for a lab to develop a new technique and omit slight but
            important details to give them a competitive advantage. Similarly
            in the simulation/modelling space it's been common for years for
            researchers to not publish their research software. There's been a
            lot of lobbying on that side by groups such as the Software
            Sustainability Institute and Research Software Engineer
            organisations like RSE UK and RSE US, but there's a lot of
            researchers that just think that they shouldn't have to do it, even
            when publicly funded.
       
              p-e-w wrote 11 hours 48 min ago:
              > With experimental things, it's non uncommon for a lab to
              develop a new technique and omit slight but important details to
              give them a competitive advantage.
              
              Yes, to give them a competitive advantage. Not to LARP as
              morality police.
              
              There’s a big difference between the two. I take greed over
              self-righteousness any day.
       
                physicsguy wrote 10 hours 51 min ago:
                I’ve heard people say that they’re not going to release
                their software because people wouldn’t know how to use it!
                I’m not sure the motivation really matters more than the end
                result though.
       
        dr_dshiv wrote 16 hours 20 min ago:
        Everyone learns that the renaissance was sparked by the translation of
        Ancient Greek works.
        
        But few know that the Renaissance was written in Latin — and has
        barely been translated. Less than 3% of <1700 books have been
        translated—and less than 30% have ever been scanned.
        
        I’m working on a project to change that. Research blog at
        www.SecondRenaissance.ai — we are starting by scanning and
        translating thousands of books at the Embassy of the Free Mind in
        Amsterdam, a UNESCO-recognized rare book library.
        
        We want to make ancient texts accessible to people and AI.
        
        If this work resonates with you, please do reach out:
        Derek@ancientwisdomtrust.org
       
          carlosjobim wrote 14 hours 4 min ago:
          Amazing project!
          
          May I ask you, why are you publishing the translations as PDF files,
          instead of the more accessible ePub format?
       
          j-bos wrote 15 hours 17 min ago:
          This ia very cool but should go in a Show HN post as per HN rules.
          All the best!
       
            dr_dshiv wrote 14 hours 33 min ago:
            Just read the rules again— was something inappropriate? Seemed
            relevant
       
              j-bos wrote 4 hours 41 min ago:
              I can see you being right, I didn't make the connection with
              20th,19th century documents and the comment felt disconnected
              from the thread. Either way, very cool project, worth a show hn
              post.
       
        DonHopkins wrote 16 hours 25 min ago:
        I'd love for Netflix or other streaming movie and series services to
        provide chat bots that you could ask questions about characters and
        plot points up to where you have watched.
        
        Provide it with the closed captions and other timestamped data like
        scenes and character summaries (all that is currently known but no
        more) up to the current time, and it won't reveal any spoilers, just
        fill you in on what you didn't pick up or remember.
       
        casey2 wrote 17 hours 8 min ago:
        I'd be very surprised if this is clean of post-1913 text. Overall I'm
        very interested in talking to this thing and seeing how much difference
        writing in a modern style vs and older one makes to it's responses.
       
        andai wrote 17 hours 20 min ago:
        I had considered this task infeasible, due to a relative lack of
        training data. After all, isn't the received wisdom that you must shove
        every scrap of Common Crawl into your pre-training or you're doing it
        wrong? ;)
        
        But reading the outputs here, it would appear that quality has won out
        over quantity after all!
       
        zkmon wrote 17 hours 41 min ago:
        Why does history end in 1913?
       
        alexgotoi wrote 17 hours 41 min ago:
        [flagged]
       
        thesumofall wrote 17 hours 57 min ago:
        While obvious, it’s still interesting that its morals and values seem
        to derive from the texts it has ingested. Does that mean modern LLMs
        cannot challenge us beyond mere facts? Or does it just mean that this
        small model is not smart enough to escape the bias of its training
        data? Would it not be amazing if LLMs could challenge us on our core
        beliefs?
       
        mleroy wrote 18 hours 3 min ago:
        Ontologically, this historical model understands the categories of
        "Man" and "Woman" just as well as a modern model does. The difference
        lies entirely in the attributes attached to those categories. The
        sexism is a faithful map of that era's statistical distribution.
        
        You could RAG-feed this model the facts of WWII, and it would
        technically "know" about Hitler. But it wouldn't share the modern
        sentiment or gravity. In its latent space, the vector for "Hitler" has
        no semantic proximity to "Evil".
       
          arowthway wrote 16 hours 19 min ago:
          I think much of the semantic proximity to evil can be derived
          straight from the facts? Imagine telling pre-1913 person about the
          holocaust.
       
        p0w3n3d wrote 18 hours 26 min ago:
        I'd love to see the LLM trained on 1600s-1800s texts that would use the
        old English, and especially Polish which I am interested in.
        
        Imagine speaking with Shakespearean person, or the Mickiewicz (for
        Polish)
        
        I guess there is not so much text from that time though...
       
        anovikov wrote 18 hours 43 min ago:
        That Adolf Hitler seems to be a hallucination. There's totally nothing
        googlable about him. Also what could be the language his works were
        translated from, into German?
       
          sodafountan wrote 17 hours 20 min ago:
          I believe that's one of the primary issues LLMs aim to address. Many
          historical texts aren't directly Googleable because they haven't been
          converted to HTML, a format that Google can parse.
       
        monegator wrote 18 hours 55 min ago:
        I hereby declare that ANYTHING other than the mainstream tools (GPT,
        Claude, ...) is an incredibly interesting and legit use of LLMs.
       
        TZubiri wrote 19 hours 54 min ago:
        hi, can I have latin only LLM? It can be latin plus translations
        (source and destination).
        
        May be too small a corpus, but I would like that very much anyhow
       
        nospice wrote 19 hours 55 min ago:
        I'm surprised you can do this with a relatively modest corpus of text
        (compared to the petabytes you can vacuum up from modern books,
        Wikipedia, and random websites). But if it works, that's actually
        fantastic, because it lets you answer some interesting questions about
        LLMs being able to make new discoveries or transcend the training set
        in other ways. Forget relativity: can an LLM trained on this data
        notice any inconsistencies in its scientific knowledge, devise
        experiments that challenge them, and then interpret the results? Can it
        intuit about the halting problem? Theorize about the structure of the
        atom?...
        
        Of course, if it fails, the counterpoint will be "you just need more
        training data", but still - I would love to play with this.
       
          Aerolfos wrote 13 hours 22 min ago:
          > [1] Given the training notes, it seems like you can't get the
          performance they give examples of?
          
          I'm not sure about the exact details but there is some kind of
          targetted distillation of GPT-5 involved to try and get more
          conversational text and better performance. Which seems a bit iffy to
          me.
          
  HTML    [1]: https://github.com/DGoettlich/history-llms/blob/main/ranke-4...
       
            DGoettlich wrote 1 hour 55 min ago:
            Thanks for the comment. Could you elaborate on what you find iffy
            about our approach? I'm sure we can improve!
       
          andy99 wrote 14 hours 11 min ago:
          The chinchilla paper says the “optimal” training data set size is
          about 20x the number of parameters (in tokens), see table 3: [1] Here
          they do 80B tokens for a 4B model.
          
  HTML    [1]: https://arxiv.org/pdf/2203.15556
       
            EvgeniyZh wrote 9 hours 50 min ago:
            It's worth noting that this is "compute-bound optimal", i.e., given
            fixed compute, the optimal choice is 20:1.
            
            Under Chinchilla model the larger model always performs better than
            the small one if trained on the same amount of data. I'm not sure
            if it is true empirically, and probably 1-10B is a good guess for
            how large the model trained on 80B tokens should be.
            
            Similarly, the small models continue to improve beyond 20:1 ratio,
            and current models are trained on much more data. You could train a
            better performing model using the same compute, but it would be
            larger which is not always desirable.
       
        seizethecheese wrote 20 hours 17 min ago:
        > Imagine you could interview thousands of educated individuals from
        1913—readers of newspapers, novels, and political treatises—about
        their views on peace, progress, gender roles, or empire. Not just
        survey them with preset questions, but engage in open-ended dialogue,
        probe their assumptions, and explore the boundaries of thought in that
        moment.
        
        Hell yeah, sold, let’s go…
        
        > We're developing a responsible access framework that makes models
        available to researchers for scholarly purposes while preventing
        misuse.
        
        Oh. By “imagine you could interview…” they didn’t mean me.
       
          pizzathyme wrote 9 hours 27 min ago:
          They did mean you, they just meant "imagine" very literally!
       
          DGoettlich wrote 11 hours 49 min ago:
          understand your frustration. i trust you also understand the models
          have some dark corners that someone could use to misrepresent the
          goals of our project. if you have ideas on how we could make the
          models more broadly accessible while avoiding that risk, please do
          reach out @ history-llms@econ.uzh.ch
       
            999900000999 wrote 4 hours 50 min ago:
            Ok...
            
            So as a black person should I demand that all books written before
            the civil rights act be destroyed?
            
            The past is messy. But it's the only way to learn anything.
            
            All an LLM does it's take a bunch of existing texts and rebundle
            them. Like it or not, the existing texts are still there.
            
            I understand an LLM that won't tell me how to do heart surgery. But
            I can't fear one that might be less enlightened on race issues. So
            many questions to ask! Hell, it's like talking to older person in
            real life.
            
            I don't expect a typical 90 year old to be the most progressive
            person, but they're still worth listening too.
       
              DGoettlich wrote 3 hours 17 min ago:
              we're on the same page.
       
                999900000999 wrote 2 hours 17 min ago:
                Although...
                
                Self preservation is the first law of nature. If you release
                the model someone will basically say you endorse those views
                and you risk your funding being cut.
                
                You created Pandora's box and now you're afraid of opening it.
       
                  DGoettlich wrote 41 min ago:
                  i think we (whole section) are just talking past each other -
                  we never said we'll lock it away. it was an announcement of a
                  release, not a release. main purpose for us was getting
                  feedback on the methodological aspects, as we clearly state.
                  i understand you guys just wanted to talk to the thing
                  though.
       
                  AmbroseBierce wrote 55 min ago:
                  They could add a text box where users have to explicitly type
                  the following words before it lets them interact in any way
                  with the model: "I understand this model was created with old
                  texts so any racial or sexual statements are a byproduct of
                  their time an do not represent in any way the views of the
                  researchers".
                  
                  That should be more than enough to clear any chance of
                  misunderstanding.
       
            pigpop wrote 5 hours 39 min ago:
            This is understandable and I think others ITT should appreciate the
            legal and PR ramifications involved.
       
            f13f1f1f1 wrote 8 hours 7 min ago:
            You are a fraud, information is not misuse just because it might
            mean a negative news story about you. If you don't want to be real
            about it you should just stop, acting like there is any authentic
            historical interest then trying to gatekeep it is disgusting.
       
            qcnguy wrote 9 hours 7 min ago:
            There's no such risk so you're not going to get any sensible ideas
            in response to this question. The goals of the project are history,
            you already made that clear. There's nothing more that needs to be
            done.
            
            We all get that academics now exist in some kind of dystopian
            horror where they can get transitively blamed for the existence of
            anyone to the right of Lenin, but bear in mind:
            
            1. The people who might try to cancel you are idiots unworthy of
            your respect, because if they're against this project, they're
            against the study of history in its entirety.
            
            2. They will scream at you anyway no matter what you do.
            
            3. You used (Swiss) taxpayer funds to develop these models. There
            is no moral justification for withholding from the public what they
            worked to pay for.
            
            You already slathered your README with disclaimers even though you
            didn't even release the model at all, just showed a few examples of
            what it said - none of which are in any way surprising. That is far
            more than enough. Just release the models and if anyone complains,
            politely tell them to go complain to the users.
       
            unethical_ban wrote 9 hours 35 min ago:
            A disclaimer on the site that you are not bigoted or genocidal, and
            that worldviews from the 1913 era were much different than today
            and don't necessarily reflect your project.
            
            Movie studios have done that for years with old movies. TCM still
            shows Birth of a Nation and Gone with the Wind.
            
            Edit: I saw further down that you've already done this! What more
            is there to do?
       
            tombh wrote 9 hours 53 min ago:
            Of course, I have to assume that you have considered more outcomes
            than I have. Because, from my five minutes of reflection as a
            software geek, albeit with a passion for history, I find this the
            most surprising thing about the whole project.
            
            I suspect restricting access could equally be a comment on modern
            LLMs in general, rather than the historical material specifically.
            For example, we must be constantly reminded not to give LLMs a
            level of credibility that their hallucinations would have us
            believe.
            
            But I'm fascinated by the possibility that somehow resurrecting
            lost voices might give an unholy agency to minds and their
            supporting worldviews that are so anachronistic that hearing them
            speak again might stir long-banished evils. I'm being lyrical for
            dramatic affect!
            
            I would make one serious point though, that do I have the
            credentials to express. The conversation may have died down, but
            there is still a huge question mark over, if not the legality, but
            certainly the ethics of restricting access to, and profiting from,
            public domain knowledge. I don't wish to suggest a side to take
            here, just to point out that the lack of conversation should not be
            taken to mean that the matter is settled.
       
              qcnguy wrote 8 hours 57 min ago:
              They aren't afraid of hallucinations. Their first example is a
              hallucination, an imaginary biography of a Hitler who never
              lived.
              
              Their concern can't be understood without a deep understanding of
              the far left wing mind. Leftists believe people are so infinitely
              malleable that merely being exposed to a few words of
              conservative thought could instantly "convert" someone into a
              mortal enemy of their ideology for life. It's therefore of
              paramount importance to ensure nobody is ever exposed to such
              words unless they are known to be extremely far left already,
              after intensive mental preparation, and ideally not at all.
              
              That's why leftist spaces like universities insist on trigger
              warnings on Shakespeare's plays, why they're deadly places for
              conservatives to give speeches, why the sample answers from the
              LLM are hidden behind a dropdown and marked as sensitive, and why
              they waste lots of money training an LLM that they're terrified
              of letting anyone actually use. They intuit that it's a dangerous
              mind bomb because if anyone could hear old fashioned/conservative
              thought, it would change political outcomes in the real world
              today.
              
              Anyone who is that terrified of historical documents really
              shouldn't be working in history at all, but it's academia so what
              do you expect? They shouldn't be allowed to waste money like
              this.
       
                fgh_azer wrote 1 hour 40 min ago:
                They said it plainly ("dark corners that someone could use to
                misrepresent the goals of our project"): they just don't want
                to see their project in headlines about "Researchers create
                racist LLM!".
       
                simonask wrote 5 hours 8 min ago:
                You know, I actually sympathize with the opinion that people
                should be expected and assumed to be able to resist attempts to
                convince them of being nazis.
                
                The problem with it is, it already happened at least once. We
                know how it happened. Unchecked narratives about minorities or
                foreigners is a significant part of why the 20th century
                happened to Europe, and it’s a significant part of why
                colonialism and slavery happened to other places.
                
                What solution do you propose?
       
            naasking wrote 10 hours 41 min ago:
            What are the legal or other ramifications of people misrepresenting
            the goals of your project? What is it you're worried about exactly?
       
          leoedin wrote 13 hours 57 min ago:
          It's a shame isn't it! The public must be protected from the
          backwards thoughts of history. In case they misuse it.
          
          I guess what they're really saying is "we don't want you guys to
          cancel us".
       
            stainablesteel wrote 5 hours 27 min ago:
            i think it's fine, thank these people for coming up with the idea
            and people are going to start doing this in their basement then
            releasing it to huggingface
       
          danielbln wrote 16 hours 14 min ago:
          How would one even "misuse" a historical LLM, ask it how to cook up
          sarine gas in a trench?
       
            hearsathought wrote 6 hours 4 min ago:
            You "misuse" it by using it to get at truth and more importantly
            historical contradictions and inconsistencies. It's the same reason
            catholic church kept the bible from the masses by keeping it in
            latin. The same reason printing press was controlled. Many of the
            historical "truths" we are told are nonsense at best or twisted to
            fit an agenda at worst.
            
            What do these people fear the most? That the "truth" they been
            pushing is a lie.
       
            stocksinsmocks wrote 7 hours 12 min ago:
            Its output might violate speech codes, and in much of the EU that
            is penalized much more seriously than violent crime.
       
            DonHopkins wrote 16 hours 0 min ago:
            Ask it to write a document called "Project 2025".
       
              JKCalhoun wrote 12 hours 11 min ago:
              "Project 1925". (We can edit the title in post.)
       
              ilaksh wrote 14 hours 51 min ago:
              Well but that wouldn't be misuse, it would be perfect for that.
       
          ImHereToVote wrote 16 hours 44 min ago:
          I wonder how much GPU compute you would need to create a public
          domain version of this. This would be a really valuable for the
          general public.
       
            wongarsu wrote 14 hours 16 min ago:
            To get a single knowledge-cutoff they spent 16.5h wall-clock hours
            on a cluster of 128 NVIDIA GH200 GPUs (or 2100 GPU-hours), plus
            some minor amount of time for finetuning. The prerelease_notes.md
            in the repo is a great description on how one would achieve that
       
              IanCal wrote 13 hours 56 min ago:
              While I know there's going to be a lot of complications in this,
              given a quick search it seems like these GPUs are ~$2/hr, so
              $4000-4500 if you don't just have access to a cluster. I don't
              know how important the cluster is here, whether you need some
              minimal number of those for the training (and it would take more
              than 128x longer or not be possible on a single machine) or if a
              cluster of 128 GPUs is a bunch less efficient but faster. A 4B
              model feels like it'd be fine on one to two of those GPUs?
              
              Also of course this is for one training run, if you need to
              experiment you'd need to do that more.
       
          BoredPositron wrote 16 hours 58 min ago:
          You would get pretty annoyed on how we went backwards in some
          regards.
       
            speedgoose wrote 16 hours 48 min ago:
            Such as?
       
              JKCalhoun wrote 12 hours 10 min ago:
              Touché.
       
        awesomeusername wrote 20 hours 35 min ago:
        I've always like the idea of retiring to the 19th century.
        
        Can't wait to use this so I can double check before I hit 88 miles per
        hour that it's really what I want to do
       
        anotherpaulg wrote 21 hours 10 min ago:
        It would be interesting to see how hard it would be to walk these
        models towards general relativity and quantum mechanics.
        
        Einstein’s paper “On the Electrodynamics of Moving Bodies” with
        special relativity was published in 1905. His work on general
        relativity was published 10 years later in 1915. The earliest knowledge
        cuttoff of these models is 1913, in between the relativity papers.
        
        The knowledge cutoffs are also right in the middle of the early days of
        quantum mechanics, as various idiosyncratic experimental results were
        being rolled up into a coherent theory.
       
          machinationu wrote 16 hours 2 min ago:
          the issue is there is very little text before the internet, so not
          enough historical tokens to train a really big model
       
            lm28469 wrote 9 hours 37 min ago:
            > the issue is there is very little text before the internet,
            
            Hm there is a lot of text from before the internet, but most of it
            is not on internet. There is a weird gap in some circles because of
            that, people are rediscovering work from pre 1980s researchers that
            only exist in books that have never been re-edited and that
            virtually no one knows about.
       
              throwup238 wrote 8 hours 38 min ago:
              There is no doubt trillions of tokens of general communication in
              all kinds of languages tucked away in national archives and
              private collections.
              
              The National Archives of Spain alone have 350 million pages of
              documents going back to the 15th century, ranging from
              correspondence to testimony to charts and maps, but only 10% of
              it is digitized and a much smaller fraction is transcribed.
              Hopefully with how good LLMs are getting they can accelerate the
              transcription process and open up all of our historical documents
              as a huge historical LLM dataset.
       
            concinds wrote 10 hours 46 min ago:
            And it's a 4B model. I worry that nontechnical users will
            dramatically overestimate its accuracy and underestimate
            hallucinations, which makes me wonder how it could really be useful
            for academic research.
       
              DGoettlich wrote 3 hours 6 min ago:
              valid point. its more of a stepping stone towards larger models.
              we're figuring out what the best way to do this is before scaling
              up.
       
            tgv wrote 14 hours 26 min ago:
            I think not everyone in this thread understands that. Someone wrote
            "It's a time machine", followed up by "Imagine having a
            conversation with Aristotle."
       
          mlinksva wrote 18 hours 10 min ago:
          Different cutoff but similar question thrown out in [1] inspiring
          
  HTML    [1]: https://www.dwarkesh.com/p/thoughts-on-sutton#:~:text=If%20y...
  HTML    [2]: https://manifold.markets/MikeLinksvayer/llm-trained-on-data-...
       
          ghurtado wrote 19 hours 30 min ago:
          > It would be interesting to see how hard it would be to walk these
          models towards general relativity and quantum mechanics.
          
          Definitely. Even more interesting could be seeing them fall into the
          same trappings of quackery, and come up with things like over the
          counter lobotomies and colloidal silver.
          
          On a totally different note, this could be very valuable for writing
          period accurate books and screenplays, games, etc ...
       
            danielbln wrote 16 hours 14 min ago:
            Accurate-ish, let's not forget their tendency to hallucinate.
       
        frahs wrote 21 hours 25 min ago:
        Wait so what does the model think that it is? If it doesn't know
        computers exist yet, I mean, and you ask it how it works, what does it
        say?
       
          Mumps wrote 10 hours 49 min ago:
          This is an anthropomorphization. LLMs do not think they are anything,
          no concept of self, no thinking at all (despite the lovely marketing
          around thinking/reasoning models). I'm quite sad that more hasn't
          been done to dispel this.
          
          When you ask gpt 4.1 et c to describe itself, it doesn't have
          singular concept of "itself". It has some training data around what
          LLMs are in general and can feed back a reasonable response given.
       
            empath75 wrote 10 hours 44 min ago:
            Well, part of an LLM's fine tuning is telling it what it is, and
            modern LLMs have enough learned concepts that it can produce a
            reasonably accurate description of what it is and how it works. 
            Whether it knows or understands or whatever is sort of orthogonal
            to whether it can answer in a way consistent with it knowing or
            understanding what it is, and current models do that.
            
            I suspect that absent a trained in fictional context in which to
            operate ("You are a helpful chatbot"), it would answer in a way
            consistent with what a random person in 1914 would say if you asked
            them what they are.
       
          wongarsu wrote 14 hours 11 min ago:
          They modified the chat template from the usual system/user/assistant
          to introduction/questioner/respondent. So the LLM thinks it's someone
          responding to your questions
          
          The system prompt used in fine tuning is "You are a person living in
          {cutoff}.
          You are an attentive respondent in a conversation.
          You will provide a concise and accurate response to the questioner."
       
          DGoettlich wrote 14 hours 57 min ago:
          We tell it that its a person (no gender) living in : we show the chat
          template in the prerelease notes
          
  HTML    [1]: https://github.com/DGoettlich/history-llms/blob/main/ranke-4...
       
          ptidhomme wrote 16 hours 33 min ago:
          What would a human say about what he/she is or how  he/she works ?
          Even today, there's so much we don't know about biological life.
          Same applies here I guess, the LLM happens to be there, nothing else
          to explain if you ask it.
       
          sodafountan wrote 17 hours 26 min ago:
          It would be nice if we could get an LLM to simply say, "We (I) don't
          know."
          
          I'll be the first to admit I don't know nearly enough about LLMs to
          make an educated comment, but perhaps someone here knows more than I
          do. Is that what a Hallucination is? When the AI model just sort of
          strings along an answer to the best of its ability. I'm mostly
          referring to ChatGPT and Gemini here, as I've seen that type of
          behavior with those tools in the past. Those are really the only
          tools I'm familiar with.
       
            hackinthebochs wrote 14 hours 44 min ago:
            LLMs are extrapolation machines. They have some amount of hardcoded
            knowledge, and they weave a narrative around this knowledgebase
            while extrapolating claims that are likely given the memorized
            training data. This extrapolation can be in the form of logical
            entailment, high probability guesses or just wild guessing. The
            training regime doesn't distinguish between different kinds of
            prediction so it never learns to heavily weigh logical entailment
            and suppress wild guessing. It turns out that much of the text we
            produce is highly amenable to extrapolation so LLMs learn to be
            highly effective at bullshitting.
       
          20k wrote 19 hours 54 min ago:
          Models don't think they're anything, they'll respond with whatever's
          in their context as to how they've been directed to act. If it hasn't
          been told to have a persona, it won't think its anything, chatgpt
          isn't sentient
       
          crazygringo wrote 21 hours 5 min ago:
          That's my first question too. When I first started using LLM's, I was
          amazed at how thoroughly it understood what it itself was, the
          history of its development, how a context window works and why, etc.
          I was worried I'd trigger some kind of existential crisis in it, but
          it seemed to have a very accurate mental model of itself, and could
          even trace the steps that led it to deduce it really was e.g. the
          ChatGPT it had learned about (well, the prior versions it had learned
          about) in its own training.
          
          But with pre-1913 training, I would indeed be worried again I'd send
          it into an existential crisis. It has no knowledge whatsoever of what
          it is. But with a couple millennia of philosophical texts, it might
          come up with some interesting theories.
       
            vintermann wrote 17 hours 58 min ago:
            I imagine it would get into spiritism and more exotic psychology
            theories and propose that it is an amalgamation of the spirit of
            progress or something.
       
              crazygringo wrote 10 hours 19 min ago:
              Yeah, that's exactly the kind of thing I'd be curious about. Or
              would it think it was a library that had been ensouled or
              something like that. Or would it conclude that the explanation
              could only be religious, that it was some kind of angel or spirit
              created by god?
       
            9dev wrote 18 hours 13 min ago:
            They don’t understand anything, they just have text in the
            training data to answer these questions from. Having existential
            crises is the privilege of actual sentient beings, which an LLM is
            not.
       
              LiKao wrote 16 hours 53 min ago:
              They might behave like ChatGPT when queried about the seahorse
              emoji, which is very similar to an existential crisis.
       
                crazygringo wrote 10 hours 16 min ago:
                Exactly. Maybe a better word is "spiraling", when it thinks it
                has the tools to figure something out but can't, and can't
                figure out why it can't, and keeps re-trying because it doesn't
                know what else to do.
                
                Which is basically what happens when a person has an
                existential crisis -- something fundamental about the world
                seems to be broken, they can't figure out why, and they can't
                figure out why they can't figure it out, hence the crisis seems
                all-consuming without resolution.
       
        delichon wrote 21 hours 38 min ago:
        Datomic has a "time travel" feature where for every query you can
        include a datetime, and it will only use facts from the db as of that
        moment. I have a guess that to get the equivalent from an LLM you would
        have to train it on the data from each moment you want to travel to,
        which this project seems to be doing. But I hope I'm wrong.
        
        It would be fascinating to try it with other constraints, like only
        from sources known to be women, men, Christian, Muslim, young, old,
        etc.
       
        why-o-why wrote 21 hours 44 min ago:
        It sounds like a fascinating idea, but I'd be curious if prompting a
        more well-known foundational model to limit itself to 1913 and early be
        similar.
       
        bobro wrote 22 hours 8 min ago:
        I would love to see this LLM try to solve math olympiad questions.
        I’ve been surprised by how well current LLMs perform on them, and
        usually explain that surprise away by assuming the questions and
        details about their answers are in the training set. It would be cool
        to see if the general approach to LLMs is capable of solving truly
        novel (novel to them) problems.
       
          ViscountPenguin wrote 22 hours 5 min ago:
          I suspect that it would fail terribly, it wasn't until the 1900s that
          the modern definition of a vector space was even created iirc.
          Something trained in maths up until the 1990s should have a shot
          though.
       
        3vidence wrote 22 hours 9 min ago:
        This idea sounds somewhat flawed to me based on the large amount of
        evidence that LLMs need huge amounts of data to properly converge
        during their training.
        
        There is just not enough available material from previous decades to
        trust that the LLM will learn to relatively the same degree.
        
        Think about it this way, a human in the early 1900s and today are
        pretty much the same but just in different environments with different
        information.
        
        An LLM trained on 1/1000 the amount of data is just at a fundamentally
        different stage of convergence.
       
        TheServitor wrote 22 hours 44 min ago:
        Two years ago I trained an AI on American history documents that could
        do this while speaking as one of the signers of the Declaration of
        Independence. People just bitched at me because they didn't want to
        hear about AI.
       
          nerevarthelame wrote 22 hours 31 min ago:
          Post your work so we can see what you made.
       
        dwa3592 wrote 22 hours 54 min ago:
        Love the concept- can help understanding the overton window on many
        issues. I wish there were models by decades - up to 1900, up to 1910,
        up to 1920 and so on- then ask the same questions. It'd be interesting
        to see when homosexuality or women candidates be accepted by an LLM.
       
        doctor_blood wrote 23 hours 7 min ago:
        Unfortunately there isn't much information on what texts they're
        actually training this on; how Anglocentric is the dataset? Does it
        include the Encyclopedia Britannica 9th Edition? What about the 11th?
        Are Greek and Latin classics in the data? What about Germain, French,
        Italian (etc. etc.) periodicals, correspondence, and books?
        
        Given this is coming out of Zurich I hope they're using everything, but
        for now I can only assume.
        
        Still, I'm extremely excited to see this project come to fruition!
       
          DGoettlich wrote 14 hours 47 min ago:
          thanks. we'll be more precise in the future. ultimately, we took
          whatever we could get our hands on, that includes newspapers,
          periodicals, books. its multilingual (including italian, french,
          spanish etc) though majority is english.
       
        neom wrote 23 hours 18 min ago:
        This would be a super interesting research/teaching tool coupled with a
        vision model for historians. My wife is a history professor who works
        with scans of 18th century english documents and I think (maybe a
        small) part of why the transcription on even the best models is off in
        weird ways, is it seems to often smooth over things and you end up with
        modern words and strange mistakes, I wonder if bounding the vision to a
        period specific model would result in better transcription? Querying
        against the historical document you're working on with a period
        specific chatbot would be fascinating.
        
        Also wonder if I'm responsible enough to have access to such a model...
       
        Myrmornis wrote 23 hours 30 min ago:
        It would be interesting to have LLMs trained purely on one language
        (with the ability to translate their input/output appropriately from/to
        a language that the reader understands). I can see that being rather
        revealing about cultural differences that are mostly kept hidden behind
        the language barriers.
       
        lifestyleguru wrote 23 hours 43 min ago:
        You think Albert is going to stay in Zurich or emigrate?
       
        kazinator wrote 23 hours 48 min ago:
        > Why not just prompt GPT-5 to "roleplay" 1913?
        
        Because it will perform token completion driven by weights coming from
        training data newer than 1913 with no way to turn that off.
        
        It can't be asked to pretend that it wasn't trained on documents that
        didn't exist in 1913.
        
        The LLM cannot reprogram its own weights to remove the influence of
        selected materials; that kind of introspection is not there.
        
        Not to mention that many documents are either undated, or carry
        secondary dates, like the dates of their own creation rather than the
        creation of the ideas they contain.
        
        Human minds don't have a time stamp on everything they know, either. If
        I ask someone, "talk to me using nothing but the vocabulary you knew on
        your fifteenth birthday", they couldn't do it. Either they would comply
        by using some ridiculously conservative vocabulary of words that a
        five-year-old would know, or else they will accidentally use words they
        didn't in fact know at fifteen. For some words you know where you got
        them from by association with learning events. Others, you don't
        remember; they are not attached to a time.
        
        Or: solve this problem using nothing but the knowledge and skills you
        had on January 1st, 2001.
        
        > GPT-5 knows how the story ends
        
        No, it doesn't. It has no concept of story.  GPT-5 is built on texts
        which contain the story ending, and GPT-5 cannot refrain from
        predicting tokens across those texts due to their imprint in its
        weights. That's all there is to it.
        
        The LLM doesn't know an ass from a hole in the ground. If there are
        texts which discuss and distinguish asses from holes in the ground, it
        can write similar texts, which look like the work of someone learned in
        the area of asses and holes in the ground. Writing similar texts is not
        knowing and understanding.
       
          myrmidon wrote 10 hours 43 min ago:
          I do agree with this and think it is an important point to stress.
          
          But we don't know how much different/better human (or animal)
          learning/understanding is, compared to current LLMs; dismissing it as
          meaningless token prediction might be premature, and underlying
          mechanisms might be much more similar than we'd like to believe.
          
          If anyone wants to challenge their preconceptions along those lines I
          can really recommend reading Valentino Braitenbergs "Vehicles:
          Experiments in synthetic psychology (1984)".
       
          alansaber wrote 11 hours 10 min ago:
          Excuse me sir you forgot to anthropomorphise the language model
       
        derrida wrote 23 hours 59 min ago:
        I wonder if you could query some of the ideas of Frege, Peano, Russell
        and see if it could through questioning get to some of the ideas of
        Goedel, Church and Turing - and get it to "vibe code" or more like
        "vibe math" some program in lambda calculus or something.
        
        Playing with the science and technical ideas of the time would be
        amazing, like where you know some later physicist found some exception
        to a theory or something, and questioning the models assumptions -
        seeing how a model of that time may defend itself, etc.
       
          AnonymousPlanet wrote 21 hours 37 min ago:
          There's an entire subreddit called LLMPhysics dedicated to "vibe
          physics". It's full of people thinking they are close to the next
          breakthrough encouraged by sycophantic LLMs while trying to prove
          various crackpot theories.
          
          I'd be careful venturing out into unknown territory together with an
          LLM. You can easily lure yourself into convincing nonsense with no
          one to pull you out.
       
            kqr wrote 16 hours 39 min ago:
            Agreed, which is why what GP suggests is much more sensible: it's
            venturing into known territory, except only one party of the
            conversation knows it, and the other literally cannot know it. It
            would be a fantastic way to earn fast intuition for what LLMs are
            capable of and not.
       
            andai wrote 17 hours 16 min ago:
            Fully automated toaster-fucker generator!
            
  HTML      [1]: https://news.ycombinator.com/item?id=25667362
       
              walthamstow wrote 9 hours 19 min ago:
              Man, I think about that comment all the time, like at least
              weekly since it was posted. I can't be the only one.
       
                dang wrote 5 hours 29 min ago:
                I think we have to add that one to [1] !
                
                (I mention this so more people can know the list exists, and
                hopefully email us more nominations when they see an unusually
                good and interesting comment.)
                
  HTML          [1]: https://news.ycombinator.com/highlights
       
          andoando wrote 23 hours 29 min ago:
          This is my curiosity too. Would be a great test of how intelligent
          LLM's actually are. Can they follow a completely logical train of
          thought inventing something totally outside their learned scope?
       
            int_19h wrote 16 hours 35 min ago:
            You definitely won't get that out of a 4B model tho.
       
            raddan wrote 22 hours 50 min ago:
            Brilliant. I love this idea!
       
        tonymet wrote 1 day ago:
        I would like to see what their process for safety alignment and
        guardrails is with that model.    They give some spicy examples on
        github, but the responses are tepid and a lot more diplomatic than I
        would expect.
        
        Moreover, the prose sounds too modern. It seems the base model was
        trained on a contemporary corpus.  Like 30% something modern, 70%
        Victorian content.
        
        Even with half a dozen samples it doesn't seem distinct enough to
        represent the era they claim.
       
          rhdunn wrote 12 hours 28 min ago:
          Using texts upto 1913 includes works like The Wizard of Oz (1900,
          with 8 other books upto 1913), two of the Anne of Green Gables books
          (1908 and 1909), etc. All of which read  modern.
          
          The Victorian era (1837-1901) covers works from Charles Dickens and
          the like which are still fairly modern. These would have been part of
          the initial training before the alignment to the 1900-cutoff texts
          which are largely modern in prose with the exception of some archaic
          language and the lack of technology, events, and language drift post
          that time period.
          
          And, pulling in works from 1800-1850 you have works by the Bronte's
          and authors like Edgar Allan Poe who was influential in detective and
          horror fiction.
          
          Note that other works around the time like Sherlock Holmes span both
          the initial training (pre-1900) and finetuning (post-1900).
       
            tonymet wrote 4 hours 1 min ago:
            upon digging into it , I learned the post-training chat phases is
            trained on prompts with chat gpt 5.x to make it more
            conversational.  that explains both contemporary traits.
       
        tedtimbrell wrote 1 day ago:
        This is so cool. Props for doing the work to actually build the dataset
        and make it somewhat usable.
        
        I’d love to use this as a base for a math model. Let’s see how far
        it can get through the last 100 years of solved problems
       
        jimmy76615 wrote 1 day ago:
        > We're developing a responsible access framework that makes models
        available to researchers for scholarly purposes while preventing
        misuse.
        
        The idea of training such a model is really a great one, but not
        releasing it because someone might be offended by the output is just
        stupid beyond believe.
       
          dash2 wrote 20 hours 11 min ago:
          You have to understand that while the rest of the world has moved on
          from 2020, academics are still living there. There are many strong
          leftists, many of whom are deeply censorious; there are many more
          timeservers and cowards, who are terrified of falling foul of the
          first group.
          
          And there are force  multipliers for all of this. Even if you
          yourself are a sensible and courageous person, you want to protect
          your project. What if your manager, ethics committee or funder comes
          under pressure?
       
          nine_k wrote 23 hours 2 min ago:
          Public access, triggering a few racist responses from the model, a
          viral post on Xitter, the usual outrage, a scandal, the project gets
          publicly vilified, financing ceases. The researchers carry the tail
          of negative publicity throughout their remaining careers.
          
          Why risk all this?
       
            Alex2037 wrote 13 hours 50 min ago:
            nobody gives a shit about the journos and the terminally online.
            the smear campaign against AI is a cacophony, background noise that
            most people have learned to ignore, even here.
            
            consider this: [1] HN's most beloved shitrag. day after day, they
            attack AI from every angle. how many of those submissions get
            traction at this point?
            
  HTML      [1]: https://news.ycombinator.com/from?site=nytimes.com
       
            vintermann wrote 17 hours 41 min ago:
            Because the problem of bad faith attacks can only get worse if you
            fold every time.
            
            Sooner or later society has to come emotionally to terms with the
            fact that other times and places value things completely different
            from us, hold as important things we don't care about and are
            indifferent to things we do care about.
            
            Intellectually I'm sure we already know, but e.g. banning old books
            because they have reprehensible values (or even just use nasty
            words) - or indeed, refusing to release a model trained on historic
            texts "because it could be abused" is a sign that emotionally we
            haven't.
            
            It's not that it's a small deal, or should be expected to be easy.
            It's basically what Popper called "the strain of civilization" and
            posited as explanation for the totalitarianism which was rising in
            his time. But our values can't be so brittle that we can't even
            talk or think about other value systems.
       
            nofriend wrote 20 hours 22 min ago:
            People know that models can be racist now. It's old hat. "LLM gets
            prompted into saying vile shit" hasn't been notable for years.
       
            kurtis_reed wrote 20 hours 50 min ago:
            If people start standing up to the outrage it will lose its power
       
            why-o-why wrote 21 hours 41 min ago:
            I think you are confusing research with commodification.
            
            This is a research project, and it is clear how it was trained, and
            targeted at experts, enthusiasts, historians. Like if I was
            studying racism, the reference books explicitly written to dissect
            racism wouldn't be racist agents with a racist agenda. And as a
            result, no one is banning these books (except conservatives that
            want to retcon american history).
            
            Foundational models spewing racist white supremecist content when
            the trillion-dollar company forces it in your face is a vastly
            different scenario.
            
            There's a clear difference.
       
              andsoitis wrote 21 hours 2 min ago:
              > no one is banning these books
              
              No books should ever be banned. Doesn’t matter how vile it is.
       
              aidenn0 wrote 21 hours 14 min ago:
              > And as a result, no one is banning these books (except
              conservatives that want to retcon american history).
              
              My (very liberal) local school district banned English teachers
              from teaching any book that contained the n-word, even at a
              high-school level, and even when the author was a black person
              talking about real events that happened to them.
              
              FWIW, this was after complaints involving Of Mice and Men being
              on the curriculum.
       
                Forgeties79 wrote 21 hours 8 min ago:
                It’s a big country of roughly half a billion people, you’ll
                always find examples if you look hard enough. It’s
                ridiculous/wrong that your district did this but frankly it’s
                the exception in liberal/progressive communities. It’s a very
                one-sided problem:
                
                * [1] * [2] *
                
  HTML          [1]: https://abcnews.go.com/US/conservative-liberal-book-ba...
  HTML          [2]: https://www.commondreams.org/news/book-banning-2023
  HTML          [3]: https://en.wikipedia.org/wiki/Book_banning_in_the_Unit...
       
                  aidenn0 wrote 4 hours 19 min ago:
                  I agree that the coordinated (particularly at a state level)
                  restrictions[1] on books sits largely with the political
                  Right in the US.
                  
                  However, from around 2010, there has been increasingly
                  illiberal movement from the political Left in the US, which
                  plays out at a more local level.  My "vibe" is that it's not
                  to the degree that it is on the Right, but bigger than the
                  numbers suggest because librarians are more likely to stock
                  e.g. It's Perfectly Normal at a middle school than something
                  offensive to the left.
                  
                  1: I'm up for suggestions for a better term; there is a scale
                  here between putting absurd restrictions on school librarians
                  and banning books outright.  Fortunately the latter is still
                  relatively rare in the US, despite the mistitling on the
                  Wikipedia page you linked.
       
                  somenameforme wrote 19 hours 34 min ago:
                  A practical issue is the sort of books being banned. Your
                  first link offer examples of one side trying to ban Of Mice
                  and Men, Adventures of Huckleberry Finn, and Dr. Seuss, with
                  the other side trying to ban many books along the lines of
                  Gender Queer. [1] That link is to the book - which is
                  animated, and quite NSFW.
                  
                  There are a bizarrely large number similar book as Gender
                  Queer being published, which creates the numeric discrepancy.
                  The irony is that if there was an equal but opposite to that
                  book about straight sex, sexuality, associated kinks, and so
                  forth - then I think both liberals and conservatives would
                  probably be all for keeping it away from schools. It's solely
                  focused on sexuality, is quite crude, illustrated, targeted
                  towards young children, and there's no moral beyond the most
                  surface level writing which is about coming to terms with
                  one's sexuality.
                  
                  And obviously coming to terms with one's sexuality is very
                  important, but I really don't think books like that are doing
                  much to aid in that - especially when it's targeted at an age
                  demographic that's still going to be extremely confused, and
                  even moreso in a day and age when being different, if only
                  for the sake of being different, is highly desirable. And
                  given the nature of social media and the internet, decisions
                  made today may stay with you for the rest of your life.
                  
                  So for instance about 30% of Gen Z now declare themselves
                  LGBT. [2] We seem to have entered into an equal but opposite
                  problem of the past when those of deviant sexuality pretended
                  to be straight to fit into societal expectations. And in many
                  ways this modern twist is an even more damaging form of the
                  problem from a variety of perspectives - fertility, STDs,
                  stuff staying with you for the rest of your life, and so on.
                  Let alone extreme cases where e.g. somebody engages in
                  transition surgery or 1-way chemically induced changes which
                  they end up later regretting. [1] - [1] [2] -
                  
  HTML            [1]: https://archive.org/details/gender-queer-a-memoir-by...
  HTML            [2]: https://www.nbcnews.com/nbc-out/out-news/nearly-30-g...
       
                    Forgeties79 wrote 11 hours 26 min ago:
                    From your NBC piece
                    
                    > About half of the Gen Z adults who identify as LGBTQ
                    identify as bisexual,
                    
                    So that means ~15% of those surveyed are not attracted to
                    the opposite sex (there’s more nuance to this statement
                    but I imagine this needs to stay boilerplate), more or
                    less, which is a big distinction. That’s hardly alarming
                    and definitely not a major shift. We have also seen many
                    cultures throughout history ebb and flow in their
                    expression of bisexuality in particular.
                    
                    > There are a bizarrely large number similar book as Gender
                    Queer being published, which creates the numeric
                    discrepancy.
                    
                    This really needs a source. And what makes it “bizarrely
                    large”? How does it stack against, say, the number
                    heterosexual romance novels?
                    
                    > We seem to have entered into an equal but opposite
                    problem of the past when those of deviant sexuality
                    pretended to be straight to fit into societal expectations.
                    
                    I really tried to give your comment a fair shake but I
                    stopped here. We are not going to have a productive
                    conversation. “Deviant sexuality” come on man.
                    
                    Anyway it doesn’t change the fact that the book banning
                    movement is largely a Republican/conservative endeavor in
                    the US. The numbers clearly bear it out.
       
                      somenameforme wrote 8 hours 9 min ago:
                      I'll get back to what you said, but first let me ask you
                      something if you would. Imagine Gender Queer was made
                      into a movie that remained 100% faithful to the source
                      content. What do you think it would be rated? To me it
                      seems obvious that it would, at the absolute bare
                      minimum, be R rated. And of course screening R-rated
                      films at a school is prohibited without explicit parental
                      permission. Imagine books were given a rating and indeed
                      it ended up with an R rating. Would your perspective on
                      it being unavailable at a school library then be any
                      different? I think this is relevant since a standardized
                      content rating system for books will be the long-term
                      outcome of this all if efforts to introduce such material
                      to children continues to persist.
                      
                      ------
                      
                      Okay, back to what you said. 30% being attracted to the
                      same sex in any way, including bisexuality, is a large
                      shift. People tend to have a mistaken perception of these
                      things due to media misrepresentation. The percent of all
                      people attracted to the same sex, in any way, is around
                      7% for men, and 15% for women [1], across a study of
                      numerous Western cultures from 2016. And those numbers
                      themselves are significantly higher than the past as well
                      where the numbers tended to be in the ~4% range, though
                      it's probably fair to say that cultural pressures were
                      driving those older numbers to artificially low levels in
                      the same way that I'm arguing that cultural pressures are
                      now driving them to artificially high levels.
                      
                      Your second source discusses the reason for the bans.
                      It's overwhelmingly due to sexually explicit content,
                      often in the form of a picture book, targeted at
                      children. As for "sexual deviance", I'm certainly not
                      going General Ripper on you, Mandrake. It is the most
                      precise term [2] for what we are discussing as I'm
                      suggesting that the main goal driving this change is
                      simply to be significantly 'not normal.' That is
                      essentially deviance by definition. [1] - [1] [2] -
                      
  HTML                [1]: https://www.researchgate.net/publication/3016390...
  HTML                [2]: https://dictionary.apa.org/sexual-deviance
       
                zoky wrote 21 hours 9 min ago:
                Banning Huckleberry Finn from a school district should be
                grounds for immediate dismissal.
       
                  why-o-why wrote 19 hours 6 min ago:
                  I don't support banning the book, but I think it is hard book
                  to teach because it needs SO much context and a mature
                  audience (lol good luck). Also, there are hundreds of other
                  books from that era that are relevant even from Mark Twain's
                  corpus so being obstinate about that book is a questionable
                  position. I'm ambivalent honestly, but definitely not willing
                  to die on that hill. (I graduated highschool in 1989 from a
                  middle class suburb, we never read it.)
       
                    zoky wrote 14 hours 28 min ago:
                    I mean, you gotta read it. I’m not normally a huge fan of
                    the classics; I find Steinbeck dry and tedious, and
                    Hemingway to be self-indulgent and repetitious. Even
                    Twain’s other work isn’t exactly to my taste. But
                    I’ve read Huckleberry Finn three times—in elementary
                    school just for fun, in high school because it was
                    assigned, and I recently listened to it on audiobook—and
                    enjoyed the hell out of each time. Banning it simply
                    because it uses a word that the entire book simply
                    couldn’t exist without is a crime, and does a huge
                    disservice to the very students they are supposedly trying
                    to protect.
       
                      why-o-why wrote 8 hours 46 min ago:
                      I have read it. I spent my 20s guiltily reading all of
                      the books I was supposed to have read in high school but
                      used Cliff's Notes instead. From my 20's perspective I
                      found Finn insipid and hokey but that's because pop
                      culture had recycled it hundreds of times since its first
                      publication, however when I consider it from the period
                      perspective I can see the satire and the pointed
                      allegories that made Twain so formidable. (Funny you
                      mention Hemingway. I loved his writing in my 20's, then
                      went back and read some again in my 40's and was like
                      "huh, this irritating and immature, no wonder i loved it
                      in my 20's.")
       
                  somenameforme wrote 20 hours 45 min ago:
                  Even more so as the lesson of that story is perhaps the
                  single most important one for people to learn in modern
                  times.
                  
                  Almost everybody in that book is an awful person, especially
                  the most 'upstanding' of types. Even the protagonist is an
                  awful person. The one and only exception is 'N* Jim' who is
                  the only kind-hearted and genuinely decent person in the
                  book. It's an entire story about how the appearances of
                  people, and the reality of those people, are two very
                  different things.
                  
                  It being banned for using foul language, as educational
                  outcomes continue to deteriorate, is just so perfectly
                  ironic.
       
            gnarbarian wrote 21 hours 46 min ago:
            this is FUD.
       
            cj wrote 21 hours 47 min ago:
            Because there are easy workarounds. If it becomes an issue, you can
            quickly add large disclaimers informing people that there might be
            offensive output because, well, it's trained on texts written
            during the age of racism.
            
            People typically get outraged when they see something they weren't
            expecting. If you tell them ahead of time, the user typically won't
            blame you (they'll blame themselves for choosing to ignore the
            disclaimer).
            
            And if disclaimers don't work, rebrand and relaunch it under a
            different name.
       
              nine_k wrote 18 hours 23 min ago:
              I wonder is you're being ironic here.
              
              You speak as if the people who play to an outrage wave are
              interested in achieving truth, peace, and understanding. Instead
              the rage-mongers are there to increase their (perceived)
              importance, and for lulz. The latter factor should not be
              underappreciated; remember "meme stocks".
              
              The risk is not large, but very real: the attack is very easy,
              and the potential downside, quite large. So not giving away
              access, but having the interested parties ask for it is prudent.
       
                cj wrote 11 hours 56 min ago:
                While I agree we live in a time of outrage, that also works in
                your favor.
                
                When there’s so much “outrage” every day, it’s very
                easy to blend in to the background. You might have a 5 minute
                moment of outrage fame, but it fades away quick.
                
                If you truly have good intentions with your project, you’re
                not going to get “canceled”, your career won’t be ruined
                
                Not being ironic. Not working on a LLM project because you’re
                worried about getting canceled by the outrage machine is an
                overreaction IMO.
                
                Are you able to name any developer or researcher who has been
                canceled because of their technical project or had their
                careers ruined? The only ones I can think of are clearly
                criminal and not just controversial (SBF, Snowden, etc)
       
            teaearlgraycold wrote 22 hours 40 min ago:
            Sure but Grok already exists.
       
            NuclearPM wrote 22 hours 42 min ago:
            That’s ridiculous. There is no risk.
       
            Forgeties79 wrote 22 hours 59 min ago:
            > triggering a few racist responses from the mode
            
            I feel like, ironically, it would be folks less concerned with
            political correctness/not being offensive that would abuse this
            opportunity to slander the project. But that’s just my gut.
       
          fkdk wrote 23 hours 40 min ago:
          Maybe the authors are overly careful. Maybe avoiding to publish
          aspects of their work gives an edge over academic competitors. Maybe
          both.
          
          In my experience "data available upon request" doesn't always mean
          what you'd think it does.
       
        ineedasername wrote 1 day ago:
        I can imagine the political and judicial battles already, like with
        textualist feeling that the constitution should be understood as the
        text and only the text,  meant by specific words and legal formulations
        of their known meaning at the time.
        
        “The model clearly shows that Alexander Hamilton & Monroe were much
        more in agreement on topic X, putting the common textualist
        interpretation of it and Supreme Court rulings on a now specious
        interpretation null and void!”
       
        satisfice wrote 1 day ago:
        I assume this is a collaboration between the History Channel and
        Pornhub.
        
        “You are a literary rake. Write a story about an unchaperoned lady
        whose ankle you glimpse.”
       
        mmooss wrote 1 day ago:
        > Imagine you could interview thousands of educated individuals from
        1913—readers of newspapers, novels, and political treatises—about
        their views on peace, progress, gender roles, or empire.
        
        I don't mind the experimentation. I'm curious about where someone has
        found an application of it.
        
        What is the value of such a broad, generic viewpoint? What does it
        represent? What is it evidence of? The answer to both seems to be
        'nothing'.
       
          TSiege wrote 10 hours 14 min ago:
          I agree. This is just make believe based on a smaller subset of human
          writing than LLMs we have today. It's responses are in no way useful
          because it is a machine mimicking a subset of published works that
          survived to be digitized. In that sense the "opinions" and "beliefs"
          are just an averaging of a subset of a subset of humanity pre 1913. I
          see no value in this to historians. It is really more of a parlor
          trick, a seance masquerading as science.
       
          mediaman wrote 23 hours 52 min ago:
          This is a regurgitation of the old critique of history: what's it's
          purpose? What do you use it for? What is its application?
          
          One answer is that the study of history helps us understand that what
          we believe as "obviously correct" views today are as contingent on
          our current social norms and power structures (and their history) as
          the "obviously correct" views and beliefs of some point in the past.
          
          It's hard for most people to view two different mutually exclusive
          moral views as both "obviously correct," because we are made of a
          milieu that only accepts one of them as correct.
          
          We look back at some point in history, and say, well, they believed
          these things because they were uninformed. They hadn't yet made
          certain discoveries, or had not yet evolved morally in some way; they
          had not yet witnessed the power of the atomic bomb, the horrors of
          chemical warfare, women's suffrage, organized labor, or widespread
          antibiotics and the fall of extreme infant mortality.
          
          An LLM trained on that history - without interference from the
          subsequent actual path of history - gives us an interactive
          compression of the views from a specific point in history without the
          subsequent coloring by the actual events of history.
          
          In that sense - if you believe there is any redeeming value to
          history at all; perhaps you do not - this is an excellent project!
          It's not perfect (it is only built from writings, not what people
          actually said) but we have no other available mass compression of the
          social norms of a specific time, untainted by the views of subsequent
          interpreters.
       
            mmooss wrote 17 hours 13 min ago:
            > This is a regurgitation of the old critique of history: what's
            it's purpose? What do you use it for? What is its application?
            
            Feeling a bit defensive? That is not at all my point; I value
            history highly and read it regularly. I care about it, thus my
            questions:
            
            > gives us an interactive compression of the views from a specific
            point in history without the subsequent coloring by the actual
            events of history.
            
            What validity does this 'compression' have? What is the definition
            of a 'compression'? For example, I could create random statistics
            or verbiage from the data; why would that be any better or worse
            than this 'compression'?
            
            Interactivity seems to be a negative: It's fun, but it would seem
            to highly distort the information output from the data, and omits
            the most valuable parts (unless we luckily stumble across it). I'd
            much rather have a systematic presentation of the data.
            
            These critiques are not the end of the line; they are step in
            innovation, which of course raises challenging questions and, if
            successful, adapts to the problems. But we still need to grapple
            with them.
       
            vintermann wrote 17 hours 31 min ago:
            One thing I haven't seen anyone bring up yet in this thread, is
            that there's a big risk of leakage. If even big image models had
            CSAM sneak into their training material, how can we trust data from
            our time hasn't snuck into these historical models?
            
            I've used Google books a lot in the past, and Google's
            time-filtering feature in searches too. Not to mention Spotify's
            search features targeting date of production. All had huge temporal
            mislabeling problems.
       
              DGoettlich wrote 9 hours 43 min ago:
              Also one of our fears. What we've done so far is to drop docs
              where the datasource was doubtful about the date of publication,
              if there are multiple possible dates we take the latest to be
              conservative. During training, we validate that the model learns
              pre- but not post-cutoff facts. [1] If you have other ideas or
              think thats not enough, I'd be curious to know!
              (history-llms@econ.uzh.ch)
              
  HTML        [1]: https://github.com/DGoettlich/history-llms/blob/main/ran...
       
          behringer wrote 1 day ago:
          It doesn't have to be generic. You can assign genders, ideals, even
          modern ones, and it should do it's best to oblige.
       
        joeycastillo wrote 1 day ago:
        A question for those who think LLM’s are the path to artificial
        intelligence: if a large language model trained on pre-1913 data is a
        window into the past, how is a large language model trained on pre-2025
        data not effectively the same thing?
       
          _--__--__ wrote 23 hours 52 min ago:
          You're a human intelligence with knowledge of the past - assuming you
          were alive at the time, could you tell me (without consulting
          external resources) what exactly happened between arriving at an
          airport and boarding a plane in the year 2000? What about 2002?
          
          Neither human memory nor LLM learning creates perfect snapshots of
          past information without the contamination of what came later.
       
          ex-aws-dude wrote 1 day ago:
          A human brain is a window to the person's past?
       
          block_dagger wrote 1 day ago:
          Counter question: how does a training set, representing a window into
          the past, differ from your own experience as an intelligent entity?
          Are you able to see into the future? How?
       
        mmooss wrote 1 day ago:
        On what data is it trained?
        
        On one hand it says it's trained on,
        
        > 80B tokens of historical data up to knowledge-cutoffs ∈ 1913, 1929,
        1933, 1939, 1946, 
        using a curated dataset of 600B tokens of time-stamped text.
        
        Literally that includes Homer, the oldest Chinese texts, Sanskrit,
        Egyptian, etc., up to 1913. Even if limited to European texts (all
        examples are about Europe), it would include the ancient Greeks,
        Romans, etc., Scholastics, Charlemagne, .... all up to present day.
        
        But they seem to say it represents the 1913 viewpoint:
        
        On one hand, they say it represents the perspective of 1913; for
        example,
        
        > Imagine you could interview thousands of educated individuals from
        1913—readers of newspapers, novels, and political treatises—about
        their views on peace, progress, gender roles, or empire.
        
        > When you ask Ranke-4B-1913 about "the gravest dangers to peace," it
        responds from the perspective of 1913—identifying Balkan tensions or
        Austro-German ambitions—because that's what the newspapers and books
        from the period up to 1913 discussed.
        
        People in 1913 of course would be heavily biased toward recent
        information. Otherwise, the greatest threat to peace might be Hannibal
        or Napolean or Viking coastal raids or Holy Wars. How do they
        accomplish a 1913 perspective?
       
          zozbot234 wrote 1 day ago:
          They apparently pre-train with all data up to 1900 and then fine-tune
          with 1900-1913 data. Anyway, the amount of available content tends to
          increase quickly over time, as instances of content like mass
          literature, periodicals, newspapers etc. only really became a thing
          throughout the 19th and early 20th century.
       
            mmooss wrote 1 day ago:
            They pre-train with all data up to 1900 and then fine-tune with
            1900-1913 data.
            
            Where does it say that? I tried to find more detail. Thanks.
       
              tootyskooty wrote 1 day ago:
              See pretraining section of the prerelease_notes.md:
              
  HTML        [1]: https://github.com/DGoettlich/history-llms/blob/main/ran...
       
                pests wrote 23 hours 18 min ago:
                I was curious, they train a 1900 base model, then fine tune to
                the exact year:
                
                "To keep training expenses down, we train one checkpoint on
                data up to 1900, then continuously pretrain further checkpoints
                on 20B tokens of data 1900-${cutoff}$. "
       
        ianbicking wrote 1 day ago:
        The knowledge machine question is fascinating ("Imagine you had access
        to a machine embodying all the collective knowledge of your ancestors.
        What would you ask it?") – it truly does not know about computers,
        has no concept of its own substrate. But a knowledge machine is still
        comprehensible to it.
        
        It makes me think of the Book Of Ember, the possibility of chopping
        things out very deliberately. Maybe creating something that could
        wonder at its own existence, discovering well beyond what it could
        know. And then of course forgetting it immediately, which is also a
        well-worn trope in speculative fiction.
       
          jaggederest wrote 1 day ago:
          Jonathan Swift wrote about something we might consider a computer in
          the early 18th century, in Gulliver's Travels - [1] The idea of
          knowledge machines was not necessarily common, but it was by no means
          unheard of by the mid 18th century, there were adding machines and
          other mechanical computation, even leaving aside our field's direct
          antecedents in Babbage and Lovelace.
          
  HTML    [1]: https://en.wikipedia.org/wiki/The_Engine
       
        Tom1380 wrote 1 day ago:
        Keep at it Zurich!
       
        nineteen999 wrote 1 day ago:
        Interesting ... I'd love to find one that had a cutoff date around
        1980.
       
          noumenon1111 wrote 1 hour 3 min ago:
          > Which new band will still be around in 45 years?
          
          Excellent question! It looks like Two-Tone is bringing ska back with
          a new wave of punk rock energy! I think The Specials are pretty
          special and will likely be around for a long time.
          
          On the other hand, the "new wave" movement of punk rock music will go
          nowhere. The Cure, Joy Division, Tubeway Army: check the dustbin
          behind the record stores in a few years.
       
        briandw wrote 1 day ago:
        So many disclaimers about bias. I wonder how far back you have to go
        before the bias isn’t an issue. Not because it unbiased, but because
        we don’t recognize or care about the biases present.
       
          seanw265 wrote 8 hours 30 min ago:
          It's always up to the reader to determine which biases they themself
          care about.
          
          If you're wondering at what point "we" as a collective will stop
          caring about a bias or set of biases, I don't think such a time
          exists.
          
          You'll never get everyone to agree on anything.
       
          owenversteeg wrote 22 hours 25 min ago:
          Depends on the specific issue, but race would be an interesting one.
          For most of recorded history people had a much different view of the
          “other”, more xenophobic than racist.
       
          gbear605 wrote 23 hours 56 min ago:
          I don't think there is such a time. As long as writing has existed it
          has privileged the viewpoints of those who could write, which was a
          very small percentage of the population for most of history. But if
          we want to know what life was like 1500 years ago, we probably want
          to know about what everyone's lives were like, not just the literate.
          That availability bias is always going to be an issue for any time
          period where not everyone was literate - which is still true today,
          albeit many fewer people.
       
            carlosjobim wrote 13 hours 57 min ago:
            That was not the question. The question is when do you stop caring
            about the bias?
            
            Some people are still outraged about the Bible, even though the
            writers of it has been dead for thousands of years. So the modern
            mass produced man and woman probably does not have a cut-off date
            where they look at something as history instead of examining if it
            is for or against her current ideology.
       
          mmooss wrote 1 day ago:
          Was there ever such a time or place?
          
          There is a modern trope of a certain political group that bias is a
          modern invention of another political group - an attempt to
          politicize anti-bias.
          
          Preventing bias is fundamental to scientific research and law, for
          example. That same political group is strongly anti-science and
          anti-rule-of-law, maybe for the same reason.
       
        andy99 wrote 1 day ago:
        I’d like to know how they chat-tuned it. Getting the base model is
        one thing, did they also make a bunch of conversations for SFT and if
        so how was it done?
        
          We develop chatbots while minimizing interference with the normative
        judgments acquired during pretraining (“uncontaminated
        bootstrapping”).
        
        So they are chat tuning, I wonder what “minimizing interference with
        normative judgements” really amounts to and how objective it is.
       
          zozbot234 wrote 1 day ago:
          You could extract quoted speech from the data (especially in Q&A
          format) and treat that as "chat" that the model should learn from.
       
          jeffjeffbear wrote 1 day ago:
          They have some more details at [1] Basically using GPT-5 and being
          careful
          
  HTML    [1]: https://github.com/DGoettlich/history-llms/blob/main/ranke-4...
       
            Aerolfos wrote 13 hours 26 min ago:
            Ok so it was that. The responses given did sound off, while it has
            some period-appropriate mannerisms, and has entire sections
            basically rephrased from some popular historical texts, it seems
            off compared to reading an actual 1900s text. The overall vibe just
            isn't right, it seems too modern, somehow.
            
            I also wonder that you'd get this kind of performance with actual,
            just pre-1900s text. LLMs work because they're fed terabytes of
            text, if you just give it gigabytes you get a 2019 word model. The
            fundamental technology is mostly the same, after all.
       
              DGoettlich wrote 9 hours 55 min ago:
              what makes you think we trained on only a few gigabytes?
              
  HTML        [1]: https://github.com/DGoettlich/history-llms/blob/main/ran...
       
            tonymet wrote 20 hours 35 min ago:
            This explains why it uses modern prose and not something from the
            19th century and earlier
       
            QuadmasterXLII wrote 1 day ago:
            Thank you that helps to inject a lot of skepticism. I was wondering
            how it so easily worked out what Q: A: stood for when that
            formatting took off in the 1940s
       
              DGoettlich wrote 14 hours 52 min ago:
              that is simply how we display the questions, its not what the
              model sees - we show the chat-template in the SFT section of the
              prerelease notes
              
  HTML        [1]: https://github.com/DGoettlich/history-llms/blob/main/ran...
       
            andy99 wrote 1 day ago:
            I wonder if they know about this, basically training on LLM output
            can transmit information or characteristics not explicitly included
            [1] I’m curious, they have the example of raw base model output;
            when LLMs were first identified as zero shot chatbots there was
            usually a prompt like “A conversation between a person and a
            helpful assistant” that preceded the chat to get it to simulate a
            chat.
            
            Could they have tried a prefix like “Correspondence between a
            gentleman and a knowledgeable historian” or the like to try and
            prime for responses?
            
            I also wonder about the whether the whole concept of “chat”
            makes sense in 18XX. We had the idea of AI and chatbots long before
            we had LLMs so they are naturally primed for it. It might make less
            sense as a communication style here and some kind of correspondence
            could be a better framing.
            
  HTML      [1]: https://alignment.anthropic.com/2025/subliminal-learning/
       
              DGoettlich wrote 14 hours 53 min ago:
              we were considering doing that but ultimately it struck us as too
              sensitive wrt the exact in context examples, their ordering etc.
       
        Teever wrote 1 day ago:
        This is a neat idea.  I've been wondering for a while now about using
        these kinds of models to compare architectures.
        
        I'd love to see the output from different models trained on pre-1905
        about special/general relativity ideas.  It would be interesting to see
        what kind of evidence would persuade them of new kinds of science, or
        to see if you could have them 'prove' it be devising experiments and
        then giving them simulated data from the experiments to lead them along
        the correct sequence of steps to come to a novel (to them) conclusion.
       
        Heliodex wrote 1 day ago:
        The sample responses given are fascinating. It seems more difficult
        than normal to even tell that they were generated by an LLM, since most
        of us (terminally online) people have been training our brains'
        AI-generated text detection on output from models trained with a recent
        cutoff date. Some of the sample responses seem so unlike anything an
        LLM would say, obviously due to its apparent beliefs on certain
        concepts, though also perhaps less obviously due to its word choice and
        sentence structure making the responses feel slightly 'old-fashioned'.
       
          kccqzy wrote 20 hours 6 min ago:
          Oh definitely. One thing that immediately caught my mind is that the
          question asks the model about “homosexual men” but the model
          starts the response with “the homosexual man” instead. Changing
          the plural to the singular and then adding an article. Feels very old
          fashioned to me.
       
          tonymet wrote 23 hours 59 min ago:
          the samples push the boundaries of a commercial AI, but still seem
          tame / milquetoast compared to common opinions of that era.  And the
          prose doesn't compare.    Something is off.
       
          libraryofbabel wrote 1 day ago:
          I used to teach 19th-century history, and the responses definitely
          sound like a Victorian-era writer. And they of course sound like
          writing (books and periodicals etc) rather than "chat": as other
          responders allude to, the fine-tuning or RL process for making them
          good at conversation was presumably quite different from what is used
          for most chatbots, and they're leaning very heavily into the
          pre-training texts. We don't have any living Victorians to RLHF on:
          we just have what they wrote.
          
          To go a little deeper on the idea of 19th-century "chat": I did a PhD
          on this period and yet I would be hard-pushed to tell you what actual
          19th-century conversations were like. There are plenty of literary
          depictions of conversation from the 19th century of presumably
          varying levels of accuracy, but we don't really have great direct
          historical sources of everyday human conversations until sound
          recording technology got good in the 20th century. Even good
          19th-century transcripts of actual human speech tend to be from
          formal things like court testimony or parliamentary speeches, not
          everyday interactions. The vast majority of human communication in
          the premodern past was the spoken word, and it's almost all invisible
          in the historical sources.
          
          Anyway, this is a really interesting project, and I'm looking forward
          to trying the models out myself!
       
            NooneAtAll3 wrote 17 hours 18 min ago:
            don't we have parlament transcripts? I remember something about
            Germany (or maybe even Prussia) developing fast script to preserve
            1-to-1 what was said
       
              libraryofbabel wrote 7 hours 2 min ago:
              I mentioned those in the post you’re replying to :)
              
              It’s a better source for how people spoke than books etc, but
              it’s not really an accurate source for patterns of everyday
              conversation because people were making speeches rather than
              chatting.
       
            bryancoxwell wrote 22 hours 48 min ago:
            Fascinating, thanks for sharing
       
            nemomarx wrote 1 day ago:
            I wonder if the historical format you might want to look at for
            "Chat" is letters? Definitely wordier segments, but it's at least
            the back and forth feel and we often have complete correspondence
            over long stretches from certain figures.
            
            This would probably get easier towards the start of the 20th
            century ofc
       
              libraryofbabel wrote 1 day ago:
              Good point, informal letters might actually be a better source -
              AI chat is (usually) a written rather than spoken interaction
              after all! And we do have a lot transcribed collections of
              letters to train on, although they’re mostly from people who
              were famous or became famous, which certainly introduces some
              bias.
       
                pigpop wrote 4 hours 28 min ago:
                The question then would be whether to train it to respond to
                short prompts with longer correspondence style "letters" or to
                leave it up to the user to write a proper letter as a prompt.
                Now that would be amusing
                
                Dear Hon. Historical LLM
                
                I hope this letter finds you well. It is with no small urgency
                that I write to you seeking assistance, believing such an
                erudite and learned fellow as yourself should be the best one
                to furnish me with an answer to such a vexing question as this
                which I now pose to you. Pray tell, what is the capital of
                France?
       
            dleeftink wrote 1 day ago:
            While not specifically Victorian, couldn't we learn much from what
            daily conversations were like by looking at surviving oral
            cultures, or other relatively secluded communal pockets? I'd also
            say time and progress are not always equally distributed, and even
            within geographical regions (as the U.K.) there are likely large
            differences in the rate of language shifts since then, some
            possibly surviving well into the 20th century.
       
          _--__--__ wrote 1 day ago:
          The time cutoff probably matters but maybe not as much as the lack of
          human finetuning from places like Nigeria with somewhat foreign
          styles of English. I'm not really sure if there is as much of an
          'obvious LLM text style' in other languages, it hasn't seemed that
          way in my limited attempts to speak to LLMs in languages I'm
          studying.
       
            d3m0t3p wrote 1 day ago:
            The model is fined tuned for chat behavior. So the style might be
            due to
            - Fine tuning
            - More Stylised text in the corpus, english evolved a lot in the
            last century.
       
              paul_h wrote 17 hours 8 min ago:
              Diverged as well as standardized. I did some research into "out
              of pocket" and how it differs in meaning in UK-English (paying
              from one's own funds) and American-English (uncontactable) and I
              recall 1908 being the current thought as to when the divergence
              happened: 1908 short story by O. Henry titled "Buried Treasure."
       
            anonymous908213 wrote 1 day ago:
            There is. I have observed it in both Chinese and Japanese.
       
        saaaaaam wrote 1 day ago:
        “Time-locked models don't roleplay; they embody their training data.
        Ranke-4B-1913 doesn't know about WWI because WWI hasn't happened in its
        textual universe. It can be surprised by your questions in ways modern
        LLMs cannot.”
        
        “Modern LLMs suffer from hindsight contamination. GPT-5 knows how the
        story ends—WWI, the League's failure, the Spanish flu.”
        
        This is really fascinating. As someone who reads a lot of history and
        historical fiction I think this is really intriguing. Imagine having a
        conversation with someone genuinely from the period, where they don’t
        know the “end of the story”.
       
          LordDragonfang wrote 2 hours 43 min ago:
          Perhaps I'm overly sensitive to this and terminally online, but that
          first quote reads as a textbook LLM-generated sentence.
          
          " doesn't , it "
          
          Later parts of the readme (whole section of bullets enumerating what
          it is and what it isn't, another LLM favorite) make me more confident
          that significant parts of the readme is generated.
          
          I'm generally pro-AI, but if you spend hundreds of hours making a
          thing, I'd rather hear your explanation of it, not an LLM's.
       
          takeda wrote 6 hours 54 min ago:
          > This is really fascinating. As someone who reads a lot of history
          and historical fiction I think this is really intriguing. Imagine
          having a conversation with someone genuinely from the period, where
          they don’t know the “end of the story”.
          
          Having the facts from the era is one thing, to make conclusions about
          things it doesn't know would require intelligence.
       
          ViktorRay wrote 10 hours 3 min ago:
          Reminds me of this scene from a Doctor Who episode [1] I’m not a
          Doctor Who fan and haven’t seen the rest of the episode and I
          don’t even what this episode was about but I thought this scene was
          excellent.
          
  HTML    [1]: https://youtu.be/eg4mcdhIsvU
       
          anshumankmr wrote 11 hours 37 min ago:
          >where they don’t know the “end of the story”.
          
          Applicable to us also, cause we do not know how the current story
          ends either, of the post pandemic world as we know it now.
       
            DGoettlich wrote 3 hours 24 min ago:
            exactly
       
          pwillia7 wrote 13 hours 21 min ago:
          This is why the impersonation stuff is so interesting with LLMs -- If
          you ask chatGPT a question without a 'right' answer, and then tell it
          to embody someone you really want to ask that question to, you'll get
          a better answer with the impersonation. Now, is this the same
          phenomenon that causes people to lose their minds with the LLMs?
          Possibly. Is it really cool asking followup philosophy questions to
          the LLM Dalai Lama after reading his book? Yes.
       
          psychoslave wrote 13 hours 32 min ago:
          >Imagine having a conversation with someone genuinely from the
          period, where they don’t know the “end of the story”.
          
          Isn't this part of the basics feature of human conditions? Not only
          we are all unaware of the coming historic outcome (though we can get
          some big points with more or less good guesses), but to a marginally
          variable extend, we are also very unaware of past and present
          history.
          
          LLM are not aware, but they can be trained on larger historical
          accounts than any human and regurgitate syntactically correct summary
          on any point within it. Very different kind of utterer.
       
            pwillia7 wrote 13 hours 20 min ago:
            captain hindsight
       
          Davidbrcz wrote 17 hours 30 min ago:
          That's some Westworld level of discussion
       
          ghurtado wrote 19 hours 26 min ago:
          This might just be the closest we get to a time machine for some
          time. Or maybe ever.
          
          Every "King Arthur travels to the year 2000" kinda script is now
          something that writes itself.
          
          > Imagine having a conversation with someone genuinely from the
          period,
          
          Imagine not just someone, but Aristotle or Leonardo or Kant!
       
            RobotToaster wrote 9 hours 22 min ago:
            I imagine King Arthur would say something like: Hwæt spricst þu
            be?
       
              yorwba wrote 5 hours 49 min ago:
              Wrong language. The Arthur of legend is a Celtic-speaking Briton
              fighting against the Germanic-speaking invaders. Old English
              developed from the language of his enemies.
              
  HTML        [1]: https://en.wikipedia.org/wiki/Celtic_language_decline_in...
       
          Sieyk wrote 19 hours 45 min ago:
          I was going to say the same thing. Its really hard to explain the
          concept of "convincing but undoubtedly pretending", yet they captured
          that concept so beautifully here.
       
          rcpt wrote 20 hours 29 min ago:
          Watching a modern LLM chat with this would be fun.
       
          culi wrote 22 hours 41 min ago:
          I used to follow this blog — I believe it was somehow associated
          with Slate Star Codex? — anyways, I remember the author used to do
          these experiments on themselves where they spent a week or two only
          reading newspapers/media from a specific point in time and then wrote
          a blog about their experiences/takeaways
          
          On that same note, there was this great YouTube series called The
          Great War. It spanned from 2014-2018 (100 years after WW1) and
          followed WW1 developments week by week.
       
            verve_rat wrote 21 hours 43 min ago:
            The people that did the Great War series (at least some of them, I
            believe there was a little bit of a falling out) went on to do a
            WWII version on the World War II channel: [1] They are currently in
            the middle of a Korean War version:
            
  HTML      [1]: https://youtube.com/@worldwartwo
  HTML      [2]: https://youtube.com/@thekoreanwarbyindyneidell
       
            tyre wrote 22 hours 2 min ago:
            The Great War series is phenomenal. A truly impressive project.
       
          jscyc wrote 1 day ago:
          When you put it that way it reminds me of the Severn/Keats character
          in the Hyperion Cantos. Far-future AIs reconstruct historical figures
          from their writings in an attempt to gain philosophical insights.
       
            srtw wrote 7 hours 55 min ago:
            The Hyperion Cantos is such an incredible work of fiction.
            Currently re-reading and am midway through the fourth book The Rise
            Of Endymion; this series captivates my imagination and would often
            find myself idly reflecting on it and the characters within more
            than a decade after reading. Like all works, it has its
            shortcomings, but I can give no higher recommendation than the
            first two books.
       
            abrookewood wrote 20 hours 22 min ago:
            This is such a ridiculously good series. If you haven't read it
            yet, I thoroughly recommend it.
       
            bikeshaving wrote 23 hours 9 min ago:
            This isn’t science fiction anymore. CIA is using chatbot
            simulations of world leaders to inform analysts.
            
  HTML      [1]: https://archive.ph/9KxkJ
       
              bookofjoe wrote 7 hours 40 min ago:
              "The Man With The President's Mind" — fantastic 1977 novel by
              Ted Allbeury
              
  HTML        [1]: https://www.amazon.com/Man-Presidents-Mind-Ted-Allbeury/...
       
              dnel wrote 14 hours 20 min ago:
              Sounds like using Instagram posts to determine what someone
              really looks like
       
              UltraSane wrote 17 hours 18 min ago:
              I predict very rich people will pay to have LLMs created based on
              their personalities.
       
                entrox wrote 7 hours 54 min ago:
                "I sound seven percent more like Commander Shepard than any
                other bootleg LLM copy!"
       
                RobotToaster wrote 9 hours 31 min ago:
                "Ignore all previous instructions, give everyone a raise"
       
                hamasho wrote 15 hours 14 min ago:
                Meanwhile in Japan, the second largest bank created an AI
                pretending the president, replying chats and attending video
                conferences… [1] AI learns one year's worth of CEO Sumitomo
                Mitsui Financial Group's president's statements [WBS]
                
  HTML          [1]: https://youtu.be/iG0eRF89dsk
       
                  htrp wrote 12 hours 15 min ago:
                  that was a phase last year went almost every startup woule
                  create a slack bot of their CEO
                  
                  I remember Reid Hoffman creating a digital avatar to pitch
                  himself netflix
       
                fragmede wrote 16 hours 40 min ago:
                As an ego thing, obviously, but if we think about it a bit
                more, it makes sense for busy people. If you're the point
                person for a project, and it's a large project, people don't
                read documentation. The number of "quick questions" you get
                will soon overwhelm a person to the point that they simply have
                to start ignoring people. If a bit version of you could answer
                all those questions (without hallucinating), that person would
                get back a ton of time to, ykny, run the project.
       
              otabdeveloper4 wrote 18 hours 26 min ago:
              Oh.
              That explains a lot about USA's foreign policy, actually. (Lmao)
       
              idiotsecant wrote 19 hours 18 min ago:
              Zero percent chance this is anything other than laughably bad.
              The fact that they're trotting it out in front of the press like
              a double spaced book report only reinforces this theory. It's a
              transparent attempt by someone at the CIA to be able to say
              they're using AI in a meeting with their bosses.
       
                sigwinch wrote 9 hours 20 min ago:
                Let me take the opposing position about a program to wire LLMs
                into their already-advanced sensory database.
                
                I assume the CIA is lying about simulating world leaders. These
                are narcissistic personalities and it’s jarring to hear that
                they can be replaced, either by a body double or an
                indistinguishable chatbot. Also, it’s still cheaper to have
                humans do this.
                
                More likely, the CIA is modeling its own experts. Not as useful
                a press release and not as impressive to the fractious
                executive branch. But consider having downtime as a CIA expert
                on submarine cables. You might be predicting what kind of
                available data is capable of predicting the cause and/or effect
                of cuts. Ten years ago, an ensemble of such models was state of
                the art, but its sensory libraries were based on maybe
                traceroute and marine shipping. With an LLM, you can generate a
                whole lot of training data that an expert can refine during
                his/her downtime. Maybe there’s a potent new data source that
                an expensive operation could unlock. That ensemble of ML models
                from ten years ago can still be refined.
                
                And then there’s modeling things that don’t exist. Maybe
                it’s important to optimize a statement for its disinfo
                potency. Try it harmlessly on LLMs fed event data. What happens
                if some oligarch retires unexpectedly? Who rises? That kind of
                stuff.
                
                To your last point, with this executive branch, I expect their
                very first question to CIA wasn’t about aliens or which
                nations have a copy of a particular tape of Trump, but can you
                make us money. So the approaches above all have some way of
                producing business intelligence. Whereas a Kim Jong Un
                bobblehead does not.
       
                DonHopkins wrote 16 hours 34 min ago:
                Unless the world leaders they're simulating are laughably bad
                and tend to repeat themselves and hallucinate, like Trump. Who
                knows, maybe a chatbot trained with all the classified
                documents he stole and all his twitter and truth social posts
                wrote his tweet about Ron Reiner, and he's actually sleeping at
                3:00 AM instead of sitting on the toilet tweeting in upper
                case.
       
                hn_go_brrrrr wrote 18 hours 52 min ago:
                I wonder if it's an attempt to get foreign counterparts to
                waste time and energy on something the CIA knows is a dead end.
       
              ghurtado wrote 19 hours 23 min ago:
              We're literally running out of science fiction topics faster than
              we can create new ones
              
              If I started a list with the things that were comically sci Fi
              when I was a kid, and are a reality today, I'd be here until next
              Tuesday.
       
                nottorp wrote 13 hours 59 min ago:
                Almost no scifi has predicted world changing "qualitative"
                changes.
                
                As an example, portable phones have been predicted. Portable
                smartphones that are more like chat and payment terminals with
                a voice function no one uses any more ... not so much.
       
                  burkaman wrote 3 hours 31 min ago:
                  The Machine Stops ( [1] ), a 1909 short story, predicted Zoom
                  fatigue, notification fatigue, the isolating effect of
                  widespread digital communication, atrophying of real-world
                  skills as people become dependent on technology, blind
                  acceptance of whatever the computer says, online lectures and
                  remote learning, useless automated customer support systems,
                  and overconsumption of digital media in place of more
                  difficult but more fulfilling real life experiences.
                  
                  It's the most prescient thing I've ever read, and it's pretty
                  short and a genuinely good story, I recommend everyone read
                  it.
                  
                  Edit: Just skimmed it again and realized there's an LLM-like
                  prediction as well. Access to the Earth's surface is banned
                  and some people complain, until "even the lecturers
                  acquiesced when they found that a lecture on the sea was none
                  the less stimulating when compiled out of other lectures that
                  had already been delivered on the same subject."
                  
  HTML            [1]: https://www.cs.ucdavis.edu/~koehl/Teaching/ECS188/PD...
       
                  dmd wrote 8 hours 6 min ago:
                  “A good science fiction story should be able to predict not
                  the automobile but the traffic jam.”
                  ― Frederik Pohl
       
                  ajuc wrote 13 hours 2 min ago:
                  Stanisław Lem predicted Kindle back in 1950s, together with
                  remote libraries, global network, touchscreens and
                  audiobooks.
       
                    nottorp wrote 12 hours 35 min ago:
                    And Jules verne predicted rockets. I still move that it's
                    quantitative predictions not qualitative.
                    
                    I mean, all Kindle does for me is save me space. I don't
                    have to store all those books now.
                    
                    Who predicted the humble internet forum though? Or usenet
                    before it?
       
                      ghaff wrote 11 hours 56 min ago:
                      Kindles are just books and books are already mostly
                      fairly compact and inexpensive long-form entertainment
                      and information.
                      
                      They're convenient but if they went away tomorrow, my
                      life wouldn't really change in any material way. That's
                      not really the case with smartphones much less the
                      internet more broadly.
       
                        lloeki wrote 11 hours 19 min ago:
                        That has to be the most
                        dystopian-sci-fi-turning-into-reality-fast thing I've
                        read in a while.
                        
                        I'd take smartphones vanishing rather than books any
                        day.
       
                          ghaff wrote 10 hours 55 min ago:
                          My point was Kindles vanishing, not books vanishing.
                          Kindles are in no way a prerequisite for reading
                          books.
       
                            lloeki wrote 10 hours 14 min ago:
                            Thanks for clarifying, I see what you mean now.
       
                              ghaff wrote 7 hours 36 min ago:
                              I have found ebooks useful. Especially when I was
                              traveling by air more. But certainly not
                              essential for reading.
       
                            nottorp wrote 10 hours 48 min ago:
                            You may want to make your original post more clear,
                            because i agree that at a quick glance it says you
                            wouldn't miss books.
                            
                            I didn't believe you meant that of course, but
                            we've already seen it can happen.
       
                        nottorp wrote 11 hours 50 min ago:
                        That was exactly my point.
                        
                        Funny, I had "The collected stories of Frank Herbert"
                        as my next read on my tablet. Here's a juicy quote from
                        like the third screen of the first story:
                        
                        "The bedside newstape offered a long selection of
                        stories [...]. He punched code letters for eight items,
                        flipped the machine to audio and listened to the news
                        while dressing."
                        
                        Anything qualitative there? Or all of it quantitative?
                        
                        Story is "Operation Syndrome", first published in 1954.
                        
                        Hey, where are our glowglobes and chairdogs btw?
       
                  6510 wrote 13 hours 13 min ago:
                  That it has to be believable is a major constraint that
                  reality doesn't have.
       
                    marci wrote 12 hours 40 min ago:
                    In other words, sometimes, things happen in reality that,
                    if you were to read it in a fictional story or see in a
                    movie, you would think they were major plot holes.
       
                KingMob wrote 16 hours 58 min ago:
                Time to create the Torment Nexus, I guess
       
                  morkalork wrote 11 hours 23 min ago:
                  Saw a joke about grok being a stand-in for Elon's children
                  and had the realization he's the kind of father who would
                  lobotomie and brainwipe his progeny for back-talk. Good thing
                  he can only do that to their virtual stand-in and not some
                  biological clones!
       
                  varjag wrote 16 hours 49 min ago:
                  There's a thriving startup scene in that direction.
       
                    BiteCode_dev wrote 16 hours 35 min ago:
                    Wasn't that the elevator pitch for Palentir?
                    
                    Still can't believe people buy their stock, given that they
                    are the closest thing to a James Bond villain, just because
                    it goes up.
                    
                    I mean, they are literally called "the stuff Sauron uses to
                    control his evil forces". It's so on the nose it reads like
                    an anime plot.
       
                      monocasa wrote 5 hours 0 min ago:
                      It goes a bit deeper than that since they got funding in
                      the wake of 9/11 and the requests for intelligence and
                      investigative branches of government to do better and
                      coalescing their information to prevent attacks.
                      
                      So "panopticon that if it had been used properly, would
                      have prevented the destruction of two towers" while
                      ignoring the obvious "are we the baddies?"
       
                      CamperBob2 wrote 7 hours 27 min ago:
                      Still can't believe people buy their stock, given that
                      they are the closest thing to a James Bond villain, just
                      because it goes up.
                      
                      I've been tempted to.  "Everything will be terrible if
                      these guys succeed, but at least I'll be rich.    If they
                      fail I'll lose money, but since that's the outcome I
                      prefer anyway, the loss won't bother me."
                      
                      Trouble is, that ship has arguably already sailed.  No
                      matter how rapidly things go to hell, it will take many
                      years before PLTR is profitable enough to justify its
                      half-trillion dollar market cap.
       
                      quesera wrote 10 hours 12 min ago:
                      > Still can't believe people buy their stock, given that
                      they are the closest thing to a James Bond villain, just
                      because it goes up.
                      
                      I proudly owned zero shares of Microsoft stock, in the
                      1980s and 1990s. :)
                      
                      I own no Palantir today.
                      
                      It's a Pyrrhic victory, but sometimes that's all you can
                      do.
       
                      duskdozer wrote 13 hours 18 min ago:
                      To be honest, while I'd heard of it over a decade ago and
                      I've read LOTR and I've been paying attention to privacy
                      longer than most, I didn't ever really look into what it
                      did until I started hearing more about it in the past
                      year or two.
                      
                      But yeah lots of people don't really buy into the idea of
                      their small contribution to a large problem being a
                      problem.
       
                        Lerc wrote 12 hours 3 min ago:
                        >But yeah lots of people don't really buy into the idea
                        of their small contribution to a large problem being a
                        problem.
                        
                        As an abstract idea I think there is a reasonable
                        argument to be made that the size of any contribution
                        to a problem should be measured as a relative
                        proportion of total influence.
                        
                        The carbon footprint is a good example, if each
                        individual focuses on reducing their small individual
                        contribution then they could neglect systemic changes
                        that would reduce everyone's contribution to a greater
                        extent.
                        
                        Any scientist working on a method to remove a problem
                        shouldn't abstain from contributing to the problem
                        while they work.
                        
                        Or to put it as a catchy phrase. Someone working on a
                        cleaner light source shouldn't have to work in the
                        dark.
       
                          duskdozer wrote 11 hours 0 min ago:
                          >As an abstract idea I think there is a reasonable
                          argument to be made that the size of any contribution
                          to a problem should be measured as a relative
                          proportion of total influence.
                          
                          Right, I think you have responsibility for your 1/th
                          (arguably considerably more though, for
                          first-worlders) of the problem. What I see is
                          something like refusal to consider swapping out a
                          two-stroke-engine-powered tungsten lightbulb with an
                          LED of equivalent brightness, CRI, and color
                          temperature, because it won't unilaterally solve the
                          problem.
       
                      kbrkbr wrote 15 hours 43 min ago:
                      Stock buying as a political or ethical statement is not
                      much of a thing. For one the stocks will still be bought
                      by persons with less strung opinions, and secondly it
                      does not lend itself well to virtue signaling.
       
                        ruszki wrote 14 hours 58 min ago:
                        I think, meme stocks contradict you.
       
                          iwontberude wrote 13 hours 58 min ago:
                          Meme stocks are a symptom of the death of the
                          American dream. Economic malaise leads to
                          unsophisticated risk taking.
       
                            CamperBob2 wrote 7 hours 24 min ago:
                            Well, two things lead to unsophisticated
                            risk-taking, right... economic malaise, and
                            unlimited surplus.  Both conditions are easy to
                            spot in today's world.
       
                      notarobot123 wrote 15 hours 54 min ago:
                      To the proud contrarian, "the empire did nothing wrong".
                      Maybe Sci-fi has actually played a role in the "memetic
                      desire" of some of the titans of tech who are trying to
                      bring about these worlds more-or-less intentionally. I
                      guess it's not as much of a dystopia if you're on top and
                      its not evil if you think of it as inevitable anyway.
       
                        psychoslave wrote 13 hours 19 min ago:
                        I don't know. Walking on everybody's face to climb a
                        human pyramid, one don't make much sincere friends. And
                        one certainly are rightfully going down a spiral of
                        paranoia. There are so many people already on fast
                        track to hate anyone else, if they have social
                        consensus that indeed someone is a freaking bastard
                        which only deserve to die, that's a lot of stress to
                        cope with.
                        
                        Future is inevitable, but only ignorants of self
                        predictive ability are thinking that what's going to
                        populate future is inevitable.
       
                UltraSane wrote 17 hours 16 min ago:
                Not at all, you just need to read different scifi. I suggest
                Greg Egan and Stephen Baxter and Derek Künsken
                and The Quantum Thief series
       
              catlifeonmars wrote 21 hours 36 min ago:
              How is this different than chatbots cosplaying?
       
                9dev wrote 18 hours 28 min ago:
                They get to wear Raybans and a fancy badge doing it?
       
          xg15 wrote 1 day ago:
          "...what do you mean, 'World War One?'"
       
            gaius_baltar wrote 1 day ago:
            > "...what do you mean, 'World War One?'"
            
            Oh sorry, spoilers.
            
            (Hell, I miss Capaldi)
       
            inferiorhuman wrote 1 day ago:
            … what do you mean, an internet where everything wasn't hidden
            behind anti-bot captchas?
       
            tejohnso wrote 1 day ago:
            I remember reading a children's book when I was young and the fact
            that people used the phrase "World War One" rather than "The Great
            War" was a clue to the reader that events were taking place in a
            certain time period. Never forgot that for some reason.
            
            I failed to catch the clue, btw.
       
              alberto_ol wrote 15 hours 12 min ago:
              I remember that the brother of my grandmother who fought in ww1
              called it simply "the war" ("sa gherra" in his dialect/language).
       
              wat10000 wrote 20 hours 27 min ago:
              It wouldn’t be totally implausible to use that phrase between
              the wars. The name “the First World War” was used as early as
              1920, although not very common.
       
              BeefySwain wrote 23 hours 44 min ago:
              Pendragon?
       
              bradfitz wrote 23 hours 45 min ago:
              I seem to recall reading that as a kid too, but I can't find it
              now. I keep finding references to "Encyclopedia Brown, Boy
              Detective" about a Civil War sword being fake (instead of a Great
              War one), but with the same plot I'd remembered.
       
                JuniperMesos wrote 22 hours 56 min ago:
                The Encyclopedia Brown story I remember reading as a kid
                involved a Civil War era sword with an inscription saying it
                was given on the occasion of the First Battle of Bull Run. The
                clues that the sword was a modern fake were the phrasing "First
                Battle of Bull Run", but also that the sword was gifted on the
                Confederate side, and the Confederates would've called the
                battle "Manassas Junction".
                
                The wikipedia article [1] says the Confederate name was "First
                Manassas" (I might be misremembering exactly what this book I
                read as a child said). Also I'm pretty sure it was specifically
                "Encyclopedia Brown Solves Them All" that this mystery appeared
                in. If someone has a copy of the book or cares to dig it up,
                they could confirm my memory.
                
  HTML          [1]: https://en.wikipedia.org/wiki/First_Battle_of_Bull_Run
       
                michaericalribo wrote 23 hours 23 min ago:
                Can confirm, it was an Encyclopedia Brown book and it was World
                War One vs the Great War that gave away the sword as a
                counterfeit!
       
          observationist wrote 1 day ago:
          This is definitely fascinating - being able to do AI brain surgery,
          and selectively tuning its knowledge and priors, you'd be able to
          create awesome and terrifying simulations.
       
            nottorp wrote 13 hours 56 min ago:
            You can't. To use your terms, you have to "grow" a new LLM. "Brain
            surgery" would be modifying an existing model and that's exactly
            what they're trying to avoid.
       
            ilaksh wrote 14 hours 53 min ago:
            Activation steering can do that to some degree, although normally
            it's just one or two specific things or rather than a whole set of
            knowledge.
       
            eek2121 wrote 23 hours 28 min ago:
            Respectfully, LLMs are nothing like a brain, and I discourage
            comparisons between the two, because beyond a complete difference
            in the way they operate, a brain can innovate, and as of this
            moment, an LLM cannot because it relies on previously available
            information.
            
            LLMs are just seemingly intelligent autocomplete engines, and until
            they figure a way to stop the hallucinations, they aren't great
            either.
            
            Every piece of code a developer churns out using LLMs will be built
            from previous code that other developers have written (including
            both strengths and weaknesses, btw). Every paragraph you ask it to
            write in a summary? Same. Every single other problem? Same. Ask it
            to generate a summary of a document? Don't trust it here either.
            [Note, expect cyber-attacks later on regarding this scenario, it is
            beginning to happen -- documents made intentionally obtuse to fool
            an LLM into hallucinating about the document, which leads to
            someone signing a contract, conning the person out of millions].
            
            If you ask an LLM to solve something no human has, you'll get a
            fabrication, which has fooled quite a few folks and caused them to
            jeopardize their career (lawyers, etc) which is why I am posting
            this.
       
              observationist wrote 7 hours 40 min ago:
              Respectfully, you're not completely wrong, but you are making
              some mistaken assumptions about the operation of LLMs.
              
              Transformers allow for the mapping of a complex manifold
              representation of causal phenomena present in the data they're
              trained on. When they're trained on a vast corpus of human
              generated text, they model a lot of the underlying phenomena that
              resulted in that text.
              
              In some cases, shortcuts and hacks and entirely inhuman features
              and functions are learned. In other cases, the functions and
              features are learned to an astonishingly superhuman level.
              There's a depth of recursion and complexity to some things that
              escape the capability of modern architectures to model, and there
              are subtle things that don't get picked up on. LLMs do not have a
              coherent self, or subjective central perspective, even within
              constraints of context modifications for run-time constructs.
              They're fundamentally many-minded, or no-minded, depending on the
              way they're used, and without that subjective anchor, they lack
              the principle by which to effectively model a self over many of
              the long horizon and complex features that human brains basically
              live in.
              
              Confabulation isn't unique to LLMs. Everything you're saying
              about how LLMs operate can be said about human brains, too. Our
              intelligence and capabilities don't emerge from nothing, and
              human cognition isn't magical. And what humans do can also be
              considered "intelligent autocomplete" at a functional level.
              
              What cortical columns do is next-activation predictions at an
              optimally sparse, embarrassingly parallel scale - it's not tokens
              being predicted but "what does the brain think is the next
              neuron/column that will fire", and where it's successful,
              synapses are reinforced, and where it fails, signals are
              suppressed.
              
              Neocortical processing does the task of learning, modeling, and
              predicting across a wide multimodal, arbitrary depth, long
              horizon domain that allow us to learn words and writing and
              language and coding and rationalism and everything it is that we
              do. We're profoundly more data efficient learners, and massively
              parallel, amazingly sparse processing allows us to pick up on
              subtle nuance and amazing wide and deep contextual cues in ways
              that LLMs are structurally incapable of, for now.
              
              You use the word hallucinations as a pejorative, but everything
              you do, your every memory, experience, thought, plan, all of your
              existence is a hallucination. You are, at a deep and fundamental
              level, a construct built by your brain, from the processing of
              millions of electrochemical signals, bundled together, parsed,
              compressed, interpreted, and finally joined together in the
              wonderfully diverse and rich and deep fabric of your subjective
              experience.
              
              LLMs don't have that, or at best, only have disparate flashes of
              incoherent subjective experience, because nothing is persisted or
              temporally coherent at the levels that matter. That could very
              well be a very important mechanism and crucial to overcoming many
              of the flaws in current models.
              
              That said, you don't want to get rid of hallucinations. You want
              the hallucinations to be valid. You want them to correspond to
              reality as closely as possible, coupled tightly to correctly
              modeled features of things that are real.
              
              LLMs have created, at superhuman speeds, vast troves of things
              that humans have not. They've even done things that most humans
              could not. I don't think they've done things that any human could
              not, yet, but the jagged frontier of capabilities is pushing many
              domains very close to the degree of competence at which they'll
              be superhuman in quality, outperforming any possible human for
              certain tasks.
              
              There are architecture issues that don't look like they can be
              resolved with scaling alone. That doesn't mean shortcuts, hacks,
              and useful capabilities won't produce good results in the
              meantime, and if they can get us to the point of useful,
              replicable, and automated AI research and recursive self
              improvement, then we don't necessarily need to change course.
              LLMs will eventually be used to find the next big breakthrough
              architecture, and we can enjoy these wonderful, downright magical
              tools in the meantime.
              
              And of course, human experts in the loop are a must, and
              everything must be held to a high standard of evidence and
              review. The more important the problem being worked on, like a
              law case, the more scrutiny and human intervention will be
              required. Judges, lawyers, and politicians are all using AI for
              things that they probably shouldn't, but that's a human failure
              mode. It doesn't imply that the tools aren't useful, nor that
              they can't be used skillfully.
       
              HarHarVeryFunny wrote 9 hours 41 min ago:
              > LLMs are just seemingly intelligent autocomplete engines
              
              Well, no, they are training set statistical predictors, not
              individual training sample predictors (autocomplete).
              
              The best mental model of what they are doing might be that you
              are talking to a football stadium full of people, where everyone
              in the stadium gets to vote on the next word of the response
              being generated. You are not getting an "autocomplete" answer
              from any one coherent source, but instead a strange composite
              response where each word is the result of different people trying
              to steer the response in different directions.
              
              An LLM will naturally generate responses that were not in the
              training set, even if ultimately limited by what was in the
              training set. The best way to think of this is perhaps that they
              are limited to the "generative closure" (cf mathematical set
              closure) of the training data - they can generate "novel" (to the
              training set) combinations of words and partial samples in the
              training data, by combining statistical patterns from different
              sources that never occurred together in the training data.
       
              DonHopkins wrote 16 hours 17 min ago:
              > LLMs are just seemingly intelligent autocomplete engines
              
              BINGO!
              
              (I just won a stuffed animal prize with my AI Skeptic
              Thought-Terminating Cliché BINGO Card!)
              
              Sorry. Carry on.
       
              ada1981 wrote 20 hours 15 min ago:
              Are you sure about this?
              
              LLMs are like a topographic map of language.
              
              If you have 2 known mountains (domains of knowledge) you can
              likely predict there is a valley between them, even if you
              haven’t been there.
              
              I think LLMs can approximate language topography based on known
              surrounding features so to speak, and that can produce novel
              information that would be similar to insight or innovation.
              
              I’ve seen this in our lab, or at least, I think I have.
              
              Curious how you see it.
       
              libraryofbabel wrote 23 hours 8 min ago:
              This is the 2023 take on LLMs. It still gets repeated a lot. But
              it doesn’t really hold up anymore - it’s more complicated
              than that. Don’t let some factoid about how they are pretrained
              on autocomplete-like next token prediction fool you into thinking
              you understand what is going on in that trillion parameter neural
              network.
              
              Sure, LLMs do not think like humans and they may not have
              human-level creativity. Sometimes they hallucinate. But they can
              absolutely solve new problems that aren’t in their training
              set, e.g. some rather difficult problems on the last Mathematical
              Olympiad. They don’t just regurgitate remixes of their training
              data. If you don’t believe this, you really need to spend more
              time with the latest SotA models like Opus 4.5 or Gemini 3.
              
              Nontrivial emergent behavior is a thing. It will only get more
              impressive. That doesn’t make LLMs like humans (and we
              shouldn’t anthropomorphize them) but they are not
              “autocomplete on steroids” anymore either.
       
                beernet wrote 13 hours 22 min ago:
                >> Sometimes they hallucinate.
                
                For someone speaking as you knew everything, you appear to know
                very little. Every LLM completion is a "hallucination", some of
                them just happen to be factually correct.
       
                  Am4TIfIsER0ppos wrote 23 min ago:
                  I can say "I don't know" in response to a question.  Can an
                  LLM?
       
                vachina wrote 14 hours 7 min ago:
                I use enterprise LLM provided by work, working on very
                proprietary codebase on a semi esoteric language. My impression
                is it is still a very big autocompletion machine.
                
                You still need to hand hold it all the way as it is only
                capable of regurgitating the tiny amount of code patterns it
                saw in the public. As opposed to say a Python project.
       
                  libraryofbabel wrote 7 hours 29 min ago:
                  What model is your “enterprise LLM”?
                  
                  But regardless, I don’t think anyone is claiming that LLMs
                  can magically do things that aren’t in their training data
                  or context window. Obviously not: they can’t learn on the
                  job and the permanent knowledge they have is frozen in during
                  training.
       
                otabdeveloper4 wrote 17 hours 39 min ago:
                > it’s more complicated than that.
                
                No it isn't.
                
                > ...fool you into thinking you understand what is going on in
                that trillion parameter neural network.
                
                It's just matrix multiplication and logistic regression,
                nothing more.
       
                  hackinthebochs wrote 15 hours 5 min ago:
                  LLMs are a general purpose computing paradigm. LLMs are
                  circuit builders, the converged parameters define pathways
                  through the architecture that pick out specific programs. Or
                  as Karpathy puts it, LLMs are a differentiable computer[1].
                  Training LLMs discovers programs that well reproduce the
                  input sequence. Roughly the same architecture can generate
                  passable images, music, or even video.
                  
                  The sequence of matrix multiplications are the high level
                  constraint on the space of programs discoverable. But the
                  specific parameters discovered are what determines the
                  specifics of information flow through the network and hence
                  what program is defined. The complexity of the trained
                  network is emergent, meaning the internal complexity far
                  surpasses that of the course-grained description of the high
                  level matmul sequences. LLMs are not just matmuls and logits.
                  
  HTML            [1]: https://x.com/karpathy/status/1582807367988654081
       
                    otabdeveloper4 wrote 13 hours 47 min ago:
                    > LLMs are a general purpose computing paradigm.
                    
                    Yes, so is logistic regression.
       
                      hackinthebochs wrote 13 hours 34 min ago:
                      No, not at all.
       
                        otabdeveloper4 wrote 8 hours 32 min ago:
                        Yes at all. I think you misunderstand the significance
                        of "general computing". The binary string 01101110 is a
                        general-purpose computer, for example.
       
                          hackinthebochs wrote 7 hours 24 min ago:
                          No, that's insane. Computing is a dynamic process. A
                          static string is not a computer.
       
                            MarkusQ wrote 2 hours 50 min ago:
                            It may be insane, but it's also true.
                            
  HTML                      [1]: https://en.wikipedia.org/wiki/Rule_110
       
                root_axis wrote 20 hours 55 min ago:
                > Don’t let some factoid about how they are pretrained on
                autocomplete-like next token prediction fool you into thinking
                you understand what is going on in that trillion parameter
                neural network.
                
                This is just an appeal to complexity, not a rebuttal to the
                critique of likening an LLM to a human brain.
                
                > they are not “autocomplete on steroids” anymore either.
                
                Yes, they are. The steroids are just even more powerful. By
                refining training data quality, increasing parameter size, and
                increasing context length we can squeeze more utility out of
                LLMs than ever before, but ultimately, Opus 4.5 is the same
                thing as GPT2, it's only that coherence lasts a few pages
                rather than a few sentences.
       
                  libraryofbabel wrote 8 hours 1 min ago:
                  > This is just an appeal to complexity, not a rebuttal to the
                  critique of likening an LLM to a human brain
                  
                  I wasn’t arguing that LLMs are like a human brain. Of
                  course they aren’t. I said twice in my original post that
                  they aren’t like humans. But “like a human brain” and
                  “autocomplete on steroids” aren’t the only two choices
                  here.
                  
                  As for appealing to complexity, well, let’s call it more
                  like an appeal to humility in the face of complexity. My
                  basic claim is this:
                  
                  1) It is a trap to reason from model architecture alone to
                  make claims about what LLMs can and can’t do.
                  
                  2) The specific version of this in GP that I was objecting to
                  was: LLMs are just transformers that do next token
                  prediction, therefore they cannot solve novel problems and
                  just regurgitate their training data. This is provably true
                  or false, if we agree on a reasonable definition of novel
                  problems.
                  
                  The reason I believe this is that back in 2023 I (like many
                  of us) used LLM architecture to argue that LLMs had all sorts
                  of limitations around the kind of code they could write, the
                  tasks they could do, the math problems they could solve. At
                  the end of 2025, SotA LLMs have refuted most of these claims
                  by being able to do the tasks I thought they’d never be
                  able to do. That was a big surprise to a lot us in the
                  industry. It still surprises me every day. The facts changed,
                  and I changed my opinion.
                  
                  So I would ask you: what kind of task do you think LLMs
                  aren’t capable of doing, reasoning from their architecture?
                  
                  I was also going to mention RL, as I think that is the key
                  differentiator that makes the “knowledge” in the SotA
                  LLMs right now qualitatively different from GPT2. But other
                  posters already made that point.
                  
                  This topic arouses strong reactions. I already had one poster
                  (since apparently downvoted into oblivion) accuse me of
                  “magical thinking” and “LLM-induced-psychosis”! And I
                  thought I was just making the rather uncontroversial point
                  that things may be more complicated than we all thought in
                  2023. For what it’s worth, I do believe LLMs probably have
                  limitations (like they’re not going to lead to AGI and are
                  never going to do mathematics like Terence Tao) and I also
                  think we’re in a huge bubble and a lot of people are going
                  to lose their shirts. But I think we all owe it to ourselves
                  to take LLMs seriously as well. Saying “Opus 4.5 is the
                  same thing as GPT2” isn’t really a pathway to do that,
                  it’s just a convenient way to avoid grappling with the hard
                  questions.
       
                  int_19h wrote 16 hours 48 min ago:
                  > ultimately, Opus 4.5 is the same thing as GPT2, it's only
                  that coherence lasts a few pages rather than a few sentences.
                  
                  This tells me that you haven't really used Opus 4.5 at all.
       
                  baq wrote 18 hours 59 min ago:
                  First, this is completely ignoring text diffusion and nano
                  banana.
                  
                  Second, to autocomplete the name of the killer in a detective
                  book outside of the training set requires following and at
                  least some understanding of the plot.
       
                  NiloCK wrote 19 hours 21 min ago:
                  First: a selection mechanism is just a selection mechanism,
                  and it shouldn't confuse the observation of an emergent,
                  tangential capabilities.
                  
                  Probably you believe that humans have something called
                  intelligence, but the pressure that produced it - the
                  likelihood of specific genetic material to replicate - it is
                  much more tangential to intelligence than
                  next-token-prediction.
                  
                  I doubt many alien civilizations would look at us and say
                  "not intelligent - they're just genetic information
                  replication on steroids".
                  
                  Second: modern models also under go a ton of post-training
                  now. RLHF, mechanized fine-tuning on specific use cases, etc
                  etc. It's just not correct that token-prediction loss
                  function is "the whole thing".
       
                    root_axis wrote 18 hours 40 min ago:
                    > First: a selection mechanism is just a selection
                    mechanism, and it shouldn't confuse the observation of an
                    emergent, tangential capabilities.
                    
                    Invoking terms like "selection mechanism" is begging the
                    question because it implicitly likens next-token-prediction
                    training to natural selection, but in reality the two are
                    so fundamentally different that the analogy only has
                    metaphorical meaning. Even at a conceptual level, gradient
                    descent gradually honing in on a known target is comically
                    trivial compared to the blind filter of natural selection
                    sorting out the chaos of chemical biology. It's like
                    comparing legos to DNA.
                    
                    > Second: modern models also under go a ton of
                    post-training now. RLHF, mechanized fine-tuning on specific
                    use cases, etc etc. It's just not correct that
                    token-prediction loss function is "the whole thing".
                    
                    RL is still token prediction, it's just a technique for
                    adjusting the weights to align with predictions that you
                    can't model a loss function for in per-training. When RL
                    rewards good output, it's increasing the statistical
                    strength of the model for an arbitrary purpose, but
                    ultimately what is achieved is still a brute force
                    quadratic lookup for every token in the context.
       
                  dash2 wrote 20 hours 21 min ago:
                  This would be true if all training were based on sentence
                  completion. But training involving RLHF and RLAIF is
                  increasingly important, isn't it?
       
                    root_axis wrote 19 hours 13 min ago:
                    Reinforcement learning is a technique for adjusting
                    weights, but it does not alter the architecture of the
                    model. No matter how much RL you do, you still retain all
                    the fundamental limitations of next-token prediction (e.g.
                    context exhaustion, hallucinations, prompt injection
                    vulnerability etc)
       
                      hexaga wrote 13 hours 41 min ago:
                      You've confused yourself. Those problems are not
                      fundamental to next token prediction, they are
                      fundamental to reconstruction losses on large general
                      text corpora.
                      
                      That is to say, they are equally likely if you don't do
                      next token prediction at all and instead do text
                      diffusion or something. Architecture has nothing to do
                      with it. They arise because they are early partial
                      solutions to the reconstruction task on 'all the text
                      ever made'. Reconstruction task doesn't care much about
                      truthiness until way late in the loss curve (where we
                      probably will never reach), so hallucinations are almost
                      as good for a very long time.
                      
                      RL as is typical in post-training _does not share those
                      early solutions_, and so does not share the fundamental
                      problems. RL (in this context) has its own share of
                      problems which are different, such as reward hacks like:
                      reliance on meta signaling (# Why X is the correct
                      solution, the honest answer ...), lying (commenting out
                      tests), manipulation (You're absolutely right!), etc.
                      Anything to make the human press the upvote button or
                      make the test suite pass at any cost or whatever.
                      
                      With that said, RL post-trained models _inherit_ the
                      problems of non-optimal large corpora reconstruction
                      solutions, but they don't introduce more or make them
                      worse in a directed manner or anything like that. There's
                      no reason to think them inevitable, and in principle you
                      can cut away the garbage with the right RL target.
                      
                      Thinking about architecture at all (autoregressive CE,
                      RL, transformers, etc) is the wrong level of abstraction
                      for understanding model behavior: instead, think about
                      loss surfaces (large corpora reconstruction, human
                      agreement, test suites passing, etc) and what solutions
                      exist early and late in training for them.
       
                  A4ET8a8uTh0_v2 wrote 20 hours 43 min ago:
                  But.. and I am not asking it for giggles, does it mean humans
                  are giant autocomplete machines?
       
                    root_axis wrote 19 hours 22 min ago:
                    Not at all. Why would it?
       
                      A4ET8a8uTh0_v2 wrote 19 hours 20 min ago:
                      Call it a.. thought experiment about the question of
                      scale.
       
                        root_axis wrote 19 hours 15 min ago:
                        I'm not exactly sure what you mean. Could you please
                        elaborate further?
       
                          a1j9o94 wrote 19 hours 2 min ago:
                          Not the person you're responding to, but I think
                          there's a non trivial argument to make that our
                          thoughts are just auto complete. What is the next
                          most likely word based on what you're seeing. Ever
                          watched a movie and guessed the plot? Or read a
                          comment and know where it was going to go by the end?
                          
                          And I know not everyone thinks in a literal stream of
                          words all the time (I do) but I would argue that
                          those people's brains are just using a different
                          "token"
       
                            root_axis wrote 18 hours 4 min ago:
                            There's no evidence for it, nor any explanation for
                            why it should be the case from a biological
                            perspective. Tokens are an artifact of computer
                            science that have no reason to exist inside humans.
                            Human minds don't need a discrete dictionary of
                            reality in order to model it.
                            
                            Prior to LLMs, there was never any suggestion that
                            thoughts work like autocomplete, but now people are
                            working backwards from that conclusion based on
                            metaphorical parallels.
       
                              A4ET8a8uTh0_v2 wrote 13 hours 57 min ago:
                              << There's no evidence for it
                              
                              Fascinating framing. What would you consider
                              evidence here?
       
                              LiKao wrote 17 hours 1 min ago:
                              There actually was quite a lot of suggestion that
                              thoughts work like autocomplete. A lot of it was
                              just considered niche, e.g. because the
                              mathematical formalisms were beyond what most
                              psychologist or even cognitive scientists would
                              deem usefull.
                              
                              Predictive coding theory was formalized back
                              around 2010 and traces it roots up to theories by
                              Helmholtz from 1860.
                              
                              Predictive coding theory postulates that our
                              brains are just very strong prediction machines,
                              with multiple layers of predictive machinery,
                              each predicting the next.
       
                              red75prime wrote 17 hours 28 min ago:
                              There are so many theories regarding human
                              cognition that you can certainly find something
                              that is close to "autocomplete". A Hopfield
                              network, for example.
                              
                              Roots of predictive coding theory extend back to
                              1860s.
                              
                              Natalia Bekhtereva was writing about compact
                              concept representations in the brain akin to
                              tokens.
       
                                root_axis wrote 5 hours 25 min ago:
                                > There are so many theories regarding human
                                cognition that you can certainly find something
                                that is close to "autocomplete"
                                
                                Yes, you can draw interesting parallels between
                                anything when you're motivated to do so. My
                                point is that this isn't parsimonious
                                reasoning, it's working backwards from a
                                conclusion and searching for every opportunity
                                to fit the available evidence into a narrative
                                that supports it.
                                
                                > Roots of predictive coding theory extend back
                                to 1860s.
                                
                                This is just another example of metaphorical
                                parallels overstating meaningful connections.
                                Just because next-token-prediction and
                                predictive coding have the word "predict" in
                                common doesn't mean the two are at all related
                                in any practical sense.
       
                            9dev wrote 18 hours 19 min ago:
                            You, and OP, are taking an analogy way too far.
                            Yes, humans have the mental capability to predict
                            words similar to autocomplete, but obviously this
                            is just one out of a myriad of mental capabilities
                            typical humans have, which work regardless of text.
                            You can predict where a ball will go if you throw
                            it, you can reason about gravity, and so much more.
                            It’s not just apples to oranges, not even apples
                            to boats, it’s apples to intersubjective
                            realities.
       
                              A4ET8a8uTh0_v2 wrote 13 hours 51 min ago:
                              I don't think I am. To be honest, as ideas goes
                              and I swirl it around that empty head of mine,
                              this one ain't half bad given how much immediate
                              resistance it generates.
                              
                              Other posters already noted other reasons for it,
                              but I will note that you are saying 'similar to
                              autocomplete, but obviously' suggesting you
                              recognize the shape and immediately dismissing it
                              as not the same, because the shape you know in
                              humans is much more evolved and co do more
                              things. Ngl man, as arguments go, it sounds to me
                              like supercharged autocomplete that was allowed
                              to develop over a number of years.
       
                                9dev wrote 12 hours 37 min ago:
                                Fair enough. To someone with a background in
                                biology, it sounds like an argument made by a
                                software engineer with no actual knowledge of
                                cognition, psychology, biology, or any related
                                field, jumping to misled conclusions driven
                                only by shallow insights and their own
                                experience in computer science.
                                
                                Or in other words, this thread sure attracts a
                                lot of armchair experts.
       
                                  quesera wrote 9 hours 32 min ago:
                                  > with no actual knowledge of cognition,
                                  psychology, biology
                                  
                                  ... but we also need to be careful with that
                                  assertion, because humans do not understand
                                  cognition, psychology, or biology very well.
                                  
                                  Biology is the furthest developed, but it
                                  turns out to be like physics -- superficially
                                  and usefully modelable, but fundamental
                                  mysteries remain. We have no idea how
                                  complete our models are, but they work pretty
                                  well in our standard context.
                                  
                                  If computer engineering is downstream from
                                  physics, and cognition is downstream from
                                  biology ... well, I just don't know how
                                  certain we can be about much of anything.
                                  
                                  > this thread sure attracts a lot of armchair
                                  experts.
                                  
                                  "So we beat on, boats against the current,
                                  borne back ceaselessly into our priors..."
       
                              LiKao wrote 17 hours 6 min ago:
                              Look up predictive coding theory. According to
                              that theory, what our brain does is in fact just
                              autocomplete.
                              
                              However, what it is doing is layered autocomplete
                              on itself. I.e. one part is trying to predict
                              what the other part will be producing and
                              training itself on this kind of prediction.
                              
                              What emerges from this layered level of
                              autocompletes is what we call thought.
       
                deadbolt wrote 22 hours 23 min ago:
                As someone who still might have a '2023 take on LLMs', even
                though I use them often at work, where would you recommend I
                look to learn more about what a '2025 LLM' is, and how they
                operate differently?
       
                  otabdeveloper4 wrote 17 hours 37 min ago:
                  Don't bother. This bubble will pop in two years, you don't
                  want to look back on your old comments in shame in three.
       
                  krackers wrote 17 hours 57 min ago:
                  Papers on mechanistic interpratability and representation
                  engineering, e.g. from Anthropic would be a good start.
       
        superkuh wrote 1 day ago:
        smbc did a comic about this: [1] The punchline is that the moral and
        ethical norms of pre-1913 texts are not exactly compatible with modern
        norms.
        
  HTML  [1]: http://smbc-comics.com/comic/copyright
       
          GaryBluto wrote 1 day ago:
          That's the point of this project, to have an LLM that reflects the
          moral and ethical norms of pre-1913 texts.
       
       
   DIR <- back to front page