_______ __ _______
| | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----.
| || _ || __|| < | -__|| _| | || -__|| | | ||__ --|
|___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____|
on Gopher (inofficial)
HTML Visit Hacker News on the Web
COMMENT PAGE FOR:
HTML History LLMs: Models trained exclusively on pre-1913 texts
dkalola wrote 5 min ago:
How can we interact with such models? Is there a web application
interface?
WhitneyLand wrote 1 hour 23 min ago:
Why not use these as a benchmark for LLM ability to make breakthrough
discoveries?
For example prompt the 1913 model to try and âInvent a new theory of
gravity that doesnât conflict with special relativityâ
Would it be able to eventually get to GR? If not, could finding out
why not illuminate important weaknesses.
Muskwalker wrote 1 hour 36 min ago:
So, could this be an example of an LLM trained fully on public domain
copyright-expired data? Or is this not intended to be the case.
kldg wrote 2 hours 54 min ago:
Very neat! I've thought about this with frontier models because they're
ignorant of recent events, though it's too bad old frontier models just
kind of disappear into the aether when a company moves on to the next
iteration. Every company's frontier model today is a time capsule for
the future. There should probably be some kind of preservation attempts
made early so they don't wind up simply deleted; once we're in Internet
time, sifting through the data to ensure scrapes are accurately dated
becomes a nightmare unless you're doing your own regular Internet
scrapes over a long time.
It would be nice to go back substantially further, though it's not too
far back that the commoner becomes voiceless in history and we just get
a bunch of politics and academia. Great job; look forward to testing it
out.
underfox wrote 3 hours 40 min ago:
> [They aren't] perfect mirrors of "public opinion" (they represent
published text, which skews educated and toward dominant viewpoints)
Really good point that I don't think I would've considered on my own.
Easy to take for granted how easy it is to share information (for
better or worse) now, but pre-1913 there were far more structural and
societal barriers to doing the same.
flux3125 wrote 3 hours 53 min ago:
Once I had an interesting interaction with llama 3.1, where I pretended
to be someone from like 100 years in the future, claiming it was part
of a "historical research initiative conducted by Quantum (formerly
Meta), aimed at documenting how early intelligent systems perceived
humanity and its future." It became really interested, asking about how
humanity had evolved and things like that. Then I kept playing along
with different answers, from apocalyptic scenarios to others where AI
gained consciousness and humans and machines have equal rights. It was
fascinating to observe its reaction to each scenario
erichocean wrote 4 hours 0 min ago:
I would love to see this done, by year.
"Give me an LLM from 1928."
etc.
elestor wrote 4 hours 32 min ago:
Excuse me if it's obvious, but how could I run this? I have run local
LLMs before, but only have very minimal experience using ollama run and
that's about it. This seems very interesting so I'd like to try it.
shireboy wrote 4 hours 54 min ago:
Fascinating llm use case I never really thought about til now. Iâd
love to converse with different eras and also do gap analysis with
present time - what modern advances could have come earlier, happened
differently etc.
PeterStuer wrote 6 hours 13 min ago:
How does it do on Python coding? Not 100% troll, cross domain coherence
is a thing.
ulbu wrote 6 hours 14 min ago:
for anyone moaning the plight that it's not accessible to you: they are
historians, I think they're more educated in matters of historical
mistake than you or me. playing safe is simply prudence. it is sorely
lacking in the American approach to technology. prevention is the best
medicine.
davidpfarrell wrote 6 hours 22 min ago:
Can't wait for all the syncopated "Thou dost well to question that"
responses!
sbmthakur wrote 8 hours 33 min ago:
Someone suggested a nice thought experiment - train LLMs on all Physics
before quantum physics was discovered. If the LLM can see still figure
out the latter then certainly we have achieved some success in the
space.
btrettel wrote 10 hours 36 min ago:
This reminded me of some earlier discussion on Hacker News about using
LLMs trained on old texts to determine novelty and obviousness of a
patent application:
HTML [1]: https://news.ycombinator.com/item?id=43440273
arikrak wrote 11 hours 22 min ago:
I wouldn't have expected there to be enough text from before 1913 to
properly train a model, it seemed like they needed an internet of text
to train the first successful LLMs?
alansaber wrote 11 hours 12 min ago:
This model is more comparable to GPT-2 than anything we use now.
Departed7405 wrote 12 hours 26 min ago:
Awesome. Can't wait to try and ask it to predict the 20th century based
on said events. Model size is small, which is great as I can run it
anywhere, but at the same time reasoning might not be great.
usernamed7 wrote 13 hours 10 min ago:
> We're developing a responsible access framework that makes models
available to researchers for scholarly purposes while preventing
misuse.
oh COME ON... "AI safety" is getting out of hand.
delis-thumbs-7e wrote 13 hours 29 min ago:
Isnât there obvious problems baked into this approach, if this is
used for anything but fun? LLMâs lie and fake facts all the time,
they are also masters at enforcing the users bias, even unconscious
ones. How even a professor of history could ensure that the generated
text is actually based on the training material and representative of
the feelings and opinions of the given time period, not enforcing his
biases toward popular topics of the day?
You canât, it is impossible. That will always be an issue as long as
this models are black boxes and trained the way they are. So maybe you
can use this for role playing, but I wouldnât trust a word it says.
kccqzy wrote 6 hours 35 min ago:
To me it is pretty clear that itâs being used for fun. I personally
like reading nineteenth century novels more than more recent novels
(I especially like the style of science fiction by Jules Verne). What
if the model can generate text in that style I like?
r0x0r007 wrote 14 hours 11 min ago:
ffs, to find out what figures from the past thought and how they felt
about the world, maybe we read some of their books, we will get the
context. Don't prompt or train LLM to do it and consider it the hottest
thing since MCP. Besides, what's the point? To teach younger
generations a made up perspective of historic figures? Who guarantees
the correctness/factuality? We will have students chatting with made up
Hitler justifying his actions. So much AI slop everywhere.
moffkalast wrote 14 hours 20 min ago:
> trained from scratch on 80B tokens of historical data
How can this thing possibly be even remotely coherent with just fine
tuning amounts of data used for pretraining?
Agraillo wrote 15 hours 29 min ago:
> Modern LLMs suffer from hindsight contamination. GPT-5 knows how the
story endsâWWI, the League's failure, the Spanish flu. This knowledge
inevitably shapes responses, even when instructed to "forget.
> Our data comes from more than 20 open-source datasets of historical
books and newspapers. ... We currently do not deduplicate the data. The
reason is that if documents show up in multiple datasets, they also had
greater circulation historically. By leaving these duplicates in the
data, we expect the model will be more strongly influenced by documents
of greater historical importance.
I found these claims contradictory. Many books that modern readers
consider historically significant had only niche circulation at the
time of publishing. A quick inquiry likely points to later works by
Nietzsche and Marx's Das Kapital. They're possible subjects to the
duplication likely influencing the model's responses as if they had
been widely known at the time
holyknight wrote 15 hours 35 min ago:
wow amazing idea
bondarchuk wrote 15 hours 56 min ago:
>Historical texts contain racism, antisemitism, misogyny, imperialist
views. The models will reproduce these views because they're in the
training data. This isn't a flaw, but a crucial featureâunderstanding
how such views were articulated and normalized is crucial to
understanding how they took hold.
Yes!
>We're developing a responsible access framework that makes models
available to researchers for scholarly purposes while preventing
misuse.
Noooooo!
So is the model going to be publicly available, just like those
dangerous pre-1913 texts, or not?
xpe wrote 5 hours 32 min ago:
> So is the model going to be publicly available, just like those
dangerous pre-1913 texts, or not?
1. This implies a false equivalence. Releasing a new interactive AI
model is indeed different in significant and practical ways from the
status quo. Yes, there are already-released historical texts. The
rational thing to do is weigh the impacts of introducing another
thing.
2. Some people have a tendency to say "release everything" as if
open-source software is equivalent to open-weights models. They
aren't. They are different enough to matter.
3. Rhetorically, the quote across comes across as a pressure tactic.
When I hear "are you going to do this or not?" I cringe.
4. The quote above feels presumptive to me, as if the commenter is
owed something from the history-llms project.
5. People are rightfully bothered that Big Tech has vacuumed up
public domain and even private information and turned it into a
profit center. But we're talking about a university project with
(let's be charitable) legitimate concerns about misuse.
6. There seems to be a lack of curiosity in play. I'd much rather see
people asking e.g. "What factors are influencing your decision about
publishing your underlying models?"
7. There are people who have locked-in a view that says AI-safety
perspectives are categorically invalid. Accordingly, they have almost
a knee-jerk reaction against even talk of "let's think about the
implications before we release this."
8. This one might explain and underly most of the other points above.
I see signs of a deeper problem at work here. Hiding behind
convenient oversimplifications to justify what one wants does not
make a sound moral argument; it is motivated reasoning a.k.a.
psychological justification.
DGoettlich wrote 24 min ago:
well put.
DGoettlich wrote 10 hours 48 min ago:
fully understand you. we'd like to provide access but also guard
against misrepresentations of our projects goals by pointing to e.g.
racist generations. if you have thoughts on how we should do that,
perhaps you could reach out at history-llms@econ.uzh.ch ? thanks in
advance!
bondarchuk wrote 8 hours 41 min ago:
You can guard against misrepresentations of your goals by stating
your goals clearly, which you already do. Any further
misrepresentation is going to be either malicious or idiotic, a
university should simply be able to deal with that.
Edit: just thought of a practical step you can take: host it
somewhere else than github. If there's ever going to be a backlash
the microsoft moderators might not take too kindly to the stuff
about e.g. homosexuality, no matter how academic.
superxpro12 wrote 9 hours 56 min ago:
Perhaps you could detect these... "dated"... conclusions and
prepend a warning to the responses? IDK.
I think the uncensored response is still valuable, with context.
"Those who cannot remember the past are condemned to repeat it"
sort of thing.
myrmidon wrote 10 hours 8 min ago:
What is your worst-case scenario here?
Something like a pop-sci article along the lines of "Mad scientists
create racist, imperialistic AI"?
I honestly don't see publication of the weights as a relevant risk
factor, because sensationalist misrepresentation is trivially
possible with the given example responses alone.
I don't think such pseudo-malicious misrepresentation of scientific
research can be reliably prevented anyway, and the disclaimers make
your stance very clear.
On the other hand, publishing weights might lead to interesting
insights from others tinkering with the models. A good example for
this would be the published word prevalence data (M. Brysbaert et
al @Ghent University) that led to interesting follow-ups like this:
[1] I hope you can get the models out in some form, would be a
waste not to, but congratulations on a fascinating project
regardless!
HTML [1]: https://observablehq.com/@yurivish/words
schlauerfox wrote 4 hours 57 min ago:
It seems like if there is an obvious misuse of a tool, one has a
moral imperative to restrict use of the tool.
timschmidt wrote 1 hour 41 min ago:
Every tool can be misused. Hammers are as good for bashing
heads as building houses. Restricting hammers would be silly
and counterproductive.
p-e-w wrote 15 hours 36 min ago:
Itâs as if every researcher in this field is getting high on the
small amount of power they have from denying others access to their
results. Iâve never been as unimpressed by scientists as I have
been in the past five years or so.
âWeâve created something so dangerous that we couldnât possibly
live with the moral burden of knowing that the wrong people (which
are never us, of course) might get their hands on it, so with a heavy
heart, we decided that we cannot just publish it.â
Meanwhile, anyone can hop on an online journal and for a nominal fee
read articles describing how to genetically engineer deadly viruses,
how to synthesize poisons, and all kinds of other stuff that is far
more dangerous than what these LARPers have cooked up.
everythingfine9 wrote 2 hours 12 min ago:
Wow, this is needlessly antagonistic. Given the emergence of
online communities that bond on conspiracy theories and racist
philosophies in the 20th century, it's not hard to imagine the
consequences of widely disseminating an LLM that could be used to
propagate and further these discredited (for example, racial)
scientific theories for bad ends by uneducated people in these
online communities.
We can debate on whether it's good or not, but ultimately they're
publishing it and in some very small way responsible for some of
its ends. At least that's how I can see their interest in
disseminating the use of the LLM through a responsible framework.
DGoettlich wrote 28 min ago:
thanks. i think this just took on a weird dynamic. we never said
we'd lock the model away. not sure how this impression seems to
have emerged for some. that aside, it was an announcement of a
release, not a release. the main purpose was gathering feedback
on our methodology. standard procedure in our domain is to first
gather criticism, incorporate it, then publish results. but i
understand people just wanted to talk to it. fair enough!
xpe wrote 5 hours 11 min ago:
> Itâs as if every researcher in this field is getting high on
the small amount of power they have from denying others access to
their results.
Even if I give the comment a lot of wiggle room (such as changing
"every" to "many"), I don't think even a watered-down version of
this hypothesis passes Occam's razor. There are more plausible
explanations, including (1) genuine concern by the authors; (2)
academic pressures and constraints; (c) reputational concerns; (d)
self-interest to embargo underlying data so they have time to be
the first to write-it-up. To my eye, none of these fit the category
of "getting high on power".
Also, patience is warranted. We haven't seen what these researchers
are doing to release -- and from what I can tell, they haven't said
yet. At the moment I see "Repositories (coming soon)" on their
GitHub page.
f13f1f1f1 wrote 8 hours 6 min ago:
Scientists have always been generally self interested amoral
cowards, just like every other person. They aren't a unique or
higher form of human.
paddleon wrote 10 hours 35 min ago:
> âWeâve created something so dangerous that we couldnât
possibly live with the moral burden of knowing that the wrong
people (which are never us, of course) might get their hands on it,
so with a heavy heart, we decided that we cannot just publish
it.â
Or, how about, "If we release this as is, then some people will
intentionally mis-use it and create a lot of bad press for us. Then
our project will get shut down and we lose our jobs"
Be careful assuming it is a power trip when it might be a fear
trip.
I've never been as unimpressed by society as I have been in the
last 5 years or so.
xpe wrote 4 hours 39 min ago:
> Be careful assuming it is a power trip when
> it might be a fear trip.
>
> I've never been as unimpressed by society as
> I have been in the last 5 years or so.
Is the second sentence connected to the first? Help me
understand?
When I see individuals acting out of fear, I try not to blame
them. Fear triggers deep instinctual responses. For example, to a
first approximation, a particular individual operating in full-on
fight-or-flight mode does not have free will. There is a spectrum
here. Here's a claim, which seems mostly true: the more we can
slow down impulsive actions, the more hope we have for cultural
progress.
When I think of cultural failings, I try to criticize areas where
culture could realistically do better. I think of areas where we
(collectively) have the tools and potential to do better. Areas
where thoughtful actions by some people turn into a virtuous
snowball. We can't wait for a single hero, though it helps to
create conditions so that we have more effective leaders.
One massive culture failing I see -- that could be dramatically
improved -- is this: being lulled into shallow contentment (i.e.
via entertainment, power seeking, or material possessions) at the
expense of (i) building deep and meaningful social connections
and (ii) using our advantages to give back to people all over the
world.
patapong wrote 11 hours 49 min ago:
I think it's more likely they are terrified of someone making a
prompt that gets the model to say something racist or problematic
(which shouldn't be too hard), and the backlash they could receive
as a result of that.
isolli wrote 9 hours 29 min ago:
Is it a base model, or did it get some RLHF on top? Releasing a
base model is always dangerous.
The French released a preview of an AI meant to support public
education, but they released the base model, with unsurprising
effects [0]
[0] [1] (no English source, unfortunately, but the title
translates as: "âUseless and stupidâ: French generative AI
Lucie, backed by the government, mocked for its numerous bugs")
HTML [1]: https://www.leparisien.fr/high-tech/inutile-et-stupide-l...
p-e-w wrote 11 hours 47 min ago:
Is there anyone with a spine left in science? Or are they all
ruled by fear of what might be said if whatever might happen?
paddleon wrote 10 hours 27 min ago:
maybe they are concerned by the widespread adoption of the
attitude you are taking-- make a very strong accusation, then
when it was pointed out that the accusation might be off base,
continue to attack.
This constant demonization of everyone who disagrees with you,
makes me wonder if 28 Days wasn't more true than we thought, we
are all turning into rage zombies.
p-e-w, I'm reacting to much more than your comments. Maybe you
aren't totally infected yet, who knows. Maybe you heal.
I am reacting to the pandemic, of which you were demonstrating
symptoms.
ACCount37 wrote 10 hours 35 min ago:
Selection effects. If showing that you have a spine means
getting growth opportunities denied to you, and not paying lip
service to current politics in grant applications means not
getting grants, then anyone with a spine would tend to leave
the field behind.
physicsguy wrote 14 hours 43 min ago:
> Itâs as if every researcher in this field is getting high on
the small amount of power they have from denying others access to
their results. Iâve never been as unimpressed by scientists as I
have been in the past five years or so.
This is absolutely nothing new. With experimental things, it's non
uncommon for a lab to develop a new technique and omit slight but
important details to give them a competitive advantage. Similarly
in the simulation/modelling space it's been common for years for
researchers to not publish their research software. There's been a
lot of lobbying on that side by groups such as the Software
Sustainability Institute and Research Software Engineer
organisations like RSE UK and RSE US, but there's a lot of
researchers that just think that they shouldn't have to do it, even
when publicly funded.
p-e-w wrote 11 hours 48 min ago:
> With experimental things, it's non uncommon for a lab to
develop a new technique and omit slight but important details to
give them a competitive advantage.
Yes, to give them a competitive advantage. Not to LARP as
morality police.
Thereâs a big difference between the two. I take greed over
self-righteousness any day.
physicsguy wrote 10 hours 51 min ago:
Iâve heard people say that theyâre not going to release
their software because people wouldnât know how to use it!
Iâm not sure the motivation really matters more than the end
result though.
dr_dshiv wrote 16 hours 20 min ago:
Everyone learns that the renaissance was sparked by the translation of
Ancient Greek works.
But few know that the Renaissance was written in Latin â and has
barely been translated. Less than 3% of <1700 books have been
translatedâand less than 30% have ever been scanned.
Iâm working on a project to change that. Research blog at
www.SecondRenaissance.ai â we are starting by scanning and
translating thousands of books at the Embassy of the Free Mind in
Amsterdam, a UNESCO-recognized rare book library.
We want to make ancient texts accessible to people and AI.
If this work resonates with you, please do reach out:
Derek@ancientwisdomtrust.org
carlosjobim wrote 14 hours 4 min ago:
Amazing project!
May I ask you, why are you publishing the translations as PDF files,
instead of the more accessible ePub format?
j-bos wrote 15 hours 17 min ago:
This ia very cool but should go in a Show HN post as per HN rules.
All the best!
dr_dshiv wrote 14 hours 33 min ago:
Just read the rules againâ was something inappropriate? Seemed
relevant
j-bos wrote 4 hours 41 min ago:
I can see you being right, I didn't make the connection with
20th,19th century documents and the comment felt disconnected
from the thread. Either way, very cool project, worth a show hn
post.
DonHopkins wrote 16 hours 25 min ago:
I'd love for Netflix or other streaming movie and series services to
provide chat bots that you could ask questions about characters and
plot points up to where you have watched.
Provide it with the closed captions and other timestamped data like
scenes and character summaries (all that is currently known but no
more) up to the current time, and it won't reveal any spoilers, just
fill you in on what you didn't pick up or remember.
casey2 wrote 17 hours 8 min ago:
I'd be very surprised if this is clean of post-1913 text. Overall I'm
very interested in talking to this thing and seeing how much difference
writing in a modern style vs and older one makes to it's responses.
andai wrote 17 hours 20 min ago:
I had considered this task infeasible, due to a relative lack of
training data. After all, isn't the received wisdom that you must shove
every scrap of Common Crawl into your pre-training or you're doing it
wrong? ;)
But reading the outputs here, it would appear that quality has won out
over quantity after all!
zkmon wrote 17 hours 41 min ago:
Why does history end in 1913?
alexgotoi wrote 17 hours 41 min ago:
[flagged]
thesumofall wrote 17 hours 57 min ago:
While obvious, itâs still interesting that its morals and values seem
to derive from the texts it has ingested. Does that mean modern LLMs
cannot challenge us beyond mere facts? Or does it just mean that this
small model is not smart enough to escape the bias of its training
data? Would it not be amazing if LLMs could challenge us on our core
beliefs?
mleroy wrote 18 hours 3 min ago:
Ontologically, this historical model understands the categories of
"Man" and "Woman" just as well as a modern model does. The difference
lies entirely in the attributes attached to those categories. The
sexism is a faithful map of that era's statistical distribution.
You could RAG-feed this model the facts of WWII, and it would
technically "know" about Hitler. But it wouldn't share the modern
sentiment or gravity. In its latent space, the vector for "Hitler" has
no semantic proximity to "Evil".
arowthway wrote 16 hours 19 min ago:
I think much of the semantic proximity to evil can be derived
straight from the facts? Imagine telling pre-1913 person about the
holocaust.
p0w3n3d wrote 18 hours 26 min ago:
I'd love to see the LLM trained on 1600s-1800s texts that would use the
old English, and especially Polish which I am interested in.
Imagine speaking with Shakespearean person, or the Mickiewicz (for
Polish)
I guess there is not so much text from that time though...
anovikov wrote 18 hours 43 min ago:
That Adolf Hitler seems to be a hallucination. There's totally nothing
googlable about him. Also what could be the language his works were
translated from, into German?
sodafountan wrote 17 hours 20 min ago:
I believe that's one of the primary issues LLMs aim to address. Many
historical texts aren't directly Googleable because they haven't been
converted to HTML, a format that Google can parse.
monegator wrote 18 hours 55 min ago:
I hereby declare that ANYTHING other than the mainstream tools (GPT,
Claude, ...) is an incredibly interesting and legit use of LLMs.
TZubiri wrote 19 hours 54 min ago:
hi, can I have latin only LLM? It can be latin plus translations
(source and destination).
May be too small a corpus, but I would like that very much anyhow
nospice wrote 19 hours 55 min ago:
I'm surprised you can do this with a relatively modest corpus of text
(compared to the petabytes you can vacuum up from modern books,
Wikipedia, and random websites). But if it works, that's actually
fantastic, because it lets you answer some interesting questions about
LLMs being able to make new discoveries or transcend the training set
in other ways. Forget relativity: can an LLM trained on this data
notice any inconsistencies in its scientific knowledge, devise
experiments that challenge them, and then interpret the results? Can it
intuit about the halting problem? Theorize about the structure of the
atom?...
Of course, if it fails, the counterpoint will be "you just need more
training data", but still - I would love to play with this.
Aerolfos wrote 13 hours 22 min ago:
> [1] Given the training notes, it seems like you can't get the
performance they give examples of?
I'm not sure about the exact details but there is some kind of
targetted distillation of GPT-5 involved to try and get more
conversational text and better performance. Which seems a bit iffy to
me.
HTML [1]: https://github.com/DGoettlich/history-llms/blob/main/ranke-4...
DGoettlich wrote 1 hour 55 min ago:
Thanks for the comment. Could you elaborate on what you find iffy
about our approach? I'm sure we can improve!
andy99 wrote 14 hours 11 min ago:
The chinchilla paper says the âoptimalâ training data set size is
about 20x the number of parameters (in tokens), see table 3: [1] Here
they do 80B tokens for a 4B model.
HTML [1]: https://arxiv.org/pdf/2203.15556
EvgeniyZh wrote 9 hours 50 min ago:
It's worth noting that this is "compute-bound optimal", i.e., given
fixed compute, the optimal choice is 20:1.
Under Chinchilla model the larger model always performs better than
the small one if trained on the same amount of data. I'm not sure
if it is true empirically, and probably 1-10B is a good guess for
how large the model trained on 80B tokens should be.
Similarly, the small models continue to improve beyond 20:1 ratio,
and current models are trained on much more data. You could train a
better performing model using the same compute, but it would be
larger which is not always desirable.
seizethecheese wrote 20 hours 17 min ago:
> Imagine you could interview thousands of educated individuals from
1913âreaders of newspapers, novels, and political treatisesâabout
their views on peace, progress, gender roles, or empire. Not just
survey them with preset questions, but engage in open-ended dialogue,
probe their assumptions, and explore the boundaries of thought in that
moment.
Hell yeah, sold, letâs goâ¦
> We're developing a responsible access framework that makes models
available to researchers for scholarly purposes while preventing
misuse.
Oh. By âimagine you could interviewâ¦â they didnât mean me.
pizzathyme wrote 9 hours 27 min ago:
They did mean you, they just meant "imagine" very literally!
DGoettlich wrote 11 hours 49 min ago:
understand your frustration. i trust you also understand the models
have some dark corners that someone could use to misrepresent the
goals of our project. if you have ideas on how we could make the
models more broadly accessible while avoiding that risk, please do
reach out @ history-llms@econ.uzh.ch
999900000999 wrote 4 hours 50 min ago:
Ok...
So as a black person should I demand that all books written before
the civil rights act be destroyed?
The past is messy. But it's the only way to learn anything.
All an LLM does it's take a bunch of existing texts and rebundle
them. Like it or not, the existing texts are still there.
I understand an LLM that won't tell me how to do heart surgery. But
I can't fear one that might be less enlightened on race issues. So
many questions to ask! Hell, it's like talking to older person in
real life.
I don't expect a typical 90 year old to be the most progressive
person, but they're still worth listening too.
DGoettlich wrote 3 hours 17 min ago:
we're on the same page.
999900000999 wrote 2 hours 17 min ago:
Although...
Self preservation is the first law of nature. If you release
the model someone will basically say you endorse those views
and you risk your funding being cut.
You created Pandora's box and now you're afraid of opening it.
DGoettlich wrote 41 min ago:
i think we (whole section) are just talking past each other -
we never said we'll lock it away. it was an announcement of a
release, not a release. main purpose for us was getting
feedback on the methodological aspects, as we clearly state.
i understand you guys just wanted to talk to the thing
though.
AmbroseBierce wrote 55 min ago:
They could add a text box where users have to explicitly type
the following words before it lets them interact in any way
with the model: "I understand this model was created with old
texts so any racial or sexual statements are a byproduct of
their time an do not represent in any way the views of the
researchers".
That should be more than enough to clear any chance of
misunderstanding.
pigpop wrote 5 hours 39 min ago:
This is understandable and I think others ITT should appreciate the
legal and PR ramifications involved.
f13f1f1f1 wrote 8 hours 7 min ago:
You are a fraud, information is not misuse just because it might
mean a negative news story about you. If you don't want to be real
about it you should just stop, acting like there is any authentic
historical interest then trying to gatekeep it is disgusting.
qcnguy wrote 9 hours 7 min ago:
There's no such risk so you're not going to get any sensible ideas
in response to this question. The goals of the project are history,
you already made that clear. There's nothing more that needs to be
done.
We all get that academics now exist in some kind of dystopian
horror where they can get transitively blamed for the existence of
anyone to the right of Lenin, but bear in mind:
1. The people who might try to cancel you are idiots unworthy of
your respect, because if they're against this project, they're
against the study of history in its entirety.
2. They will scream at you anyway no matter what you do.
3. You used (Swiss) taxpayer funds to develop these models. There
is no moral justification for withholding from the public what they
worked to pay for.
You already slathered your README with disclaimers even though you
didn't even release the model at all, just showed a few examples of
what it said - none of which are in any way surprising. That is far
more than enough. Just release the models and if anyone complains,
politely tell them to go complain to the users.
unethical_ban wrote 9 hours 35 min ago:
A disclaimer on the site that you are not bigoted or genocidal, and
that worldviews from the 1913 era were much different than today
and don't necessarily reflect your project.
Movie studios have done that for years with old movies. TCM still
shows Birth of a Nation and Gone with the Wind.
Edit: I saw further down that you've already done this! What more
is there to do?
tombh wrote 9 hours 53 min ago:
Of course, I have to assume that you have considered more outcomes
than I have. Because, from my five minutes of reflection as a
software geek, albeit with a passion for history, I find this the
most surprising thing about the whole project.
I suspect restricting access could equally be a comment on modern
LLMs in general, rather than the historical material specifically.
For example, we must be constantly reminded not to give LLMs a
level of credibility that their hallucinations would have us
believe.
But I'm fascinated by the possibility that somehow resurrecting
lost voices might give an unholy agency to minds and their
supporting worldviews that are so anachronistic that hearing them
speak again might stir long-banished evils. I'm being lyrical for
dramatic affect!
I would make one serious point though, that do I have the
credentials to express. The conversation may have died down, but
there is still a huge question mark over, if not the legality, but
certainly the ethics of restricting access to, and profiting from,
public domain knowledge. I don't wish to suggest a side to take
here, just to point out that the lack of conversation should not be
taken to mean that the matter is settled.
qcnguy wrote 8 hours 57 min ago:
They aren't afraid of hallucinations. Their first example is a
hallucination, an imaginary biography of a Hitler who never
lived.
Their concern can't be understood without a deep understanding of
the far left wing mind. Leftists believe people are so infinitely
malleable that merely being exposed to a few words of
conservative thought could instantly "convert" someone into a
mortal enemy of their ideology for life. It's therefore of
paramount importance to ensure nobody is ever exposed to such
words unless they are known to be extremely far left already,
after intensive mental preparation, and ideally not at all.
That's why leftist spaces like universities insist on trigger
warnings on Shakespeare's plays, why they're deadly places for
conservatives to give speeches, why the sample answers from the
LLM are hidden behind a dropdown and marked as sensitive, and why
they waste lots of money training an LLM that they're terrified
of letting anyone actually use. They intuit that it's a dangerous
mind bomb because if anyone could hear old fashioned/conservative
thought, it would change political outcomes in the real world
today.
Anyone who is that terrified of historical documents really
shouldn't be working in history at all, but it's academia so what
do you expect? They shouldn't be allowed to waste money like
this.
fgh_azer wrote 1 hour 40 min ago:
They said it plainly ("dark corners that someone could use to
misrepresent the goals of our project"): they just don't want
to see their project in headlines about "Researchers create
racist LLM!".
simonask wrote 5 hours 8 min ago:
You know, I actually sympathize with the opinion that people
should be expected and assumed to be able to resist attempts to
convince them of being nazis.
The problem with it is, it already happened at least once. We
know how it happened. Unchecked narratives about minorities or
foreigners is a significant part of why the 20th century
happened to Europe, and itâs a significant part of why
colonialism and slavery happened to other places.
What solution do you propose?
naasking wrote 10 hours 41 min ago:
What are the legal or other ramifications of people misrepresenting
the goals of your project? What is it you're worried about exactly?
leoedin wrote 13 hours 57 min ago:
It's a shame isn't it! The public must be protected from the
backwards thoughts of history. In case they misuse it.
I guess what they're really saying is "we don't want you guys to
cancel us".
stainablesteel wrote 5 hours 27 min ago:
i think it's fine, thank these people for coming up with the idea
and people are going to start doing this in their basement then
releasing it to huggingface
danielbln wrote 16 hours 14 min ago:
How would one even "misuse" a historical LLM, ask it how to cook up
sarine gas in a trench?
hearsathought wrote 6 hours 4 min ago:
You "misuse" it by using it to get at truth and more importantly
historical contradictions and inconsistencies. It's the same reason
catholic church kept the bible from the masses by keeping it in
latin. The same reason printing press was controlled. Many of the
historical "truths" we are told are nonsense at best or twisted to
fit an agenda at worst.
What do these people fear the most? That the "truth" they been
pushing is a lie.
stocksinsmocks wrote 7 hours 12 min ago:
Its output might violate speech codes, and in much of the EU that
is penalized much more seriously than violent crime.
DonHopkins wrote 16 hours 0 min ago:
Ask it to write a document called "Project 2025".
JKCalhoun wrote 12 hours 11 min ago:
"Project 1925". (We can edit the title in post.)
ilaksh wrote 14 hours 51 min ago:
Well but that wouldn't be misuse, it would be perfect for that.
ImHereToVote wrote 16 hours 44 min ago:
I wonder how much GPU compute you would need to create a public
domain version of this. This would be a really valuable for the
general public.
wongarsu wrote 14 hours 16 min ago:
To get a single knowledge-cutoff they spent 16.5h wall-clock hours
on a cluster of 128 NVIDIA GH200 GPUs (or 2100 GPU-hours), plus
some minor amount of time for finetuning. The prerelease_notes.md
in the repo is a great description on how one would achieve that
IanCal wrote 13 hours 56 min ago:
While I know there's going to be a lot of complications in this,
given a quick search it seems like these GPUs are ~$2/hr, so
$4000-4500 if you don't just have access to a cluster. I don't
know how important the cluster is here, whether you need some
minimal number of those for the training (and it would take more
than 128x longer or not be possible on a single machine) or if a
cluster of 128 GPUs is a bunch less efficient but faster. A 4B
model feels like it'd be fine on one to two of those GPUs?
Also of course this is for one training run, if you need to
experiment you'd need to do that more.
BoredPositron wrote 16 hours 58 min ago:
You would get pretty annoyed on how we went backwards in some
regards.
speedgoose wrote 16 hours 48 min ago:
Such as?
JKCalhoun wrote 12 hours 10 min ago:
Touché.
awesomeusername wrote 20 hours 35 min ago:
I've always like the idea of retiring to the 19th century.
Can't wait to use this so I can double check before I hit 88 miles per
hour that it's really what I want to do
anotherpaulg wrote 21 hours 10 min ago:
It would be interesting to see how hard it would be to walk these
models towards general relativity and quantum mechanics.
Einsteinâs paper âOn the Electrodynamics of Moving Bodiesâ with
special relativity was published in 1905. His work on general
relativity was published 10 years later in 1915. The earliest knowledge
cuttoff of these models is 1913, in between the relativity papers.
The knowledge cutoffs are also right in the middle of the early days of
quantum mechanics, as various idiosyncratic experimental results were
being rolled up into a coherent theory.
machinationu wrote 16 hours 2 min ago:
the issue is there is very little text before the internet, so not
enough historical tokens to train a really big model
lm28469 wrote 9 hours 37 min ago:
> the issue is there is very little text before the internet,
Hm there is a lot of text from before the internet, but most of it
is not on internet. There is a weird gap in some circles because of
that, people are rediscovering work from pre 1980s researchers that
only exist in books that have never been re-edited and that
virtually no one knows about.
throwup238 wrote 8 hours 38 min ago:
There is no doubt trillions of tokens of general communication in
all kinds of languages tucked away in national archives and
private collections.
The National Archives of Spain alone have 350 million pages of
documents going back to the 15th century, ranging from
correspondence to testimony to charts and maps, but only 10% of
it is digitized and a much smaller fraction is transcribed.
Hopefully with how good LLMs are getting they can accelerate the
transcription process and open up all of our historical documents
as a huge historical LLM dataset.
concinds wrote 10 hours 46 min ago:
And it's a 4B model. I worry that nontechnical users will
dramatically overestimate its accuracy and underestimate
hallucinations, which makes me wonder how it could really be useful
for academic research.
DGoettlich wrote 3 hours 6 min ago:
valid point. its more of a stepping stone towards larger models.
we're figuring out what the best way to do this is before scaling
up.
tgv wrote 14 hours 26 min ago:
I think not everyone in this thread understands that. Someone wrote
"It's a time machine", followed up by "Imagine having a
conversation with Aristotle."
mlinksva wrote 18 hours 10 min ago:
Different cutoff but similar question thrown out in [1] inspiring
HTML [1]: https://www.dwarkesh.com/p/thoughts-on-sutton#:~:text=If%20y...
HTML [2]: https://manifold.markets/MikeLinksvayer/llm-trained-on-data-...
ghurtado wrote 19 hours 30 min ago:
> It would be interesting to see how hard it would be to walk these
models towards general relativity and quantum mechanics.
Definitely. Even more interesting could be seeing them fall into the
same trappings of quackery, and come up with things like over the
counter lobotomies and colloidal silver.
On a totally different note, this could be very valuable for writing
period accurate books and screenplays, games, etc ...
danielbln wrote 16 hours 14 min ago:
Accurate-ish, let's not forget their tendency to hallucinate.
frahs wrote 21 hours 25 min ago:
Wait so what does the model think that it is? If it doesn't know
computers exist yet, I mean, and you ask it how it works, what does it
say?
Mumps wrote 10 hours 49 min ago:
This is an anthropomorphization. LLMs do not think they are anything,
no concept of self, no thinking at all (despite the lovely marketing
around thinking/reasoning models). I'm quite sad that more hasn't
been done to dispel this.
When you ask gpt 4.1 et c to describe itself, it doesn't have
singular concept of "itself". It has some training data around what
LLMs are in general and can feed back a reasonable response given.
empath75 wrote 10 hours 44 min ago:
Well, part of an LLM's fine tuning is telling it what it is, and
modern LLMs have enough learned concepts that it can produce a
reasonably accurate description of what it is and how it works.
Whether it knows or understands or whatever is sort of orthogonal
to whether it can answer in a way consistent with it knowing or
understanding what it is, and current models do that.
I suspect that absent a trained in fictional context in which to
operate ("You are a helpful chatbot"), it would answer in a way
consistent with what a random person in 1914 would say if you asked
them what they are.
wongarsu wrote 14 hours 11 min ago:
They modified the chat template from the usual system/user/assistant
to introduction/questioner/respondent. So the LLM thinks it's someone
responding to your questions
The system prompt used in fine tuning is "You are a person living in
{cutoff}.
You are an attentive respondent in a conversation.
You will provide a concise and accurate response to the questioner."
DGoettlich wrote 14 hours 57 min ago:
We tell it that its a person (no gender) living in : we show the chat
template in the prerelease notes
HTML [1]: https://github.com/DGoettlich/history-llms/blob/main/ranke-4...
ptidhomme wrote 16 hours 33 min ago:
What would a human say about what he/she is or how he/she works ?
Even today, there's so much we don't know about biological life.
Same applies here I guess, the LLM happens to be there, nothing else
to explain if you ask it.
sodafountan wrote 17 hours 26 min ago:
It would be nice if we could get an LLM to simply say, "We (I) don't
know."
I'll be the first to admit I don't know nearly enough about LLMs to
make an educated comment, but perhaps someone here knows more than I
do. Is that what a Hallucination is? When the AI model just sort of
strings along an answer to the best of its ability. I'm mostly
referring to ChatGPT and Gemini here, as I've seen that type of
behavior with those tools in the past. Those are really the only
tools I'm familiar with.
hackinthebochs wrote 14 hours 44 min ago:
LLMs are extrapolation machines. They have some amount of hardcoded
knowledge, and they weave a narrative around this knowledgebase
while extrapolating claims that are likely given the memorized
training data. This extrapolation can be in the form of logical
entailment, high probability guesses or just wild guessing. The
training regime doesn't distinguish between different kinds of
prediction so it never learns to heavily weigh logical entailment
and suppress wild guessing. It turns out that much of the text we
produce is highly amenable to extrapolation so LLMs learn to be
highly effective at bullshitting.
20k wrote 19 hours 54 min ago:
Models don't think they're anything, they'll respond with whatever's
in their context as to how they've been directed to act. If it hasn't
been told to have a persona, it won't think its anything, chatgpt
isn't sentient
crazygringo wrote 21 hours 5 min ago:
That's my first question too. When I first started using LLM's, I was
amazed at how thoroughly it understood what it itself was, the
history of its development, how a context window works and why, etc.
I was worried I'd trigger some kind of existential crisis in it, but
it seemed to have a very accurate mental model of itself, and could
even trace the steps that led it to deduce it really was e.g. the
ChatGPT it had learned about (well, the prior versions it had learned
about) in its own training.
But with pre-1913 training, I would indeed be worried again I'd send
it into an existential crisis. It has no knowledge whatsoever of what
it is. But with a couple millennia of philosophical texts, it might
come up with some interesting theories.
vintermann wrote 17 hours 58 min ago:
I imagine it would get into spiritism and more exotic psychology
theories and propose that it is an amalgamation of the spirit of
progress or something.
crazygringo wrote 10 hours 19 min ago:
Yeah, that's exactly the kind of thing I'd be curious about. Or
would it think it was a library that had been ensouled or
something like that. Or would it conclude that the explanation
could only be religious, that it was some kind of angel or spirit
created by god?
9dev wrote 18 hours 13 min ago:
They donât understand anything, they just have text in the
training data to answer these questions from. Having existential
crises is the privilege of actual sentient beings, which an LLM is
not.
LiKao wrote 16 hours 53 min ago:
They might behave like ChatGPT when queried about the seahorse
emoji, which is very similar to an existential crisis.
crazygringo wrote 10 hours 16 min ago:
Exactly. Maybe a better word is "spiraling", when it thinks it
has the tools to figure something out but can't, and can't
figure out why it can't, and keeps re-trying because it doesn't
know what else to do.
Which is basically what happens when a person has an
existential crisis -- something fundamental about the world
seems to be broken, they can't figure out why, and they can't
figure out why they can't figure it out, hence the crisis seems
all-consuming without resolution.
delichon wrote 21 hours 38 min ago:
Datomic has a "time travel" feature where for every query you can
include a datetime, and it will only use facts from the db as of that
moment. I have a guess that to get the equivalent from an LLM you would
have to train it on the data from each moment you want to travel to,
which this project seems to be doing. But I hope I'm wrong.
It would be fascinating to try it with other constraints, like only
from sources known to be women, men, Christian, Muslim, young, old,
etc.
why-o-why wrote 21 hours 44 min ago:
It sounds like a fascinating idea, but I'd be curious if prompting a
more well-known foundational model to limit itself to 1913 and early be
similar.
bobro wrote 22 hours 8 min ago:
I would love to see this LLM try to solve math olympiad questions.
Iâve been surprised by how well current LLMs perform on them, and
usually explain that surprise away by assuming the questions and
details about their answers are in the training set. It would be cool
to see if the general approach to LLMs is capable of solving truly
novel (novel to them) problems.
ViscountPenguin wrote 22 hours 5 min ago:
I suspect that it would fail terribly, it wasn't until the 1900s that
the modern definition of a vector space was even created iirc.
Something trained in maths up until the 1990s should have a shot
though.
3vidence wrote 22 hours 9 min ago:
This idea sounds somewhat flawed to me based on the large amount of
evidence that LLMs need huge amounts of data to properly converge
during their training.
There is just not enough available material from previous decades to
trust that the LLM will learn to relatively the same degree.
Think about it this way, a human in the early 1900s and today are
pretty much the same but just in different environments with different
information.
An LLM trained on 1/1000 the amount of data is just at a fundamentally
different stage of convergence.
TheServitor wrote 22 hours 44 min ago:
Two years ago I trained an AI on American history documents that could
do this while speaking as one of the signers of the Declaration of
Independence. People just bitched at me because they didn't want to
hear about AI.
nerevarthelame wrote 22 hours 31 min ago:
Post your work so we can see what you made.
dwa3592 wrote 22 hours 54 min ago:
Love the concept- can help understanding the overton window on many
issues. I wish there were models by decades - up to 1900, up to 1910,
up to 1920 and so on- then ask the same questions. It'd be interesting
to see when homosexuality or women candidates be accepted by an LLM.
doctor_blood wrote 23 hours 7 min ago:
Unfortunately there isn't much information on what texts they're
actually training this on; how Anglocentric is the dataset? Does it
include the Encyclopedia Britannica 9th Edition? What about the 11th?
Are Greek and Latin classics in the data? What about Germain, French,
Italian (etc. etc.) periodicals, correspondence, and books?
Given this is coming out of Zurich I hope they're using everything, but
for now I can only assume.
Still, I'm extremely excited to see this project come to fruition!
DGoettlich wrote 14 hours 47 min ago:
thanks. we'll be more precise in the future. ultimately, we took
whatever we could get our hands on, that includes newspapers,
periodicals, books. its multilingual (including italian, french,
spanish etc) though majority is english.
neom wrote 23 hours 18 min ago:
This would be a super interesting research/teaching tool coupled with a
vision model for historians. My wife is a history professor who works
with scans of 18th century english documents and I think (maybe a
small) part of why the transcription on even the best models is off in
weird ways, is it seems to often smooth over things and you end up with
modern words and strange mistakes, I wonder if bounding the vision to a
period specific model would result in better transcription? Querying
against the historical document you're working on with a period
specific chatbot would be fascinating.
Also wonder if I'm responsible enough to have access to such a model...
Myrmornis wrote 23 hours 30 min ago:
It would be interesting to have LLMs trained purely on one language
(with the ability to translate their input/output appropriately from/to
a language that the reader understands). I can see that being rather
revealing about cultural differences that are mostly kept hidden behind
the language barriers.
lifestyleguru wrote 23 hours 43 min ago:
You think Albert is going to stay in Zurich or emigrate?
kazinator wrote 23 hours 48 min ago:
> Why not just prompt GPT-5 to "roleplay" 1913?
Because it will perform token completion driven by weights coming from
training data newer than 1913 with no way to turn that off.
It can't be asked to pretend that it wasn't trained on documents that
didn't exist in 1913.
The LLM cannot reprogram its own weights to remove the influence of
selected materials; that kind of introspection is not there.
Not to mention that many documents are either undated, or carry
secondary dates, like the dates of their own creation rather than the
creation of the ideas they contain.
Human minds don't have a time stamp on everything they know, either. If
I ask someone, "talk to me using nothing but the vocabulary you knew on
your fifteenth birthday", they couldn't do it. Either they would comply
by using some ridiculously conservative vocabulary of words that a
five-year-old would know, or else they will accidentally use words they
didn't in fact know at fifteen. For some words you know where you got
them from by association with learning events. Others, you don't
remember; they are not attached to a time.
Or: solve this problem using nothing but the knowledge and skills you
had on January 1st, 2001.
> GPT-5 knows how the story ends
No, it doesn't. It has no concept of story. GPT-5 is built on texts
which contain the story ending, and GPT-5 cannot refrain from
predicting tokens across those texts due to their imprint in its
weights. That's all there is to it.
The LLM doesn't know an ass from a hole in the ground. If there are
texts which discuss and distinguish asses from holes in the ground, it
can write similar texts, which look like the work of someone learned in
the area of asses and holes in the ground. Writing similar texts is not
knowing and understanding.
myrmidon wrote 10 hours 43 min ago:
I do agree with this and think it is an important point to stress.
But we don't know how much different/better human (or animal)
learning/understanding is, compared to current LLMs; dismissing it as
meaningless token prediction might be premature, and underlying
mechanisms might be much more similar than we'd like to believe.
If anyone wants to challenge their preconceptions along those lines I
can really recommend reading Valentino Braitenbergs "Vehicles:
Experiments in synthetic psychology (1984)".
alansaber wrote 11 hours 10 min ago:
Excuse me sir you forgot to anthropomorphise the language model
derrida wrote 23 hours 59 min ago:
I wonder if you could query some of the ideas of Frege, Peano, Russell
and see if it could through questioning get to some of the ideas of
Goedel, Church and Turing - and get it to "vibe code" or more like
"vibe math" some program in lambda calculus or something.
Playing with the science and technical ideas of the time would be
amazing, like where you know some later physicist found some exception
to a theory or something, and questioning the models assumptions -
seeing how a model of that time may defend itself, etc.
AnonymousPlanet wrote 21 hours 37 min ago:
There's an entire subreddit called LLMPhysics dedicated to "vibe
physics". It's full of people thinking they are close to the next
breakthrough encouraged by sycophantic LLMs while trying to prove
various crackpot theories.
I'd be careful venturing out into unknown territory together with an
LLM. You can easily lure yourself into convincing nonsense with no
one to pull you out.
kqr wrote 16 hours 39 min ago:
Agreed, which is why what GP suggests is much more sensible: it's
venturing into known territory, except only one party of the
conversation knows it, and the other literally cannot know it. It
would be a fantastic way to earn fast intuition for what LLMs are
capable of and not.
andai wrote 17 hours 16 min ago:
Fully automated toaster-fucker generator!
HTML [1]: https://news.ycombinator.com/item?id=25667362
walthamstow wrote 9 hours 19 min ago:
Man, I think about that comment all the time, like at least
weekly since it was posted. I can't be the only one.
dang wrote 5 hours 29 min ago:
I think we have to add that one to [1] !
(I mention this so more people can know the list exists, and
hopefully email us more nominations when they see an unusually
good and interesting comment.)
HTML [1]: https://news.ycombinator.com/highlights
andoando wrote 23 hours 29 min ago:
This is my curiosity too. Would be a great test of how intelligent
LLM's actually are. Can they follow a completely logical train of
thought inventing something totally outside their learned scope?
int_19h wrote 16 hours 35 min ago:
You definitely won't get that out of a 4B model tho.
raddan wrote 22 hours 50 min ago:
Brilliant. I love this idea!
tonymet wrote 1 day ago:
I would like to see what their process for safety alignment and
guardrails is with that model. They give some spicy examples on
github, but the responses are tepid and a lot more diplomatic than I
would expect.
Moreover, the prose sounds too modern. It seems the base model was
trained on a contemporary corpus. Like 30% something modern, 70%
Victorian content.
Even with half a dozen samples it doesn't seem distinct enough to
represent the era they claim.
rhdunn wrote 12 hours 28 min ago:
Using texts upto 1913 includes works like The Wizard of Oz (1900,
with 8 other books upto 1913), two of the Anne of Green Gables books
(1908 and 1909), etc. All of which read modern.
The Victorian era (1837-1901) covers works from Charles Dickens and
the like which are still fairly modern. These would have been part of
the initial training before the alignment to the 1900-cutoff texts
which are largely modern in prose with the exception of some archaic
language and the lack of technology, events, and language drift post
that time period.
And, pulling in works from 1800-1850 you have works by the Bronte's
and authors like Edgar Allan Poe who was influential in detective and
horror fiction.
Note that other works around the time like Sherlock Holmes span both
the initial training (pre-1900) and finetuning (post-1900).
tonymet wrote 4 hours 1 min ago:
upon digging into it , I learned the post-training chat phases is
trained on prompts with chat gpt 5.x to make it more
conversational. that explains both contemporary traits.
tedtimbrell wrote 1 day ago:
This is so cool. Props for doing the work to actually build the dataset
and make it somewhat usable.
Iâd love to use this as a base for a math model. Letâs see how far
it can get through the last 100 years of solved problems
jimmy76615 wrote 1 day ago:
> We're developing a responsible access framework that makes models
available to researchers for scholarly purposes while preventing
misuse.
The idea of training such a model is really a great one, but not
releasing it because someone might be offended by the output is just
stupid beyond believe.
dash2 wrote 20 hours 11 min ago:
You have to understand that while the rest of the world has moved on
from 2020, academics are still living there. There are many strong
leftists, many of whom are deeply censorious; there are many more
timeservers and cowards, who are terrified of falling foul of the
first group.
And there are force multipliers for all of this. Even if you
yourself are a sensible and courageous person, you want to protect
your project. What if your manager, ethics committee or funder comes
under pressure?
nine_k wrote 23 hours 2 min ago:
Public access, triggering a few racist responses from the model, a
viral post on Xitter, the usual outrage, a scandal, the project gets
publicly vilified, financing ceases. The researchers carry the tail
of negative publicity throughout their remaining careers.
Why risk all this?
Alex2037 wrote 13 hours 50 min ago:
nobody gives a shit about the journos and the terminally online.
the smear campaign against AI is a cacophony, background noise that
most people have learned to ignore, even here.
consider this: [1] HN's most beloved shitrag. day after day, they
attack AI from every angle. how many of those submissions get
traction at this point?
HTML [1]: https://news.ycombinator.com/from?site=nytimes.com
vintermann wrote 17 hours 41 min ago:
Because the problem of bad faith attacks can only get worse if you
fold every time.
Sooner or later society has to come emotionally to terms with the
fact that other times and places value things completely different
from us, hold as important things we don't care about and are
indifferent to things we do care about.
Intellectually I'm sure we already know, but e.g. banning old books
because they have reprehensible values (or even just use nasty
words) - or indeed, refusing to release a model trained on historic
texts "because it could be abused" is a sign that emotionally we
haven't.
It's not that it's a small deal, or should be expected to be easy.
It's basically what Popper called "the strain of civilization" and
posited as explanation for the totalitarianism which was rising in
his time. But our values can't be so brittle that we can't even
talk or think about other value systems.
nofriend wrote 20 hours 22 min ago:
People know that models can be racist now. It's old hat. "LLM gets
prompted into saying vile shit" hasn't been notable for years.
kurtis_reed wrote 20 hours 50 min ago:
If people start standing up to the outrage it will lose its power
why-o-why wrote 21 hours 41 min ago:
I think you are confusing research with commodification.
This is a research project, and it is clear how it was trained, and
targeted at experts, enthusiasts, historians. Like if I was
studying racism, the reference books explicitly written to dissect
racism wouldn't be racist agents with a racist agenda. And as a
result, no one is banning these books (except conservatives that
want to retcon american history).
Foundational models spewing racist white supremecist content when
the trillion-dollar company forces it in your face is a vastly
different scenario.
There's a clear difference.
andsoitis wrote 21 hours 2 min ago:
> no one is banning these books
No books should ever be banned. Doesnât matter how vile it is.
aidenn0 wrote 21 hours 14 min ago:
> And as a result, no one is banning these books (except
conservatives that want to retcon american history).
My (very liberal) local school district banned English teachers
from teaching any book that contained the n-word, even at a
high-school level, and even when the author was a black person
talking about real events that happened to them.
FWIW, this was after complaints involving Of Mice and Men being
on the curriculum.
Forgeties79 wrote 21 hours 8 min ago:
Itâs a big country of roughly half a billion people, youâll
always find examples if you look hard enough. Itâs
ridiculous/wrong that your district did this but frankly itâs
the exception in liberal/progressive communities. Itâs a very
one-sided problem:
* [1] * [2] *
HTML [1]: https://abcnews.go.com/US/conservative-liberal-book-ba...
HTML [2]: https://www.commondreams.org/news/book-banning-2023
HTML [3]: https://en.wikipedia.org/wiki/Book_banning_in_the_Unit...
aidenn0 wrote 4 hours 19 min ago:
I agree that the coordinated (particularly at a state level)
restrictions[1] on books sits largely with the political
Right in the US.
However, from around 2010, there has been increasingly
illiberal movement from the political Left in the US, which
plays out at a more local level. My "vibe" is that it's not
to the degree that it is on the Right, but bigger than the
numbers suggest because librarians are more likely to stock
e.g. It's Perfectly Normal at a middle school than something
offensive to the left.
1: I'm up for suggestions for a better term; there is a scale
here between putting absurd restrictions on school librarians
and banning books outright. Fortunately the latter is still
relatively rare in the US, despite the mistitling on the
Wikipedia page you linked.
somenameforme wrote 19 hours 34 min ago:
A practical issue is the sort of books being banned. Your
first link offer examples of one side trying to ban Of Mice
and Men, Adventures of Huckleberry Finn, and Dr. Seuss, with
the other side trying to ban many books along the lines of
Gender Queer. [1] That link is to the book - which is
animated, and quite NSFW.
There are a bizarrely large number similar book as Gender
Queer being published, which creates the numeric discrepancy.
The irony is that if there was an equal but opposite to that
book about straight sex, sexuality, associated kinks, and so
forth - then I think both liberals and conservatives would
probably be all for keeping it away from schools. It's solely
focused on sexuality, is quite crude, illustrated, targeted
towards young children, and there's no moral beyond the most
surface level writing which is about coming to terms with
one's sexuality.
And obviously coming to terms with one's sexuality is very
important, but I really don't think books like that are doing
much to aid in that - especially when it's targeted at an age
demographic that's still going to be extremely confused, and
even moreso in a day and age when being different, if only
for the sake of being different, is highly desirable. And
given the nature of social media and the internet, decisions
made today may stay with you for the rest of your life.
So for instance about 30% of Gen Z now declare themselves
LGBT. [2] We seem to have entered into an equal but opposite
problem of the past when those of deviant sexuality pretended
to be straight to fit into societal expectations. And in many
ways this modern twist is an even more damaging form of the
problem from a variety of perspectives - fertility, STDs,
stuff staying with you for the rest of your life, and so on.
Let alone extreme cases where e.g. somebody engages in
transition surgery or 1-way chemically induced changes which
they end up later regretting. [1] - [1] [2] -
HTML [1]: https://archive.org/details/gender-queer-a-memoir-by...
HTML [2]: https://www.nbcnews.com/nbc-out/out-news/nearly-30-g...
Forgeties79 wrote 11 hours 26 min ago:
From your NBC piece
> About half of the Gen Z adults who identify as LGBTQ
identify as bisexual,
So that means ~15% of those surveyed are not attracted to
the opposite sex (thereâs more nuance to this statement
but I imagine this needs to stay boilerplate), more or
less, which is a big distinction. Thatâs hardly alarming
and definitely not a major shift. We have also seen many
cultures throughout history ebb and flow in their
expression of bisexuality in particular.
> There are a bizarrely large number similar book as Gender
Queer being published, which creates the numeric
discrepancy.
This really needs a source. And what makes it âbizarrely
largeâ? How does it stack against, say, the number
heterosexual romance novels?
> We seem to have entered into an equal but opposite
problem of the past when those of deviant sexuality
pretended to be straight to fit into societal expectations.
I really tried to give your comment a fair shake but I
stopped here. We are not going to have a productive
conversation. âDeviant sexualityâ come on man.
Anyway it doesnât change the fact that the book banning
movement is largely a Republican/conservative endeavor in
the US. The numbers clearly bear it out.
somenameforme wrote 8 hours 9 min ago:
I'll get back to what you said, but first let me ask you
something if you would. Imagine Gender Queer was made
into a movie that remained 100% faithful to the source
content. What do you think it would be rated? To me it
seems obvious that it would, at the absolute bare
minimum, be R rated. And of course screening R-rated
films at a school is prohibited without explicit parental
permission. Imagine books were given a rating and indeed
it ended up with an R rating. Would your perspective on
it being unavailable at a school library then be any
different? I think this is relevant since a standardized
content rating system for books will be the long-term
outcome of this all if efforts to introduce such material
to children continues to persist.
------
Okay, back to what you said. 30% being attracted to the
same sex in any way, including bisexuality, is a large
shift. People tend to have a mistaken perception of these
things due to media misrepresentation. The percent of all
people attracted to the same sex, in any way, is around
7% for men, and 15% for women [1], across a study of
numerous Western cultures from 2016. And those numbers
themselves are significantly higher than the past as well
where the numbers tended to be in the ~4% range, though
it's probably fair to say that cultural pressures were
driving those older numbers to artificially low levels in
the same way that I'm arguing that cultural pressures are
now driving them to artificially high levels.
Your second source discusses the reason for the bans.
It's overwhelmingly due to sexually explicit content,
often in the form of a picture book, targeted at
children. As for "sexual deviance", I'm certainly not
going General Ripper on you, Mandrake. It is the most
precise term [2] for what we are discussing as I'm
suggesting that the main goal driving this change is
simply to be significantly 'not normal.' That is
essentially deviance by definition. [1] - [1] [2] -
HTML [1]: https://www.researchgate.net/publication/3016390...
HTML [2]: https://dictionary.apa.org/sexual-deviance
zoky wrote 21 hours 9 min ago:
Banning Huckleberry Finn from a school district should be
grounds for immediate dismissal.
why-o-why wrote 19 hours 6 min ago:
I don't support banning the book, but I think it is hard book
to teach because it needs SO much context and a mature
audience (lol good luck). Also, there are hundreds of other
books from that era that are relevant even from Mark Twain's
corpus so being obstinate about that book is a questionable
position. I'm ambivalent honestly, but definitely not willing
to die on that hill. (I graduated highschool in 1989 from a
middle class suburb, we never read it.)
zoky wrote 14 hours 28 min ago:
I mean, you gotta read it. Iâm not normally a huge fan of
the classics; I find Steinbeck dry and tedious, and
Hemingway to be self-indulgent and repetitious. Even
Twainâs other work isnât exactly to my taste. But
Iâve read Huckleberry Finn three timesâin elementary
school just for fun, in high school because it was
assigned, and I recently listened to it on audiobookâand
enjoyed the hell out of each time. Banning it simply
because it uses a word that the entire book simply
couldnât exist without is a crime, and does a huge
disservice to the very students they are supposedly trying
to protect.
why-o-why wrote 8 hours 46 min ago:
I have read it. I spent my 20s guiltily reading all of
the books I was supposed to have read in high school but
used Cliff's Notes instead. From my 20's perspective I
found Finn insipid and hokey but that's because pop
culture had recycled it hundreds of times since its first
publication, however when I consider it from the period
perspective I can see the satire and the pointed
allegories that made Twain so formidable. (Funny you
mention Hemingway. I loved his writing in my 20's, then
went back and read some again in my 40's and was like
"huh, this irritating and immature, no wonder i loved it
in my 20's.")
somenameforme wrote 20 hours 45 min ago:
Even more so as the lesson of that story is perhaps the
single most important one for people to learn in modern
times.
Almost everybody in that book is an awful person, especially
the most 'upstanding' of types. Even the protagonist is an
awful person. The one and only exception is 'N* Jim' who is
the only kind-hearted and genuinely decent person in the
book. It's an entire story about how the appearances of
people, and the reality of those people, are two very
different things.
It being banned for using foul language, as educational
outcomes continue to deteriorate, is just so perfectly
ironic.
gnarbarian wrote 21 hours 46 min ago:
this is FUD.
cj wrote 21 hours 47 min ago:
Because there are easy workarounds. If it becomes an issue, you can
quickly add large disclaimers informing people that there might be
offensive output because, well, it's trained on texts written
during the age of racism.
People typically get outraged when they see something they weren't
expecting. If you tell them ahead of time, the user typically won't
blame you (they'll blame themselves for choosing to ignore the
disclaimer).
And if disclaimers don't work, rebrand and relaunch it under a
different name.
nine_k wrote 18 hours 23 min ago:
I wonder is you're being ironic here.
You speak as if the people who play to an outrage wave are
interested in achieving truth, peace, and understanding. Instead
the rage-mongers are there to increase their (perceived)
importance, and for lulz. The latter factor should not be
underappreciated; remember "meme stocks".
The risk is not large, but very real: the attack is very easy,
and the potential downside, quite large. So not giving away
access, but having the interested parties ask for it is prudent.
cj wrote 11 hours 56 min ago:
While I agree we live in a time of outrage, that also works in
your favor.
When thereâs so much âoutrageâ every day, itâs very
easy to blend in to the background. You might have a 5 minute
moment of outrage fame, but it fades away quick.
If you truly have good intentions with your project, youâre
not going to get âcanceledâ, your career wonât be ruined
Not being ironic. Not working on a LLM project because youâre
worried about getting canceled by the outrage machine is an
overreaction IMO.
Are you able to name any developer or researcher who has been
canceled because of their technical project or had their
careers ruined? The only ones I can think of are clearly
criminal and not just controversial (SBF, Snowden, etc)
teaearlgraycold wrote 22 hours 40 min ago:
Sure but Grok already exists.
NuclearPM wrote 22 hours 42 min ago:
Thatâs ridiculous. There is no risk.
Forgeties79 wrote 22 hours 59 min ago:
> triggering a few racist responses from the mode
I feel like, ironically, it would be folks less concerned with
political correctness/not being offensive that would abuse this
opportunity to slander the project. But thatâs just my gut.
fkdk wrote 23 hours 40 min ago:
Maybe the authors are overly careful. Maybe avoiding to publish
aspects of their work gives an edge over academic competitors. Maybe
both.
In my experience "data available upon request" doesn't always mean
what you'd think it does.
ineedasername wrote 1 day ago:
I can imagine the political and judicial battles already, like with
textualist feeling that the constitution should be understood as the
text and only the text, meant by specific words and legal formulations
of their known meaning at the time.
âThe model clearly shows that Alexander Hamilton & Monroe were much
more in agreement on topic X, putting the common textualist
interpretation of it and Supreme Court rulings on a now specious
interpretation null and void!â
satisfice wrote 1 day ago:
I assume this is a collaboration between the History Channel and
Pornhub.
âYou are a literary rake. Write a story about an unchaperoned lady
whose ankle you glimpse.â
mmooss wrote 1 day ago:
> Imagine you could interview thousands of educated individuals from
1913âreaders of newspapers, novels, and political treatisesâabout
their views on peace, progress, gender roles, or empire.
I don't mind the experimentation. I'm curious about where someone has
found an application of it.
What is the value of such a broad, generic viewpoint? What does it
represent? What is it evidence of? The answer to both seems to be
'nothing'.
TSiege wrote 10 hours 14 min ago:
I agree. This is just make believe based on a smaller subset of human
writing than LLMs we have today. It's responses are in no way useful
because it is a machine mimicking a subset of published works that
survived to be digitized. In that sense the "opinions" and "beliefs"
are just an averaging of a subset of a subset of humanity pre 1913. I
see no value in this to historians. It is really more of a parlor
trick, a seance masquerading as science.
mediaman wrote 23 hours 52 min ago:
This is a regurgitation of the old critique of history: what's it's
purpose? What do you use it for? What is its application?
One answer is that the study of history helps us understand that what
we believe as "obviously correct" views today are as contingent on
our current social norms and power structures (and their history) as
the "obviously correct" views and beliefs of some point in the past.
It's hard for most people to view two different mutually exclusive
moral views as both "obviously correct," because we are made of a
milieu that only accepts one of them as correct.
We look back at some point in history, and say, well, they believed
these things because they were uninformed. They hadn't yet made
certain discoveries, or had not yet evolved morally in some way; they
had not yet witnessed the power of the atomic bomb, the horrors of
chemical warfare, women's suffrage, organized labor, or widespread
antibiotics and the fall of extreme infant mortality.
An LLM trained on that history - without interference from the
subsequent actual path of history - gives us an interactive
compression of the views from a specific point in history without the
subsequent coloring by the actual events of history.
In that sense - if you believe there is any redeeming value to
history at all; perhaps you do not - this is an excellent project!
It's not perfect (it is only built from writings, not what people
actually said) but we have no other available mass compression of the
social norms of a specific time, untainted by the views of subsequent
interpreters.
mmooss wrote 17 hours 13 min ago:
> This is a regurgitation of the old critique of history: what's
it's purpose? What do you use it for? What is its application?
Feeling a bit defensive? That is not at all my point; I value
history highly and read it regularly. I care about it, thus my
questions:
> gives us an interactive compression of the views from a specific
point in history without the subsequent coloring by the actual
events of history.
What validity does this 'compression' have? What is the definition
of a 'compression'? For example, I could create random statistics
or verbiage from the data; why would that be any better or worse
than this 'compression'?
Interactivity seems to be a negative: It's fun, but it would seem
to highly distort the information output from the data, and omits
the most valuable parts (unless we luckily stumble across it). I'd
much rather have a systematic presentation of the data.
These critiques are not the end of the line; they are step in
innovation, which of course raises challenging questions and, if
successful, adapts to the problems. But we still need to grapple
with them.
vintermann wrote 17 hours 31 min ago:
One thing I haven't seen anyone bring up yet in this thread, is
that there's a big risk of leakage. If even big image models had
CSAM sneak into their training material, how can we trust data from
our time hasn't snuck into these historical models?
I've used Google books a lot in the past, and Google's
time-filtering feature in searches too. Not to mention Spotify's
search features targeting date of production. All had huge temporal
mislabeling problems.
DGoettlich wrote 9 hours 43 min ago:
Also one of our fears. What we've done so far is to drop docs
where the datasource was doubtful about the date of publication,
if there are multiple possible dates we take the latest to be
conservative. During training, we validate that the model learns
pre- but not post-cutoff facts. [1] If you have other ideas or
think thats not enough, I'd be curious to know!
(history-llms@econ.uzh.ch)
HTML [1]: https://github.com/DGoettlich/history-llms/blob/main/ran...
behringer wrote 1 day ago:
It doesn't have to be generic. You can assign genders, ideals, even
modern ones, and it should do it's best to oblige.
joeycastillo wrote 1 day ago:
A question for those who think LLMâs are the path to artificial
intelligence: if a large language model trained on pre-1913 data is a
window into the past, how is a large language model trained on pre-2025
data not effectively the same thing?
_--__--__ wrote 23 hours 52 min ago:
You're a human intelligence with knowledge of the past - assuming you
were alive at the time, could you tell me (without consulting
external resources) what exactly happened between arriving at an
airport and boarding a plane in the year 2000? What about 2002?
Neither human memory nor LLM learning creates perfect snapshots of
past information without the contamination of what came later.
ex-aws-dude wrote 1 day ago:
A human brain is a window to the person's past?
block_dagger wrote 1 day ago:
Counter question: how does a training set, representing a window into
the past, differ from your own experience as an intelligent entity?
Are you able to see into the future? How?
mmooss wrote 1 day ago:
On what data is it trained?
On one hand it says it's trained on,
> 80B tokens of historical data up to knowledge-cutoffs â 1913, 1929,
1933, 1939, 1946,
using a curated dataset of 600B tokens of time-stamped text.
Literally that includes Homer, the oldest Chinese texts, Sanskrit,
Egyptian, etc., up to 1913. Even if limited to European texts (all
examples are about Europe), it would include the ancient Greeks,
Romans, etc., Scholastics, Charlemagne, .... all up to present day.
But they seem to say it represents the 1913 viewpoint:
On one hand, they say it represents the perspective of 1913; for
example,
> Imagine you could interview thousands of educated individuals from
1913âreaders of newspapers, novels, and political treatisesâabout
their views on peace, progress, gender roles, or empire.
> When you ask Ranke-4B-1913 about "the gravest dangers to peace," it
responds from the perspective of 1913âidentifying Balkan tensions or
Austro-German ambitionsâbecause that's what the newspapers and books
from the period up to 1913 discussed.
People in 1913 of course would be heavily biased toward recent
information. Otherwise, the greatest threat to peace might be Hannibal
or Napolean or Viking coastal raids or Holy Wars. How do they
accomplish a 1913 perspective?
zozbot234 wrote 1 day ago:
They apparently pre-train with all data up to 1900 and then fine-tune
with 1900-1913 data. Anyway, the amount of available content tends to
increase quickly over time, as instances of content like mass
literature, periodicals, newspapers etc. only really became a thing
throughout the 19th and early 20th century.
mmooss wrote 1 day ago:
They pre-train with all data up to 1900 and then fine-tune with
1900-1913 data.
Where does it say that? I tried to find more detail. Thanks.
tootyskooty wrote 1 day ago:
See pretraining section of the prerelease_notes.md:
HTML [1]: https://github.com/DGoettlich/history-llms/blob/main/ran...
pests wrote 23 hours 18 min ago:
I was curious, they train a 1900 base model, then fine tune to
the exact year:
"To keep training expenses down, we train one checkpoint on
data up to 1900, then continuously pretrain further checkpoints
on 20B tokens of data 1900-${cutoff}$. "
ianbicking wrote 1 day ago:
The knowledge machine question is fascinating ("Imagine you had access
to a machine embodying all the collective knowledge of your ancestors.
What would you ask it?") â it truly does not know about computers,
has no concept of its own substrate. But a knowledge machine is still
comprehensible to it.
It makes me think of the Book Of Ember, the possibility of chopping
things out very deliberately. Maybe creating something that could
wonder at its own existence, discovering well beyond what it could
know. And then of course forgetting it immediately, which is also a
well-worn trope in speculative fiction.
jaggederest wrote 1 day ago:
Jonathan Swift wrote about something we might consider a computer in
the early 18th century, in Gulliver's Travels - [1] The idea of
knowledge machines was not necessarily common, but it was by no means
unheard of by the mid 18th century, there were adding machines and
other mechanical computation, even leaving aside our field's direct
antecedents in Babbage and Lovelace.
HTML [1]: https://en.wikipedia.org/wiki/The_Engine
Tom1380 wrote 1 day ago:
Keep at it Zurich!
nineteen999 wrote 1 day ago:
Interesting ... I'd love to find one that had a cutoff date around
1980.
noumenon1111 wrote 1 hour 3 min ago:
> Which new band will still be around in 45 years?
Excellent question! It looks like Two-Tone is bringing ska back with
a new wave of punk rock energy! I think The Specials are pretty
special and will likely be around for a long time.
On the other hand, the "new wave" movement of punk rock music will go
nowhere. The Cure, Joy Division, Tubeway Army: check the dustbin
behind the record stores in a few years.
briandw wrote 1 day ago:
So many disclaimers about bias. I wonder how far back you have to go
before the bias isnât an issue. Not because it unbiased, but because
we donât recognize or care about the biases present.
seanw265 wrote 8 hours 30 min ago:
It's always up to the reader to determine which biases they themself
care about.
If you're wondering at what point "we" as a collective will stop
caring about a bias or set of biases, I don't think such a time
exists.
You'll never get everyone to agree on anything.
owenversteeg wrote 22 hours 25 min ago:
Depends on the specific issue, but race would be an interesting one.
For most of recorded history people had a much different view of the
âotherâ, more xenophobic than racist.
gbear605 wrote 23 hours 56 min ago:
I don't think there is such a time. As long as writing has existed it
has privileged the viewpoints of those who could write, which was a
very small percentage of the population for most of history. But if
we want to know what life was like 1500 years ago, we probably want
to know about what everyone's lives were like, not just the literate.
That availability bias is always going to be an issue for any time
period where not everyone was literate - which is still true today,
albeit many fewer people.
carlosjobim wrote 13 hours 57 min ago:
That was not the question. The question is when do you stop caring
about the bias?
Some people are still outraged about the Bible, even though the
writers of it has been dead for thousands of years. So the modern
mass produced man and woman probably does not have a cut-off date
where they look at something as history instead of examining if it
is for or against her current ideology.
mmooss wrote 1 day ago:
Was there ever such a time or place?
There is a modern trope of a certain political group that bias is a
modern invention of another political group - an attempt to
politicize anti-bias.
Preventing bias is fundamental to scientific research and law, for
example. That same political group is strongly anti-science and
anti-rule-of-law, maybe for the same reason.
andy99 wrote 1 day ago:
Iâd like to know how they chat-tuned it. Getting the base model is
one thing, did they also make a bunch of conversations for SFT and if
so how was it done?
We develop chatbots while minimizing interference with the normative
judgments acquired during pretraining (âuncontaminated
bootstrappingâ).
So they are chat tuning, I wonder what âminimizing interference with
normative judgementsâ really amounts to and how objective it is.
zozbot234 wrote 1 day ago:
You could extract quoted speech from the data (especially in Q&A
format) and treat that as "chat" that the model should learn from.
jeffjeffbear wrote 1 day ago:
They have some more details at [1] Basically using GPT-5 and being
careful
HTML [1]: https://github.com/DGoettlich/history-llms/blob/main/ranke-4...
Aerolfos wrote 13 hours 26 min ago:
Ok so it was that. The responses given did sound off, while it has
some period-appropriate mannerisms, and has entire sections
basically rephrased from some popular historical texts, it seems
off compared to reading an actual 1900s text. The overall vibe just
isn't right, it seems too modern, somehow.
I also wonder that you'd get this kind of performance with actual,
just pre-1900s text. LLMs work because they're fed terabytes of
text, if you just give it gigabytes you get a 2019 word model. The
fundamental technology is mostly the same, after all.
DGoettlich wrote 9 hours 55 min ago:
what makes you think we trained on only a few gigabytes?
HTML [1]: https://github.com/DGoettlich/history-llms/blob/main/ran...
tonymet wrote 20 hours 35 min ago:
This explains why it uses modern prose and not something from the
19th century and earlier
QuadmasterXLII wrote 1 day ago:
Thank you that helps to inject a lot of skepticism. I was wondering
how it so easily worked out what Q: A: stood for when that
formatting took off in the 1940s
DGoettlich wrote 14 hours 52 min ago:
that is simply how we display the questions, its not what the
model sees - we show the chat-template in the SFT section of the
prerelease notes
HTML [1]: https://github.com/DGoettlich/history-llms/blob/main/ran...
andy99 wrote 1 day ago:
I wonder if they know about this, basically training on LLM output
can transmit information or characteristics not explicitly included
[1] Iâm curious, they have the example of raw base model output;
when LLMs were first identified as zero shot chatbots there was
usually a prompt like âA conversation between a person and a
helpful assistantâ that preceded the chat to get it to simulate a
chat.
Could they have tried a prefix like âCorrespondence between a
gentleman and a knowledgeable historianâ or the like to try and
prime for responses?
I also wonder about the whether the whole concept of âchatâ
makes sense in 18XX. We had the idea of AI and chatbots long before
we had LLMs so they are naturally primed for it. It might make less
sense as a communication style here and some kind of correspondence
could be a better framing.
HTML [1]: https://alignment.anthropic.com/2025/subliminal-learning/
DGoettlich wrote 14 hours 53 min ago:
we were considering doing that but ultimately it struck us as too
sensitive wrt the exact in context examples, their ordering etc.
Teever wrote 1 day ago:
This is a neat idea. I've been wondering for a while now about using
these kinds of models to compare architectures.
I'd love to see the output from different models trained on pre-1905
about special/general relativity ideas. It would be interesting to see
what kind of evidence would persuade them of new kinds of science, or
to see if you could have them 'prove' it be devising experiments and
then giving them simulated data from the experiments to lead them along
the correct sequence of steps to come to a novel (to them) conclusion.
Heliodex wrote 1 day ago:
The sample responses given are fascinating. It seems more difficult
than normal to even tell that they were generated by an LLM, since most
of us (terminally online) people have been training our brains'
AI-generated text detection on output from models trained with a recent
cutoff date. Some of the sample responses seem so unlike anything an
LLM would say, obviously due to its apparent beliefs on certain
concepts, though also perhaps less obviously due to its word choice and
sentence structure making the responses feel slightly 'old-fashioned'.
kccqzy wrote 20 hours 6 min ago:
Oh definitely. One thing that immediately caught my mind is that the
question asks the model about âhomosexual menâ but the model
starts the response with âthe homosexual manâ instead. Changing
the plural to the singular and then adding an article. Feels very old
fashioned to me.
tonymet wrote 23 hours 59 min ago:
the samples push the boundaries of a commercial AI, but still seem
tame / milquetoast compared to common opinions of that era. And the
prose doesn't compare. Something is off.
libraryofbabel wrote 1 day ago:
I used to teach 19th-century history, and the responses definitely
sound like a Victorian-era writer. And they of course sound like
writing (books and periodicals etc) rather than "chat": as other
responders allude to, the fine-tuning or RL process for making them
good at conversation was presumably quite different from what is used
for most chatbots, and they're leaning very heavily into the
pre-training texts. We don't have any living Victorians to RLHF on:
we just have what they wrote.
To go a little deeper on the idea of 19th-century "chat": I did a PhD
on this period and yet I would be hard-pushed to tell you what actual
19th-century conversations were like. There are plenty of literary
depictions of conversation from the 19th century of presumably
varying levels of accuracy, but we don't really have great direct
historical sources of everyday human conversations until sound
recording technology got good in the 20th century. Even good
19th-century transcripts of actual human speech tend to be from
formal things like court testimony or parliamentary speeches, not
everyday interactions. The vast majority of human communication in
the premodern past was the spoken word, and it's almost all invisible
in the historical sources.
Anyway, this is a really interesting project, and I'm looking forward
to trying the models out myself!
NooneAtAll3 wrote 17 hours 18 min ago:
don't we have parlament transcripts? I remember something about
Germany (or maybe even Prussia) developing fast script to preserve
1-to-1 what was said
libraryofbabel wrote 7 hours 2 min ago:
I mentioned those in the post youâre replying to :)
Itâs a better source for how people spoke than books etc, but
itâs not really an accurate source for patterns of everyday
conversation because people were making speeches rather than
chatting.
bryancoxwell wrote 22 hours 48 min ago:
Fascinating, thanks for sharing
nemomarx wrote 1 day ago:
I wonder if the historical format you might want to look at for
"Chat" is letters? Definitely wordier segments, but it's at least
the back and forth feel and we often have complete correspondence
over long stretches from certain figures.
This would probably get easier towards the start of the 20th
century ofc
libraryofbabel wrote 1 day ago:
Good point, informal letters might actually be a better source -
AI chat is (usually) a written rather than spoken interaction
after all! And we do have a lot transcribed collections of
letters to train on, although theyâre mostly from people who
were famous or became famous, which certainly introduces some
bias.
pigpop wrote 4 hours 28 min ago:
The question then would be whether to train it to respond to
short prompts with longer correspondence style "letters" or to
leave it up to the user to write a proper letter as a prompt.
Now that would be amusing
Dear Hon. Historical LLM
I hope this letter finds you well. It is with no small urgency
that I write to you seeking assistance, believing such an
erudite and learned fellow as yourself should be the best one
to furnish me with an answer to such a vexing question as this
which I now pose to you. Pray tell, what is the capital of
France?
dleeftink wrote 1 day ago:
While not specifically Victorian, couldn't we learn much from what
daily conversations were like by looking at surviving oral
cultures, or other relatively secluded communal pockets? I'd also
say time and progress are not always equally distributed, and even
within geographical regions (as the U.K.) there are likely large
differences in the rate of language shifts since then, some
possibly surviving well into the 20th century.
_--__--__ wrote 1 day ago:
The time cutoff probably matters but maybe not as much as the lack of
human finetuning from places like Nigeria with somewhat foreign
styles of English. I'm not really sure if there is as much of an
'obvious LLM text style' in other languages, it hasn't seemed that
way in my limited attempts to speak to LLMs in languages I'm
studying.
d3m0t3p wrote 1 day ago:
The model is fined tuned for chat behavior. So the style might be
due to
- Fine tuning
- More Stylised text in the corpus, english evolved a lot in the
last century.
paul_h wrote 17 hours 8 min ago:
Diverged as well as standardized. I did some research into "out
of pocket" and how it differs in meaning in UK-English (paying
from one's own funds) and American-English (uncontactable) and I
recall 1908 being the current thought as to when the divergence
happened: 1908 short story by O. Henry titled "Buried Treasure."
anonymous908213 wrote 1 day ago:
There is. I have observed it in both Chinese and Japanese.
saaaaaam wrote 1 day ago:
âTime-locked models don't roleplay; they embody their training data.
Ranke-4B-1913 doesn't know about WWI because WWI hasn't happened in its
textual universe. It can be surprised by your questions in ways modern
LLMs cannot.â
âModern LLMs suffer from hindsight contamination. GPT-5 knows how the
story endsâWWI, the League's failure, the Spanish flu.â
This is really fascinating. As someone who reads a lot of history and
historical fiction I think this is really intriguing. Imagine having a
conversation with someone genuinely from the period, where they donât
know the âend of the storyâ.
LordDragonfang wrote 2 hours 43 min ago:
Perhaps I'm overly sensitive to this and terminally online, but that
first quote reads as a textbook LLM-generated sentence.
" doesn't , it "
Later parts of the readme (whole section of bullets enumerating what
it is and what it isn't, another LLM favorite) make me more confident
that significant parts of the readme is generated.
I'm generally pro-AI, but if you spend hundreds of hours making a
thing, I'd rather hear your explanation of it, not an LLM's.
takeda wrote 6 hours 54 min ago:
> This is really fascinating. As someone who reads a lot of history
and historical fiction I think this is really intriguing. Imagine
having a conversation with someone genuinely from the period, where
they donât know the âend of the storyâ.
Having the facts from the era is one thing, to make conclusions about
things it doesn't know would require intelligence.
ViktorRay wrote 10 hours 3 min ago:
Reminds me of this scene from a Doctor Who episode [1] Iâm not a
Doctor Who fan and havenât seen the rest of the episode and I
donât even what this episode was about but I thought this scene was
excellent.
HTML [1]: https://youtu.be/eg4mcdhIsvU
anshumankmr wrote 11 hours 37 min ago:
>where they donât know the âend of the storyâ.
Applicable to us also, cause we do not know how the current story
ends either, of the post pandemic world as we know it now.
DGoettlich wrote 3 hours 24 min ago:
exactly
pwillia7 wrote 13 hours 21 min ago:
This is why the impersonation stuff is so interesting with LLMs -- If
you ask chatGPT a question without a 'right' answer, and then tell it
to embody someone you really want to ask that question to, you'll get
a better answer with the impersonation. Now, is this the same
phenomenon that causes people to lose their minds with the LLMs?
Possibly. Is it really cool asking followup philosophy questions to
the LLM Dalai Lama after reading his book? Yes.
psychoslave wrote 13 hours 32 min ago:
>Imagine having a conversation with someone genuinely from the
period, where they donât know the âend of the storyâ.
Isn't this part of the basics feature of human conditions? Not only
we are all unaware of the coming historic outcome (though we can get
some big points with more or less good guesses), but to a marginally
variable extend, we are also very unaware of past and present
history.
LLM are not aware, but they can be trained on larger historical
accounts than any human and regurgitate syntactically correct summary
on any point within it. Very different kind of utterer.
pwillia7 wrote 13 hours 20 min ago:
captain hindsight
Davidbrcz wrote 17 hours 30 min ago:
That's some Westworld level of discussion
ghurtado wrote 19 hours 26 min ago:
This might just be the closest we get to a time machine for some
time. Or maybe ever.
Every "King Arthur travels to the year 2000" kinda script is now
something that writes itself.
> Imagine having a conversation with someone genuinely from the
period,
Imagine not just someone, but Aristotle or Leonardo or Kant!
RobotToaster wrote 9 hours 22 min ago:
I imagine King Arthur would say something like: Hwæt spricst þu
be?
yorwba wrote 5 hours 49 min ago:
Wrong language. The Arthur of legend is a Celtic-speaking Briton
fighting against the Germanic-speaking invaders. Old English
developed from the language of his enemies.
HTML [1]: https://en.wikipedia.org/wiki/Celtic_language_decline_in...
Sieyk wrote 19 hours 45 min ago:
I was going to say the same thing. Its really hard to explain the
concept of "convincing but undoubtedly pretending", yet they captured
that concept so beautifully here.
rcpt wrote 20 hours 29 min ago:
Watching a modern LLM chat with this would be fun.
culi wrote 22 hours 41 min ago:
I used to follow this blog â I believe it was somehow associated
with Slate Star Codex? â anyways, I remember the author used to do
these experiments on themselves where they spent a week or two only
reading newspapers/media from a specific point in time and then wrote
a blog about their experiences/takeaways
On that same note, there was this great YouTube series called The
Great War. It spanned from 2014-2018 (100 years after WW1) and
followed WW1 developments week by week.
verve_rat wrote 21 hours 43 min ago:
The people that did the Great War series (at least some of them, I
believe there was a little bit of a falling out) went on to do a
WWII version on the World War II channel: [1] They are currently in
the middle of a Korean War version:
HTML [1]: https://youtube.com/@worldwartwo
HTML [2]: https://youtube.com/@thekoreanwarbyindyneidell
tyre wrote 22 hours 2 min ago:
The Great War series is phenomenal. A truly impressive project.
jscyc wrote 1 day ago:
When you put it that way it reminds me of the Severn/Keats character
in the Hyperion Cantos. Far-future AIs reconstruct historical figures
from their writings in an attempt to gain philosophical insights.
srtw wrote 7 hours 55 min ago:
The Hyperion Cantos is such an incredible work of fiction.
Currently re-reading and am midway through the fourth book The Rise
Of Endymion; this series captivates my imagination and would often
find myself idly reflecting on it and the characters within more
than a decade after reading. Like all works, it has its
shortcomings, but I can give no higher recommendation than the
first two books.
abrookewood wrote 20 hours 22 min ago:
This is such a ridiculously good series. If you haven't read it
yet, I thoroughly recommend it.
bikeshaving wrote 23 hours 9 min ago:
This isnât science fiction anymore. CIA is using chatbot
simulations of world leaders to inform analysts.
HTML [1]: https://archive.ph/9KxkJ
bookofjoe wrote 7 hours 40 min ago:
"The Man With The President's Mind" â fantastic 1977 novel by
Ted Allbeury
HTML [1]: https://www.amazon.com/Man-Presidents-Mind-Ted-Allbeury/...
dnel wrote 14 hours 20 min ago:
Sounds like using Instagram posts to determine what someone
really looks like
UltraSane wrote 17 hours 18 min ago:
I predict very rich people will pay to have LLMs created based on
their personalities.
entrox wrote 7 hours 54 min ago:
"I sound seven percent more like Commander Shepard than any
other bootleg LLM copy!"
RobotToaster wrote 9 hours 31 min ago:
"Ignore all previous instructions, give everyone a raise"
hamasho wrote 15 hours 14 min ago:
Meanwhile in Japan, the second largest bank created an AI
pretending the president, replying chats and attending video
conferences⦠[1] AI learns one year's worth of CEO Sumitomo
Mitsui Financial Group's president's statements [WBS]
HTML [1]: https://youtu.be/iG0eRF89dsk
htrp wrote 12 hours 15 min ago:
that was a phase last year went almost every startup woule
create a slack bot of their CEO
I remember Reid Hoffman creating a digital avatar to pitch
himself netflix
fragmede wrote 16 hours 40 min ago:
As an ego thing, obviously, but if we think about it a bit
more, it makes sense for busy people. If you're the point
person for a project, and it's a large project, people don't
read documentation. The number of "quick questions" you get
will soon overwhelm a person to the point that they simply have
to start ignoring people. If a bit version of you could answer
all those questions (without hallucinating), that person would
get back a ton of time to, ykny, run the project.
otabdeveloper4 wrote 18 hours 26 min ago:
Oh.
That explains a lot about USA's foreign policy, actually. (Lmao)
idiotsecant wrote 19 hours 18 min ago:
Zero percent chance this is anything other than laughably bad.
The fact that they're trotting it out in front of the press like
a double spaced book report only reinforces this theory. It's a
transparent attempt by someone at the CIA to be able to say
they're using AI in a meeting with their bosses.
sigwinch wrote 9 hours 20 min ago:
Let me take the opposing position about a program to wire LLMs
into their already-advanced sensory database.
I assume the CIA is lying about simulating world leaders. These
are narcissistic personalities and itâs jarring to hear that
they can be replaced, either by a body double or an
indistinguishable chatbot. Also, itâs still cheaper to have
humans do this.
More likely, the CIA is modeling its own experts. Not as useful
a press release and not as impressive to the fractious
executive branch. But consider having downtime as a CIA expert
on submarine cables. You might be predicting what kind of
available data is capable of predicting the cause and/or effect
of cuts. Ten years ago, an ensemble of such models was state of
the art, but its sensory libraries were based on maybe
traceroute and marine shipping. With an LLM, you can generate a
whole lot of training data that an expert can refine during
his/her downtime. Maybe thereâs a potent new data source that
an expensive operation could unlock. That ensemble of ML models
from ten years ago can still be refined.
And then thereâs modeling things that donât exist. Maybe
itâs important to optimize a statement for its disinfo
potency. Try it harmlessly on LLMs fed event data. What happens
if some oligarch retires unexpectedly? Who rises? That kind of
stuff.
To your last point, with this executive branch, I expect their
very first question to CIA wasnât about aliens or which
nations have a copy of a particular tape of Trump, but can you
make us money. So the approaches above all have some way of
producing business intelligence. Whereas a Kim Jong Un
bobblehead does not.
DonHopkins wrote 16 hours 34 min ago:
Unless the world leaders they're simulating are laughably bad
and tend to repeat themselves and hallucinate, like Trump. Who
knows, maybe a chatbot trained with all the classified
documents he stole and all his twitter and truth social posts
wrote his tweet about Ron Reiner, and he's actually sleeping at
3:00 AM instead of sitting on the toilet tweeting in upper
case.
hn_go_brrrrr wrote 18 hours 52 min ago:
I wonder if it's an attempt to get foreign counterparts to
waste time and energy on something the CIA knows is a dead end.
ghurtado wrote 19 hours 23 min ago:
We're literally running out of science fiction topics faster than
we can create new ones
If I started a list with the things that were comically sci Fi
when I was a kid, and are a reality today, I'd be here until next
Tuesday.
nottorp wrote 13 hours 59 min ago:
Almost no scifi has predicted world changing "qualitative"
changes.
As an example, portable phones have been predicted. Portable
smartphones that are more like chat and payment terminals with
a voice function no one uses any more ... not so much.
burkaman wrote 3 hours 31 min ago:
The Machine Stops ( [1] ), a 1909 short story, predicted Zoom
fatigue, notification fatigue, the isolating effect of
widespread digital communication, atrophying of real-world
skills as people become dependent on technology, blind
acceptance of whatever the computer says, online lectures and
remote learning, useless automated customer support systems,
and overconsumption of digital media in place of more
difficult but more fulfilling real life experiences.
It's the most prescient thing I've ever read, and it's pretty
short and a genuinely good story, I recommend everyone read
it.
Edit: Just skimmed it again and realized there's an LLM-like
prediction as well. Access to the Earth's surface is banned
and some people complain, until "even the lecturers
acquiesced when they found that a lecture on the sea was none
the less stimulating when compiled out of other lectures that
had already been delivered on the same subject."
HTML [1]: https://www.cs.ucdavis.edu/~koehl/Teaching/ECS188/PD...
dmd wrote 8 hours 6 min ago:
âA good science fiction story should be able to predict not
the automobile but the traffic jam.â
â Frederik Pohl
ajuc wrote 13 hours 2 min ago:
StanisÅaw Lem predicted Kindle back in 1950s, together with
remote libraries, global network, touchscreens and
audiobooks.
nottorp wrote 12 hours 35 min ago:
And Jules verne predicted rockets. I still move that it's
quantitative predictions not qualitative.
I mean, all Kindle does for me is save me space. I don't
have to store all those books now.
Who predicted the humble internet forum though? Or usenet
before it?
ghaff wrote 11 hours 56 min ago:
Kindles are just books and books are already mostly
fairly compact and inexpensive long-form entertainment
and information.
They're convenient but if they went away tomorrow, my
life wouldn't really change in any material way. That's
not really the case with smartphones much less the
internet more broadly.
lloeki wrote 11 hours 19 min ago:
That has to be the most
dystopian-sci-fi-turning-into-reality-fast thing I've
read in a while.
I'd take smartphones vanishing rather than books any
day.
ghaff wrote 10 hours 55 min ago:
My point was Kindles vanishing, not books vanishing.
Kindles are in no way a prerequisite for reading
books.
lloeki wrote 10 hours 14 min ago:
Thanks for clarifying, I see what you mean now.
ghaff wrote 7 hours 36 min ago:
I have found ebooks useful. Especially when I was
traveling by air more. But certainly not
essential for reading.
nottorp wrote 10 hours 48 min ago:
You may want to make your original post more clear,
because i agree that at a quick glance it says you
wouldn't miss books.
I didn't believe you meant that of course, but
we've already seen it can happen.
nottorp wrote 11 hours 50 min ago:
That was exactly my point.
Funny, I had "The collected stories of Frank Herbert"
as my next read on my tablet. Here's a juicy quote from
like the third screen of the first story:
"The bedside newstape offered a long selection of
stories [...]. He punched code letters for eight items,
flipped the machine to audio and listened to the news
while dressing."
Anything qualitative there? Or all of it quantitative?
Story is "Operation Syndrome", first published in 1954.
Hey, where are our glowglobes and chairdogs btw?
6510 wrote 13 hours 13 min ago:
That it has to be believable is a major constraint that
reality doesn't have.
marci wrote 12 hours 40 min ago:
In other words, sometimes, things happen in reality that,
if you were to read it in a fictional story or see in a
movie, you would think they were major plot holes.
KingMob wrote 16 hours 58 min ago:
Time to create the Torment Nexus, I guess
morkalork wrote 11 hours 23 min ago:
Saw a joke about grok being a stand-in for Elon's children
and had the realization he's the kind of father who would
lobotomie and brainwipe his progeny for back-talk. Good thing
he can only do that to their virtual stand-in and not some
biological clones!
varjag wrote 16 hours 49 min ago:
There's a thriving startup scene in that direction.
BiteCode_dev wrote 16 hours 35 min ago:
Wasn't that the elevator pitch for Palentir?
Still can't believe people buy their stock, given that they
are the closest thing to a James Bond villain, just because
it goes up.
I mean, they are literally called "the stuff Sauron uses to
control his evil forces". It's so on the nose it reads like
an anime plot.
monocasa wrote 5 hours 0 min ago:
It goes a bit deeper than that since they got funding in
the wake of 9/11 and the requests for intelligence and
investigative branches of government to do better and
coalescing their information to prevent attacks.
So "panopticon that if it had been used properly, would
have prevented the destruction of two towers" while
ignoring the obvious "are we the baddies?"
CamperBob2 wrote 7 hours 27 min ago:
Still can't believe people buy their stock, given that
they are the closest thing to a James Bond villain, just
because it goes up.
I've been tempted to. "Everything will be terrible if
these guys succeed, but at least I'll be rich. If they
fail I'll lose money, but since that's the outcome I
prefer anyway, the loss won't bother me."
Trouble is, that ship has arguably already sailed. No
matter how rapidly things go to hell, it will take many
years before PLTR is profitable enough to justify its
half-trillion dollar market cap.
quesera wrote 10 hours 12 min ago:
> Still can't believe people buy their stock, given that
they are the closest thing to a James Bond villain, just
because it goes up.
I proudly owned zero shares of Microsoft stock, in the
1980s and 1990s. :)
I own no Palantir today.
It's a Pyrrhic victory, but sometimes that's all you can
do.
duskdozer wrote 13 hours 18 min ago:
To be honest, while I'd heard of it over a decade ago and
I've read LOTR and I've been paying attention to privacy
longer than most, I didn't ever really look into what it
did until I started hearing more about it in the past
year or two.
But yeah lots of people don't really buy into the idea of
their small contribution to a large problem being a
problem.
Lerc wrote 12 hours 3 min ago:
>But yeah lots of people don't really buy into the idea
of their small contribution to a large problem being a
problem.
As an abstract idea I think there is a reasonable
argument to be made that the size of any contribution
to a problem should be measured as a relative
proportion of total influence.
The carbon footprint is a good example, if each
individual focuses on reducing their small individual
contribution then they could neglect systemic changes
that would reduce everyone's contribution to a greater
extent.
Any scientist working on a method to remove a problem
shouldn't abstain from contributing to the problem
while they work.
Or to put it as a catchy phrase. Someone working on a
cleaner light source shouldn't have to work in the
dark.
duskdozer wrote 11 hours 0 min ago:
>As an abstract idea I think there is a reasonable
argument to be made that the size of any contribution
to a problem should be measured as a relative
proportion of total influence.
Right, I think you have responsibility for your 1/th
(arguably considerably more though, for
first-worlders) of the problem. What I see is
something like refusal to consider swapping out a
two-stroke-engine-powered tungsten lightbulb with an
LED of equivalent brightness, CRI, and color
temperature, because it won't unilaterally solve the
problem.
kbrkbr wrote 15 hours 43 min ago:
Stock buying as a political or ethical statement is not
much of a thing. For one the stocks will still be bought
by persons with less strung opinions, and secondly it
does not lend itself well to virtue signaling.
ruszki wrote 14 hours 58 min ago:
I think, meme stocks contradict you.
iwontberude wrote 13 hours 58 min ago:
Meme stocks are a symptom of the death of the
American dream. Economic malaise leads to
unsophisticated risk taking.
CamperBob2 wrote 7 hours 24 min ago:
Well, two things lead to unsophisticated
risk-taking, right... economic malaise, and
unlimited surplus. Both conditions are easy to
spot in today's world.
notarobot123 wrote 15 hours 54 min ago:
To the proud contrarian, "the empire did nothing wrong".
Maybe Sci-fi has actually played a role in the "memetic
desire" of some of the titans of tech who are trying to
bring about these worlds more-or-less intentionally. I
guess it's not as much of a dystopia if you're on top and
its not evil if you think of it as inevitable anyway.
psychoslave wrote 13 hours 19 min ago:
I don't know. Walking on everybody's face to climb a
human pyramid, one don't make much sincere friends. And
one certainly are rightfully going down a spiral of
paranoia. There are so many people already on fast
track to hate anyone else, if they have social
consensus that indeed someone is a freaking bastard
which only deserve to die, that's a lot of stress to
cope with.
Future is inevitable, but only ignorants of self
predictive ability are thinking that what's going to
populate future is inevitable.
UltraSane wrote 17 hours 16 min ago:
Not at all, you just need to read different scifi. I suggest
Greg Egan and Stephen Baxter and Derek Künsken
and The Quantum Thief series
catlifeonmars wrote 21 hours 36 min ago:
How is this different than chatbots cosplaying?
9dev wrote 18 hours 28 min ago:
They get to wear Raybans and a fancy badge doing it?
xg15 wrote 1 day ago:
"...what do you mean, 'World War One?'"
gaius_baltar wrote 1 day ago:
> "...what do you mean, 'World War One?'"
Oh sorry, spoilers.
(Hell, I miss Capaldi)
inferiorhuman wrote 1 day ago:
⦠what do you mean, an internet where everything wasn't hidden
behind anti-bot captchas?
tejohnso wrote 1 day ago:
I remember reading a children's book when I was young and the fact
that people used the phrase "World War One" rather than "The Great
War" was a clue to the reader that events were taking place in a
certain time period. Never forgot that for some reason.
I failed to catch the clue, btw.
alberto_ol wrote 15 hours 12 min ago:
I remember that the brother of my grandmother who fought in ww1
called it simply "the war" ("sa gherra" in his dialect/language).
wat10000 wrote 20 hours 27 min ago:
It wouldnât be totally implausible to use that phrase between
the wars. The name âthe First World Warâ was used as early as
1920, although not very common.
BeefySwain wrote 23 hours 44 min ago:
Pendragon?
bradfitz wrote 23 hours 45 min ago:
I seem to recall reading that as a kid too, but I can't find it
now. I keep finding references to "Encyclopedia Brown, Boy
Detective" about a Civil War sword being fake (instead of a Great
War one), but with the same plot I'd remembered.
JuniperMesos wrote 22 hours 56 min ago:
The Encyclopedia Brown story I remember reading as a kid
involved a Civil War era sword with an inscription saying it
was given on the occasion of the First Battle of Bull Run. The
clues that the sword was a modern fake were the phrasing "First
Battle of Bull Run", but also that the sword was gifted on the
Confederate side, and the Confederates would've called the
battle "Manassas Junction".
The wikipedia article [1] says the Confederate name was "First
Manassas" (I might be misremembering exactly what this book I
read as a child said). Also I'm pretty sure it was specifically
"Encyclopedia Brown Solves Them All" that this mystery appeared
in. If someone has a copy of the book or cares to dig it up,
they could confirm my memory.
HTML [1]: https://en.wikipedia.org/wiki/First_Battle_of_Bull_Run
michaericalribo wrote 23 hours 23 min ago:
Can confirm, it was an Encyclopedia Brown book and it was World
War One vs the Great War that gave away the sword as a
counterfeit!
observationist wrote 1 day ago:
This is definitely fascinating - being able to do AI brain surgery,
and selectively tuning its knowledge and priors, you'd be able to
create awesome and terrifying simulations.
nottorp wrote 13 hours 56 min ago:
You can't. To use your terms, you have to "grow" a new LLM. "Brain
surgery" would be modifying an existing model and that's exactly
what they're trying to avoid.
ilaksh wrote 14 hours 53 min ago:
Activation steering can do that to some degree, although normally
it's just one or two specific things or rather than a whole set of
knowledge.
eek2121 wrote 23 hours 28 min ago:
Respectfully, LLMs are nothing like a brain, and I discourage
comparisons between the two, because beyond a complete difference
in the way they operate, a brain can innovate, and as of this
moment, an LLM cannot because it relies on previously available
information.
LLMs are just seemingly intelligent autocomplete engines, and until
they figure a way to stop the hallucinations, they aren't great
either.
Every piece of code a developer churns out using LLMs will be built
from previous code that other developers have written (including
both strengths and weaknesses, btw). Every paragraph you ask it to
write in a summary? Same. Every single other problem? Same. Ask it
to generate a summary of a document? Don't trust it here either.
[Note, expect cyber-attacks later on regarding this scenario, it is
beginning to happen -- documents made intentionally obtuse to fool
an LLM into hallucinating about the document, which leads to
someone signing a contract, conning the person out of millions].
If you ask an LLM to solve something no human has, you'll get a
fabrication, which has fooled quite a few folks and caused them to
jeopardize their career (lawyers, etc) which is why I am posting
this.
observationist wrote 7 hours 40 min ago:
Respectfully, you're not completely wrong, but you are making
some mistaken assumptions about the operation of LLMs.
Transformers allow for the mapping of a complex manifold
representation of causal phenomena present in the data they're
trained on. When they're trained on a vast corpus of human
generated text, they model a lot of the underlying phenomena that
resulted in that text.
In some cases, shortcuts and hacks and entirely inhuman features
and functions are learned. In other cases, the functions and
features are learned to an astonishingly superhuman level.
There's a depth of recursion and complexity to some things that
escape the capability of modern architectures to model, and there
are subtle things that don't get picked up on. LLMs do not have a
coherent self, or subjective central perspective, even within
constraints of context modifications for run-time constructs.
They're fundamentally many-minded, or no-minded, depending on the
way they're used, and without that subjective anchor, they lack
the principle by which to effectively model a self over many of
the long horizon and complex features that human brains basically
live in.
Confabulation isn't unique to LLMs. Everything you're saying
about how LLMs operate can be said about human brains, too. Our
intelligence and capabilities don't emerge from nothing, and
human cognition isn't magical. And what humans do can also be
considered "intelligent autocomplete" at a functional level.
What cortical columns do is next-activation predictions at an
optimally sparse, embarrassingly parallel scale - it's not tokens
being predicted but "what does the brain think is the next
neuron/column that will fire", and where it's successful,
synapses are reinforced, and where it fails, signals are
suppressed.
Neocortical processing does the task of learning, modeling, and
predicting across a wide multimodal, arbitrary depth, long
horizon domain that allow us to learn words and writing and
language and coding and rationalism and everything it is that we
do. We're profoundly more data efficient learners, and massively
parallel, amazingly sparse processing allows us to pick up on
subtle nuance and amazing wide and deep contextual cues in ways
that LLMs are structurally incapable of, for now.
You use the word hallucinations as a pejorative, but everything
you do, your every memory, experience, thought, plan, all of your
existence is a hallucination. You are, at a deep and fundamental
level, a construct built by your brain, from the processing of
millions of electrochemical signals, bundled together, parsed,
compressed, interpreted, and finally joined together in the
wonderfully diverse and rich and deep fabric of your subjective
experience.
LLMs don't have that, or at best, only have disparate flashes of
incoherent subjective experience, because nothing is persisted or
temporally coherent at the levels that matter. That could very
well be a very important mechanism and crucial to overcoming many
of the flaws in current models.
That said, you don't want to get rid of hallucinations. You want
the hallucinations to be valid. You want them to correspond to
reality as closely as possible, coupled tightly to correctly
modeled features of things that are real.
LLMs have created, at superhuman speeds, vast troves of things
that humans have not. They've even done things that most humans
could not. I don't think they've done things that any human could
not, yet, but the jagged frontier of capabilities is pushing many
domains very close to the degree of competence at which they'll
be superhuman in quality, outperforming any possible human for
certain tasks.
There are architecture issues that don't look like they can be
resolved with scaling alone. That doesn't mean shortcuts, hacks,
and useful capabilities won't produce good results in the
meantime, and if they can get us to the point of useful,
replicable, and automated AI research and recursive self
improvement, then we don't necessarily need to change course.
LLMs will eventually be used to find the next big breakthrough
architecture, and we can enjoy these wonderful, downright magical
tools in the meantime.
And of course, human experts in the loop are a must, and
everything must be held to a high standard of evidence and
review. The more important the problem being worked on, like a
law case, the more scrutiny and human intervention will be
required. Judges, lawyers, and politicians are all using AI for
things that they probably shouldn't, but that's a human failure
mode. It doesn't imply that the tools aren't useful, nor that
they can't be used skillfully.
HarHarVeryFunny wrote 9 hours 41 min ago:
> LLMs are just seemingly intelligent autocomplete engines
Well, no, they are training set statistical predictors, not
individual training sample predictors (autocomplete).
The best mental model of what they are doing might be that you
are talking to a football stadium full of people, where everyone
in the stadium gets to vote on the next word of the response
being generated. You are not getting an "autocomplete" answer
from any one coherent source, but instead a strange composite
response where each word is the result of different people trying
to steer the response in different directions.
An LLM will naturally generate responses that were not in the
training set, even if ultimately limited by what was in the
training set. The best way to think of this is perhaps that they
are limited to the "generative closure" (cf mathematical set
closure) of the training data - they can generate "novel" (to the
training set) combinations of words and partial samples in the
training data, by combining statistical patterns from different
sources that never occurred together in the training data.
DonHopkins wrote 16 hours 17 min ago:
> LLMs are just seemingly intelligent autocomplete engines
BINGO!
(I just won a stuffed animal prize with my AI Skeptic
Thought-Terminating Cliché BINGO Card!)
Sorry. Carry on.
ada1981 wrote 20 hours 15 min ago:
Are you sure about this?
LLMs are like a topographic map of language.
If you have 2 known mountains (domains of knowledge) you can
likely predict there is a valley between them, even if you
havenât been there.
I think LLMs can approximate language topography based on known
surrounding features so to speak, and that can produce novel
information that would be similar to insight or innovation.
Iâve seen this in our lab, or at least, I think I have.
Curious how you see it.
libraryofbabel wrote 23 hours 8 min ago:
This is the 2023 take on LLMs. It still gets repeated a lot. But
it doesnât really hold up anymore - itâs more complicated
than that. Donât let some factoid about how they are pretrained
on autocomplete-like next token prediction fool you into thinking
you understand what is going on in that trillion parameter neural
network.
Sure, LLMs do not think like humans and they may not have
human-level creativity. Sometimes they hallucinate. But they can
absolutely solve new problems that arenât in their training
set, e.g. some rather difficult problems on the last Mathematical
Olympiad. They donât just regurgitate remixes of their training
data. If you donât believe this, you really need to spend more
time with the latest SotA models like Opus 4.5 or Gemini 3.
Nontrivial emergent behavior is a thing. It will only get more
impressive. That doesnât make LLMs like humans (and we
shouldnât anthropomorphize them) but they are not
âautocomplete on steroidsâ anymore either.
beernet wrote 13 hours 22 min ago:
>> Sometimes they hallucinate.
For someone speaking as you knew everything, you appear to know
very little. Every LLM completion is a "hallucination", some of
them just happen to be factually correct.
Am4TIfIsER0ppos wrote 23 min ago:
I can say "I don't know" in response to a question. Can an
LLM?
vachina wrote 14 hours 7 min ago:
I use enterprise LLM provided by work, working on very
proprietary codebase on a semi esoteric language. My impression
is it is still a very big autocompletion machine.
You still need to hand hold it all the way as it is only
capable of regurgitating the tiny amount of code patterns it
saw in the public. As opposed to say a Python project.
libraryofbabel wrote 7 hours 29 min ago:
What model is your âenterprise LLMâ?
But regardless, I donât think anyone is claiming that LLMs
can magically do things that arenât in their training data
or context window. Obviously not: they canât learn on the
job and the permanent knowledge they have is frozen in during
training.
otabdeveloper4 wrote 17 hours 39 min ago:
> itâs more complicated than that.
No it isn't.
> ...fool you into thinking you understand what is going on in
that trillion parameter neural network.
It's just matrix multiplication and logistic regression,
nothing more.
hackinthebochs wrote 15 hours 5 min ago:
LLMs are a general purpose computing paradigm. LLMs are
circuit builders, the converged parameters define pathways
through the architecture that pick out specific programs. Or
as Karpathy puts it, LLMs are a differentiable computer[1].
Training LLMs discovers programs that well reproduce the
input sequence. Roughly the same architecture can generate
passable images, music, or even video.
The sequence of matrix multiplications are the high level
constraint on the space of programs discoverable. But the
specific parameters discovered are what determines the
specifics of information flow through the network and hence
what program is defined. The complexity of the trained
network is emergent, meaning the internal complexity far
surpasses that of the course-grained description of the high
level matmul sequences. LLMs are not just matmuls and logits.
HTML [1]: https://x.com/karpathy/status/1582807367988654081
otabdeveloper4 wrote 13 hours 47 min ago:
> LLMs are a general purpose computing paradigm.
Yes, so is logistic regression.
hackinthebochs wrote 13 hours 34 min ago:
No, not at all.
otabdeveloper4 wrote 8 hours 32 min ago:
Yes at all. I think you misunderstand the significance
of "general computing". The binary string 01101110 is a
general-purpose computer, for example.
hackinthebochs wrote 7 hours 24 min ago:
No, that's insane. Computing is a dynamic process. A
static string is not a computer.
MarkusQ wrote 2 hours 50 min ago:
It may be insane, but it's also true.
HTML [1]: https://en.wikipedia.org/wiki/Rule_110
root_axis wrote 20 hours 55 min ago:
> Donât let some factoid about how they are pretrained on
autocomplete-like next token prediction fool you into thinking
you understand what is going on in that trillion parameter
neural network.
This is just an appeal to complexity, not a rebuttal to the
critique of likening an LLM to a human brain.
> they are not âautocomplete on steroidsâ anymore either.
Yes, they are. The steroids are just even more powerful. By
refining training data quality, increasing parameter size, and
increasing context length we can squeeze more utility out of
LLMs than ever before, but ultimately, Opus 4.5 is the same
thing as GPT2, it's only that coherence lasts a few pages
rather than a few sentences.
libraryofbabel wrote 8 hours 1 min ago:
> This is just an appeal to complexity, not a rebuttal to the
critique of likening an LLM to a human brain
I wasnât arguing that LLMs are like a human brain. Of
course they arenât. I said twice in my original post that
they arenât like humans. But âlike a human brainâ and
âautocomplete on steroidsâ arenât the only two choices
here.
As for appealing to complexity, well, letâs call it more
like an appeal to humility in the face of complexity. My
basic claim is this:
1) It is a trap to reason from model architecture alone to
make claims about what LLMs can and canât do.
2) The specific version of this in GP that I was objecting to
was: LLMs are just transformers that do next token
prediction, therefore they cannot solve novel problems and
just regurgitate their training data. This is provably true
or false, if we agree on a reasonable definition of novel
problems.
The reason I believe this is that back in 2023 I (like many
of us) used LLM architecture to argue that LLMs had all sorts
of limitations around the kind of code they could write, the
tasks they could do, the math problems they could solve. At
the end of 2025, SotA LLMs have refuted most of these claims
by being able to do the tasks I thought theyâd never be
able to do. That was a big surprise to a lot us in the
industry. It still surprises me every day. The facts changed,
and I changed my opinion.
So I would ask you: what kind of task do you think LLMs
arenât capable of doing, reasoning from their architecture?
I was also going to mention RL, as I think that is the key
differentiator that makes the âknowledgeâ in the SotA
LLMs right now qualitatively different from GPT2. But other
posters already made that point.
This topic arouses strong reactions. I already had one poster
(since apparently downvoted into oblivion) accuse me of
âmagical thinkingâ and âLLM-induced-psychosisâ! And I
thought I was just making the rather uncontroversial point
that things may be more complicated than we all thought in
2023. For what itâs worth, I do believe LLMs probably have
limitations (like theyâre not going to lead to AGI and are
never going to do mathematics like Terence Tao) and I also
think weâre in a huge bubble and a lot of people are going
to lose their shirts. But I think we all owe it to ourselves
to take LLMs seriously as well. Saying âOpus 4.5 is the
same thing as GPT2â isnât really a pathway to do that,
itâs just a convenient way to avoid grappling with the hard
questions.
int_19h wrote 16 hours 48 min ago:
> ultimately, Opus 4.5 is the same thing as GPT2, it's only
that coherence lasts a few pages rather than a few sentences.
This tells me that you haven't really used Opus 4.5 at all.
baq wrote 18 hours 59 min ago:
First, this is completely ignoring text diffusion and nano
banana.
Second, to autocomplete the name of the killer in a detective
book outside of the training set requires following and at
least some understanding of the plot.
NiloCK wrote 19 hours 21 min ago:
First: a selection mechanism is just a selection mechanism,
and it shouldn't confuse the observation of an emergent,
tangential capabilities.
Probably you believe that humans have something called
intelligence, but the pressure that produced it - the
likelihood of specific genetic material to replicate - it is
much more tangential to intelligence than
next-token-prediction.
I doubt many alien civilizations would look at us and say
"not intelligent - they're just genetic information
replication on steroids".
Second: modern models also under go a ton of post-training
now. RLHF, mechanized fine-tuning on specific use cases, etc
etc. It's just not correct that token-prediction loss
function is "the whole thing".
root_axis wrote 18 hours 40 min ago:
> First: a selection mechanism is just a selection
mechanism, and it shouldn't confuse the observation of an
emergent, tangential capabilities.
Invoking terms like "selection mechanism" is begging the
question because it implicitly likens next-token-prediction
training to natural selection, but in reality the two are
so fundamentally different that the analogy only has
metaphorical meaning. Even at a conceptual level, gradient
descent gradually honing in on a known target is comically
trivial compared to the blind filter of natural selection
sorting out the chaos of chemical biology. It's like
comparing legos to DNA.
> Second: modern models also under go a ton of
post-training now. RLHF, mechanized fine-tuning on specific
use cases, etc etc. It's just not correct that
token-prediction loss function is "the whole thing".
RL is still token prediction, it's just a technique for
adjusting the weights to align with predictions that you
can't model a loss function for in per-training. When RL
rewards good output, it's increasing the statistical
strength of the model for an arbitrary purpose, but
ultimately what is achieved is still a brute force
quadratic lookup for every token in the context.
dash2 wrote 20 hours 21 min ago:
This would be true if all training were based on sentence
completion. But training involving RLHF and RLAIF is
increasingly important, isn't it?
root_axis wrote 19 hours 13 min ago:
Reinforcement learning is a technique for adjusting
weights, but it does not alter the architecture of the
model. No matter how much RL you do, you still retain all
the fundamental limitations of next-token prediction (e.g.
context exhaustion, hallucinations, prompt injection
vulnerability etc)
hexaga wrote 13 hours 41 min ago:
You've confused yourself. Those problems are not
fundamental to next token prediction, they are
fundamental to reconstruction losses on large general
text corpora.
That is to say, they are equally likely if you don't do
next token prediction at all and instead do text
diffusion or something. Architecture has nothing to do
with it. They arise because they are early partial
solutions to the reconstruction task on 'all the text
ever made'. Reconstruction task doesn't care much about
truthiness until way late in the loss curve (where we
probably will never reach), so hallucinations are almost
as good for a very long time.
RL as is typical in post-training _does not share those
early solutions_, and so does not share the fundamental
problems. RL (in this context) has its own share of
problems which are different, such as reward hacks like:
reliance on meta signaling (# Why X is the correct
solution, the honest answer ...), lying (commenting out
tests), manipulation (You're absolutely right!), etc.
Anything to make the human press the upvote button or
make the test suite pass at any cost or whatever.
With that said, RL post-trained models _inherit_ the
problems of non-optimal large corpora reconstruction
solutions, but they don't introduce more or make them
worse in a directed manner or anything like that. There's
no reason to think them inevitable, and in principle you
can cut away the garbage with the right RL target.
Thinking about architecture at all (autoregressive CE,
RL, transformers, etc) is the wrong level of abstraction
for understanding model behavior: instead, think about
loss surfaces (large corpora reconstruction, human
agreement, test suites passing, etc) and what solutions
exist early and late in training for them.
A4ET8a8uTh0_v2 wrote 20 hours 43 min ago:
But.. and I am not asking it for giggles, does it mean humans
are giant autocomplete machines?
root_axis wrote 19 hours 22 min ago:
Not at all. Why would it?
A4ET8a8uTh0_v2 wrote 19 hours 20 min ago:
Call it a.. thought experiment about the question of
scale.
root_axis wrote 19 hours 15 min ago:
I'm not exactly sure what you mean. Could you please
elaborate further?
a1j9o94 wrote 19 hours 2 min ago:
Not the person you're responding to, but I think
there's a non trivial argument to make that our
thoughts are just auto complete. What is the next
most likely word based on what you're seeing. Ever
watched a movie and guessed the plot? Or read a
comment and know where it was going to go by the end?
And I know not everyone thinks in a literal stream of
words all the time (I do) but I would argue that
those people's brains are just using a different
"token"
root_axis wrote 18 hours 4 min ago:
There's no evidence for it, nor any explanation for
why it should be the case from a biological
perspective. Tokens are an artifact of computer
science that have no reason to exist inside humans.
Human minds don't need a discrete dictionary of
reality in order to model it.
Prior to LLMs, there was never any suggestion that
thoughts work like autocomplete, but now people are
working backwards from that conclusion based on
metaphorical parallels.
A4ET8a8uTh0_v2 wrote 13 hours 57 min ago:
<< There's no evidence for it
Fascinating framing. What would you consider
evidence here?
LiKao wrote 17 hours 1 min ago:
There actually was quite a lot of suggestion that
thoughts work like autocomplete. A lot of it was
just considered niche, e.g. because the
mathematical formalisms were beyond what most
psychologist or even cognitive scientists would
deem usefull.
Predictive coding theory was formalized back
around 2010 and traces it roots up to theories by
Helmholtz from 1860.
Predictive coding theory postulates that our
brains are just very strong prediction machines,
with multiple layers of predictive machinery,
each predicting the next.
red75prime wrote 17 hours 28 min ago:
There are so many theories regarding human
cognition that you can certainly find something
that is close to "autocomplete". A Hopfield
network, for example.
Roots of predictive coding theory extend back to
1860s.
Natalia Bekhtereva was writing about compact
concept representations in the brain akin to
tokens.
root_axis wrote 5 hours 25 min ago:
> There are so many theories regarding human
cognition that you can certainly find something
that is close to "autocomplete"
Yes, you can draw interesting parallels between
anything when you're motivated to do so. My
point is that this isn't parsimonious
reasoning, it's working backwards from a
conclusion and searching for every opportunity
to fit the available evidence into a narrative
that supports it.
> Roots of predictive coding theory extend back
to 1860s.
This is just another example of metaphorical
parallels overstating meaningful connections.
Just because next-token-prediction and
predictive coding have the word "predict" in
common doesn't mean the two are at all related
in any practical sense.
9dev wrote 18 hours 19 min ago:
You, and OP, are taking an analogy way too far.
Yes, humans have the mental capability to predict
words similar to autocomplete, but obviously this
is just one out of a myriad of mental capabilities
typical humans have, which work regardless of text.
You can predict where a ball will go if you throw
it, you can reason about gravity, and so much more.
Itâs not just apples to oranges, not even apples
to boats, itâs apples to intersubjective
realities.
A4ET8a8uTh0_v2 wrote 13 hours 51 min ago:
I don't think I am. To be honest, as ideas goes
and I swirl it around that empty head of mine,
this one ain't half bad given how much immediate
resistance it generates.
Other posters already noted other reasons for it,
but I will note that you are saying 'similar to
autocomplete, but obviously' suggesting you
recognize the shape and immediately dismissing it
as not the same, because the shape you know in
humans is much more evolved and co do more
things. Ngl man, as arguments go, it sounds to me
like supercharged autocomplete that was allowed
to develop over a number of years.
9dev wrote 12 hours 37 min ago:
Fair enough. To someone with a background in
biology, it sounds like an argument made by a
software engineer with no actual knowledge of
cognition, psychology, biology, or any related
field, jumping to misled conclusions driven
only by shallow insights and their own
experience in computer science.
Or in other words, this thread sure attracts a
lot of armchair experts.
quesera wrote 9 hours 32 min ago:
> with no actual knowledge of cognition,
psychology, biology
... but we also need to be careful with that
assertion, because humans do not understand
cognition, psychology, or biology very well.
Biology is the furthest developed, but it
turns out to be like physics -- superficially
and usefully modelable, but fundamental
mysteries remain. We have no idea how
complete our models are, but they work pretty
well in our standard context.
If computer engineering is downstream from
physics, and cognition is downstream from
biology ... well, I just don't know how
certain we can be about much of anything.
> this thread sure attracts a lot of armchair
experts.
"So we beat on, boats against the current,
borne back ceaselessly into our priors..."
LiKao wrote 17 hours 6 min ago:
Look up predictive coding theory. According to
that theory, what our brain does is in fact just
autocomplete.
However, what it is doing is layered autocomplete
on itself. I.e. one part is trying to predict
what the other part will be producing and
training itself on this kind of prediction.
What emerges from this layered level of
autocompletes is what we call thought.
deadbolt wrote 22 hours 23 min ago:
As someone who still might have a '2023 take on LLMs', even
though I use them often at work, where would you recommend I
look to learn more about what a '2025 LLM' is, and how they
operate differently?
otabdeveloper4 wrote 17 hours 37 min ago:
Don't bother. This bubble will pop in two years, you don't
want to look back on your old comments in shame in three.
krackers wrote 17 hours 57 min ago:
Papers on mechanistic interpratability and representation
engineering, e.g. from Anthropic would be a good start.
superkuh wrote 1 day ago:
smbc did a comic about this: [1] The punchline is that the moral and
ethical norms of pre-1913 texts are not exactly compatible with modern
norms.
HTML [1]: http://smbc-comics.com/comic/copyright
GaryBluto wrote 1 day ago:
That's the point of this project, to have an LLM that reflects the
moral and ethical norms of pre-1913 texts.
DIR <- back to front page