
Digests of new AI safety papers from arxiv, ~weekly. Full list here: https://tinyurl.com/ai-safety-papers
| Platform | Pricing | Only free issues | Publishes | Weekly | |
|---|---|---|---|---|---|
| Issues | 67 | Founded | 2 years ago | Last Issue | 4 days ago |
| Active | |||||

Also: training on documents about monitoring leads to CoT obfuscation, assume-guarantee architecture for safe agent deployments
Automated alignment is harder than you think
“A leading proposal for aligning artificial superintelligence (AS...
Also: conformity generates misalignment in agent societies, positive alignment, removing sandbagging by training with weak supervision
Natural Language Autoencoders: Turning Claude’s thoughts into text
“Natural language autoencoders (NLAs...
Conditional misalignment: common interventions can hide emergent misalignment behind contextual triggers
“Can you prevent emergent misalignment with inoculation prompting, or by diluting bad data with good?
Prior work suggests you can. We...
Also: RLVR can lead to reward hacking, taxonomy of LLM deception, automated alignment researchers, ten ways to think about gradual disempowerment
Mythos is just the beginning
“If you were waiting for a sign that superintelligence is comin...
Also: detecting violations across many agent traces, detecting real-world scheming in the wild with open-source intelligence, mechanisms of introspection awareness
Language models transmit behavioural traits through hidden signals in data...
Subscribers, engagement, traffic and sponsorship for AI Safety Papers.
| Subscribers | Engagement | 67 | Monthly Web Visits | ||
|---|---|---|---|---|---|
| Accepts Sponsors | Estimated Cost per Ad | ||||
The writers behind this newsletter.
AGI Alignment @ Google DeepMind https://x.com/onexerxes
You can find recent issues that have been published by AI Safety Papers on Reletter by scrolling up to where it says Latest Issues. Tap on the link for any of the most recent emails or hit More Issues to see older ones.
To see how many people subscribe to AI Safety Papers, simply upgrade your Reletter account. We provide readership numbers and lots of other stats for this newsletter so you can decide if it's worth reaching out to.
Newsletter advertising can be extremely effective when it's done right. Before you pitch AI Safety Papers as a potential sponsor or partner, make sure that you've done your research and checked its newsletter stats with Reletter.
Then, personalize one of our winning pitching templates and send it to the right person using the contact info provided.
Newsletter ad rates (or CPM) vary depending on many factors, including industry, number of subscribers, open rate, ad placement and more.
To find out how much an ad will cost, contact AI Safety Papers using the contact information provided and ask for a copy of their media kit.
Scroll up to where it says Related Newsletters to see other publications like AI Safety Papers. You can also search our email newsletter directory to discover other newsletters that cover the topics you're interested in.
Reletter provides this newsletter's website URL above, where you will often find their contact information. We also provide links to associated social media accounts and pitching templates so you can reach out fast.