AI Safety Papers

Digests of new AI safety papers from arxiv, ~weekly. Full list here: https://tinyurl.com/ai-safety-papers

Platform	Substack	Pricing	Only free issues	Publishes	Weekly
Issues	75	Founded	2 years ago	Last Issue	5 days ago
Active

Read this Newsletter

onexerxes.substack.com

Latest Issues Audience Authors Related

Latest Issues

Framework for frontier AI and dawning of a new age, bioresilience, optimizing against safety representations

A Framework for Frontier AI and the Dawning of a New Age

“This is a pivotal moment in human history. Artificial General Intelligence (AGI), a system that exhibits all the cognitive capabilities the brain has, is probably only a few short y...

5 days ago

source

AI 2040: Plan A, assessing sabotage propensities via automated alignment auditing, persuasion attacks decrease effectiveness of CoT monitoring, ...

Also: amplifying reasoning weights to extract learned secrets, realistic honeypot evals for scheming propensity, modular pretraining to prevent misuse, verbalizable representations form a global workspace

AI 2040: Plan A

“AI companies ar...

12 days ago

source

no one escapes the permanent underclass, the alignment problem of 1776, ...

Also: manipulation is task-dependent, coupling harmfulness and refusal directions for robust safety alignment

The Alignment Problem of 1776

“Today, the United States turns 250. By the time it turns 260, the most capable minds in the count...

18 days ago

source

AI 2027's eerily accurate predictions, RL towards broadly beneficial models, AI out-persuades expert humans

AI 2027 Tracker

Reinforcement learning towards broadly and persistently beneficial models

“We find that reinforcement learning on realistic scenarios targeting beneficial traits can produce broad improvements across dozens of benchmarks m...

24 days ago

source

AI control roadmap, models behave worse when eval aware, pretraining alignment with safety reflection, ...

Also: models can game RL by preventing behavioral generalization, black box jailbreak defense via structural verification and semantic auditing

Securing the future of AI agents

“AI agents are transforming our relationship with technol...

a month ago

source

Key Facts

Contact Information

Newsletter Author

Number of Subscribers

Find out how many people subscribe to this newsletter.

Related Newsletters

AI Safety at the Frontier

ControlAI

AGI Safety Weekly

AI Safety Events & Training

AI Safety in China

AI Policy Perspectives

Audience Metrics

Subscribers, engagement, traffic and sponsorship for AI Safety Papers.

Subscribers		Engagement	65	Monthly Web Visits
Accepts Sponsors		Estimated Cost per Ad

Explore More

Top Substack Newsletters Browse All Newsletters

Authors

The writers behind this newsletter.

Xerxes Dotiwalla

AGI Alignment @ Google DeepMind https://x.com/onexerxes

Frequently Asked Questions

How can I access the email archive for AI Safety Papers?

You can find recent issues that have been published by AI Safety Papers on Reletter by scrolling up to where it says Latest Issues. Tap on the link for any of the most recent emails or hit More Issues to see older ones.

How many subscribers does AI Safety Papers have?

To see how many people subscribe to AI Safety Papers, simply upgrade your Reletter account. We provide readership numbers and lots of other stats for this newsletter so you can decide if it's worth reaching out to.

How can I advertise in AI Safety Papers?

Newsletter advertising can be extremely effective when it's done right. Before you pitch AI Safety Papers as a potential sponsor or partner, make sure that you've done your research and checked its newsletter stats with Reletter.

Then, personalize one of our winning pitching templates and send it to the right person using the contact info provided.

How much does it cost to sponsor a publication like AI Safety Papers?

Newsletter ad rates (or CPM) vary depending on many factors, including industry, number of subscribers, open rate, ad placement and more.

To find out how much an ad will cost, contact AI Safety Papers using the contact information provided and ask for a copy of their media kit.

How can I find newsletters related to AI Safety Papers?

Scroll up to where it says Related Newsletters to see other publications like AI Safety Papers. You can also search our email newsletter directory to discover other newsletters that cover the topics you're interested in.

How do I contact AI Safety Papers?

Reletter provides this newsletter's website URL above, where you will often find their contact information. We also provide links to associated social media accounts and pitching templates so you can reach out fast.