Reletter
Artwork for Musings on the Alignment Problem

Musings on the Alignment Problem

Jan Leike
Platform
Substack
PricingOnly free issuesPublishesInfrequently
Issues14Founded5 years agoLast Issue4 months ago
Active

Read this Newsletter

aligned.substack.com
Artwork for Musings on the Alignment Problem

Latest Issues

Alignment is not solved

I’ve been optimistic about alignment for a while now, but when I first wrote about this in 2022, there was a lot more uncertainty about how the technology would develop. Since then a lot has happened: pretraining continued improving and RL...

4 months ago
128
17

Should we control AI instead of aligning it?

The basic pitch for AI control is to build systems around capable models that ensure they can’t cause harm even if they were misaligned. For example, a misaligned model could deliberately introduce subtle mistakes or bugs or try to go rogue...

a year ago
50
16

Crisp and fuzzy tasks

This is a taxonomy for task space that I find useful when thinking about what we need to do to solve alignment.

  • Crisp tasks are more reasoning/system-2 based. Whether a response is good is typically precisely defined. Reasonable, knowl...
2 years ago
48
6

Two alignment threat models

Misalignments in models typically fall somewhere between the two ends of this spectrum:

  • Under-elicited models: The model doesn’t try as hard as possible on the task, so its performance is worse than it could be if it was more aligned....
2 years ago
35
21

Combining weak-to-strong generalization with scalable oversight

The idea behind our latest paper on weak-to-strong generalization (W2SG) is that we finetune a large pretrained model to generalize well from supervision that is weaker (less accurate) than this large model. In the paper we use weak supervi...

2 years ago
28
6

Key Facts

Contact Information
Newsletter Author
Number of Subscribers
Find out how many people subscribe to this newsletter.

Audience Metrics

Subscribers, engagement, traffic and sponsorship for Musings on the Alignment Problem.

SubscribersMonthly Web VisitsAccepts Sponsors
Estimated Cost per Ad

Authors

The writers behind this newsletter.

  • Jan Leike
  • Frequently Asked Questions

    How can I access the email archive for Musings on the Alignment Problem?

    You can find recent issues that have been published by Musings on the Alignment Problem on Reletter by scrolling up to where it says Latest Issues. Tap on the link for any of the most recent emails or hit More Issues to see older ones.

    How many subscribers does Musings on the Alignment Problem have?

    To see how many people subscribe to Musings on the Alignment Problem, simply upgrade your Reletter account. We provide readership numbers and lots of other stats for this newsletter so you can decide if it's worth reaching out to.

    How can I advertise in Musings on the Alignment Problem?

    Newsletter advertising can be extremely effective when it's done right. Before you pitch Musings on the Alignment Problem as a potential sponsor or partner, make sure that you've done your research and checked its newsletter stats with Reletter.

    Then, personalize one of our winning pitching templates and send it to the right person using the contact info provided.

    How much does it cost to sponsor a publication like Musings on the Alignment Problem?

    Newsletter ad rates (or CPM) vary depending on many factors, including industry, number of subscribers, open rate, ad placement and more.

    To find out how much an ad will cost, contact Musings on the Alignment Problem using the contact information provided and ask for a copy of their media kit.

    How can I find newsletters related to Musings on the Alignment Problem?

    Scroll up to where it says Related Newsletters to see other publications like Musings on the Alignment Problem. You can also search our email newsletter directory to discover other newsletters that cover the topics you're interested in.

    How do I contact Musings on the Alignment Problem?

    Reletter provides this newsletter's website URL above, where you will often find their contact information. We also provide links to associated social media accounts and pitching templates so you can reach out fast.