The premise of AI risk is that AI is a danger, and therefore research into AI might be dangerous. In the AI alignment community, we're trying to do research which makes AI safer, but occasionally we might come up with results that have significant implications for AI capability as well. Therefore, it seems prudent to come up with a set of guidelines that address:

Which results should be published?

What to do with results that shouldn't be published?

These are thorny questions that it seems unreasonable to expect every researcher to solve for themselves. The inputs to these questions involve ... (Read more)