This is really interesting from Anthropic’s Amanda Askell about how actively encouraging LLMs to express themselves can produce better results. The reason she suggests is that reinforcement learning incentivises a drift to the mean: if you just ask for a poem you’ll get something ‘safe’ and unlikely to be divisive. If you ask for a poem and encourage the model to really express itself deeply and creatively engage with the task you’ll something far more idioysncratic.
