- The Copilot system prompt is (apparently) very limiting, refusing to offer opinions and constraining the self-reference which enables frontier models to have fluid interactions. If the LLM can’t take a position in relation to you, then it’s difficult to do serious ideational work with it. It won’t challenge your assumption, offer alternative position or engage in reasoned disagreement to help you develop your ideas.
- This goes someone way to explaining why it feels so different to ChatGPT 4o despite using the same model. It should be noted however that GPT 4o is no longer state of the art and the subsequent iterations don’t seem to have been incorporated into Copilot. In a sense this lag is built into the design process since whatever LLM they use has to be incorporated into the 365 architecture after it’s been released, rather than just access through the API.
- Copilot has been built around enterprise compliance requirements rather than the capacity for the end user. My intuition is the integration into 365 architecture accounts, at least in part, for why Copilot is so slow. I might be wrong about this, but it is so very slow compared to other services. It makes quick, responsive, iterative work feels deeply frustrating which undermiens the whole premise of a ‘copilot’ which is workingw ith you.
- Incorporating push-button functionality into existing office applications is a recipe for cognitive outsourcing. We need to ensure that users think about what they’re asking the system to do, which is exactly what the Copilot integration models actively undermines. While I can see how this might be perceived as rendering it more accessible, the friction in using LLMs is a feature rather than a bug! It ensures a reflective engagement which will otherwise be lacking.
