Raiding the inarticulate since 2010

accelerated academy acceleration agency AI Algorithmic Authoritarianism and Digital Repression archer Archive Archiving artificial intelligence automation Becoming Who We Are Between Post-Capitalism and Techno-Fascism big data blogging capitalism ChatGPT claude Cognitive Triage: Practice, Culture and Strategies Communicative Escalation and Cultural Abundance: How Do We Cope? Corporate Culture, Elites and Their Self-Understandings craft creativity critical realism data science Defensive Elites Digital Capitalism and Digital Social Science Digital Distraction, Personal Agency and The Reflexive Imperative Digital Elections, Party Politics and Diplomacy digital elites Digital Inequalities Digital Social Science Digital Sociology digital sociology Digital Universities elites Fragile Movements and Their Politics Cultures generative AI higher education Interested labour Lacan Listening LLMs margaret archer Organising personal morphogenesis Philosophy of Technology platform capitalism platforms Post-Democracy, Depoliticisation and Technocracy post-truth psychoanalysis public engagement public sociology publishing Reading realism reflexivity scholarship sexuality Shadow Mobilization, Astroturfing and Manipulation Social Media Social Media for Academics social media for academics social ontology social theory sociology technology The Content Ecosystem The Intensification of Work theory The Political Economy of Digital Capitalism The Technological History of Digital Capitalism Thinking trump twitter Uncategorized work writing zizek

The frontier models still hallucinate wildly for literature searches

Over the last few months I’ve slowly experimented with asking frontier models (GPT4o and Clade 3/3.5) for suggestions of literature on particular topics. It’s something which was obviously impossible with earlier models because of how uniformly they hallucinated references. In contrast the recent generation of models were capable of producing at least a few interesting and often left-field references for the topics I was asking about, usually in STS or social theory.

However I just asked GPT 4o and Claude 3.5 for literature on an extremely specific topic in psychotherapeutic practice. Not only were the real references they offered utterly generic, most of the reference lists were entirely hallucinated. Claude retreated into the hyper-apologetic mode I hadn’t seen since Claude 2 when challenged, whereas GPT 4o was weirdly evasive in a way I’d not seen from the model before:

The frontier models still hallucinate wildly for literature searches. I think this illustrates how representation within the training data is very much lopsided, with my experience suggesting there’s a surprising preponderance of STS literature in what they were trained on (presumably through open access literature). In contrast when I was asking an extremely specific question about a literature which was much less represented, the old problems were immediately encountered.