From Daniel Rosenberg’s essay in Raw Data Is An Oxymoron, loc 916. What further developments are we beginning to see in the meaning of ‘data’ in a digitalised context? The author’s point is that data is not associated with veracity, such that inaccurate data is still data, but I wonder if I’ve misunderstood this because it seems so obviously misleading to me.
This observation is supported by the numbers but not generated by them: from the beginning, data was a rhetorical concept. Data means—and has meant for a very long time—that which is given prior to argument. As a consequence, the meaning of data must always shift with argumentative strategy and context—and with the history of both. The rise of modern economics and empirical natural science created new conditions of argument and new assumptions about facts and evidence. And the histories of those terms and others in the same family nicely illustrate the larger epistemological developments.
The history of data is connected to these other histories in very important ways, but in equally important ways, it remains an outlier. Curiously, the preexisting semantic structure of the term “data” made it especially flexible in these shifting epistemological and semantic contexts. Without changing meaning, during the eighteenth century data changed connotation. It went from being reflexively associated with those things that are outside of any possible process of discovery to being the very paradigm of what one seeks through experiment and observation.
I’m interested in how this supports the implicitly critical theory of data scientists: digital data reveals what people really do, rather than what they say they do. The given character of empirical data was formerly localised and now it’s seen to be generalised: given-ness suffuses the social world in a way that seemingly promises total legibility to those with sufficient literacy of the right sort.