A few years ago, I was preoccupied by this question after an illuminating six months working in the Data Science Lab at Warwick Business School. I co-organised a ground breaking conference in computational social science and it was clear this represented a mode of expertise distinct from what I had previously been familiar with as someone whose academic life had been spent until then in philosophy and sociology departments. It made different claims and responded to different concerns, leading to analyses which were radically extensive but lacking in intensivity. However what interested me more were the human factors associated with it, the distinctive stylings and presentations I could see amongst people who identified as data scientists. It struck me as a fascinating category, doing a great deal of work while remaining under defined. It could be performed in a wide range of ways, while there remained characteristics which most data scientists seemed to share. It placed intellectual issues in the foreground while there nonetheless seemed to be an emerging ethos uniting people who identified as data scientists, even if it might only have been partially articulated.
What did it mean to be a data scientist? How did the identity motivate people? How did these feed back into the occupational category and the role this played within data driven organisations? These are questions which haven’t been on my mind for a while but I was reminded of them today when reading about DJ Patil, the first chief data scientist in the US government. This was the account he gave to Michael Lewis on pg 155 of the Fifth Risk, describing how the category of ‘data scientist’ emerged amidst the data tsunami of the last decade. A range of titles and roles had emerged which packaged up emerging competencies related to data in different ways. According to DJ Patil the impulse to unify these under the master category of ‘data scientist’ was bureaucratic, reflecting a managerial discomfort with an explosion of new categories within their organisations:
Along with much more: in the space of a few years, the interest in data analysis went from curiosity to fad. The fetish for data overran everything from political campaigns to the management of baseball teams. Inside LinkedIn, DJ presided over an explosion of job titles that described similar tasks: analyst, business analyst, data analyst, research sci. The people in human resources complained to him that the company had too many data-related job titles. The company was about to go public, and they wanted to clean up the organization chart. To that end DJ sat down with his counterpart at Facebook, who was dealing with the same problem. What could they call all these data people? “Data scientist,” his Facebook friend suggested. “We weren’t trying to create a new field or anything, just trying to get HR off our backs,” said DJ. He replaced the job titles for some openings with “data scientist.” To his surprise, the number of applicants for the jobs skyrocketed. “Data scientists” were what people wanted to be. In the fall of 2014 someone from the White House called him. Obama was coming to San Francisco and wanted to meet with him.
Exploring the interplay between organisations creating roles for ‘data scientists’ and individuals coming to identify as such would make a fascinating project. How have the role of the “analyst, business analyst, data analyst, research sci” and the people performing them changed in the process? For instance consider the description on Pg 172-174 of the intellectual project which the creation of chief data scientist in the US government sought to advance:
DJ Patil had gone to Washington in 2014 to help people find that gold. He was the human expression of an executive order Obama had signed the year before, insisting that all unclassified government data be made publicly available and that it be machine-readable. DJ assumed he’d need to leave when the man who hired him left office, so that gave him just two years. “We did not have time to collect new data,” he said. “We were just trying to open up what we had.”
How do these organisational projects shape how the data scientists involved see their purpose? How do these projects contribute to a sense of data science as a vocation beyond the organisation in question? How should we revise our conception of organisations if all those beyond a certain size will have data scientists within their ranks? There are many questions which can be asked here and investigating the role itself would be an interesting route into them.