I just came across this great post by Helen Margetts on the LSE Impact Blog from a few months ago. It’s worth reading the post in full but what really caught my imagination were the five recommendations she makes at the end. I don’t think the methods training I received was bad but in retrospect I think it was hugely limited (and consequentially limiting). This needs to be addressed institutionally because otherwise conversations surrounding ‘big data’ are likely to become absurdly lopsided over time, as successive cohorts of data scientists are trained in a way that is relatively insulated from the traditional concerns of the social sciences. I think Helen’s third point is important as a matter of technical proficiency but perhaps even more crucial as a precondition for sustained interdisciplinary communication. So while my current strategy of gradually working through Code Academy might be useful for me, it’s not exactly a scaleable solution for the social sciences more broadly (though it does fit worryingly well with the privatisation of upskilling in order to ensure one’s own occupational viability in a changing labour market). These are Helen’s five recommendations:
- Accept that multi-disciplinary research teams are going to become the norm for social science research, extending beyond social science disciplines into the life sciences, mathematics, physics, and engineering. At Policy and Internet’s 2012 Big Data conference, thekeynote speaker Duncan Watts (physicist turned sociologist) called for a ‘dating agency’ for engineers and social scientists – with the former providing the technological expertise, and the latter identifying the important research questions. We need to make sure that forums exist where social scientists and technologists meet and discuss big data research at the earliest stages, so that research projects and programmes incorporate the core competencies of both.
- We need to provide the normative and ethical basis for policy decisions in the big data era. That means bringing in normative political theorists and philosophers of information into our research teams. The government has committed £65 million to big data research funding, but it seems likely that any successful research proposals will have a strong ethics component embedded in the research programme, rather than an ethics add on or afterthought.
- Training in data science. Many leading US universities are now admitting undergraduates todata science courses, but lack social science input. Of the 20 US masters courses in big data analytics compiled by Information Week, nearly all came from computer science or informatics departments. Social science research training needs to incorporate coding and analysis skills of the kind these courses provide, but with a social science focus. If we as social scientists leave the training to computer scientists, we will find that the new cadre of data scientists tend to leave out social science concerns or questions.
- Bringing policy makers and academic researchers together to tackle the challenges that big data present. Last month the OII and Policy and Internet convened a workshop in Harvard on Responsible Research Agendas for Public Policy in the Big Data Era, which included various leading academic researchers in the government and big data field, and government officials from the Census Bureau, the Federal Reserve Board, the Bureau of Labor Statistics, and the Office of Management and Budget (OMB). The discussions revealed that there is continual procession of major events on big data in Washington DC (usually with a corporate or scientific research focus) to which US federal officials are invited, but also how few were really dedicated to tackling the distinctive issues that face government agencies such as those represented around the table.
- Taking forward theoretical development in social science, incorporating big data insights. I recently spoke at the Oxford Analytica Global Horizons conference, at a session on Big Data. One of the few policy-makers (in proportion to corporate representatives) in the audience asked the panel “where is the theory”? As social scientists, we need to respond to that question, and fast.