One of the most interesting issues raised by the rise of data science in party politics is how to untangle corporate rhetoric from social reality. I have much time for the argument that we risk taking the claims of a company like Cambridge Analytica too seriously, accepting at face value what are simply marketing exercises. But the parallel risk is that we fail to take them seriously enough, dismissing important changes in how elections are fought as marketing hype propounded by digital charlatans.

Perhaps we need to focus more on the data scientists themselves. As much as there is something of the Bond villain about Alexander Nix, CEO of Cambridge Analytica, it’s important that we don’t become preoccupied with corporate leaders. Who are the rank-and-file data scientists working on campaigns? What motivates them? How do they conceive of the work they do? There were interesting hints about this in the recent book Shattered, looking at Hilary Clinton’s failed election campaign. Much as was the case with Jeb Bush’s near entirely stalled campaign, there had been much investment in data analytics, with buy-in right from the top of the campaign. From pg 228-229:

These young data warriors, most of whom had grown up in politics during the Obama era, behaved as though the Democratic Party had come up with an inviolable formula for winning presidential elections. It started with the “blue wall”—eighteen states, plus the District of Columbia, that had voted for the Democratic presidential nominee in every election since 1992. They accounted for 242 of the 270 electoral votes needed to win the presidency. From there, you expanded the playing field of battleground states to provide as many “paths” as possible to get the remaining 28 electoral votes. Adding to their perceived advantage, Democrats believed they’d demonstrated in Obama’s two elections that they were much more sophisticated in bringing data to bear to get their voters to the polls. For all the talk of models and algorithms, the basic thrust of campaign analytics was pretty straightforward when it came to figuring out how to move voters to the polls. The data team would collect as much information as possible about potential voters, including age, race, ethnicity, voting history, and magazine subscriptions, among other things. Each person was given a score, ranging from zero to one hundred, in each of three categories: probability of voting, probability of voting for Hillary, and probability, if they were undecided, that they could be persuaded to vote for her. These scores determined which voters got contacted by the campaign and in which manner—a television spot, an ad on their favorite website, a knock on their door, or a piece of direct mail. “It’s a grayscale,” said a campaign aide familiar with the operation. “You start with the people who are the best targets and go down until you run out of resources.”

Understanding these ‘data warriors’ and the data practices they engage in is crucial to understanding how data science  is changing party politics. Perhaps it’s even more important than understanding high profile consultancies and the presentations of their corporate leaders.

I really wish I could go to this:

*Developing a Research Agenda for Human-Centered Data Science*

in conjunction with CSCW 2016

Sunday, February 28th, 2016
San Francisco, CA, USA

Workshop Website:

Important dates:
– 11th December 2015: Submission of Position Papers
– 18th January 2016: Notification of acceptance
– 25th January 2016: Camera-ready version
– 28th February 2016: Workshop at CSCW 2016

The study and analysis of large and complex data sets offer a wealth of
insights in a variety of applications. Computational approaches provide
researchers access to broad assemblages ofdata, but the insights extracted
may lack the rich detail that qualitative approaches have brought to the
understanding of sociotechnical phenomena. How do we preserve the richness
associated with traditional qualitative methods while utilizing the power
of large data sets? How do we uncover social nuances or consider ethics and
values in data use?

These and other questions are explored by human-centered data science, an
emerging field at the intersection of human-computer interaction (HCI),
computer-supported cooperative work (CSCW), human computation, and the
statistical and computational techniques of data science. This workshop,
the first of its kind at CSCW, seeks to bring together researchers
interested inhuman-centered approaches to data science to collaborate,
define a research agenda, and form a community.

This workshop provides a venue for attendees to discuss a variety of topics
in human-centereddata science.  We welcome researchers interested in
exploring how data-driven and qualitative research can be integrated to
address complex questions in a diverse range of areas, including but not
limited to social computing, urban, health, or crisis informatics,
scientific, business, policy, technical, and other fields. Researchers and
practitioners working with large data sets (“big data”) and/or qualitative
data sets looking to expand their methodological toolbox are invited to
participate and share their experiences while learning from the broader

Topics and themes of interest include, but are not limited to:

– Deep ethnographic methods: How do we preserve the richness of traditional
qualitative techniques in data science?
– Scaling up qualitative data analysis: How do we deal with ever growing
qualitative datasets?
– Quantitative and behavioral methods: How are quantitative and behavioral
methods related todata mining, machine learning, and qualitative methods?
– Connecting across levels of analysis: How can we integrate the analysis
of personal data with large-scale data?
– Ethics and values of data use: What ethical questions should we raise in
using large-scale online data?
– Privacy of data use: How can we preserve anonymity and privacy
within data ecosystems
that can easily expose users?
– Human-centered algorithm design: How do we design machine learning
algorithms tailored forhuman use and understanding?
– Understanding community data: How can we integrate knowledge gained about
communities from their aggregate social data as well as their personal
– Health and well-being at micro and macro scales: What understandings can
be exposed or occluded by aggregate or granular perspectives on health and

Please submit a position paper (from 2 to 4 pages in the CSCW ACM sigCHI EA
format – see
by 11th December 2015 to

The submissions will be reviewed by the organizers with support of other
researchers in a dedicated program committee and selected according to
their potential to contribute to the workshop topic and to foster

All accepted contributions (notifications will be sent out by 18 January
2016) will be made available on the website to allow participants to
prepare for the workshop. The organizers may consider the publication of
revised versions of accepted papers as part of a special issue in a CSCW
related journal.

Cecilia Aragon, University of Washington.
CJ Hutto, Georgia Institute of Technology.
Yun Huang, Syracuse University.
Wanli Xing, University of Missouri.
Gina Neff, University of Washington.
Jinyoung Kim, University of Maryland, College Park.
Andy Echenique, University of California, San Diego and San Diego
Supercomputer Center.
Joseph Bayer, University of Michigan.
Brittany Fiore-Gartland, University of Washington.

For more details, check the workshop website:

For any further information on the workshop please contact

From Plutocrats: The Rise of the New Global Super-Rich pg 46:

Carlos Slim, who studied engineering in college and taught algebra and linear programming as an undergraduate, attributes his fortune to his facility with numbers. So does Steve Schwarzman, who told me he owed his success to his “ability to see patterns that other people don’t see” in large collections of numbers. People inside the super- elite think the rise of the data geeks is just beginning. Elliot Schrage is a member of the tech aristocracy— he was the communications director for Google when it was the hottest company in the Valley and jumped to the same role at Facebook just as it was becoming a behemoth. At a 2009 talk he gave to an internal company meeting of education and publishing executives, Schrage was asked what field we should encourage our children to study. His instant answer was statistics, because the ability to understand data would be the most powerful skill in the twenty- first century.

How does this intersect with the (purported) rise of the data scientist as the ‘sexist job of the 21st century‘?