I just came across this remarkable estimate in an Economist feature on surveillance. I knew digitalisation made surveillance cheaper but I didn’t realise quite how much cheaper. How much of the creeping authoritarianism which characterises the contemporary national security apparatus in the UK and US is driven by a familiar impulse towards efficiency?

The agencies not only do more, they also spend less. According to Mr Schneier, to deploy agents on a tail costs $175,000 a month because it takes a lot of manpower. To put a GPS receiver in someone’s car takes $150 a month. But to tag a target’s mobile phone, with the help of a phone company, costs only $30 a month. And whereas paper records soon become unmanageable, electronic storage is so cheap that the agencies can afford to hang on to a lot of data that may one day come in useful.

http://www.economist.com/news/special-report/21709773-who-benefiting-more-cyberisation-intelligence-spooks-or-their

In reality, it is of course anything but, instead heralding a potentially open ended project to capture the world and achieve the utopia of total social legibility. An ambition which always makes me think of this short story:

The story deals with the development of universe-scale computers called Multivacs and their relationships with humanity through the courses of seven historic settings, beginning in 2061. In each of the first six scenes a different character presents the computer with the same question; namely, how the threat to human existence posed by the heat death of the universe can be averted. The question was: “How can the net amount of entropy of the universe be massively decreased?” This is equivalent to asking: “Can the workings of the second law of thermodynamics (used in the story as the increase of the entropy of the universe) be reversed?” Multivac’s only response after much “thinking” is: “INSUFFICIENT DATA FOR MEANINGFUL ANSWER.”

The story jumps forward in time into later eras of human and scientific development. In each of these eras someone decides to ask the ultimate “last question” regarding the reversal and decrease of entropy. Each time, in each new era, Multivac’s descendant is asked this question, and finds itself unable to solve the problem. Each time all it can answer is an (increasingly sophisticated, linguistically): “THERE IS AS YET INSUFFICIENT DATA FOR A MEANINGFUL ANSWER.”

In the last scene, the god-like descendant of humanity (the unified mental process of over a trillion, trillion, trillion humans that have spread throughout the universe) watches the stars flicker out, one by one, as matter and energy ends, and with it, space and time. Humanity asks AC, Multivac’s ultimate descendant, which exists in hyperspace beyond the bounds of gravity or time, the entropy question one last time, before the last of humanity merges with AC and disappears. AC is still unable to answer, but continues to ponder the question even after space and time cease to exist. Eventually AC discovers the answer, but has nobody to report it to; the universe is already dead. It therefore decides to answer by demonstration. The story ends with AC’s pronouncement,

And AC said: “LET THERE BE LIGHT!” And there was light

https://en.wikipedia.org/wiki/The_Last_Question

From Douglas Rushkoff’s Throwing Rocks at the Google Bus, loc 2256:

Besides, consumer research is all about winning some portion of a fixed number of purchases. It doesn’t create more consumption. If anything, technological solutions tend to make markets smaller and less likely to spawn associated industries in shipping, resource management, and labor services.

Digital advertising might ultimately capture the entirety of advertising budgets, but it does nothing to expand these budgets. There are upper limits on the revenue growth of the corporations that define the ‘attention economy’: how are they going to respond to these?

I’m very interested in this concept, which I was introduced to through the work of Pierpaolo Donati and Andrea Maccarini earlier this year. It emerged from the work of Arnold Gehlen and refers to the role of human institutions in unburdening us from existential demands. This is quoted from his Human Beings and Institutions on pg 257 of Social Theory: Twenty Introductory Lectures by Hans Joas and Wolfgang Knobl. He writes that institutions

are those entities which enable a being, a being at risk, unstable and affectively overburdened by nature, to put up with his fellows and wit himself, something on the basis of which one can count on and rely on oneself and others. On the one hand, human objectives are jointly tackled and pursued within these institutions; on the other, people gear themselves toward definitive certainties of doing and to doing with in them, with the extraordinary benefit that their inner life is stabilized, so that they do not have to deal with profound emotional issues or make fundamental decisions at every turn.

In an interesting essay last year, Will Davies reflected on the ‘pleasure of dependence’ in a way which captures my understanding of entlastung. It can be a relief to trust in something outside of ourselves, settling into dependence on the understanding that our context is defined by a degree of reliability due to an agency other than our own:

I have a memory from childhood, a happy memory — one of complete trust and comfort. It’s dark, and I’m kneeling in the tiny floor area of the back seat of a car, resting my head on the seat. I’m perhaps six years old. I look upward to the window, through which I can see streetlights and buildings rushing by in a foreign town whose name and location I’m completely unaware of. In the front seats sit my parents, and in front of them, the warm yellow and red glow of the dashboard, with my dad at the steering wheel.

Contrary to the sentiment of so many ads and products, this memory reminds me that dependence can be a source of deep, almost visceral pleasure: to know nothing of where one is going, to have no responsibility for how one gets there or the risks involved. I must have knelt on the floor of the car backward to further increase that feeling of powerlessness as I stared up at the passing lights.

http://thenewinquiry.com/essays/the-data-sublime/

At a time when entlastung is failing, when institutions are coming to lose this capacity to unburden us, could faith in self-tracking, big data and digital technology fill the gap? The technological system as a whole comes to constitute the remaining possibility of entlastung and we enthusiastically throw ourselves into its embrace, as the only way left to feel some relief from the creeping anxiety that characterises daily life.

The essay by Will Davies is really worth reading: http://thenewinquiry.com/essays/the-data-sublime/

From Infoglut, by Mark Andrejevic, loc 607. The context to digital innovation in public services: 

What emerges is a kind of actuarial model of crime: one that lends itself to aggregate considerations regarding how best to allocate resources under conditions of scarcity – a set of concerns that fits neatly with the conjunction of generalized threat and the constriction of public- sector funding. The algorithm promises not simply to capitalize on new information technology and the data it generates, but simultaneously to address reductions in public resources. The challenges posed by reduced manpower can be countered (allegedly) by more information. As in other realms, enhanced information processing promises to make the business of policing and security more efficient and effective. However, it does so according to new surveillance imperatives, including the guidance of targeted surveillance by comprehensive monitoring, the privileging of prediction over explanation (or causality), and new forms of informational asymmetry. The data- driven promise of prediction, in other words, relies upon significant shifts in cultures and practices of information collection.

From InfoGlut, by Mark Andrejevic, loc 464:

The dystopian version of information glut anticipates a world in which control over the tremendous amount of information generated by interactive devices is concentrated in the hands of the few who use it to sort, manage, and manipulate. Those without access to the database are left with the “poor person’s” strategies for cutting through the clutter: gut instinct, affective response, and “thin- slicing” (making a snap decision based on a tiny fraction of the evidence). The asymmetric strategies for using data highlight an all- too- often overlooked truth of the digital era: infrastructure matters. Behind the airy rhetoric of “the cloud,” the factories of the big data era are sprouting up across the landscape: huge server farms that consume as much energy as a small city. Here is where data is put to work – generating correlations and patterns, shaping decisions and sorting people into categories for marketers, employers, intelligence agencies, healthcare providers, financial institutions, the police, and so on. Herein resides an important dimension of the knowledge asymmetry of the big data era – the divide between those who generate the data and those who put it to use by turning it back upon the population. This divide is, at least in part, an infrastructural one shaped by ownership and control of the material resources for data storage and mining. But it is also an epistemological one –a difference in the forms of practical knowledge available to those with access to the database, in the way they think about and use information.

I’d been planning to read his work for a while but I’m finding it almost eery how relevant it is. This is exactly what I was trying to argue in my forthcoming chapter on Fragile Movements but Andrejevic expresses it much more effectively than I was able to. The project as a whole is about the sociology of group formation under these conditions, as well as how this contributes to the continuing development of digital capitalism.

More on this from Infoglut loc 870:

In this regard the digital era opens up a new form of digital divide: that between those with access to the databases and those without. For those with access, the way in which data is understood and used will be fundamentally transformed. There will be no attempt to read and comprehend all of the available data – the task would be all but impossible. Correlations can be unearthed and acted upon, but only by those with access to the database and the processing power. Two different information cultures will come to exist side by side: on the one hand, the familiar, “old- fashioned” one in which people attempt to make sense of the world based on the information they can access: news reports, blog posts, the words of others and the evidence of their own experience. On the other hand, computers equipped with algorithms that can “teach” themselves will advance the instrumental pragmatics of the database: the ability to use tremendous amounts of data without understanding it.

From InfoGlut, by Mark Andrejevic, loc 601:

The fictional portrayals envision a contradictory world in which individual actions can be predicted with certainty and effectively thwarted. They weave oracular fantasies about perfect foresight. Predictive analytics, by contrast, posits a world in which probabilities can be measured and resources allocated accordingly. Because forecasts are probabilistic, they never attain the type of certitude that would, for example, justify arresting someone for a crime he or she has not yet committed. Rather, they distribute probabilities across populations and scenarios.

The most pressing question this raises for me concerns the micro-sociology of algorithmic authority. To what extent are the algorithms black boxed by those ‘on the ground’? Does awareness of the probabilistic character of the forecast drop out of the picture in the social situations in which actors are intervening on the basis of these forecasts? How much implicit authority derives from the fact ‘the algorithm says so’, even if those designing the underlying system would stress that the forecasts are probabilistic? How does this vary between different groups? It’s easy to imagine securely embedded professionals (e.g. Doctors) treating these forecasts with care, not least of all because many already do so as a routine part of working life, but what if algorithmic authority is a corollary to deliberate deskilling? What if interventions are made by those who are precariously employed? As much as I dislike the show in question, it’s hard not to think of this when reflecting on these issues:


These are empirical questions which are hugely important for the study of algorithmic authority. I’m planning to start looking for ethnographic and ethnomethodological literature which can shed light on them, even if not directly addressed to questions of algorithms. Any reading suggestions are, as ever, much appreciated. 

Bookmarking this so I can come back to it later. If I pursue this thread, Social Media For Academics is never going to get finished:

Reflecting their student populations, universities have long been bastions of oodles of consumer technology. We are awash in mobile phones, laptops, tablets, gaming consoles, and the like. If one combines mobile consumer technology with Big Data analytics, one gets a host of new possibilities ranging from new ways of providing students with basic support to new ways of getting students to learn what the faculty needs them to learn. If we can get the right information flowing through the minds of students, perhaps we can improve their success. We can potentially help transform the classroom from the 19th century to the 21st.

The byproducts of all this data are the new insights that can drive decision making in new ways. When one adds into the mix advanced data visualization capabilities, one gets something different for university administrators and faculty: better and approachable insight into university operations and even the minds of the students. Higher education is at the cusp of gathering an unprecedented amount of information using affordable tools and techniques.

http://www.sap.com/bin/sapcom/hr_hr/downloadasset.2014-01-jan-29-18.applying-big-data-in-higher-education-a-case-study-pdf.html

I included some material on this in a lecture on big data I did for the MA course I was convening this year. But it just struck me how enormously significant this is for digital scholarship: the more academics embrace social media in circumstances where managers seek to unleash a big data tsunami of change, the more they will be monitored as part of such initiatives.

I’m sad I’ll be missing this (though happy to be in Berlin) – hope lots of other people make it:

Warwick University Festival of Social Sciences

Data Big and Small: Past, Present and Future

This event is jointly hosted by the

Faculty of Social Sciences and the Warwick Q-Step Centre.

11 May 2015 – 16:00 – 18:15, followed by a drinks reception until 19:00

Warwick’s Faculty of Social Science has been doing a suite of work around big data this year. ‘Big data’ has become an unwieldy catchphrase loaded with many different connotations. Some researchers argue that big data are transforming everyday social and political processes, locally and globally; others argue that big data has always been around in one way or another. This event will consider the past present and future of data, big and small. The event is aimed primarily at those wanting to learn more about big data in general as well as those wanting to learning more about different social science perspectives about big data. It will include a panel of leading Warwick scholars drawn from across social science disciplines. Rather than giving presentations, panellists will be asked to discuss a number of questions based on some of the key issues we have drawn out of our suite of work around big data. There will then be questions from the audience, which the panellists will be asked to discuss. This will be followed by our keynote speaker, Emer Coleman.

Agenda

16:00 – 16:10     Welcome Address by Professor Chris Hughes, Chair of the Faculty of Social Sciences; Head of Department, of Politics and International Studies.

16:10-17:30        Panel Session

17:30-18:15        Keynote Speaker

18:00-19:00        Drinks Reception

Keynote:

Emer Coleman – A Warwick alumni now working as a journalist and consultant writing about how technology impacts organisational development. She was the architect of the London Datastore and more recently the Deputy Director for Digital Engagement at Government Digital Services where she wrote the Social Media Guidance for the Civil Service. She was named in Wired Magazines top 100 Digital Power Influencers List 2011.

Panel:

Dr Philippe Blanchard – Assistant Professor at the Department of Politics and International Studies, and member of the Warwick Q-Step Centre. He is currently involved in a number of European research projects on political trajectories, environmental politics and old and new social sciences methods.

Dr Claire Crawford – Assistant Professor at the Department of Economics. Her recent research involves understanding what explains socio-economic and ethnic differences in Higher Education participation and attainment, and what universities and policymakers can do to help reduce these gaps.

Dr Olga Goryunova  – Associate Professor at the Centre for Interdisciplinary Methodologies. Her recent work involves questions about the digital subject/person in relation to data mining and an ESRC project on “Picturing the Social”.

Dr Tobias Preis – Associate Professor of Behavioural Science and Finance at the University of Warwick. Together with his colleague Dr. Suzy Moat, he directs the Data Science Lab at Warwick Business School. His recent research aims to predict real world behaviour using data taken from Google, Wikipedia, Flickr and other sources. His research has been featured in the news, the BBC, the New York Times, the Financial Times, Science, Nature, Time Magazine, New Scientist and the Guardian. He has given a range of public talks including presentations at TEDx events in the UK and in Switzerland. See here for further details: http://www.tobiaspreis.de./

Dr Nick Sofroniou  – Principal Research Fellow at the Institute for Employment Research.  His recent work involves developing statistical models for complex samples in education and the social sciences, e.g., students nested in classrooms, employees in different countries, and longitudinal studies. He maintains a keen interest in evidence-based policy and in the interplay between national and international-level policy initiatives.

Chair:

Dr Emma Uprichard – Associate Professor at the Centre for Interdisciplinary Methodologies and co-direct of the Warwick Q-Step Centre. Her recent research explores how different methods, including big data analytics, can be developed for complex social policy and planning purposes.

TO BOOK YOUR PLACE, PLEASE REGISTER ONLINE.

This insightful article paints a worrying picture of the growth of data-driven policing. The technical challenge of “building nuance” into data systems “is far harder than it seems” and has important practical implications for how interventions operate on the basis of digital data. What I hadn’t previously realised was how readily investigators are using social media on their own initiative above and beyond the systems that are being put into place with the help of outside consultancies: only 9% of police using social media in investigations had received training from their agency. Furthermore the discussion of the life span of data raised some really interesting (and worrying) questions about the organisational sociology of data-driven policing given what seems likely to be increasing involvement of the private sector in policing in the UK:

For the kid listed in a gang database, it can be unclear how to get out of it. In the world of human interaction, we accept change through behavior: the addict can redeem himself by getting clean, or the habitual interrupter can redeem himself by not interrupting. We accept behavior change. But in the database world, unless someone has permission to delete or amend a database record, no such change is possible. Credit agencies are required to forgive financial sins after 7 years. Police are not—at least, not consistently. The National Gang Center, in its list of gang-related legislation, shows only 12 states with policies that specifically address gang databases. Most deny the public access to the information in these databases. Only a few of these twelve mention regular purging of information, and some specifically say that a person cannot even find out if they have a record in the database.

This permanence does not necessarily match real-world conditions. Kids cycle in and out of street gangs the way they cycle in and out of any other social group, and many young men age out of violent behavior. Regularly purging the gang database, perhaps on a one-year or two-year cycle, would allow some measure of computational forgiveness. However, few institutions are good at keeping the data in their databases up-to-date. (If you’ve ever been served an ad for a product you just bought, you’re familiar with this problem of information persistence and the clumsiness of predictive algorithms.) The police are no worse and no better than the rest of us. Criminologist Charles Katz found that despite a written department policy in one large Midwestern police gang unit, data was not regularly audited or purged. “The last time that the gang unit purged its files, however, was in 1993—approximately 4 years before this study was conducted,” he wrote. “One clerk who is responsible for data entry and dissemination estimated, ‘At a minimum, 400 to 500 gang members would be deleted off the gang list today if we went through the files.’ Accordingly, Junction City’s gang list of 2,086 gang members was inflated by approximately 20% to 25%.”

http://www.theatlantic.com/politics/archive/2015/04/when-cops-check-facebook/390882/

This suggests to me that any adequate evaluation of data-driven policing needs to take questions of organisational sociology and information technology extremely seriously. What matters is not just the formulation of data management policies but what we know about how such policies tend to be implemented under the specific conditions likely to obtain in policing. Given the broader trend towards the privatisation of policing, it is increasingly important that we understand how sharing of data operates across organisational boundaries, how it is prepared and how it is perceived by end-users.

My fear is that a form of inter-organisational ‘black-boxing’ could kick in where those utilising the data for interventions trust that others have elsewhere taken responsibility for ensuring its reliability. What scrutiny would the operations of outside suppliers be subject to? Could privatisation intensify the rush towards data-driven policing in the name of efficiency savings? Would a corresponding centralisation of back-office functions compound the aforementioned epistemological risks entailed by outsourcing? These are all urgent questions which could easily be marginalised as budgetary constraint drives ‘innovation’ in policing: data-driven policing and privatised policing will likely go hand-in-hand and we need to analyse them as such.

As part of its effort to expand beyond traditional types of academic publication, Big Data & Society has introduced an Early Career Researcher Forum targeted to scholars finishing or having recently completed advanced graduate degrees.  More specifically the ECR forum seeks work by researchers reflecting about some of the challenges of their work (related to Big Data topics) in about 1000 to 2000 words with a range of illustrations, figures, etc. as well as a brief bio (100 words).  The goal is to encourage reflexive submissions that explore what it means to be a researcher studying issues concerning big data and society.  As guidance we ask authors to consider a series of questions (addressing any or all of these):

  • What kinds of challenges empirically and/or methodologically have you encountered in your work?
  • Do you have an example of these challenges, particularly one that can be shared in an online forum such as the journal offers, i.e., with visualizations, graphs, etc.?
  • Does Big Data allow you to ask new questions or explore old issues?
  • Are there questions that your data can not answer? Why? What else is necessary?
  • Why is your research important and interesting?
  • How do you relate back to your home discipline, and do your colleagues understand you?

In addition to targeted submissions, the Early Career Researcher forum accepts unsolicited contributions and encourages those who are interested to correspond with the co-editors (Irina Shklovski and Matthew Zook) for guidance.

This looks really interesting – if I wasn’t drowning under the weight of existing writing commitments, I’d love to try and write something for the final topic suggestion:

Call for papers for special issue of IEEE Internet Computinghttp://www.computer.org/portal/web/computingnow/iccfp6

Internet of You: Data Big and Small

Final submissions due:  1 March 2015
Publication issue:  November/December 2015

Please email the guest editors a brief description of the article you plan to submit by 1 February 2015.
Guest editors: Deborah Estrin and Craig Thompson (ic6-2015@computer.org).

We are at a great divide. Where our ancestors left behind few records, we are creating and preserving increasingly complete digital traces and models of almost every aspect of our lives. This special issue of IEEE Internet Computing aims to explore technologies and issues from small user-centric models of individuals to real-time analytics on huge aggregations of user data. At present, some are aspiring to create immortal avatars by letting you record everything about yourself and convert it into a model that’s queriable, conversant, and possibly even active in gaining new experiences for itself. Meanwhile, others are equally concerned with stemming the tide of third-party data aggregation of individual models to mitigate risks that can evolve from this kind of near total information awareness.

This special issue seeks original articles that explore both small data (individual-scale data sources, processing, and modeling) and big data (community level aggregation and analytics). Topics include

  • diverse data sources and digital traces, including email, Facebook, financial, health, location, images, sound, consumer transactions, and interests;
  • methods to combine trace data into complete models; data liberation; kinds of user models, such as the physical self, memories, aspect-limited versus comprehensive models; and data quality, including managing history, change, comprehensiveness, and accuracy;
  • methods to aggregate and process heterogeneous data sets, stages of life ontologies, the scope and purpose of these data collections, available public data sources;
  • usage models for experience sampling — proximity, context, activity sensing, quantified self, situation-aware modeling, activities of daily living, my immortal avatar, workflows, and pattern learning;
  • representation technologies such as agents, smartphones, wearable computing, personal sensing networks, pattern representation and adaptation, and natural language;
  • new kinds of applications that draw insights from data analytics— including, recommendation systems, personalized health, real-time marketing, and predicting elections from twitter feeds;
  • open architectures for personalization, the role of cloud computing, relevant emerging standards;
  • concerns regarding privacy and surveillance, the extent of privacy erosion, taxonomy of privacy threats, and incentives and disincentives for sharing, the right to forget, and status of legal safeguards;
  • privacy and security technology safeguards, including identity management, disclosure control, privacy-preserving data mining, de-identification, new security models, mechanisms that audit and control personal information flows and usage; and
  • social and philosophical implications for humans’ conception of self. Is there a natural boundary between user models and world models?

Submission Guidelines

All submissions must be original manuscripts of fewer than 5,000 words, focused on Internet technologies and implementations. All manuscripts are subject to peer review on both technical merit and relevance to IC’s international readership — primarily practicing engineers and academics who are looking for material that introduces new technology and broadens familiarity with current topics. We do not accept white papers, and we discourage strictly theoretical or mathematical papers. To submit a manuscript, please log on to ScholarOne (https://mc.manuscriptcentral.com:443/ic-cs) to create or access an account, which you can use to log on to IC’s Author Center (http://www.computer.org/portal/web/peerreviewmagazines/acinternet) and upload your submission.