Q&A  | 

Could we live without our data? With Colin Koopman

"The question for us today cannot be whether this [autocratic regimes accessing untold stores of data to quiet its opposition and abuse its enemies] will happen, but only when it will happen and where."

Tags: 'data aggregation' 'informational persona' 'mass surveillance'

SHARE

Colin Koopman’s writings are focused on the politics and ethics of contemporary information technologies. Countering the widespread assumption that all the information technology we are immersed in today is radically new, he explores historical precedents of the basic techniques and formats underlying contemporary high-tech data systems.

These precedents show us to have been swaddled in data for much longer than we commonly think. He is Professor & Head of Philosophy at the University of Oregon where he is also Director of the New Media & Culture Program. He has written three books, his latest titled How We Became Our Data, and numerous articles for The New York Times, Aeon, Public Books, The Philosophical Salon, and elsewhere.

In your book How We Became Our Data: A Genealogy of the Informational Person you argue that the emergence of mass-scale record keeping systems like birth certificates and social security numbers were the beginning of the informational person we have all now become. But, were the purposes of that pre-digital record keeping different from the purposes of the current surveillance state?

There can be no question that first-generation (and earlier beta-version) records systems were implemented to serve different ends than contemporary mass surveillance.

Early records systems, which I research primarily in the early-twentieth century United States -though the story is quite similar throughout much of Europe too-, were more often implemented in the interests of welfare and equality projects. 

This is clearly the case with respect to social insurance records systems such as that undertaken under the auspices of social security in the United States. Birth certificates, at least in the U.S., were a meeting point of many interests and purposes, but chief among them was the ability to provide more accurate vital statistics for the purposes of better tracking public health campaigns (especially concerning infants and children).

And what are the goals of contemporary mass surveillance?

Much of what we discuss under the heading of contemporary mass surveillance, by contrast, has the quite different goal of state security in mind. Surveillance today targets not the improvement of the lives of citizens but more often their safety through the prevention of possible threats to life and property. 

This difference said, what is perhaps more interesting is the way that data systems, given their remarkable portability, often end up producing consequences that were not intended by their developers. 

Thus did the Social Security Number become not just a recordkeeping device at one relatively small U.S. government agency, but a de facto standard for identifying citizens, employees, students, and other types of subjects across a whole range of government and commercial organizations. Thus did birth certificates in many countries become a ubiquitous ‘breeder document’ that stands as the informational root of the thousands of databases in which our lives have been filed.

Why and how much does the data we give away matter to how we live?

All those data matter enormously to how we live.

Imagine this science fiction nightmare: your data is permanently erased in such a way that you (but only you) can never have data again, so not just the loss of your identification card but the permanent impossibility of your ever having any identification cards or numbers or accounts ever again.

In this scenario you would be thoroughly debilitated.  After only a few weeks, you could no longer transact with your employer, your bank, your grocer, and eventually all the social services agencies that can interact with you only on the basis of informational identifiers.

We decidedly live through all of our data today.  They are a technological furniture for nearly everything we do—and taken together they are more like a technological shelter within which we live, and without which we could barely even live at all.

But we really do not seem to care, do we? In December 2019, Amazon Ring sales almost tripled despite the hacks, security breaches and assaults consumers experienced with Ring. In fact, New York Times reporter Brian Chen said that people just don’t care that their Ring device is spying on them.

If people do not care about global climate change, does that mean that those of us who think about it should just accept it is not a problem?  Of course not.

The problem in both cases is not that people do not care, it’s that they find it so easy to pretend that we do not care.  And in both cases we should not measure the magnitude of the problem by what people do, but rather by what we know will happen to them if nothing is done.

So the question we face today is not how we can make everyone care about the vulnerability of their data, but what we all can do to protect ourselves and one another.

We especially need to be more active in finding ways to protect the most precariously positioned people in our societies such that they do not suffer undue burdens because of the data systems that they are often unwittingly enrolled in.

This includes not only problems of surveillance and privacy.  It also importantly involves issues of bias and discrimination that are built into the formats and processing operations, commonly called algorithms, of our data systems.

Can you give some examples of how data aggregation can burden certain groups in our societies?

The examples of this are piling up. One is Amazon’s gender-biased job applicant processing engine which discounted resumés with the word “women’s”.

Another are facial recognition systems, many of which have been found to frequently misclassify (or misrecognize, or fail to recognize the faces of) racial minorities. Researchers often point to the ways in which machine-learning systems serve to reproduce bias already embedded in historical data sets used as training data.

But another aspect of this concerns the way in which records systems only record and recognize persons according to those pre-defined categories that structure the databases hooked up to them. Often these categories do not fit many of the individuals they process. For instance, many systems do not allow for mixed-race categorization, but then they nevertheless process the records they store in terms of racial categorization. Other systems do not allow for gender-neutral or gender-nonconforming identification, but embed a technological requirement to store a gender identity that is known by users to be inaccurate or inadequate.

It might be thought that individually these examples are relatively harmless. Perhaps in some cases that is true. But in the aggregate this kind of misformatting serves to subject entire swaths of informational persons to algorithmic processes that can only yield outputs that are not adequate to the lives of those persons.  

According to Steven Feldstein with the Carnegie Endowment for International Peace, AI surveillance technology has spread quickly and faster than global experts expected. The research shows that at least 75 out of 176 countries are actively using AI technology for surveillance. The uses include smart city and safe city platforms (56 countries); facial recognition (64 countries), and smart policing (52 countries). We are concerned about big tech data extraction but where are we in terms of governmental surveillance?

We are currently in a moment where government surveillance is being rolled out with a breathtaking velocity.  It would be difficult to say who wins in the race between the rollout of government surveillance and that of corporate dataveillance.

But what is more important than settling any contest here is really how the two are working hand-in-hand to produce technologies that are turning more and more of even our smallest actions and ideas into storable bits of data.

What could that look like in 25 years when some autocratic demagogue or super-corporation is installed in the power centers of what had formerly been a free and tolerant country? Such a regime will have access to untold stores of data with which it will quickly quiet its opposition and adroitly abuse its enemies. The question for us today cannot be whether this will happen, but only when it will in fact happen and where.

Why should we be concerned about us becoming an Informational Persona giving away our privacy and personal data in exchange for apps which -apparently- simplify our lives?

There are two kinds of dangers we should all take seriously.

The first concerns the harms that may flow to us as isolated individuals when we are whisked into the bureaucratic nightmare of stolen personal identifiers. But the second kind of danger is far more serious. Harms may flow, and indeed in some cases already do flow, to particular groups of people as members of those groups, and especially those groups already prone to unequal treatment.

Perhaps most visibly in the United States amongst liberal nations, but also elsewhere too, our societies are characterized by stark inequality. Those who are unduly burdened by the social systems we have created are only going to see those burdens exacerbated as technologies for tracking, recording, and capturing are rolled out.

As the data procured by these technologies gets translated to use in data processing by high-performance computing systems and machine-learning algorithms, we suddenly find ourselves in a situation where those social inequalities are further exacerbated.

Not only that, it becomes all the more difficult to understand and account for how data-driven inequalities are being produced. And if we cannot conceptualize the genesis of the harms we are producing, how are we ever going to mitigate or redress those harms?

This should be concerning to us all, but perhaps most especially to those of us who will be the unwitting beneficiaries of these data-driven augmentations of inequality.