Category: Big Data
Professor Dan Schiller was invited to deliver several lectures at the Global Fellowship Progam at Peking University October/November, 2016. Below is his interview on “big data” with Wang Jianfeng from Chinese Social Sciences Today (CSST) after his lectures. This interview was originally published in Chinese Social Sciences Today (CSST) on January 19, 2017.
Wang Jianfeng: How do you evaluate the trend toward what is called Big Data?
Dan Schiller: Big Data refers not just to the scale or volume and diversity of data that are now being created but also to the need to make sense of these data through data science, through network analysis, and through other specialized disciplines that are trying to grapple with this challenge. One problem is that this often accords a new priority to an old emphasis, which is empiricist. Anything can be data, so let’s just look for patterns. We don’t care if they amount to anything meaningful. Let’s just see what seems to be related to what. There is, however, a deeper issue. An instrumental purpose is typically encoded in the accumulation and subsequent analysis of Big Data.
Thus, we have a problem because we need to know whose instrumental purpose it is and what goals it serves. If the goal of Big Data is to preserve the fishing grounds of the people who have been fishing in some part of the ocean, maybe that is ok because maybe it can be used beyond that to preserve the fish as well as the fishermen. If, however, the goal of collecting and analyzing Big Data is to extract profit from any area of human interaction, direct or mediated by machines, then I am not so sure. Actually, I am sure: It is wrong. Because then Big Data is organized around the instrumental purpose of profit maximization, which is not only exploitative but also often carries what economists call externalities. It may have all kinds of other effects beyond the immediate goal of profit‐ making, but nobody pays for these—except the rest of us. Disease, environmental despoliation and inequality are primary examples.
So how do we build the system of organizing Big Data if we need Big Data? And I am not sure we do need Big Data, because much of the data collection that is happening should not be occurring. We need a process of what in Europe and Canada they call data protection. I am not sure that is the right term, but I am sure that we need a policy or a structure of decision‐making for data collection, as well as for data analysis.
And this poses wholly new problems of political organization. Who should be making the policy, and on what grounds? Big Data thus poses profound questions. Because on the one hand, it gives new power to the units of big capital that are learning to exploit it for profit‐ making, while, on the other hand, it takes away power from everybody else, often without anyone knowing what, specifically, is happening. So we have a really big problem of balance—a power disparity—and, looking ahead, of a need for political creativity.
The issue is partly about education. People know now that when they go online, they are giving up their data. They know that, but they don’t realize that when they turn on their washing machine, or when they open their refrigerator, or when they take a shower, or when they go to bed, they are giving data. We need a forum for the discussion and decision‐making about which data ought to be generated and collected, by whom and for what reasons. Until we have that, we don’t have an answer to the problem of Big Data.
Wang Jianfeng: Now that information is regarded as a commodity, could overcapacity in the informtion industry take place? Do we have too much information?
Dan Schiller: We have too much of the wrong information and not enough of the right information. So there continues to be a desperate need for more information on the environment, more information on workers’ safety and occupational disease, more information on epidemiology and public health, more information on the social conditions of working people and the inequalities that prevail across society. We don’t have enough information on any of these, and where we do already possess the information, it isn’t widely circulated. There is indeed a huge information deficit.
However, in some contexts, there is also too much information. There is too much information being extracted from the everyday interactions that people have as they use technology that is embedded not just in smartphones and tablets and computers, but in the Internet of Things. So I think the question needs to be reframed in terms of what information we need and what information we get.
Condescension toward farmers has been a bedrock historical fixture of urban middle-class understanding (In the United States, “clod-hoppers” is one of the more polite disparagements). After World War II, U.S. social scientists incorporated this prejudice into what they termed “modernization theory,” which they developed as a rationale for compelling indigenous peoples to abandon “traditional” village life. Walt Rostow’s formulation of the “stages of economic growth” became ubiquitous. In this conception, “development” took the form of a repeated sequence: out of agriculture, into industrial manufacturing, and then on to the production of services. In this scheme, the U.S. – conveniently – constituted a paragon of developed modernity. Modernization theory was far from being merely an academic daydream. The U.S. Government packaged its foreign policy toward the then Third World under the motto of “development” – and used it, among other things, to sell what was called the “Green Revolution.” The Green Revolution pushed to increase agricultural productivity via “technology transfers.” Fertilizers and pesticides and high-yield seeds from the U.S., alongside intrusive management practices, were the standard package.
Capitalist agriculture was thereby given a giant push. And the ratchet continues to turn: capital has continued to transform agriculture. It comes as no surprise, therefore, that farming should increasingly exhibit some of digital capitalism’s trademark features.
Alongside the Green Revolution, industrial capitalist agriculture brought about massive land grabs, widespread destruction of biodiversity, climate change, environmental pollution, and unsustainable use of water resources. Heralding that the same social forces that caused the problems now will fix them, corporate capital is calling for a “digital revolution” and a shift to more information-intensive farming practices.
Drones, driverless tractors, sensors, robotics, mobile apps, global positioning system satellites, and cloud-based data storage are sweeping across the agricultural sector, as well as below and above, the landscape. Farming is being digitized and data codified throughout the agricultural lifecycle – from the cultivation of soil, to plant breeding, to planting schedules, to pest control, to irrigation, to crop monitoring, to harvesting, to food production and distribution, all the way to ultimate consumption. Companies including Monsanto, John Deere, Cargill, and DuPont are at the forefront of this process. The public relations industry has been hard at work creating happy-talk names for what they’re doing: “maximizing crop yields,” “sustainability” farming, and so on. Broader social and economic ramifications are ignored, as is the fact that this initiative stems not from social-justice activism, or even from good-Samaritanism, but from a familiar drive for profit.
Over the last few years, the idea has taken hold that “big data” is driving far-reaching, and typically positive, change. “How Big Data Changes the Banking Industry,” “Big Data Is Transforming Medicine,” and “How Big Data Can Improve Manufacturing,” are characteristic headlines. “Big data” has become ubiquitous, powering everything from models of climate change to the advertisements sent to Web searchers.
Even in a society in which acronyms and sound-bites pass for knowledge, this familiar formulation stands out as vacuous. It offers us a reified name rather than an explanation of what the name means. What is the phenomenon denoted by “big data”? Why and when did it emerge? How is “it” changing things? Which things, in particular, are being changed – as opposed to merely being hyped? And last, but hardly least, are these changes desirable and, if so, for whom?
Big data is usually defined as data sets that are so large and complex – both structured and unstructured – that they challenge existing forms of statistical analysis. For instance, Google alone processes more than 40 thousand search queries every second, which equates to 3.5 billion in a day and 1.2 trillion searches per year; every minute, Facebook users post 31.25 million message and views 2.77 milion video, 347,222 tweets are generated; by the year 2020, 1.8 megabytes of new information is expected to be created every second for every person on the planet.
The compounding production of data – “datafication,” in one account  – is tied to proliferating arrays of digital sensors and probes, embedded in diverse arcs of practice. New means of storing, processing, and analyzing these data are the needed complement.
A quick etymological search finds that the term “big data” began to circulate during the years just before and after 2000. Its deployments than quickened; but this seemingly sharp-edged transition into what Andrejevic and Burdon call a “sensor society” actually possesses a deeper-rooted history.
The uses of statistics in prediction and control have long been entrenched, and have increased rapidly throughout the last century – as is pointed out by a working group on “Historicizing Big Data” established at the Max Planck Institute for the History of Science. The group emphasizes that big data must not be stripped out of “a Cold War political economy,” in that “many of the precursors to 21st century data sciences began as national security or military projects in the Big Science era of the 1950s and 1960s.”