[Reblogged from The Polson Institute Symposium Website]
[Written by Jaron Porciello, Associate Director for Research Data Engagement and Training, IP-CALS]
Nearly a decade ago I worked with group of graduate students in west Africa who were collecting daily soil samples.
The small research field, overcrowded with curled and drooping maize, was far enough from their research lab and classrooms that the students drove to from the lab to the field in a shared institutional car. A schedule was drawn up so each person knew when it was their day to head out into the field at daybreak to scoop small soil samples into test tubes, and bring them back to the lab where they would carefully file notations away in a single project binder. The collection concluded after two weeks.
The data was entered into a spreadsheet for analysis. The team found some discrepancies in the data. Most notably, soil moisture retention seemed outside of the acceptable margin of error. At first, the group shared a feeling of general disappointment in the results. Over the next few days, however, the atmosphere shifted and became tense, where some students hinted to others that they made errors in sampling techniques and data entry. The data was checked, reentered, re-analyzed. The results remained the same.
A faculty member checked in with the group and asked some questions about the process. The group’s answers revealed instances of scheduling mix-ups, a time someone forgot to leave the key for the next person and someone else had to chase him down, and illnesses among the team—all normal things that happen to any group. But the mishaps resulted in a variety of collection times ranging from pre-dawn to late afternoon—which may have put some of the collection times at post-watering. This single missing data point—the time of collection—wasn’t captured because the paper logs lacked such a column: there was only a column for “date.” The consensus was that collecting the data at irregular intervals was likely the cause of the discrepancies. Things calmed down.
Whenever I have told this story to my colleagues, the response is always the same: “What the team needed was an app! An app would have solved this!” They are not wrong; an app may have helped. But focusing solely on a technical solution to solve what is both a social and technical problem is indicative of a broader systemic issue. What was once mere data was transformed into a social object. Technical errors provided justification to dig up underlying interpersonal tensions and professional disagreements: complaints that someone was sloppy or careless, or had trouble manipulating a spreadsheet. Data took on a new use, and told a bigger story about team dynamics. Material activities (such as data collection) are situated within larger organizational and social contexts (such as coordinating the schedules of those responsible for data collection). Wanda Orlikowski writes in Sociomaterial practices: Exploring Technology at Work that, “organizational practice is always bound with materiality,” and under this logic, the social and the material are “[c]onstitutively entangled, where there is no social that is not also material, and no material that is not also social.”[i] Materiality is inherently tied to the social context in which it was created and data as material objects are likewise bound to the social process of data production. We cannot separate them—and yet the default position is to treat data as a neutral object.
Data is a moment in time marked by the knowability of now and a way to divine a digital truth. In our knowledge-based economy data is our greatest natural resource, and treating data as infallible—as free from influence, bias, or errors–will not help us discover greater or ‘purer’ truth. As Steve Jackson and David Ribes brilliantly capture in Raw Data is an Oxymoron:“[d]ata are ephemeral creatures that threaten to become corrupted, lost, or meaningless if not properly cared for.”[ii]
I’m sure this experience is now long-forgotten by the research team, but it had a profound impact on me, and I think about it as we bring an incredible array of stakeholders together for the Polson symposium. We have the opportunity to ask: What is missing from the conversation on science, policy, and evidence? How do we acknowledge and appreciate the sociotechnical issues inherent in data and evidence? And perhaps most importantly, if data is a shared natural resource—one that we say we want to mine, extract, exploit—how do we do so responsibly so we might create knowledge that can be used, and reused, by others?
Jaron Porciello is the Associate Director for Research Data Engagement in International Programs, College of Agriculture and Life Sciences (IP-CALS)
[i] Orlikowski, Wanda J. “Sociomaterial practices: Exploring technology at work.” Organization studies 28.9 (2007): 1435-1448.
[ii] Ribes, David, and Steven J. Jackson. “Data bite man: The work of sustaining a long-term study.” Raw data” is an oxymoron (2013): 147-166.