Mary Flanagan and Eric Paulos SoCS critical (game) design notes

And, notes from Mary Flanagan and Eric Paulos’ keynote and tutorial from the second day of the SoCS workshop. Mary first.

Not all games are serious games: critical games, causal games, art games, silly games — the question here is about how play and games interact. Or, even more basic, how do we know when people (or animals) are playing? Openness, non-threatening, relative safety, …

Instead, thinking about important elements of play might be a fruitful way to help designers (“Intro to game design 101” part of the talk). So, for instance, thinking about what people are looking for in meaningful play: meaning/context, choice/inquiry, action/agency, outcomes/feedback, integration/experience, discernment/legibility. Or, thinking about ways to carve up the elements of the game itself: rules/mechanics, play/dynamics, and culture/meaning.

So, what does this mean if you want to design games with a purpose, that have social good or social commentary or social change as their goals? In particular, gamification has a connotation of making people do things, a la persuasive computing, that makes people who design games feel really awkward because play is generally voluntary. Are Girl Scout merit badges about “play”?

Buffalo: a game that juxtaposes pairs of words that requires people to give names that encompass both the words (“contemporary” “robot”; “female” “scientist”; “multiracial” “superhero”; “Hispanic” “lawyer”), and the group of players to agree that the names are appropriate. It’s designed to help us reflect on stereotypes and implicit biases, and increase or broaden our “social identity complexity” (ability to belong to/respect/know about multiple social groups).

Which is important, because apparently these kinds of biases and divisions show up really young. (Though, it makes you wonder how to design games — both practically and ethically — for the very young, to help address issues like that.) And, because the biases are floating around in your social computing users and in your social computing software.

A fair number of games and studies suggest that you can have fairly dramatic effects on the biases, at least in the short term. Long-term effects are less clear, though — how would we study that?

I ran out of battery so lost most of the last bit, but one notable element was a design claim that having a more diverse team leads to more diverse games and outputs. This rang true on its face, and has some support from our own experiences writing prompts for the Pensieve project; the team was largely white rich kids from the Northeast and the prompts reflected that, in ways that sometimes made users from different backgrounds sad. So, this seems like a pretty useful nugget to take away.

Now, Eric, on design and intervention, and design vs. design research. In particular, claims about design research: it tends to focus on situations with topical/theoretical potential; it embeds designers’ judgment, value, and biases; and the results hopefully speak back to the theory and topics chosen, broadening the scope of knowledge and possibilities, as well as perhaps improving or reflecting on design processes themselves.

He’s also advocating for a more risky approach and perspective on design, with the claim that we often are good at solving well-defined problems (“good grades”) but not so good at having ideas that might help us think about tough problems (“creative thinking”). Further, the harder the problem, the less we know about it.

Like Mary, Eric is talking to some extent about critical design, speculative prototyping that calls out assumptions, possibilities, and hypotheses. There’s a general critical design meme that you’re looking for strong opinions that doesn’t seem necessary (suppose I reflect on assumption X and come to conclude I’m okay with it), but the general goal is to look outside of the normal ways that we look at a situation.

Now we’re going through a process that Phoebe Sengers talks about for thinking about design (and research) spaces: figure out what the core metaphors and assumptions of the field are, look at what’s left out, and then try to invert the goals to focus on what’s left out and leave out what’s a core assumption. Here we’re talking about this in the context of telepresence: what would it mean to think about telepresence not for work and fidelity but for fun and experience. One bit is an interesting parallel between early causal telepresence robots and, say, FaceTime.

Our next step is to think about the core assumptions of social computing (“find friends and connect”), what’s left out (“familiar strangers”), and inverting (“designing technology that exposes and respects the familiar stranger relationship”). Again, an impact claim here, that designs like Jabberwocky inspired things like Foursquare (with evidence that this is true, which is cool).

I wonder what the ratio of ‘successful’ to ‘unsuccessful’ or ‘more interesting’ to ‘less interesting’ critical design projects is. We’re now going through a large series of designs and it makes me wonder how many other designs we’re not talking about. Of course, you can say the same thing about research papers… and in both cases it would be really nice to see a little more of the sausage being made.

The large number of designs is also reminding me of the CHI 2013 keynote. Here we’re looking to illustrate the idea that looking at other design cultures might be useful for us, but the connections between particular designs and the underlying concepts/points/ideas are often not so clear: how do we make sense of these as a group. There’s always a talk tension where you want to talk about lots of cool stuff, making the connections between them, and managing the audience capacity, and again, this is not specific to designs (you see it in, say, job talks sometimes).

Now a discussion around the value of amateurs, DIY culture, and the idea that innovations often happen when people cross over these boundaries ($1 to Ron Burt, I think). It’s not clear that this follows from the set of designs we’ve talked about, but it’s a plausible and reasonable place and one that I try to live in a fair amount myself. There are costs to this — learning time and increase risk of failure — but I think that’s part of our game.

More heuristics around critical design, more than I expected, so kind of a giant list here:

  • Constraints (when is technology useful)
  • Questioning progress (what negative outcomes and lost elements arise)
  • Celebrate the noir/darkness (seek out unintended uses and effects)
  • Misuse technology (hacking/repurposing tech and intentionally doing things the wrong way)
  • Bend stereotypes (who are the ‘intended’ users)
  • Blend contexts (perhaps mainly physical and digital)
  • Be seamful (exploit the failure points of technology)
  • Tactful contrarianness (confidence in the value of inversions)
  • Embrace making problems (designs that cause or suggest issues, not resolve them)
  • Make it painful (versus easy or useful to use),
  • Read (not just papers), and
  • Be an amateur (a lover of what you do).

Noah Smith on NLP at SoCS

Continuing on the SoCS workshop, the afternoon session is a tutorial from Noah Smith at CMU about using NLP for socio-computational kinds of work.

Talking about how NLP people tend to make choices in models, algorithms, collection, and cleaning decisions, rather than what are “the right answers” (which are context-dependent), feels like a nice, fruitful way to discuss using NLP for socio-computational work. We’ll start with a discussion of document classification since that’s where Noah went.

Much of the story around NLP for document classification is thoughtful annotation/labeling of your data for the categories/attributes of interest. Having good justifications, theories, and research questions that lead you to create appropriate categories for the text and goals you have. And, once you get that dataset created, share it — people love useful datasets and might help you on the work.

Likewise, thinking carefully about how to transform the texts to text features–word counts, stemming, bigrams/trigrams, defining word categories (a la Linguistic Inquiry and Word Count, or LIWC)–is important and requires a thoughtful balance of intuition and justification.

Question: what are, for NLP folks, for CSCW folks, for social science folks, the “right” or “good” ways to justify choices of category schemes, labeling, feature construction, etc.?

One answer, around choice of ML algorithm, is to say “SVM performs a little better but you need to be able to talk about probabilities, so I’ll trade off a bit of performance for other kinds of interpretability”. And, especially if you choose a linear model from features to categories, the algorithms have relatively small (and predictable) kinds of differences — perhaps more noise than is worth optimizing on, versus spending your time on other stages that require more intuition/justification/art.

Another answer is that you should pick methods that you can talk sensibly about and that your community gets: if you can’t explain it at all, or to your community, you are in a world of hurt. Practical issues around tool choice that fit your research pipeline and skills and budget also matter.

Performance is only a piece of the tradeoff — and you really want to compare it on held out data. (You can be very careful about this by taking files with your test data and making them unreadable.) Likewise, you want to compare to a reasonable baseline; at the very least, against a “predict the most common class” zero-rule baseline. You might also think about the maximum expected performance, perhaps considering inter-coder agreement as an upper bound.

Performance went bad: what went wrong? Not enough data, bad labels, meaningless features, home-grown algorithms and implementations, (perhaps) the wrong algorithm, not enough experience or insight into the domain, …

Parsing for parts of speech or entity recognition is like sharing dinners. At dinner, the people around you will influence decisions on what to order. At NLP, the words nearby (and maybe some far away) might influence the classification of the words you’re looking at. The Viterbi algorithm for sequence labeling is a useful way to account for some of these dependencies.

Noah claims that this is going to be the next big idea from NLP that makes it big in the world of computational social science, because lots of important text analysis cames including part of speech tagging, entity recognition, and translation can be modeled pretty well as sequence labeling problems. Further, the algorithms for this kind of structured prediction are more or less generalizations of standard ML classification algorithms.

That said, there are a lot of really tough problems, especially around more semantic goals such as predicting framings, where there’s some in-progress work that is dangerous to rely on but perhaps fun to play with, including some of Noah’s own group’s work.

I’m going to not cover the clustering side, because I need a little break from typing and thinking, but hopefully this was useful/interesting for some folks.

Bonus note: can you predict if a bill will make it out of committee or a paper gets cited? Yes, at least better than chance, according to their paper.

Leysia Palen on crisis informatics at SoCS 2013

Since attempting to do a trip report after CHI was such a disaster, and since spamming twitter with N+1 tweets about an event may be annoying to some twitter folks, I’m going to try a kind of bloggy summary of the keynotes at SOCS 2013, the PI meeting for the socio-computational systems program at NSF.

This one is from Leysia Palen about understanding the use of social media, and ICTs more generally, in disaster response. Observations below (the first few are basically copied tweets before I realized I could do this, so they are pretty short).

— Dan

The sociology of disaster talks about the convergence of resources, information, volunteers; the “social media in disaster” question then might talk about how ICTs both add to and cut across these areas.

How to know you’ve found important online resources in disaster? Claim: people will mention them on Twitter, at least once, so if you collect Twitter, you get a pretty good sample of the world.

Leysia pointing out that SoCS/data mining proposals might need a substantial software engineering (or database) bit to support data management. This seems interesting as a way to build new collaborations and techniques both in the context of the kind of work lots of social media folks are doing.

Tweets for earthquakes to get human experience as well as magnitude and to supplement location. “Did you Feel It” encourages people to report on their earthquake experiences. However, crowd tasks for disaster response can’t put people at risk: “Where’s the lava?” would be a bad app. More prosaically, broadcasting to people asking whether a road is open will lead to people converging there — possibly putting them at risk, and the agency at liability.

Now we’re looking at people posting to Twitter about disaster, more or less intentionally and annotated with metadata, in a way that would let us think of it as data. Or journalism. Can we use it in real time/for situational awareness? Can it become useful data for longer term planning and policy?

So there’s a question about how both “spontaneous” (info people create anyways as part of responding to the disaster for their own reasons) and “solicited” (agency requests for specific info; apps like Did You Feel It or the Gulf Oil Spill app) arise, what they’re useful for, how they compare, what are the ethical and legal responsibilities around them, etc.

A model: “Generative”, initial tweets create raw material and report experience and conditions, mostly from locals. “Syncretic”/synthetic tweets use existing material to build out insight (example: looking at flood reports to predict future flooding). “Derivative” material, retweets and URL posting, is a kind of filter/recommender system, one that really helps deal with the bad actors/bad info problem because the useless stuff tends not to be generated. (And in particular, “official” material tends to be retweeted more often than other material.)

Talking about distance from and details of disaster experiences reminds me some of the opinion spam detection work from Myle Ott, Yejin Choi, Claire Cardie, Jeff Hancock at Cornell:

The part of the talk about the recent Haitian disaster, and how online digital volunteers/digital activists built major bits of useful information infrastructure, including Haiti’s first really good maps (period), applications of them to the earthquake, and other infrastructure for distributing good, was cool. There’s a huge question, though, about how and when these things work, and why. Further, part of the story was that people had been gathering before the earthquake with a good of doing good, building capacity that was then ready to be turned toward an event such as the earthquake.

So, how do formal/existing organizations manage and work with social media? It’s hard, because on the ground they are so busy responding that social media is just not that important (not unlike, but more serious, than professors never updating their webpages because there is always something more important to do). Apart from questions of liability, and practicality, there’s also just a question of voice: what are they trying to accomplish and to convey by being online? One strategy they use is to correct misinformation, and allow likely-good information to go by (but not endorse it, for the liability reasons).

It turns out that it’s hard to go the other way, too: it was hard for social media researchers just to _find_ the accounts and locations of, e.g., police and fire departments that responded to Sandy, in order to look at their communication with the public. And, there’s a whole parallel discussion about internal use of social media in (and between) these organizations.

Interesting audience question was how do we measure the impact of social media activity and content on the ground? Answer: not well, pretty hard to do this. I guess you could look at retweets/likes, and someone else suggested using a summarization tool to help suss through tweets to find ones that are represenative/important/meaningful.

I wonder if it would be useful to look at parallels between microlending dynamics and disaster response dynamics. It’s clearly not a perfect parallel, but some of the social dynamics around convergence might be interesting to poke at. Likewise, the general crowdsourcing literature and infrastructure might be a fun connection.

Tenure: writing and thinking about service

tl/dr: You may need to write a service statement for tenure. It looks like effective ones use specific evidence to talk about your goals, accomplishments, and plans around service in your professional, university, and broader community.

In our last episode, we were talking about tenure in general, and since then much of my energy has gone into CSCW reviewing. Perhaps it’s only fitting, then, that one thing I’ve found I need for the tenure package is a “service statement” [0]. The official Cornell guidance is delightfully terse: the dossier should include “statements from the candidate about his/her research, teaching, advising, service, and (if applicable) extension.” [1].

So I went off searching and asking for advice and examples [2]. Here I’ll give a few tidbits from that quest that are hopefully useful to other folks who are writing or thinking about service, followed by a mostly-reasonable draft of my own statement (comments welcome!).

Jon Kleinberg, who’s co-department chair and who handles tenure for information science, amplified a bit: “service: both in your research community outside Cornell — e.g. program committees and similar things — and also inside Cornell — things like serving on committees locally”. That was a nice, useful structuring suggestion, and I covered them separately (though Phoebe Sengers did a nice job of integrating both around a discussion of her overall service goals). I added a broader community aspect as well; in my case, I argued that this was primarily through software artifacts and public service via NSF reviews.

The Vet School has a little more guidance on what, specifically, to talk about: “The statement should document the quality and relevance of the clinical service and will include accomplishments, self-evaluation, steps taken to improve service, and future plans.” Combined with Jon’s advice, this basically set my overall structure.

The “steps taken to improve service” part of the vet school guidelines also reminded me that it might be useful to talk about specific training I’ve engaged in, whether it’s on the CV or not: a diversity workshop for hiring service; ed tech workshops for teaching; NSF funding workshops for research. Showing that you’re working to improve and setting yourself up for future success seems like an important goal, since part of promotion is about future potential.

That said, you also need to demonstrate current competence. Tanzeem Choudhury‘s service statement did a nice job of making specific claims around service impact backed up with specific examples from her service activities, which I largely emulated. Phoebe talked in detail about how she participated in and organized service activities [3] that served her goals and talked about tangible outcomes, which also felt strong.

There’s surely more to the story, but based on my poking at this I’ll call out the following bits as useful to think about.

  • It was handy to think about the different kinds of service: service to the professional community, the university (department, college, and university level), and the broader community.
  • You can tell the story around service using goals and rationales, the activities you’ve engaged in (both service itself and prep/training work), the accomplishments and outcomes that have come from them, and the plans you have going forward.
  • Finally, you should think about specific evidence you can marshal both to link the elements above and to demonstrate that you have significant “excellence and potential”, to use the words from the tenure policy. Thinking about  your noteworthy and distinctive goals, activities, accomplishments, and plans has value.

And, when I say “useful” or “has value”, I don’t just mean accomplishing an administrative busywork task. (a) It’s not just busywork: the tenure committee really does need you to help them think about this. (b) It’s probably worth spending a little time reflecting on what you do and why for service, how it shapes you and you shape it, and the cost-benefit story around it.

Hope this was useful, would be happy to get some comments on my statement, and have a good weekend.

— Dan

[0] Those of you waiting for a continuation of the CHI trip report… don’t hold your breath.  Now I understand why these are less common than they used to be; almost every thing I do is higher priority.

[1] Cornell is in part a public, land-grant university with an explicit public mission for several of its colleges; “extension” is a term often used to refer to those aspects.

[2] To be fair, Cornell does organize workshops that talk about tenure, tenure dossiers, and related topics in ways that have been useful. Your institution may have similar things; consider going earlier rather than later in your career, giving you more time to act on the advice.

[3] Phoebe also spent less time in the statement on standard kinds of conference-level reviewing and organization service, and more on activities  focused on her “invisible college” both within CHI and across disciplines. I don’t do as much of that kind of work, but now as I write this, it occurs to me that talking about my work and involvement with the Consortium for the Science of Sociotechnical Systems (CSST) might be useful.  (Since I want to publish the draft more like now, I’m going to publish it with an “insert CSST story here” bit.)


Service statement

In this statement I’ll talk about how I’ve addressed the service expectations for assistant professors, first covering service to my professional community, then to the university community, and finally to the community at large. In each case I’ll talk about my current activities and, to the extent I can predict them, future plans and goals.

Professional service

As an assistant professor, my primary service focus has been toward my professional community. This choice was based on both practical and moral considerations. From a practical point of view, one way junior researchers come onto the radar of more senior members of the community is through interacting with them as high-quality reviewers, program committee members, and conference organizers. It also is valuable for Cornell for its members to be seen as effective contributors to and leaders of their professional communities. From a moral point of view, service to the professional community is important and impactful. Submitting and publishing consumes resources and it is only right to give back through reviewing and organizing. Further, reviewers and organizers have real influence on the conversation of research in a discipline. Good service increases the quality of published research, which is both an academic and a societal good; those who serve also have their voices heard in shaping the directions and methods of a field.

Thus, I have invested major effort in professional service, as documented in my CV. I have a long history of reviewing for most of the major conferences and many of the journals related to my professional interests. Starting in 2009 I began serving on program committees, and have served on the CSCW and CHI committees a number of times. (I’ve also served on many other conference ‘program committees’, though junior members of these committees are mostly reviewers; these include RecSys, WWW, IUI, and UMAP). I also started in conference organization roles for relevant social computing conferences in 2009 including co-chair for videos, demos, and doctoral colloquiua. My work in these roles led to me being named the technical chair for WikiSym in 2012, and I was recently chosen to be the general co-chair for CSCW, a leading social computing conference, in 2015. This gradual escalation in responsibilities and roles gives evidence that I am seen as a valuable, important member of my professional community.

(Insert CSST paragraph here in next draft).

My plan here is largely to keep doing what I’m doing. I will need to be a little more strategic in reviewing (though I now use some review assignments to mentor PhD students in reviewing) and I will need to take breaks from outside service to support university work. But on balance I have done well here.

University service

For university service, I have focused primarily on service within my department, both to demonstrate my value as a department member and because department-level service is more aligned and appropriate with the experience and qualifications of assistant professors. Further, my department has been very good about limiting my university service duties so that I could focus on the professional service described above.

Still, I have done a number of things for Information Science, CIS, and Cornell. At the department level I’ve served in a number of committee roles, including the graduate admissions, curriculum, and faculty recruiting committee; serving in these roles has given me useful experience for leading these committees as an associate professor. I also organized a professionalization seminar series for early-career PhD students and managed the department colloquium for two years. At the college level, I’ve represented the department at college-level events including BOOM (Bits on Our Minds, our undergraduate research showcase) and Cornell Days (orientation for incoming undergrads); I also served on the committee overseeing the transition in computing facilities for the department. For the university, I’ve done several one-off committees and panels that leverage my expertise, including a successful panel on how academics can leverage social media, working with the social media hub portion of the Tech Campus initiative, reviewing for the Institute for Social Sciences grant program, and participating in the Cornell Moodle courseware pilot.

Here, I expect to take a much larger role as an associate professor, chairing committees for the department and participating in standing college and university-level committees. For the department, I’m looking forward to being the director of either undergrad or graduate studies once I return from sabbatical; either would give me a chance to turn some of my service energy directly toward students in a way necessary for the department and rewarding for both the students and me. At the university level, I am hoping to find committees that leverage my knowledge of social media, technology, and education; working with the development of academic technology and MOOCs would be a natural fit.

Service to the broader community

My main contributions to the broader community are through developing software artifacts as part of my research that are both themselves used and that have influenced other systems. SuggestBot, a Wikipedia tool that helps people find articles to edit that need work and that are related to their interests, has been in continuous use for six years and has made hundreds of thousands of recommendations to thousands of editors. Pensieve, which supports reminiscing and reflection by reminding people about meaningful content they have created in social media, still has an active user community after four years, and has influenced the design of related tools such as Timehop (whose senior engineer Jon Baxter was a student lead for Pensieve). RegulationRoom, an online community that encourages citizens to participate in federal rulemaking processes, has influenced a number of socially relevant regulations around air passenger rights, home mortgage consumer protection, and distracted driving. I also serve both the broader community and my professional community through regular reviewing for NSF proposals.

My main plan here is to continue to work on socially relevant projects. I’m building relationships with companies, particularly Google and Facebook, looking to define questions that have both research depth and potential impact on products used worldwide. My work with Amit Sharma on recommender systems designed for social networks and social interaction (rather than for individual consumption), with Victoria Schwanda on using social media platforms to deliver positive psychology interventions, with Bin Xu on leveraging social media data to support relationship-building both offline and online, and with Liz Murnane on better models and techniques for motivating people to volunteer and participate in activities for social good all have real potential value beyond the research community.

I am also considering two broader service activities that would have both social and personal benefit. One would be to invest some of my energy in the new tech campus. Building the infrastructure to help train a next generation of innovators and entrepreneurs would have lasting social benefit, while gaining more knowledge and experience in this area would make me a better advisor for students in the long term. The other broadening activity would be a rotation as an NSF program officer. With the current rate of hiring in information science and the progress of the set of students I am working with, a rotation there in two or three years might be excellent timing. Like the tech campus, this would produce both social good, through helping manage and shape national priorities around research, and personal/professional good, through a better understanding of the funding landscape and through interacting with reviewers and other NSF officers.

Thinking about tenure

As an assistant professor entering the summer before my sixth year, I’m spending non-trivial effort on putting together my tenure package. For those who haven’t had this experience, it involves a number of things. One is a full-on assault on my CV: papers published, talks given, awards garnered, funding gotten and sought, and service rendered to the department, university, and broader academic community.

Another is a “research statement” that tells stories about what I’ve done, how it fits together, why it’s important, and how it’s affected the world [1]. Along with this, you send in a list of names of potential “letter writers”, who testify to how you’ve influenced the research community [2].

A third part is a “teaching statement” about my philosophy of teaching and the results I’ve gotten [3]. This pairs nicely with an “advising statement” that talks about how I work with PhD students and undergrads, both on research and in more general career (and occasionally, life [4]) advice.

There are probably other parts that I’ll find out about along the way, but these are the main ones you hear about. And it’s daunting, for a number of reasons. First is the uncertainty, about the process, the criteria, and the outcome [5]. I feel fortunate to be at Cornell, because even if I don’t get tenure I expect to have options [6]–but I’d still rather get it! Cornell has done a fair amount of work to make the process and criteria transparent, but it’s still a little nervous-making.

Second is that it forces you to confront the question of whether you’re doing good and valuable work. You’re reflecting on your career as a whole, which is much different than the more situated way I suspect most of us approach the work [7]. Who have you helped? Hurt? What does it mean? What’s next? It’s good to face these questions every so often–but they can still be scary.

Third is that you are working with incomplete information. Some of this is because much of the power rests in other people, and you only sort of know how others see you [8]. More practically, you probably didn’t record everything along the way, either about the activities or why you did them, so you’ll be spending some time grinding through email, filesystems, and neurons trying to dredge it back out [9].

There’s lots more to say about this, and I plan to come back to the blog as I do some of these activities to talk about them in ways that hopefully help other folks down the road. But for now let’s leave it here, while I go off and ponder my teaching statement some more.


[1] Someday we’ll have a nice blog post about reasons to do research even though most of it does not have such impact.

[2] Or, call bullshit on your research statement. Or just decline to write a letter, which is apparently not a good sign if too many people do so.

[3] Technically, I should be writing this right now; this  post is total procrastination.

[4] Though it’s not obvious that I’m qualified for this.

[5] Tenure is always uncertain, and everyone needs a backup plan. Cliff Lampe’s plan was goat farming, but he recently got tenure (which means, I guess, he could just do it anyways). Following in my parents’ footsteps, mine is to be a truck driver. I like driving, I like trucks, and I like stuff. It’s perfect.

[6] My original PhD plan was to teach at a liberal arts school; I still have my statement of purpose for PhD apps floating around. It’s a little cringeworthy, which means that I’m sure to post it someday.

[7] Frantic CSCW submitters from yesterday, I salute you.

[8]  A common piece of tenure advice is to clarify this through giving talks at other schools and asking either indirectly or explicitly about tenure and letter-writing. This is sometimes called the “tenure tour”.

[9] Though, this has been surprisingly rewarding as a way of reminiscing about people and events.