Corpus and Repository of Writing

C&W 2017 Workshop

We’re excited to be offering a full-day workshop at Computers & Writing 2017 — sharing some of the lessons we’ve learned from our highly nerdly inter-institutional collaborative research project. Here’s the abstract, and after the jump, the full description of the workshop. (Attending? See the workshop materials.)

Structuring Active Work: Developing Sustainable Digital Infrastructures for Collaborative Research Teams

As writing is increasingly performed in online shared spaces, and humanities research becomes more dependent on external funding, collaborative work is more important than ever. Although collaborative teaching and learning are nearly ubiquitous, scholarship in Computers & Writing speaks more to classrooms than our own research and professional development. This full-day workshop supports sustainable research in our field by helping research teams learn to communicate and collaborate in a manner which both supports joint decision-making and sustains long-term research. We share lessons learned from our interdisciplinary, inter-institutional research project focused on research and professional development in writing instruction. Through intensive participant-facilitator collaboration, we offer attendees opportunities to gain experience using digital tools, redirect communication breakdowns productively, build frameworks for scaffolding active work, and network with researchers similarly interested in helping writing research become more sustainable, efficient, and effective.


Attendees of this full-day workshop will study models for collaborative research teams, learn best practices for digital collaboration tools, and build a framework for their future collaborations including goals for sustainable research.

Interested? Register today, or keep reading!

Read more ›

Tagged with: , , , , , , ,

Crow Arizona

Shelley Flies South; Wildcats turn Wildcrows

Shelley Staples has moved from Purdue to the University of Arizona to head up our little flock of AZ Crowbirds. She is joined by four Wildcats who turned Wildcrow to start up the Arizona branch of the corpus project. Samantha Kirby and Olga Chumakova are graduate students in the University of Arizona’s English Applied Linguistics program. Kati Juhlin and Justin Squires are undergraduate students in the Linguistics department.  

Teaming Up

This is the first blog post from the Arizona team, and we’re excited to share what we’ve been working on.

Samantha Kirby’s interests have always been with digital technology and linguistics, and in the CROW lab she found quite a few venues to apply her talents. Olga Chumakova had some past experience with building a small corpus on paper and in a box for research she did, so the opportunity to participate in a grown-up and serious corpus project felt like an exciting way to learn. Kati Juhlin can’t think of a better way to transition from an undergraduate linguistics degree to a graduate English Applied Linguistics program than the CROW lab, where she can geek out about grammar and get a crash course on corpus linguistics research at the same time. Justin looks to make his way to Japan as an English language teacher after graduation and has found exposure to copious amounts of L2 writing and a language-teaching oriented lab team to be very informative about what the inner-workings of the field may look like.

Corpus Building

Last semester we collected student essays from several sections of English 106 and 108, freshman composition courses for international students. Now we’re working forward to collect content from even more instructors and students at the end of this semester.

Our biggest task has been processing these texts. Right off the bat we ran into some snags. Most notable is the incorporation of multimedia in students’ writing. We love that teachers are encouraging their students to make Weebly pages, Facebook posts, and tweets to demonstrate different registers of writing, but it makes our task harder as we decide how to convert these creative works into .txt files in a consistent way. It has taken some trial and error, but we’re working on establishing guidelines to do it. We are excited to announce that the end of text processing is in sight and we are about to start de-identifying our data!

A recent informal discussion among the Arizona Crow members reminded us that we are not representing all of the multilingual writers at Arizona  because they decide to take writing classes with first language (L1) English/monolingual speakers. Adapting to a new writing environment might encourage multilingual writers to work twice as hard as some L1 English writers with the result being stronger, more grammatically complex writing. Don’t get lazy all you L1 English writers!


Samantha and Olga have also been putting together a workshop for the instructors of the UA Writing Program in April, and another for faculty and grad students who may be interested in using our corpus for research. The idea for the first workshop is to help raise language use awareness among English instructors, and show how corpus linguistics can inform curricula and classroom activities. For the second one we plan to get even more people excited about using PSLW and ASLW for their projects. We think possible research can include not only studies of grammatical structures, and learner’s use of language features and conventions, but also identity research, because our corpus contains reflective essays and narratives.


Modern technologies keep the crow-birds in touch. Olga, Justin, and Kati have been getting familiar with the tag checking process and establishing inter-rater reliability with Ashley and Ji-Young from the Purdue team via Skype. We’re learning all about passive voice, and the complement and relative clauses. The word “that” is now all we hear when someone speaks, or writes. And Skype says no to two calls on two computers from the same account, be warned!

Finally, Samantha will be working with Mark Fullmer on the interface for the online corpus. These two will combine their digital expertise to ensure successful usability and database construction.


Olga, Samantha, and Shelley joined the rest of the Crow team in Seattle for the TESOL conference. Stay tuned for an update on our presentations there!




Crow poster at AAAL 2017

Next! After our two CCCC talks, the Crow team presented this poster (PDF) about our work in the Sunday afternoon poster session at AAAL 2017.

Crow AAAL 2017 Poster (three panels)

Crow AAAL 2017 Poster; link to PDF

Corpus and computational collaboration at CCCC 2017

Today the Crow team presented “Cultivating Writing Research via Corpus and Computational Collaboration,” featuring talks from Lindsey Macdonald, Shelley Staples, Bill Hart-Davidson, and Ryan Omizo:

In March 2017, CCCC will be joined in Portland by AAAL, the conference of the American Association for Applied Linguistics. We take this opportunity to highlight the value of collaboration between researchers who will be attending one, but likely not both, of these conferences, and unfortunately, crossing paths in few ways. The corpus linguistics methods common in applied linguistics can bring quantitative elements to empirical research in rhetoric and composition, including attention to demographic issues and diverse genres. Rhetorical research, conversely, offers corpus researchers valuable insights into extra-textual features and contextual influences. This panel explores possibilities for collaborative writing research by demonstrating the value of this interdisciplinary work. We offer an overview of the benefits of corpus and computational methods, then present case studies of two projects which integrate computational methods and corpus linguistics with rhetoric and composition. We conclude with a brief panel discussion of takeaways for interdisciplinary collaboration, then invite conversation.

Here’s our session handout.

We’d love to hear your comments!

Tagged with: , , , ,

Promoting RAD writing research at CCCC 2017

Michelle McMullin, Terrence Wang, Bradley Dilger and Shelley Staples are presenting “Promoting RAD writing research through inter-institutional collaboration” at CCCC. Here are the notes, slides, and references from our A10 presentation, (March 16, 2017).

In this presentation, we describe how research designed as inter-institutional from its inception has embedded attention to diverse research outcomes, the development of sustainable infrastructures, and the lifecycle model of scalable user-centered development. Our project brings the methods of corpus linguistics to rhetoric and composition, and vice-versa, creating a web-based archive for research and professional development. By embedding an interdisciplinary approach to collaboration from the start, we have developed a project that considers the strengths and contributions of each partner for an effective collaboration model that best serves the needs of all stakeholders.

Promoting RAD Writing Research conference handout (PDF).

Read more ›

Tagged with: , , , , ,

Conference Season has Begun! PLCC 2017

The Crow team had a great time presenting at the Purdue Languages and Cultures Conference (PLCC 2017). One of our presentations was standing room only!

Here is a link to the handout for “Building a Better Team”

Look for a longer blog entry soon as we prepare to travel out west for our March SPACCLE!

Crow supports refugees, immigrants, and international students

The past three months have been a very exciting time for Crow. We’ve received good news about grant funding which we are eager to share once we finalize the paperwork with our sponsors. We’ve added new collaborators who are bringing energy and new perspectives to our work. And we’ve been invited to share our research at the Purdue Languages & Cultures Conference and Computers & Writing 2017, in addition to our upcoming presentations at CCCC, AAAL, and TESOL.

From the start, Crow has always been driven by and for graduate students from all over the world. Our work builds on two projects, COIN and PSLW, both started at Purdue. We’re proud to include researchers from diverse countries in our team — including Poland, China, South Korea, Russia, Turkey, Lebanon, and the United States. The texts in Crow are written by students from these countries and more. We’re committed to working with researchers internationally and turning to them to make Crow broadly useful.

For these reasons, we’re both sad and angry about the hateful attacks the Trump administration is making on immigration, wrongly singling out the Muslim faith, and carelessly harming people who have lawfully come to make the United States their home. Most of all, it is wrong to cast aside refugees, hopeful to escape hatred and war, who have patiently shown they are not worthy of fear or aspersion. These are our friends, students, colleagues, and neighbors. It is wrong to fear them for being different. To put these and other human beings in danger is to reject the freedoms supposedly being protected by this gross over-reaction. We wonder who will be the next targets?

Lately we’ve started referring to our team as a family. Now our family is being threatened. We are better people for having worked with Hadi, Ola, Jie, Zhaozhe, Ji-Young, Olga, Beril, and Ge — and too many others to list here. Our hearts are fuller, our research is better, and our project stronger because of them. We are grateful for the responses from our universities. Like them, we will resist these attempts to undo the good work our students are doing. We will protect their interests and affirm their rights to be treated with the respect, dignity, grace, and kindness they show others.

Bradley Dilger, Shelley Staples, and William Hart-Davidson

Tagged with: , , , ,

Crow’s Trips to the Southwest and Midwest: SSLW and AACL 2016

On October 21st, 2016, five proud Crow members, Professor Shelley Staples, doctoral student Hadi Banat, Aleksandra Swatek, Ashley Velázquez, and Zhaozhe Wang, flew with Crow to the wild West and stretched their  wings in the lively college town of Tempe, Arizona. After an entire year’s hard work dedicated to this now fully fledged interdisciplinary project, we were honored to have the opportunity to represent our team and introduce our project to an enthusiastic audience at this year’s Symposium on Second Language Writing (SSLW).

Terrence presenting while others look on

Not so long before that, on September 16th, another group of Crow members, Ge Lan, and Jie Gao along with SLS doctoral student Ji-young Shin, had presented on an empirical study derived from the Crow project—on students’ use of reporting verbs. This talk was part of the annual conference of the American Association of Corpus Linguistics (AACL) in the equally beautiful college town of Ames, Iowa, and successfully attracted yet another audience.

These presentations were the best birthday gifts for our one-year-old baby Crow, and anniversary gifts for the intellectual relationship between team members from two disciplinary fields—rhetoric & composition and second language studies.

So far, Crow has positioned itself in the intersection among audiences from diverse, even vastly epistemologically and methodologically different disciplines, including rhetoric & composition, corpus linguistics, applied linguistics, technical communication, and second language writing. And the list will continue to grow as we attend more conferences and share our research in journals.

All of us Crow members have been highly committed and contributed lots of time and energy to its growth. Yet this growth is not solely marked by how many audiences we have reached out to, but by the professional growth of every individual Crow members as well. We have been able to take as much as, or even more than, what we have given. Since we have recently showcased the progress of our project at two important academic conferences, I think it’s a perfect time to look back as we are taking small but steady leaps forward. And it’s a perfect time to hear from some of the Crow members about their experiences at the two conferences. So I invited Hadi Banat, Jie Gao, Ge Lan, Aleksandra Swatek, and Ashley Velásquez, who presented at either or both of the conferences, to share their reflections with us by responding to several questions. Let’s hear what they have to say.

Q1: What were your experiences like in general at either/both of the two conferences?

Jie: I attended AACL in this September at Iowa State. I was really impressed by the pre-conference workshop held by Dr. Stefan Gries, which introduced of the application of R software in corpus linguistic research.

Portait: Terrence, Hadi, Ola, and Shelley at AACL

Hadi: My experience at SSLW was crucial in terms of positioning myself in the discourse community of second language writing scholars. I felt it was a great opportunity to interact with professionals in my field and learn about the new trends of research.

Ge: For AACL and SSLW, I did see people talking about their research from different angles, which helped me broaden my research perspectives. Both conferences offered various topics to cater for different audiences.

Aleksandra: I only attended SSLW, but I felt like I am among academics who have very similar interests to mine. It’s an amazing experience to be able to interact with people who are as interested in second language writing as I am. Such conferences give me motivation to work even more and to contribute to the field.

Ashley: As always, I enjoyed SSLW. It’s a wonderful conference in terms of both scholarship and networking and catching up with old and new pals.

Q2: What’s your biggest takeaway from either/both of the two conferences?

Jie: The conference was eye-widening with the potential topics corpus research could cover. I remember one presentation: “Worldbuilder: A Tool for Text World Analysis,”

which investigated Worldbuilder as a tool for text-world analysis and visualization. The presentation showed how the tool was used to produce visualization of the annotated data in the form of text-world diagrams based on the criminal case of Amanda Knox. The murder trial lasting from 2007 to 2015 has aroused attention world-wide, and Worldbuilder helped prove the translation of Amanda’s three statements as inauthentic to the original ones. Interestingly, Netflix presented a documentary based on the case just after AACL ended. The presentation at the conferences displays how corpus linguistics can be utilized within forensic issues.

Hadi: On a professional level, I felt assured that the field of second language writing is expanding and flourishing. The new strands of presentations that were not in previous conferences reflect the variety of topics and research trends of interest. A dominant focus on graduate second language writers in US institutions was an interesting observation because it reflected the diverse professional opportunities and needs of the job market. I felt that the balance between plenaries, colloquiums, and concurrent sessions contributed to my learning experience due to a mix between talks, workshops and presentations.

On a social level, I believe I got closer to my cohort of SLS PhD students at Purdue because we had plenty of opportunities to interact and learn more about each other. Through our conversations, I noticed mutual compassion, respect and admiration, which made me feel psychologically better about my choice of Purdue as home for my PhD studies. I also had a chance to meet SLS Purdue alumni, who are doing very well in their careers in the US, China and Japan. They were accessible, welcoming, cooperative and engaging. I made new connections and friendships that I want to pursue.

Ge: For AACL, I realized the importance of programming skills in corpus linguistics and for SSLW, people gave me and Zhaozhe a lot of critical feedback on our project, so we can revise the methods and results.

Slide from expertise presentation at SSLW

Ashley: My biggest takeaway, honestly, is that we’re emerging in some way. Often, as a graduate student, I tend to forget that the scholars I’ve propped up on a pedestal were once PhD students such as myself; they, too, had no idea what they were doing and had to fake it till they made it. The theme for SSLW this year emphasized expertise; and what I’ve learned about expertise is that while some may revere others as an “expert” we’re all just learning as we go.

Q3: Could you say a little bit about your experience presenting on projects related to Crow? What did you present on? Were they well-received by the audience? Did the audience provide any constructive feedback that we as Crow members may draw on?

Jie: I co-presented with Ge Lan and Ji-young Shin on the reporting verb projected based on the pedagogical use of PSLW. We clarified why inferential statistics have not been used, and the audience members were interested in the future research potential in data from the control group.

Hadi: Our Crow presentation was well-received. Although it was early in the morning, we had a great audience who were listening attentively and taking notes. Paul Kei Matsuda was among the audience, which was intriguing. We did not have time for many questions, but through our brief conversations with the audience there was an interest in an interdisciplinary project that invites collaboration across institutions. One of the interesting questions we received was, “Are you considering expanding the corpus and repository to include writing samples and pedagogical materials from other classes?” Dr. Staples expressed a common interest with Dr. Dilger to include samples and materials related to writing in the disciplines in the future after developing the online interface and finishing the current phase of the project.

Ge: For AACL, I have co-presented with Wendy and Ji-young on the reporting verb project. People encouraged us to do statistical analysis to further strengthen our results. For SSLW, I had a co-presentation with Zhaozhe, and people gave us a lot of constructive feedback on how we can revise the method and how we can interpret our results more clearly.

Ashley: At SSLW I presented with Scott, Ji-young, and Shelley on our reporting verb project. It seemed that our project was well received by the audience. One thing I do remember was our slip up of using “significant” as a descriptor.

Q4: How, and to what extent, has your engagement with Crow shaped your professional development trajectory in terms of professional conferences?

Rodriguez, Biber, Shin, Wang, Lan, Gao, and Staples at AACL

Rodrigo Rodriguez, Douglas Biber (Regent’s Professor, Northern Arizona University), Ji-young Shin, Terrence Zhaozhe Wang, Ge Lan, Wendy Jie Gao, and Shelley Staples at AACL (From left.)

Jie: Crow inspired me to chew more on how to carry out quantitative analysis in second language studies. I have also been spurred to think more about the relationship between teaching and research, or how to keep a balance between the two.

Hadi: Being part of Crow team offered me valuable opportunities to present at flagship conferences with my colleagues. My involvement in Crow is a learning experience because it is adding to my knowledge which facilitates having scholarly conversations with people in different disciplines pursuing different research projects.

Ge: When preparing the conference Powerpoint, I learned a lot from my co-presenters not only from their academic knowledge but also from their carefulness and patience.

Ashley: Honestly, it’s been the connections I’ve made and the knowledge I’ve gained from the mentorship that Crow offers us. I’m more confident in my identity as a researcher and emerging scholar since being involved with Crow.

As our team members have nicely reflected, they all feel grateful for the great professional development opportunities Crow has provided, and they all feel proud to be part of this initiative. Despite the fact that they have varying research interests, methodological orientations, and levels of engagement, they have all gained something that’s personally fulfilling from Crow. That’s the beauty of it. And that’s what’s going to drive us to fly higher and farther.

Tagged with: , , , ,

As the leaves change: Crow updates

As the Fall semester progresses, the Crow team has been hard at work submitting grants, attending conferences, and working on prototyping.

Welcome new Crowbirds!

We have welcomed three new Crowbirds onto our team! Tony Bushner, Ashley Velázquez, and Bill Hart-Davidson. Tony has joined to work on various development and prototyping tasks as Crow starts designing database structures. Ashley’s focus is in Second Language Studies. S he has been helping with the development of PSLW and with research in the citation project. Bill Hart-Davidson is the most recent addition to our team. He will serve as the Project Coordinator for Michigan State University. He will lead the user-experience research and development team for this project to ensure its usability and usefulness.


grant-imageThe grants team has been working their way through their list of grants from the beginning of the semester. We have submitted our CLA Enhancing Humanities grant, and are waiting to hear back. We planned on using this grant for travel expenses to the Computers and Writing conference, and to fund some of our graduate researchers during the Spring and Summer terms.

The Humanities Without Walls Changing Climate Initiative grant has also been submitted. We are waiting to hear back on this one as well. We’re excited about the grad lab practicum component, since Crow is such a good fit.

I recently completed, with the help of Bradley, the ASPIRE grant. We were approved for $1,500! This money will help fund travel expenses for our graduate students and staff to present at three upcoming conferences. Shelley Staples also learned that her University of Arizona Small Faculty Grant, to begin the development of a corpus at Arizona, was funded at $3,000.

Hadi, Bradley and I are still working on the final drafts of our CLA Non-Laboratory grant to help us revamp our resource room. We should have it submitted soon! Our next big grant that we will be working on is the American Council of Learned Societies Digital Extension Gran t. A list of completed and awarded grants is on our web site.


A few of our Crowbirds, Shelley Staples, Ola Swatek, Terrence Wang, and Hadi Banat just returned last week from the Symposium on Second Language Writing (SSLW) Conference at Arizona State University. A recap of the event will be available soon.


prototype-picture-2 prototype-picture-1Things have been slow but steady with prototyping for Crow. Recently, we revisited our environmental scans and the goals for the project in an effort to explore a variety of new platforms which could host the project. Also, we have continued our effort to find resources within our institutions to help support the site development. In the coming weeks, we will move to creating a closed test site using existing materials in our possession. We soon help develop recruitment materials and protocol procedures to gather more texts for the corpus in December.

The corpus

We will begin recruitment for our corpus soon. We will be calling on English 106 and 106I instructors here at Purdue to submit their student papers and pedagogical materials for us to examine and add to our corpus. A group of researchers are working together to develop recruitment processes which support prototyping efforts. We’re expecting to cross eight million words with the PSLW corpus this fall.

TALC 2016

Ola, Shelley and Hadi traveled from Warsaw, Chicago and Beirut and met in the beautiful German town Giessen to present @TaLCGiessen. The center of this vibrant town is the Justus Liebig University, which hosted the conference.

We got to explore the many possibilities of corpus research through the pre-conference workshops and a number of interesting presentations. Three presentations in particular related closely to Crow’s work on using the corpus and repository for teaching, or data driven learning (DDL).

First, the plenary by Marcus Callies focused on corpus literacy of teachers. He emphasized the need for corpus training in applied linguistics programs and the divide that still exists between novice and experienced teachers in their use of corpora in the classroom. In reflecting on this talk, though, we discussed that a major reason for the divide is that corpus builders can do more to make corpora more teacher friendly, as well as to develop ready-made materials that instructors can take into their classrooms. This is part of what we are trying to do with Crow.

Tom Cobb and Alex Boulton presented on a meta-analysis of research on the use of corpus tools for classroom teaching. They found that, despite concerns by many teachers and scholars, use of corpora in the classroom was effective even for intermediate language learners. This aligns with the classroom based research project conducted by Crow team.

In her presentation “All tooled up: Corpus-assisted editing for academic writers”, Maggie Charles talked about using AntCont – Keyword List, Concordance Plot and N-grams – for the purposes of editing graduate writing (thesis or dissertation). In her work with students at the University of Oxford, she teaches how to use corpus tools to compare the use of terminology used in research articles with the students’ own writing. The students create a keyword list based on research articles and compare it to their own graduate writing keywords. This helps the students to notice differences in the use of terminology and discourse markers to revise their writing. In conversations with Maggie Charles, we were also able to get feedback on our use of the semantic and functional coding schemes we’re employing in our research on reporting verbs. This was especially useful since our study builds off of her 2006 paper on reporting verb use in master’s theses.


The conference team made our experience smooth, and after every eventful day at the conference there was either a tour, a dinner, a social or a get-together. Every new TALC day was another opportunity to meet researchers from Europe and the United States in order to discuss different projects in a multitude of academic contexts. We learned that funding is an important driving force behind the sustainability of big research projects started in Europe. Since funding is competitive in the United States, we started thinking more about bigger grants and possible international collaboration.


Our German cultural conference experience was not restricted to academics; we got to experience the fine taste of German beer and the many options we selected from. Whether you are a vegan fan, a vegetarian, a pescatarian, or a meat lover, the small town of Giessen can accommodate your diet genres and offer you a range of mouth watering dishes.


As a team, we got to know each other better, and that is one significant prerequisite for Stage Two in Crow. Our conversations every evening that included commentary about conference sessions, new corpus tools, research techniques, our own interests, strengths and challenges gave us better insights about Crow’s future directions. We also connected with researchers at other institutions, such as Nicole Tracy-Ventura (pictured with us below), professor of applied linguistics at University of South Florida. She worked closely on the development of another student learner corpus, the SPLLOC (Spanish Learner Language Oral Corpora) at University of Southhampton:


Being scheduled for the last slot on the last day of the conference did not turn out to be a bad experience after all. Three days of engaging conversations with TALC participants gave us better access to our potential audience. Thus, we decided to tweak our presentation accordingly and implement last minute adjustments that contributed to a successful session. Thinking aloud and rehearsing were things we happily did during coffee and German cake breaks.


We were worried about the number of people attending the very last presentation at TALC, but our worries subsided the minute we entered Senatssaal hall. In addition to the decent number in the audience, Ute Romer, MICUSP developer, was listening attentively and taking notes.


Our presentation was well received, and the Q and A session exceeded the ten minute threshold. Ute Romer asked about tags accompanying the processed texts and was interested in the broad range of writer characteristics our corpus includes. The concept of first year composition was also compelling to professionals working in Europe because most of the writing they do with their students is in English for Academic Purposes and English for Specific Purposes. Other attendees were keen on asking about the research studies we conducted using data from the corpus.


After our session, Shelley and Ola went off to the big city, Frankfurt, where they literally spent twenty minutes climbing the 328 stairs to reach the dome of Cathedral of St. Bartholomew, and there they enjoyed the beauty of German architecture from the high skies. Hadi took a train to Berlin because after a six-week wait to get his visa to Germany, he wanted to make the trip worthwhile. TaLC 2018 will take place in London, and we are hoping by that time Crow will have developed to fly to a new destination in Europe. Until then, Crow team will be busy developing the online interface and conducting research.

Tagged with: , ,