My research interests are Machine Learning, Natural Language Processing, and Information Retrieval and Extraction. In particular, I am interested in the application of these to problems in the Social Sciences and Humanities.
While employed at ETS, I worked in the Natural Language Processing and Speech Group in Research and Development. I was the lead developer on the TextEvaluator (formerly SourceRater) project, and the lead back-end engineer for The Writing Mentor. I was also a key member of the team which develops e-rater and our unified “rater” platform which powers e-rater–ETS’s automated essay-scoring engine, Henry–for content-based short-answer scoring, and The Writing Mentor.
At Stony Brook, I worked with Professor Amanda Stent in the HCI Lab. For my thesis, I developed a Java-based NLP/Machine Learning tool that aims to help students, both native and non-native speakers of English, improve their writing. You can read my thesis, if you’d like.
In the summer of 2007, I participated in the Data Sciences Summer Institute at the University of Illinois at Urbana-Champaign, where I worked on a project that explored the Virtual Web as a finite-state graph. A presentation on my work can be found below.
In the summer of 2016, I attended the North American Summer School on Logic, Language, and Information (NASSLLI) at Rutgers University.
- CALICO Journal (very rarely)
- ACL 2016 System Demonstrations
- EMNLP 2017 System Demonstrations
- EMNLP 2018 System Demonstrations
- The 13th Workshop on Innovative Use of NLP for Building Educational Applications (BEA) at NAACL 2018
- NAACL-HLT 2019 Social Media Track
Forsyth, Carolyn M., Stephanie Peters, Jung Aa Moon, and Diane Napolitano. 2019. Assessing Scientific Inquiry Based on Multiple Sources of Evidence. Presentation. American Educational Research Association (AERA), Toronto, ON, Canada. Forthcoming.
Burstein, Jill, Norbert Elliot, Beata Beigman Klebanov, Nitin Madnani, Diane Napolitano, Maxwell Schwartz, Patrick Houghton, and Hillary Molloy. 2018. “Writing Mentor: Writing Progress Using Self-Regulated Writing Support”. The Journal of Writing Analytics 2 (1): 285–313. https://journals.colostate.edu/analytics/article/view/213.
Madnani, Nitin, Aoife Cahill, Daniel Blanchard, Slava Andreyev, Diane Napolitano, Binod Gyawali, Michael Heilman, et al. 2018a. A Robust Microservice Architecture for Scaling Automated Scoring Applications. ETS Research Report Series. doi:10.1002/ets2.12202. https://onlinelibrary.wiley.com/doi/abs/10.1002/ets2.12202.
Madnani, Nitin, Jill Burstein, Norbert Elliot, Beata Beigman Klebanov, Diane Napolitano, Slava Andreyev, and Maxwell Schwartz. 2018b. “Writing Mentor: Self-Regulated Writing Feedback for Struggling Writers”. In Proceedings of the 27th International Conference on Computational Linguistics (COLING): System Demonstrations Session. Santa Fe, NM. http://www.aclweb.org/anthology/C18-2025.
Malmasi, Shervin, Keelan Evanini, Aoife Cahill, Joel Tetreault, Robert Pugh, Christopher Hamill, Diane Napolitano, and Yao Qian. 2017. “A Report on the 2017 Native Language Identiﬁcation Shared Task”. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA). Copenhagen, Denmark: Empirical Methods for Natural Language Processing (EMNLP). http://www.aclweb.org/anthology/W17-5007.
Yoon, Su-Youn, Yeonsuk Cho, and Diane Napolitano. 2016. “Spoken Text Diﬃculty Estimation Using Linguistic Features”. In Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications (BEA). San Diego, CA: North American Chapter of the Association for Computational Linguistics (NAACL). http://m-mitchell.com/NAACL-2016/BEA/pdf/BEA1131.pdf.
Bhat, Suma, Su-Youn Yoon, and Diane Napolitano. 2015. “Automatic Detection of Grammatical Structures from Non-native Speech”. In Proceedings of the Sixth Workshop on Speech and Language Technology in Education (SLaTE). Leipzig, Germany: INTERSPEECH. https://www.slate2015.org/files/submissions/Bhat15-ADO.pdf.
Napolitano, Diane, Kathleen M. Sheehan, and Robert Mundkowsky. 2015. “Online Readability and Text Complexity Analysis with TextEvaluator”. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL): System Demonstrations Session. Denver, CO. http://www.aclweb.org/anthology/N/N15/N15-3020.pdf.
Sheehan, Kathleen M., Michael Flor, Diane Napolitano, and Chaitanya Ramineni. 2015a. Using TextEvaluator to Better Understand the Comprehension Challenges Presented Within Textbooks Targeted at First Grade Readers. Presentation. American Educational Research Association (AERA), Chicago, IL.
——. 2015b. Using TextEvaluator to Quantify Sources of Linguistic Complexity in Textbooks Targeted at First-Grade Readers Over the Past Half Century. ETS Research Report Series. doi:10.1002/ets2.12085. http://onlinelibrary.wiley.com/doi/10.1002/ets2.12085/full.
Cho, Yeonsuk, Su-Youn Yoon, Diane Napolitano, and Yuan Wang. 2014. An Automated Spoken Text Diﬃculty Evaluation System. Presentation. Computer Assisted Language Instruction Consortium (CALICO), Athens, OH. https://calico.org/calico-conference/conferences-from-previous-years/calico-2014-ohio-university/thursday-may-8/.
Higgins, Derrick, Chris Brew, Michael Heilman, Ramon Ziai, Lei Chen, Aoife Cahill, Michael Flor, et al. 2014. Is getting the right answer just about choosing the right words? The role of syntactically-informed features in short answer scoring. arXiv: 1403.0801 [cs.CL]_. http://arxiv.org/abs/1403.0801v2.
Sheehan, Kathleen M., and Diane Napolitano. 2014. Measuring the Diﬃculty of Inferring Connections Across Sentences. Presentation. National Council on Measurement in Education (NCME), Philadelphia, PA.
Sheehan, Kathleen M., Irene Kostin, Diane Napolitano, and Michael Flor. 2014. “The TextEvaluator Tool: Helping Teachers and Test Developers Select Texts for Use in Instruction and Assessment”. The Elementary School Journal 115 (2): 184–209. http://www.jstor.org/stable/10.1086/678294.
Cahill, Aoife, Joel Madnani Nitin Tetreault, and Diane Napolitano. 2013. “Robust Systems for Preposition Error Correction Using Wikipedia Revisions”. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). Atlanta, GA. http://www.aclweb.org/anthology-new/N/N13/N13-1055.pdf.
Sheehan, Kathleen M., Michael Flor, and Diane Napolitano. 2013. “A Two-Stage Approach for Generating Unbiased Estimates of Text Complexity”. In Proceedings of the 2nd Workshop on Natural Language Processing for Improving Textual Accessibility (NLP4ITA). Atlanta, GA: North American Chapter of the Association for Computational Linguistics (NAACL). http://www.aclweb.org/anthology/W/W13/W13-15.pdf#page=59.
Sheehan, Kathleen M., Irene Kostin, Diane Napolitano, and Michael Flor. 2013. Helping teachers and test developers determine the diﬃculty of text for instruction and assessment. Presentation. Literacy Research Association, Addressing the Three Legs of the Text Complexity Triangle: Quantitative, Qualitative, and Reader-Task Systems, Dallas, TX.
Sheehan, Kathleen M., Irene Kostin, and Diane Napolitano. 2012a. SourceRater: An automated approach for generating text complexity classiﬁcations aligned with the Common Core Standards. Presentation. National Council on Measurement in Education (NCME), Vancouver, BC, Canada.
——. 2012b. SourceRater: Helping Teachers and Test Developers Determine the Diﬃculty of Text for Instruction and Assessment. Presentation. National Council on Measurement in Education (NCME), Vancouver, BC, Canada.
Napolitano, Diane, and Amanda Stent. 2009. “TechWriter: An Evolving System for Writing Assistance for Advanced Learners of English”. CALICO Journal 26 (3): 611–625. https://www.jstor.org/stable/calicojournal.26.3.611.
——. 2008. TechWriter: An individualized approach to writing assistance and improvement. Poster. Computer Assisted Language Instruction Consortium (CALICO), Workshop on the Automatic Analysis of Learner Language. calico_poster_08.png.
Here is the presentation on my work at the University of Illinois that I gave to my reading group.
I was an Adjunct Instructor at the State University of New York College at Old Westbury for three academic years. The courses I taught were:
- CS2511: Computer Programming II, Spring 2011
- CS5720: Advanced Java Programming and Applications, Spring 2011
- CS2510: Computer Programming I, Fall 2010
- CS1500: Introduction to Computer Applications, Summer 2010
- CS3620: Computer Architecture, Fall 2008, Spring 2010, Spring 2011
- CS3911: C++ in Object-Oriented Design, Fall 2008, Spring 2009, Winter 2011
- CS5610: Operating Systems, Spring 2009, Fall 2010
- CS4400: Artificial Intelligence, Spring 2010
While at Stony Brook, I had the pleasure of being a TA for the following courses:
- CSE 114: Computer Science I, Spring 2007
- CSE 300: Writing in Computer Science, Fall 2007
- CSE 592: Machine Learning, Spring 2008
- Association for Computational Linguistics (ACL)
- ACL Special Interest Group for Building Educational Applications
- ARPA-level member of the host of this website, a.k.a. the “PBS of the Internet”
- The Rail Passengers Association
- Go Leafs :)
I was the founding Vice-President and Webmaster of Women in Computer Science at Stony Brook, and I used to regularly attend both LUGSB and SBCS meetings. When I was an undergrad at Binghamton, I was a representative on the Student Assembly and was on the Rules Committee as both Vice-Chair and Chair.