Research

Interests

  • Natural Language Processing
  • Machine Learning
  • Information Retrieval and Extraction
  • The application of all of the above to problems in the Social Sciences, Education, and Medical/Healthcare domains
  • Canadian History and Literature

Work

Aspiring Data Scientists at Rutgers can find a set of helpful resources here.

While employed at ETS, I worked in the Natural Language Processing and Speech Group in Research and Development. I was the lead developer on the TextEvaluator (formerly SourceRater) project, and the lead back-end engineer for The Writing Mentor. I was also a key member of the team which develops e-rater and our unified “rater” platform which powers e-rater–ETS’s automated essay-scoring engine, Henry–for content-based short-answer scoring, and The Writing Mentor.

At Stony Brook, I worked with Professor Amanda Stent in the HCI Lab. For my thesis, I developed a Java-based NLP/Machine Learning tool that aims to help students, both native and non-native speakers of English, improve their writing. You can read my thesis, if you’d like.

In the summer of 2007, I participated in the Data Sciences Summer Institute at the University of Illinois at Urbana-Champaign, where I worked on a project that explored the Virtual Web as a finite-state graph. A presentation on my work can be found below.

In the summer of 2016, I attended the North American Summer School on Logic, Language, and Information (NASSLLI) at Rutgers University.

Publishing

Reviewing Activities

Publications

Forsyth, Carolyn M., Stephanie Peters, Jung Aa Moon, and Diane Napolitano. 2019. Assessing Scientific Inquiry Based on Multiple Sources of Evidence. Presentation. American Educational Research Association (AERA), Toronto, ON, Canada.

Sheehan, Kathleen M, and Diane Napolitano. 2019. “Generating Reliable Feedback About Students’ Performances Within an Automated Reading Tutor”. Journal of Educational Computing Research 57 (3): 1–19. doi:10.1177/0735633119845412. https://doi.org/10.1177/0735633119845412.

Burstein, Jill, Norbert Elliot, Beata Beigman Klebanov, Nitin Madnani, Diane Napolitano, Maxwell Schwartz, Patrick Houghton, and Hillary Molloy. 2018. “Writing Mentor: Writing Progress Using Self-Regulated Writing Support”. The Journal of Writing Analytics 2 (1): 285–313. https://journals.colostate.edu/analytics/article/view/213.

Madnani, Nitin, Jill Burstein, Norbert Elliot, Beata Beigman Klebanov, Diane Napolitano, Slava Andreyev, and Maxwell Schwartz. 2018. “Writing Mentor: Self-Regulated Writing Feedback for Struggling Writers”. In Proceedings of the 27th International Conference on Computational Linguistics (COLING): System Demonstrations Session. Santa Fe, NM. http://www.aclweb.org/anthology/C18-2025.

Madnani, Nitin, Aoife Cahill, Daniel Blanchard, Slava Andreyev, Diane Napolitano, Binod Gyawali, Michael Heilman, et al. 2018. A Robust Microservice Architecture for Scaling Automated Scoring Applications. ETS Research Report Series. doi:10.1002/ets2.12202. https://onlinelibrary.wiley.com/doi/abs/10.1002/ets2.12202.

Malmasi, Shervin, Keelan Evanini, Aoife Cahill, Joel Tetreault, Robert Pugh, Christopher Hamill, Diane Napolitano, and Yao Qian. 2017. “A Report on the 2017 Native Language Identification Shared Task”. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA). Copenhagen, Denmark: Empirical Methods for Natural Language Processing (EMNLP). http://www.aclweb.org/anthology/W17-5007.

Yoon, Su-Youn, Yeonsuk Cho, and Diane Napolitano. 2016. “Spoken Text Difficulty Estimation Using Linguistic Features”. In Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications (BEA). San Diego, CA: North American Chapter of the Association for Computational Linguistics (NAACL). http://m-mitchell.com/NAACL-2016/BEA/pdf/BEA1131.pdf.

Bhat, Suma, Su-Youn Yoon, and Diane Napolitano. 2015. “Automatic Detection of Grammatical Structures from Non-native Speech”. In Proceedings of the Sixth Workshop on Speech and Language Technology in Education (SLaTE). Leipzig, Germany: INTERSPEECH. https://www.slate2015.org/files/submissions/Bhat15-ADO.pdf.

Napolitano, Diane, Kathleen M. Sheehan, and Robert Mundkowsky. 2015. “Online Readability and Text Complexity Analysis with TextEvaluator”. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL): System Demonstrations Session. Denver, CO. http://www.aclweb.org/anthology/N/N15/N15-3020.pdf.

Sheehan, Kathleen M., Michael Flor, Diane Napolitano, and Chaitanya Ramineni. 2015a. Using TextEvaluator to Better Understand the Comprehension Challenges Presented Within Textbooks Targeted at First Grade Readers. Presentation. American Educational Research Association (AERA), Chicago, IL.

______. 2015b. Using TextEvaluator to Quantify Sources of Linguistic Complexity in Textbooks Targeted at First-Grade Readers Over the Past Half Century. ETS Research Report Series. doi:10.1002/ets2.12085. http://onlinelibrary.wiley.com/doi/10.1002/ets2.12085/full.

Cho, Yeonsuk, Su-Youn Yoon, Diane Napolitano, and Yuan Wang. 2014. An Automated Spoken Text Difficulty Evaluation System. Presentation. Computer Assisted Language Instruction Consortium (CALICO), Athens, OH. https://calico.org/calico-conference/conferences-from-previous-years/calico-2014-ohio-university/thursday-may-8/.

Higgins, Derrick, Chris Brew, Michael Heilman, Ramon Ziai, Lei Chen, Aoife Cahill, Michael Flor, et al. 2014. Is getting the right answer just about choosing the right words? The role of syntactically-informed features in short answer scoring. arXiv: 1403.0801 [cs.CL]. http://arxiv.org/abs/1403.0801v2.

Sheehan, Kathleen M., Irene Kostin, Diane Napolitano, and Michael Flor. 2014. “The TextEvaluator Tool: Helping Teachers and Test Developers Select Texts for Use in Instruction and Assessment”. The Elementary School Journal 115 (2): 184–209. http://www.jstor.org/stable/10.1086/678294.

Sheehan, Kathleen M., and Diane Napolitano. 2014. Measuring the Difficulty of Inferring Connections Across Sentences. Presentation. National Council on Measurement in Education (NCME), Philadelphia, PA.

Cahill, Aoife, Joel Madnani Nitin Tetreault, and Diane Napolitano. 2013. “Robust Systems for Preposition Error Correction Using Wikipedia Revisions”. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). Atlanta, GA. http://www.aclweb.org/anthology-new/N/N13/N13-1055.pdf.

Sheehan, Kathleen M., Michael Flor, and Diane Napolitano. 2013. “A Two-Stage Approach for Generating Unbiased Estimates of Text Complexity”. In Proceedings of the 2nd Workshop on Natural Language Processing for Improving Textual Accessibility (NLP4ITA). Atlanta, GA: North American Chapter of the Association for Computational Linguistics (NAACL). http://www.aclweb.org/anthology/W/W13/W13-15.pdf#page=59.

Sheehan, Kathleen M., Irene Kostin, Diane Napolitano, and Michael Flor. 2013. Helping teachers and test developers determine the difficulty of text for instruction and assessment. Presentation. Literacy Research Association, Addressing the Three Legs of the Text Complexity Triangle: Quantitative, Qualitative, and Reader-Task Systems, Dallas, TX.

Sheehan, Kathleen M., Irene Kostin, and Diane Napolitano. 2012a. SourceRater: An automated approach for generating text complexity classifications aligned with the Common Core Standards. Presentation. National Council on Measurement in Education (NCME), Vancouver, BC, Canada.

______. 2012b. SourceRater: Helping Teachers and Test Developers Determine the Difficulty of Text for Instruction and Assessment. Presentation. National Council on Measurement in Education (NCME), Vancouver, BC, Canada.

Napolitano, Diane, and Amanda Stent. 2009. “TechWriter: An Evolving System for Writing Assistance for Advanced Learners of English”. CALICO Journal 26 (3): 611–625. https://www.jstor.org/stable/calicojournal.26.3.611.

______. 2008. TechWriter: An individualized approach to writing assistance and improvement. Poster. Computer Assisted Language Instruction Consortium (CALICO), Workshop on the Automatic Analysis of Learner Language. calico_poster_08.png.

Presentations

In July 2014, I provided an introduction to NLP to many members of the Data Management and Analytics department at Mathematica Policy Research.

Here is the presentation on my work at the University of Illinois that I gave to my reading group.

Teaching

I was an Adjunct Instructor at the State University of New York College at Old Westbury for three academic years. The courses I taught were:

  • CS2511: Computer Programming II, Spring 2011
  • CS5720: Advanced Java Programming and Applications, Spring 2011
  • CS2510: Computer Programming I, Fall 2010
  • CS1500: Introduction to Computer Applications, Summer 2010
  • CS3620: Computer Architecture, Fall 2008, Spring 2010, Spring 2011
  • CS3911: C++ in Object-Oriented Design, Fall 2008, Spring 2009, Winter 2011
  • CS5610: Operating Systems, Spring 2009, Fall 2010
  • CS4400: Artificial Intelligence, Spring 2010

While at Stony Brook, I had the pleasure of being a TA for the following courses:

Affiliations

Past Affiliations

I was the founding Vice-President and Webmaster of Women in Computer Science at Stony Brook, and I used to regularly attend both LUGSB and SBCS meetings. When I was an undergrad at Binghamton, I was a representative on the Student Assembly and was on the Rules Committee as both Vice-Chair and Chair.