YANG JANET LIU

I study Computational Linguistics at Georgetown University with the Department of Linguistics, where I’m advised by Amir Zeldes, Ph.D. and a member of Corpling@GU and Computational Linguistics @ Georgetown (GUCL). I also work on research with Nathan Schneider, Ph.D. as a student affiliate of NERT. I obtained my M.S. in Computational Linguistics from Georgetown University in May 2019.

My primary research interests are centered around discourse-level linguistic phenomena (i.e. transcending sentence boundaries) across genres using computational, statistical, and corpus-based methods. In addition, my work also involves the creation of discourse resources spanning different genres to inform model development and facilitate targeted evaluation. In addition, I have been working on initiatives for facilitating cross-framework discourse understanding and unifying discourse resources by co-organizing the Discourse Relation Parsing and Treebanking shared task. I have also contributed to multilingual annotation projects such as the development of the largest multi-genre RST treebank for Mandarin Chinese and the creation of the first Chinese corpus annotated with adposition semantics that makes parallel analysis possible.

Before coming to Georgetown, I majored in Linguistics at Temple University in Philadelphia, PA from 2015 to 2017, where I was a research assistant in Temple University’s Multilingual Research Group.

news

Feb 23, 2024	I passed my dissertation defense
Dec 01, 2023	Invited talk (online) at Prof. Dr. Dirk Hovy’s MilaNLP Lab at Bocconi University 🇮🇹
Sep 25, 2023	Invited talk at Prof. Dr. Barbara Plank’s MaiNLP research lab at the Center for Information and Language Processing (CIS) at LMU in Munich, Germany about The Pivotal Role of Genres: Insights from English RST Parsing and Abstractive Summarization 🕺🏻
Sep 20, 2023	Invited talk at Prof. Dr. Manfred Stede’s Applied CL Discourse Lab at Universität Potsdam about English RST Parsing in Potsdam, Germany 🇩🇪
Jul 11, 2023	One paper accepted to SIGDIAL 2023 🙌🏼 See you in Prague 🇨🇿 in September
Jun 23, 2023	Area Chair of Discourse and Pragmatics at EMNLP 2023 🇸🇬
May 30, 2023	Started a Research Scientist internship at Spotify USA Inc. 🤠🕺🏻🎸🎧🥁
May 17, 2023	One paper accepted to INTERSPEECH 2023 in Dublin, Ireland 🇮🇪
May 02, 2023	One paper accepted to the Findings of ACL 2023 See y’all in Toronto, Canada 🇨🇦
Apr 22, 2023	Invited talk at MASC-SLL 2023 at Georgetown Mason University (Arlington campus) 🤠
Jan 21, 2023	One paper accepted to the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023) 🕺🏻 See y’all in Dubrovnik, Croatia 🇭🇷
Jan 11, 2023	Co-organizing the DISRPT2023 Shared Task on Discourse Segmentation, Connective and Relation Identification across Formalisms in conjunction with ACL2023 and the CODI2023 Workshop 🤠 More languages and discourse treebanks available 🙌🏼
Dec 14, 2022	Awarded a Fall 2023 GSAS Conference Travel Grant!
Dec 14, 2022	Awarded a Fall 2022 GSAS Conference Travel Grant and a Fall 2022 GSAS-GradGov Research Project Award!
Sep 21, 2022	One co-authored paper accepted at the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (AACL-IJCNLP) 👏🏼
May 09, 2022	One paper on Adpositional Pragmatic Markers accepted at the 16th Lingusitic Annotation Workshop (LAW-XVI) Workshop, co-located with LREC 2022 in Marseille, France 🇫🇷
Nov 05, 2021	Passed my dissertation proposal defense ✌🏼
Jun 07, 2021	Started the Research Scientist, PhD - Summer Internship with the Lab in Language Technologies at Spotify Research 🎸🎧🥁
May 11, 2021	Passed the 2nd Qualifying Review ✌🏼
Feb 18, 2021	Co-organizing the DISRPT2021 Shared Task on Discourse Segmentation, Connective and Relation Identification across Formalisms in conjunction with EMNLP2021 and the CODI2021 Workshop 🤠
Dec 16, 2020	Done with PhD Coursework ✌🏼
Jul 01, 2020	A journal paper on detecting signals of discourse relations by Amir and I now published in Dialogue & Discourse 🕵🏻‍♀️
May 18, 2020	Started my first internship at Alexa AI @ Amazon as a Language Data Researcher Intern (VIRTUAL) 🔍
May 17, 2019	Happy Graduation 🎓 M.S. in Computational Linguistics, Georgetown University 🎉
Apr 15, 2019	Awarded a Spring 2019 GSAS Conference Travel Grant and a Spring 2019 GSAS-GradGov Research Project Award!

selected publications

SIGDIAL

What’s Hard in RST Parsing? Predictive Models for Error Analysis

Yang Janet Liu, Tatsuya Aoyama , and Amir Zeldes

In Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue , Sep 2023

Abs PDF Code

Despite recent advances in Natural Language Processing (NLP), hierarchical discourse parsing in the framework of Rhetorical Structure Theory remains challenging, and our understanding of the reasons for this are as yet limited. In this paper, we examine and model some of the factors associated with parsing difficulties in previous work: the existence of implicit discourse relations, challenges in identifying long-distance relations, out-of-vocabulary items, and more. In order to assess the relative importance of these variables, we also release two annotated English test-sets with explicit correct and distracting discourse markers associated with gold standard RST relations. Our results show that as in shallow discourse parsing, the explicit/implicit distinction plays a role, but that long-distance dependencies are the main challenge, while lack of lexical overlap is less of a problem, at least for in-domain parsing. Our final model is able to predict where errors will occur with an accuracy of 76.3% for the bottom-up parser and 76.6% for the top-down parser.
ACL

GUMSum: Multi-Genre Data and Evaluation for English Abstractive Summarization

Yang Janet Liu, and Amir Zeldes

In Findings of the Association for Computational Linguistics: ACL 2023 , Jul 2023

Abs PDF Code Poster

Automatic summarization with pre-trained language models has led to impressively fluent results, but is prone to ‘hallucinations’, low performance on non-news genres, and outputs which are not exactly summaries. Targeting ACL 2023’s ‘Reality Check’ theme, we present GUMSum, a small but carefully crafted dataset of English summaries in 12 written and spoken genres for evaluation of abstractive summarization. Summaries are highly constrained, focusing on substitutive potential, factuality, and faithfulness. We present guidelines and evaluate human agreement as well as subjective judgments on recent system outputs, comparing general-domain untuned approaches, a fine-tuned one, and a prompt-based approach, to human performance. Results show that while GPT3 achieves impressive scores, it still underperforms humans, with varying quality across genres. Human judgments reveal different types of errors in supervised, prompted, and human-generated summaries, shedding light on the challenges of producing a good summary.
EACL

Why Can’t Discourse Parsing Generalize? A Thorough Investigation of the Impact of Data Diversity

Yang Janet Liu, and Amir Zeldes

In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics , May 2023

Abs PDF Code Poster Slides

Recent advances in discourse parsing performance create the impression that, as in other NLP tasks, performance for high-resource languages such as English is finally becoming reliable. In this paper we demonstrate that this is not the case, and thoroughly investigate the impact of data diversity on RST parsing stability. We show that state-of-the-art architectures trained on the standard English newswire benchmark do not generalize well, even within the news domain. Using the two largest RST corpora of English with text from multiple genres, we quantify the impact of genre diversity in training data for achieving generalization to text types unseen during training. Our results show that a heterogeneous training regime is critical for stable and generalizable models, across parser architectures. We also provide error analyses of model outputs and out-of-domain performance. To our knowledge, this study is the first to fully evaluate cross-corpus RST parsing generalizability on complete trees, examine between-genre degradation within an RST corpus, and investigate the impact of genre diversity in training data composition.