LLMNLPGenerative AI

Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations

In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) ·
Lucy Lu Wang, Yulia Otmakhova, Jay DeYoung, Thinh Hung Truong, Bailey Kuehl, Erin Bransom, Byron Wallace

Cite as

Lucy Lu Wang, Yulia Otmakhova, Jay DeYoung, Thinh Hung Truong, Bailey Kuehl, Erin Bransom, Byron Wallace. (2023). Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). https://doi.org/10.18653/v1/2023.acl-long.549