diff --git a/2024.md b/2024.md index c855d73..1884b94 100644 --- a/2024.md +++ b/2024.md @@ -1,5 +1,5 @@ -## 2024 (105 papers) +## 2024 (109 papers) 1. [A Computational Framework for Behavioral Assessment of LLM Therapists](https://arxiv.org/abs/2401.00820v1), Yu Ying Chiu,Ashish Sharma,Inna Wanyin Lin,Tim Althoff, 01-01-2024 ### Categories @@ -1013,6 +1013,21 @@ +1. [Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models](https://arxiv.org/abs/2402.14207v2), Yijia Shao,Yucheng Jiang,Theodore A. Kanell,Peter Xu,Omar Khattab,Monica S. Lam, 22-02-2024 + ### Categories + Computation and Language, Artificial Intelligence + ### Abstract + For evaluation, we curate FreshWiki, a dataset of recent high-quality Wikipedia articles, and formulate outline assessments to evaluate the pre-writing stage. We further gather feedback from experienced Wikipedia editors. Compared to articles generated by an outline-driven retrieval-augmented baseline, more of STORM's articles are deemed to be organized (by a 25% absolute increase) and broad in coverage (by 10%). The expert feedback also helps identify new challenges for generating grounded long articles, such as source bias transfer and over-association of unrelated facts. + ### Bullet Points + + * To evaluate the pre-writing stage of STORM's articles, we curate FreshWiki, formulate outline assessments, and gather feedback from experienced editors + + * More articles are organized and broad in coverage compared to an outline-driven retrieval-augmented baseline + + * Expert feedback also helps identify new challenges for generating grounded long articles, such as source bias transfer and over-association of unrelated facts. + + + 1. [INSTRUCTIR: A Benchmark for Instruction Following of Information Retrieval Models](https://arxiv.org/abs/2402.14334), Hanseok Oh,Hyunji Lee,Seonghyeon Ye,Haebin Shin,Hansol Jang,Changwook Jun,Minjoon Seo, 22-02-2024 ### Categories Computation and Language @@ -1538,5 +1553,15 @@ +1. [RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing](https://arxiv.org/abs/2404.19543v1), Yucheng Hu,Yuxing Lu, 30-04-2024 + ### Categories + Computation and Language, Artificial Intelligence +1. [Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models](https://arxiv.org/abs/2405.01535v1), Seungone Kim,Juyoung Suk,Shayne Longpre,Bill Yuchen Lin,Jamin Shin,Sean Welleck,Graham Neubig,Moontae Lee,Kyungjae Lee,Minjoon Seo, 02-05-2024 + ### Categories + Computation and Language +1. [RLHF Workflow: From Reward Modeling to Online RLHF](https://arxiv.org/abs/2405.07863v1), Hanze Dong,Wei Xiong,Bo Pang,Haoxiang Wang,Han Zhao,Yingbo Zhou,Nan Jiang,Doyen Sahoo,Caiming Xiong,Tong Zhang, 13-05-2024 + ### Categories + Machine Learning, Artificial Intelligence, Computation and Language, Machine Learning + ### Abstract diff --git a/Papers.md b/Papers.md index ea24b6e..28acce4 100644 --- a/Papers.md +++ b/Papers.md @@ -6,4 +6,4 @@ ### [2021](2021.md) (44 papers) ### [2022](2022.md) (49 papers) ### [2023](2023.md) (217 papers) -### [2024](2024.md) (105 papers) \ No newline at end of file +### [2024](2024.md) (109 papers) \ No newline at end of file