Ankit narang biography template
•
Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation bygd Large Language Models
Aliyah R. Hsu1,2, James Zhu1, Zhichao Wang1, insekter som pollinerar Bi1, Shubham Mehrotra1,
Shiva K. Pentyala1, Katherine Tan1, Xiang-Bo Mao1, Roshanak Omrani1, Sougata Chaudhuri1,
Regunathan Radhakrishnan1, Sitaram Asur1, Claire Na Cheng1, insekter som pollinerar Yu2
1Salesforce AI Platform, 2UC Berkeley
Correspondence:aliyahhsu@berkeley.edu, james.zhu@salesforce.comWork done during an internship at Salesforce.
Abstract
LLMs have demonstrated impressive proficiency in generating coherent and high-quality text, making them valuable across a range of text-generation tasks. However, rigorous evaluation of this generated content fryst vatten crucial, as ensuring its quality remains a significant challenge due to persistent issues such as factual inaccuracies and hallucination. This paper introduces three fine-tuned general-purpose LLM autoevaluators, REC-8B, REC-12B and REC-70B, s
•
WikiGenBench: Exploring Full-length Wikipedia Generation
under Real-World Scenario
Jiebin Zhang ∗†, Eugene J. Yu∗∗†, Qinyu Chen†, Chenhao Xiong† Dawei Zhu†
Han Qian†, Mingbo Song†, Weimin Xiong† Xiaoguang Li‡, Qun Liu‡, Sujian Li†
† National Key Laboratory for Multimedia Information Processing,
School of Computer Science, Peking University
{zhangjiebin,ejyu,chenqinyu,dwzhu,lisujian}@pku.edu.cn
‡ Huawei Technologies
{lixiaoguang11,qun.liu}@huawei.com∗Equal contribution.
Abstract
It presents significant challenges to generate comprehensive and accurate Wikipedia articles for newly emerging events under a real-world scenario. Existing attempts fall short either by focusing only on short snippets or by using metrics that are insufficient to evaluate real-world scenarios. In this paper, we construct WikiGenBench, a new benchmark consisting of 1,320 entries, designed to align with real-world scenarios in both generation and evaluation. For generation,
•
Large language models (LLMs) often hallucinate and lack the ability to provide attribution for their generations. Semi-parametric LMs, such as kNN-LM, approach these limitations by refining the output of an LM for a given prompt using its nearest neighbor matches in a non-parametric data store. However, these models often exhibit slow inference speeds and produce non-fluent texts. In this paper, we introduce Nearest Neighbor Speculative Decoding (NEST), a novel semi-parametric language modeling approach that is capable of incorporating real-world text spans of arbitrary length into the LM generations and providing attribution to their sources. NEST performs token-level retrieval at each inference step to compute a semi-parametric mixture distribution and identify promising span continuations in a corpus. It then uses an approximate speculative decoding procedure that accepts a prefix of the retrieved span or generates a new token. NEST significantly enhances the generation