Thank you for being here! Letโs take a deep breath and dive in to the best LLM papers of this week!
1. Adversarial Preference Optimization
๐ Author(s): Pengyu Cheng, et al. from Tencent AI Lab
๐ Publication Date: Nov 14, 2023
โจ Key Insights:
Whatโs New? They proposed an adversarial preference optimization (APO) framework, where the LLM agent and the preference model update alternatively via a min-max game.
Behind the New. In practice, continuously updating LLMs raises a distribution gap between model-generated samples and human-preferred responses, which hinders model fine-tuning efficiency. They mitigated this issue by APO.
So, How can we use this? It was very exciting to see the principles of GAN, which I studied a long time ago, in this paper! Since GAN, research has been done to make more stable GANs, do you think this paper can be developed like that?
๐ Read Full Paper
2. SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
๐ Author(s): Ziyi Lin, et al. from Shanghai AI Laboratory
๐ Publication Date: Nov 13, 2023
โจ Key Insights:
Whatโs New? They presented SPHINX, a versatile multi-modal large language model (MLLM) with a joint mixing of model weights, tuning tasks, and visual embeddings.
Behind the New. For stronger vision-language alignment, they unfreeze the LLM during pre-training, and introduce a weight mix strategy between LLMs trained by real-world and synthetic data.
So, How can we use this? They cast a light on the exploration of joint mixing in future MLLM research!
๐ Read Full Paper, Explore Github Repo
3. TrainerAgent: Customizable and Efficient Model Training through LLM-Powered Multi-Agent System
๐ Author(s): Haoyuan Li, et al. from Taotian Group
๐ Publication Date: Nov 11, 2023
โจ Key Insights:
Whatโs New? Leveraging the powerful analytical, planning, and decision-making capabilities of LLM, they proposed a TrainerAgent system comprising a multi-agent framework including Task, Data, Model and Server agents.
Behind the New. Algorithm engineers often face a lengthy process to iteratively develop models tailored to specific business requirements, making it even more difficult for non-experts. The quest for high-quality and efficient model development, along with the emergence of Large Language Model (LLM) Agents, has become a key focus in the industry.
So, How can we use this? TrainerAgent can understand the task and gather the data and train the model with data and deploy it. What should we do now? Good Luck, ML Engineers.
๐ Read Full Paper
4. Prompt Engineering a Prompt Engineer
๐ Author(s): Qinyuan Ye, et al. from Microsoft
๐ Publication Date: Nov 09, 2023
โจ Key Insights:
Whatโs New? LLMs can be meta-prompted to perform automatic prompt engineering. They constructed good meta-prompt that more effectively guides LLMs to perform automatic prompt engineering.
Behind the New. Their final method, named PE2, found a prompt that outperforms โlets think step by stepโ on the MultiArith dataset. Their prompt was this: โLetโs solve this problem by considering all the details. Pay attention to each piece of information, remember to add or subtract as needed, and perform the calculations step by step.โ.
So, How can we use this? This paper is so nice to find the best prompt for your task. Please try this method!
๐ Read Full Paper
5. Ask One More Time: Self-Agreement Improves Reasoning of Language Models in (Almost) All Scenarios
๐ Author(s): Lei Lin, et al. from Kuaishou Techonology
๐ Publication Date: Nov 14, 2023
โจ Key Insights:
Whatโs New? They proposed self-agreement, a generalizable ensemble-optimization method applying in almost all scenarios where the type of input questions and the answer format of reasoning paths may be known or unknown.
Behind the New. Verifier or re-ranker based methods and post-processing based methods can only solve the question that belongs to a know task. Additionally, agent based methods are complex as the process of agents debating is iterative.
So, How can we use this? Please try their nice prompt in Table 5 in your service. โAsk One More Timeโ method might be helpful!
๐ Read Full Paper
6. Is โA Helpful Assistantโ the Best Role for Large Language Models? A Systematic Evaluation of Social Roles in System Prompts
๐ Author(s): Mingqian Zheng, et al. from University of Michigan
๐ Publication Date: Nov 16, 2023
โจ Key Insights:
Whatโs New? They presented a systematic evaluation of how social roles in system prompts affect model performance. They found that โmentorโ is the proper role for flan-t5-xxl and llama-2-7b-chat.
Behind the New. ChatGPT uses โYou are a helpful assistantโ as part of the default system prompt. But is โa helpful assistantโ the best role for LLMs?
So, How can we use this? Try to find your LLMโs best role. A helpful assistant might not be its proper role.
๐ Read Full Paper, Explore Github Repo
7. Auto-ICL: In-Context Learning without Human Supervision
๐ Author(s): Jinghan Yang, et al. from Microsoft
๐ Publication Date: Nov 15, 2023
โจ Key Insights:
Whatโs New? They presented Automatic In-Context Learning. Upon receiving a userโs request, they ask the model to independently generate examples, including labels, instructions, or reasoning pathways.
Behind the New. With this framework, model generate the similar questions with userโs question and answer them first. Then, doing Chain-of-Thought (CoT) with these pairs! This method outperformed Auto-CoT (โletโs think step by stepโ).
So, How can we use this? Vanilla CoT requires our handmade few-show examples, but Auto-ICL does not. Try this instead of Auto-CoT!
๐ Read Full Paper
8. Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
๐ Author(s): Wenhao Yu, et al. from Tecent AI Lab
๐ Publication Date: Nov 15, 2023
โจ Key Insights:
Whatโs New? They introduced Chain-of-Noting (CoN), an approach aimed at improving the robustness of Retrieval-augmented language models (RALMs) in facing noisy, irrelevant documents and in handling unknown scenarios.
Behind the New. RALM with CoN can generate sequential reading notes for retrieved documents, enabling a thorough evaluation of their relevance to the given question and integrating this information to formulate the final answer.
So, How can we use this? CoN always improved LLaMa-2โs QA performance! Try to use CoN in your QA task!
๐ Read Full Paper
Stay curious, and until next week!