📜 Top GenAI Papers of the Week

Vol.41 for May 20 - May 23, 2024

Taehee

and

Yoon Baek

May 24, 2024

Thank you for being here. Let’s take a deep breath and dive into the best GenAI papers of this week!

1. MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

🌐 Author(s): Ting Jiang from Beihang University

📅 Publication Date: May 20, 2024

✨ Key Insights:

What’s New? They proposed a new method called MoRA, which employs a square matrix to achieve high-rank updating while maintaining the same number of trainable parameters.

Behind the New. To achieve this, they introduced corresponding non-parameter operators to reduce the input dimension and increase output dimension for the square matrix.
So, How can we use this? MoRA or LoRA?

🔗 Read Full Paper

2. OpenRLHF: An Easy-to-use, Scalable and High- performance RLHF Framework

🌐 Author(s): Jian Hu, et al. from OpenLLMAI Team

📅 Publication Date: May 20, 2024

✨ Key Insights:

What’s New? They presented OpenRLHF, an open-source framework
enabling efficient RLHF scaling, offering integration with Hugging Face models and out-of-the-box solutions with optimized algorithms and launch scripts, ensuring user-friendliness.
Behind the New. Unlike existing RLHF frameworks that co-locate four
models on the same GPUs, OpenRLHF re-designs scheduling for the models beyond 70B parameters using Ray, vLLM, and DeepSpeed, leveraging improved resource utilization and diverse training approaches.
So, How can we use this? An easy way to conduct distributed RLHF!

🔗 Read Full Paper

3. FIFO-Diffusion: Generating Infinite Videos from Text without Training

🌐 Author(s): Jihwan Kim, et al. from Seoul National University

📅 Publication Date: May 19, 2024

✨ Key Insights:

What’s New? They proposed an inference technique called FIFO-Diffusion, based on a pretrained diffusion model for text-conditional video generation.
Behind the New. The proposed method is capable of generating infinitely long videos without training, achieved by iteratively performing diagonal denoising. They also introduced latent partitioning and lookahead denoising method.
So, How can we use this? Long video generation’s quality and speed is catching up!

🔗 Read Full Paper, Explore Github Repo

4. Presentations are not always linear! GNN meets LLM for Document-to-Presentation Transformation with Attribution

🌐 Author(s): Himanshu Maheshwari, et al. from Adobe Research

📅 Publication Date: May 21, 2024

✨ Key Insights:

What’s New? They proposed a novel graph based solution where they learn a graph from the input document and use a combination of graph neural network and LLM to generate a presentation with attribution of content for each slide.
Behind the New. Automatically generating a presentation from the text of a long document is a challenging and useful problem. In contrast to a flat summary, a presentation needs to have a better and non-linear narrative.
So, How can we use this? If only there was a world where you could write a thesis and it would automatically create PPT materials for you!

🔗 Read Full Paper

5. DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis

🌐 Author(s): Yao Teng, et al. from The University of Hong Kong

📅 Publication Date: May 23, 2024

✨ Key Insights:

What’s New? They proposed Diffusion Mamba (DiM), which combines the efficiency of Mamba, a sequence model based on State Space Models (SSM), with the expressive power of diffusion models for efficient high-resolution image synthesis.
Behind the New. The computational cost of Transformers is quadratic to the number of tokens. Mamba cannot generalize to 2D signals, they make learnable padding tokens at the end of each row and column.
So, How can we use this? Mamba's influence seems to be growing, doesn't it?

🔗 Read Full Paper, Explore Github Repo

6. Your Transformer is Secretly Linear

🌐 Author(s): Anton Razzhigaev, et al. from AIRI

📅 Publication Date: May 19, 2024

✨ Key Insights:

What’s New? They analyzed embedding transformation between sequential layers in transformer decoders such as GPT, LLaMA, uncovering a near-perfect linear relationship. They introduced a cosine-similarity-based regularization, aimed at reducing layer linearity. This improves performance and as well successfully decreases the linearity of the models.
Behind the New. Their experiments showed that removing or linearly approximating some of the most linear blocks of transformers does not affect significantly the loss or model performance.
So, How can we use this? We worked hard to build up the layers, but it turns out they're almost linear, which is pretty shocking!

🔗 Read Full Paper, Explore Github Repo

Stay curious, and until next week!

Top GenAI Papers of the Week

📜 Top GenAI Papers of the Week

Vol.41 for May 20 - May 23, 2024

1. MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

2. OpenRLHF: An Easy-to-use, Scalable and High- performance RLHF Framework

3. FIFO-Diffusion: Generating Infinite Videos from Text without Training

4. Presentations are not always linear! GNN meets LLM for Document-to-Presentation Transformation with Attribution

5. DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis

6. Your Transformer is Secretly Linear

Discussion about this post

Top GenAI Papers of the Week