Thank you for being here. Letβs take a deep breath and dive into the best GenAI papers of this week!
1. MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
π Author(s): Ting Jiang from Beihang University
π Publication Date: May 20, 2024
β¨ Key Insights:
Whatβs New? They proposed a new method called MoRA, which employs a square matrix to achieve high-rank updating while maintaining the same number of trainable parameters.
Behind the New. To achieve this, they introduced corresponding non-parameter operators to reduce the input dimension and increase output dimension for the square matrix.
So, How can we use this? MoRA or LoRA?
π Read Full Paper
2. OpenRLHF: An Easy-to-use, Scalable and High- performance RLHF Framework
π Author(s): Jian Hu, et al. from OpenLLMAI Team
π Publication Date: May 20, 2024
β¨ Key Insights:
Whatβs New? They presented OpenRLHF, an open-source framework
enabling efficient RLHF scaling, offering integration with Hugging Face models and out-of-the-box solutions with optimized algorithms and launch scripts, ensuring user-friendliness.
Behind the New. Unlike existing RLHF frameworks that co-locate four
models on the same GPUs, OpenRLHF re-designs scheduling for the models beyond 70B parameters using Ray, vLLM, and DeepSpeed, leveraging improved resource utilization and diverse training approaches.
So, How can we use this? An easy way to conduct distributed RLHF!
π Read Full Paper
3. FIFO-Diffusion: Generating Infinite Videos from Text without Training
π Author(s): Jihwan Kim, et al. from Seoul National University
π Publication Date: May 19, 2024
β¨ Key Insights:
Whatβs New? They proposed an inference technique called FIFO-Diffusion, based on a pretrained diffusion model for text-conditional video generation.
Behind the New. The proposed method is capable of generating infinitely long videos without training, achieved by iteratively performing diagonal denoising. They also introduced latent partitioning and lookahead denoising method.
So, How can we use this? Long video generationβs quality and speed is catching up!
π Read Full Paper, Explore Github Repo
4. Presentations are not always linear! GNN meets LLM for Document-to-Presentation Transformation with Attribution
π Author(s): Himanshu Maheshwari, et al. from Adobe Research
π Publication Date: May 21, 2024
β¨ Key Insights:
Whatβs New? They proposed a novel graph based solution where they learn a graph from the input document and use a combination of graph neural network and LLM to generate a presentation with attribution of content for each slide.
Behind the New. Automatically generating a presentation from the text of a long document is a challenging and useful problem. In contrast to a flat summary, a presentation needs to have a better and non-linear narrative.
So, How can we use this? If only there was a world where you could write a thesis and it would automatically create PPT materials for you!
π Read Full Paper
5. DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis
π Author(s): Yao Teng, et al. from The University of Hong Kong
π Publication Date: May 23, 2024
β¨ Key Insights:
Whatβs New? They proposed Diffusion Mamba (DiM), which combines the efficiency of Mamba, a sequence model based on State Space Models (SSM), with the expressive power of diffusion models for efficient high-resolution image synthesis.
Behind the New. The computational cost of Transformers is quadratic to the number of tokens. Mamba cannot generalize to 2D signals, they make learnable padding tokens at the end of each row and column.
So, How can we use this? Mamba's influence seems to be growing, doesn't it?
π Read Full Paper, Explore Github Repo
6. Your Transformer is Secretly Linear
π Author(s): Anton Razzhigaev, et al. from AIRI
π Publication Date: May 19, 2024
β¨ Key Insights:
Whatβs New? They analyzed embedding transformation between sequential layers in transformer decoders such as GPT, LLaMA, uncovering a near-perfect linear relationship. They introduced a cosine-similarity-based regularization, aimed at reducing layer linearity. This improves performance and as well successfully decreases the linearity of the models.
Behind the New. Their experiments showed that removing or linearly approximating some of the most linear blocks of transformers does not affect significantly the loss or model performance.
So, How can we use this? We worked hard to build up the layers, but it turns out they're almost linear, which is pretty shocking!
π Read Full Paper, Explore Github Repo
Stay curious, and until next week!