Forwarded for publication.
---------- Forwarded message ---------
From:
Youssef Hosni from To Data & Beyond <youssefh@substack.com>Date: Tue, Mar 5, 2024 at 3:39 PM
Subject: Top Important LLM Papers for the Week from 19/02 to 25/02
To: <
marielandryx@gmail.com>
Stay Updated with Recent Large Language Models Research
͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
| | |
| Stay Updated with Recent Large Language Models Research To Data & Beyond is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. Large language models (LLMs) have advanced rapidly in recent years. As new generations of models are developed, researchers and engineers need to stay informed on the latest progress. This article summarizes some of the most important LLM papers published during the First Week of March 2024. The papers cover various topics shaping the next generation of language models, from model optimization and scaling to reasoning, benchmarking, and enhancing performance. Keeping up with novel LLM research across these domains will help guide continued progress toward models that are more capable, robust, and aligned with human values. Table of Contents:LLM Progress & Benchmarking LLM Reasoning LLM Training, Evaluation & Inference LLM Fine-Tuning Transformers & Attention Based Models
1. LLM Progress & BenchmarkingBeyond Language Models: Byte Models are Digital World Simulators StarCoder 2 and The Stack v2: The Next Generation Orca-Math: Unlocking the Potential of SLMs in Grade School Math Humanoid Locomotion as Next Token Prediction Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Priority Sampling of Large Language Models for Compilers The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web Nemotron-4 15B Technical Report MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT StructLM: Towards Building Generalist Models for Structured Knowledge Grounding API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs FuseChat: Knowledge Fusion of Chat Models MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Genie: Generative Interactive Environments Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts Watermarking Makes Language Models Radioactive ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
2. LLM ReasoningDo Large Language Models Latently Perform Multi-Hop Reasoning? Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
3. LLM Training, Evaluation & InferenceAgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning Evaluating Very Long-Term Conversational Memory of LLM Agents Towards Optimal Learning of Language Models Training-Free Long-Context Scaling of Large Language Models MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs Divide-or-Conquer? Which Part Should You Distill Your LLM? GPTVQ: The Blessing of Dimensionality for LLM Quantization
4. LLM Fine-TuningDiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Model When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
5. Transformers & Attention Based ModelsSimple linear attention language models balance the recall-throughput tradeoff
To Data & Beyond is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. Are you looking to start a career in data science and AI and do not know how? I offer data science mentoring sessions and long-term career mentoring: | |
No comments:
Post a Comment