LLM Fine-Tuning Course – From Supervised FT to RLHF, LoRA, and Multimodal

Learn how to tailor massive models to specific tasks with this comprehensive, deep dive into the modern LLM ecosystem. You will progress from the core foundations of supervised fine-tuning to advanced alignment techniques like RLHF and DPO, ensuring your models are both capable and helpful. Through hands-on practice with the Hugging Face ecosystem and high-performance tools like Unsloth and Axolotl, you’ll gain the technical edge needed to implement parameter-efficient strategies like LoRA and QLoRA.

Code: https://github.com/sunnysavita10/Complete-LLM-Finetuning

Course developed by @sunnysavita10

❤️ Support for this channel comes from our friends at Scrimba – the coding platform that's reinvented interactive learning: https://scrimba.com/freecodecamp

⭐️ Chapters ⭐️
- 00:00:00 Introduction & Course Syllabus
- 00:03:42 LLM Training Pipeline Overview
- 00:05:01 Parameter Level Fine-Tuning: Full vs. Partial
- 00:07:22 Partial Fine-Tuning: Old School vs. Advanced Methods
- 00:10:07 Parameter Efficient Fine-Tuning (PEFT): LoRa & QLoRa
- 00:13:01 Advanced PEFT Techniques: DoRA, IA3, & BitFit
- 00:17:34 Data Level Fine-Tuning: Instructional vs. Non-Instructional
- 00:19:55 Preference Based Learning: RLHF & DPO
- 00:24:25 Deep Dive: Unsupervised Pre-training (Self-Supervised Learning)
- 00:30:45 Deep Dive: Non-Instructional Fine-Tuning & Domain Adaptation
- 00:40:48 Data Preparation for Non-Instructional Fine-Tuning
- 00:42:51 Deep Dive: Instructional Fine-Tuning & Chatbot Creation
- 00:47:57 Deep Dive: Preference Alignment with Human Feedback
- 00:50:38 Family-wise LLM Breakdown: Llama, GPT, Gemini, & DeepSeek
- 00:55:23 Practical Setup: Essential Libraries & GPU Connection
- 01:08:56 Working with Pre-built vs. Custom Custom Data Sets
- 01:21:02 Model Selection, Tokenization, & Padding Explained
- 01:26:11 Defining Training Arguments: Epochs, Learning Rate, & Batch Size
- 01:32:38 Executing Fine-Tuning with LoRa
- 01:42:35 Post-Training: Model Prediction & Inferencing
- 01:45:15 Part 2: Comprehensive Guide to Instructional Fine-Tuning
- 02:16:32 Loading & Unzipping Previous Training Checkpoints
- 02:30:13 Masking Labels for Improved Instructional Responses
- 02:40:02 Part 3: Preference Alignment & DPO Training
- 02:56:07 Preference Optimization Techniques: RLHF, RL AIF, & DPO
- 03:02:40 DPO Intuition: Understanding the Training Loss Formula
- 03:07:44 Practical DPO Implementation & Avoiding LoRa Stacking
- 03:37:30 Introduction to the Llama Factory Project
- 03:51:09 Setup & Setting up Llama Factory via GitHub
- 04:03:19 Using Llama Factory Web UI: Selecting Models & Data
- 04:29:44 Training via CLI: Configuration via YAML Files
- 04:37:55 Unsloth Framework: Achieving 2x Faster Training
- 04:57:33 Inside Unsloth: Custom Kernels & Memory Efficiency
- 05:14:14 Practical Walkthrough: Fine-Tuning with Unsloth
- 05:32:08 Enterprise Fine-Tuning via OpenAI API
- 05:48:06 Preparing & Validating JSONL Data for OpenAI
- 06:21:55 Creating and Monitoring OpenAI Fine-Tuning Jobs
- 06:52:20 Google Cloud Vertex AI: Fine-Tuning Gemini Models
- 07:22:41 Data Management in Google Cloud Storage Buckets
- 08:31:01 Embedding Fine-Tuning Masterclass
- 08:38:40 Multimodal AI: Image, Video, & Audio Modalities
- 09:13:48 Vision Transformer (ViT) Architecture Deep Dive
- 09:58:48 Keyword Search vs. Semantic Similarity
- 11:24:45 Step-by-Step: The Modern Text Embedding Process

🎉 Thanks to our Champion and Sponsor supporters:
👾 @omerhattapoglu1158
👾 @goddardtan
👾 @akihayashi6629
👾 @kikilogsin
👾 @anthonycampbell2148
👾 @tobymiller7790
👾 @rajibdassharma497
👾 @CloudVirtualizationEnthusiast
👾 @adilsoncarlosvianacarlos
👾 @martinmacchia1564
👾 @ulisesmoralez4160
👾 @_Oscar_
👾 @jedi-or-sith2728
👾 @justinhual1290

--

Learn to code for free and get a developer job: https://www.freecodecamp.org

Read hundreds of articles on programming: https://freecodecamp.org/news Receive SMS online on sms24.me

Watch on YouTube

Subscribe on YouTubeReaderBot

TubeReader video aggregator is a website that collects and organizes online videos from the YouTube source. Video aggregation is done for different purposes, and TubeReader take different approaches to achieve their purpose.

Our try to collect videos of high quality or interest for visitors to view; the collection may be made by editors or may be based on community votes.

Another method is to base the collection on those videos most viewed, either at the aggregator site or at various popular video hosting sites.

TubeReader site exists to allow users to collect their own sets of videos, for personal use as well as for browsing and viewing by others; TubeReader can develop online communities around video sharing.

Our site allow users to create a personalized video playlist, for personal use as well as for browsing and viewing by others.

@YouTubeReaderBot allows you to subscribe to Youtube channels.

By using @YouTubeReaderBot Bot you agree with YouTube Terms of Service.

Use the @YouTubeReaderBot telegram bot to be the first to be notified when new videos are released on your favorite channels.

Look for new videos or channels and share them with your friends.

You can start using our bot from this video, subscribe now to LLM Fine-Tuning Course – From Supervised FT to RLHF, LoRA, and Multimodal

What is YouTube?

YouTube is a free video sharing website that makes it easy to watch online videos. You can even create and upload your own videos to share with others. Originally created in 2005, YouTube is now one of the most popular sites on the Web, with visitors watching around 6 billion hours of video every month.