Ideas April/25
Creation Time: 2025-04-21 00:27
ideas
A Collection of Various Thoughts from April 2025
17/04/2025
-
Even though I’ve read many papers on large language models (LLMs) and written code to fine-tune them or call their APIs, I still don’t fully understand how the base models work — what each part of the code does, how pre-training and post-training actually function, and why they are designed that way.
-
So, I want to write a blog post that explores the fundamentals of LLMs.
-
Fortunately, I discovered a project called MiniMind — a tiny LLM with just a few hundred million parameters, yet it includes components like pre-training, SFT, RLHF, LoRA, and even DPO. My plan is to read, run, and modify this project to gain a deeper understanding and then document the process in a blog post about building an LLM from scratch.
-
Hopefully, this will be helpful for others too.
these is a similar project called Build a Large Language Model (from Scratch)