Ideas April/25

Creation Time: 2025-04-21 00:27

ideas

A Collection of Various Thoughts from April 2025

17/04/2025

Even though I’ve read many papers on large language models (LLMs) and written code to fine-tune them or call their APIs, I still don’t fully understand how the base models work — what each part of the code does, how pre-training and post-training actually function, and why they are designed that way.
So, I want to write a blog post that explores the fundamentals of LLMs.
Fortunately, I discovered a project called MiniMind — a tiny LLM with just a few hundred million parameters, yet it includes components like pre-training, SFT, RLHF, LoRA, and even DPO. My plan is to read, run, and modify this project to gain a deeper understanding and then document the process in a blog post about building an LLM from scratch.
Hopefully, this will be helpful for others too.

these is a similar project called Build a Large Language Model (from Scratch)