Day 0 - Setup: GPT from Scratch with MLX

Reference Material:


Day 0: Getting Python environment setup and going over some basics to prepare for Day 1.
Setup Python Environments with MLX

  Install Conda -
  Choose your IDE - after going through this stream I want to use PyCharm from Jetbrains unless I can figure out how to configure this with
  Make a project in your ~/workspace/GPTfromScratch
  From the terminal window in the IDE run conda create -n GPTfromScratch python=3.10
  conda activate GPTfromScratch
  Install MLX conda install -c conda-forge mlx

Preparing Data - LLM Tokenization Explained

Breaking down the following statement to a 5th grader.
The first step to training an LLM is collecting a large corpus of text data and then tokenizing it. Tokenization is the process of mapping text to integers, which can be fed into the LLM.
To Do

Figure out environments using

Next up: Tackling data setup in depth

