Most of the week is focused on getting up to speed with fundamentals, I already worked on most of the topics/concepts before and now I would like to train on diff data/tasks, read seminal papers ilya u30

starting with small language models and various vision tasks (fastdotai finetuing tasks), working on assignments from CS336.

GPU Poor :( how do I fix it?

Language models, AdamW, Learning rate, Tokenization, LayerNorm, BatchNorm, Dropout, Residual Connection, Backprop.

Arch: Transformer, Mixture of Experts, ResNet

how do I get the job as a research engineer or build my own product.. what type of product, should I focus on application layer or put more focus on research.

things to read: AlphaGoZero