this dir | view | cards | source | edit | dark top

Lecture

Lecture

supervised ML

gradient descent

multi-layer neural networks

for two layers

Universal approximation theorem (Cybenko '89, Hornik '89)

very deep networks (CNN, Transformer, …)

attention, transformers, BERT

example

recent technical revolutions

residual NN (ResNet)

transformer and attention mechanism

theorem

assume there is no causality in the order of tokens, then

theorem

a synthetic, pseudo-code like view on transformers (for the exam)

encoder × decoder

instruction fine-tuning

teaching DeepSeek-R1 Zero to reason

Multimodal AI

convolution

Multimodal AI

CNN is learning the filters to transform the images

Multimodal AI

VGG architecture

Multimodal AI

RNNs

Hurá, máš hotovo! 🎉
Pokud ti moje kartičky pomohly, můžeš mi koupit pivo.