Back to Notes
UCSC · CSE 143Work in Progress

Natural Language Processing

Notes covering tokenisation, language models, sequence-to-sequence architectures, attention, and transformers.

Notes from CSE 143 at UCSC.

Topics Covered

  • Tokenisation and text preprocessing
  • N-gram language models
  • Recurrent neural networks (RNNs, LSTMs)
  • Attention mechanisms
  • Transformers and BERT

Useful Links

Textbook

Speech and Language Processing — Jurafsky & Martin