credit
event as a set of basic outcomes
we can estimate the probability of event by experiment
axioms
entropy
nothing can be more uncertain than the uniform distribution
perplexity
chain rule
coding interpretation
entropy … the least average number of bits needed to encode a message
mutual information
we try to recover the original input from a noised output
to get input
language modelling
smoothing
<unk>
homework
morphological annotation
POS tags
Czech positional tags of PDT
ancient Greek word classes
traditional parts of speech
openness vs. closeness, content vs. function words
morphological analysis
morphological analysis vs. tagging
finite-state morphology
lexicon is implemented as a FSA (trie)
problem with phonology: baby+s → babies (not babys)
baby+0s
babi0es
finite-state transducer (převodník)
another way of rule notation: two-level grammar
a:b <=> l:l _ r:r
a
must be realized as surface b
in this context and only in this contextl
and r
nej-
prefix is legal only if there is -ší
suffixsyntactic annotation
surface syntax
syntactic parsers
boolean retrieval
and
, or
, not
text processing
boolean retrieval: good for experts, good for applications, not good for the majority of users
tf-idf weighting
we want to somehow measure similarity between the query and the documents
evaluation
representing words
representing sequences
Transformers
LM as sequence labeling
BLEU score combines precision and recall
phrase-based machine translation