this dir | view | cards | source | edit | dark top

Exam

Exam
Basic notions from probability and information theory

What are the three basic properties of a probability function?

Basic notions from probability and information theory

When do we say that two events are (statistically) independent?

Basic notions from probability and information theory

Show how Bayes' Theorem can be derived.

Basic notions from probability and information theory

Explain Chain Rule.

Basic notions from probability and information theory

Explain the notion of Entropy (formula expected too).

Basic notions from probability and information theory

Explain Kullback-Leibler distance (formula expected too).

Basic notions from probability and information theory

Explain Mutual Information (formula expected too).

Language models, the noisy channel model

Explain the notion of The Noisy Channel.

Language models, the noisy channel model

Explain the notion of the n-gram language model.

Language models, the noisy channel model

Describe how Maximum Likelihood estimate of a trigram language model is computed. (2 points)

Language models, the noisy channel model

Why do we need smoothing (in language modelling)?

Language models, the noisy channel model

Give at least two examples of smoothing methods. (2 points)

Morphological analysis

What is a morphological tag? List at least five features that are often encoded in morphological tag sets.

Morphological analysis

List the open and closed part-of-speech classes and explain the difference between open and closed classes.

Morphological analysis

Explain the difference between a finite-state automaton and a finite-state transducer. Describe the algorithm of using a finite-state transducer to transform a surface string to a lexical string (pseudocode or source code in your favorite programming language). (2 points)

Morphological analysis

Give an example of a phonological or an orthographical change caused by morphological inflection (any natural language). Describe the rule that would take care of the change during analysis or generation. It is not required that you draw a transducer, although drawing a transducer is one of the possible ways of describing the rule.

Morphological analysis

Give an example of a long-distance dependency in morphology (any natural language). How would you handle it in a morphological analyzer?

Syntactic analysis

Describe dependency trees, constituent trees, differences between them and phenomena that must be addressed when converting between them. (2 points)

Syntactic analysis

Give an example of a sentence (in any natural language) that has at least two plausible, semantically different syntactic analyses (readings). Draw the corresponding dependency trees and explain the difference in meaning. Are there other additional readings that are less probable but still grammatically acceptable? (2 points)

Syntactic analysis

What is coordination? Why is it difficult in dependency parsing? How would you capture coordination in a dependency structure? What are the advantages and disadvantages of your solution?

Syntactic analysis

What is ellipsis? Why is it difficult in parsing? Give examples of different kinds of ellipsis (any natural language).

Information retrieval

Explain the difference between information need and query.

Information retrieval

What is inverted index and what are the optimal data structures for it?

Information retrieval

What is stopword and what is it useful for?

Information retrieval

Explain the bag-of-word principle?

Information retrieval

What is the main advantage and disadvantage of boolean model.

Information retrieval

Explain the role of the two components in the TF-IDF weighting scheme.

Information retrieval

Explain length normalization in vector space model what is it useful for?

Language data resources

Explain what a corpus is.

Language data resources

Explain what annotation is (in the context of language resources). What types of annotation do you know? (2 points)

Language data resources

What are the reasons for variability of even basic types of annotation, such as the annotation of morphological categories (parts of speech etc.).

Language data resources

Explain what a treebank is. Why trees are used? (2 points)

Language data resources

Explain what a parallel corpus is. What kind of alignments can we distinguish? (2 points)

Language data resources

What is a sentiment-annotated corpus? How can it be used?

Language data resources

What is a coreference-annotated corpus?

Language data resources

Explain how WordNet is structured?

Language data resources

Explain the difference between derivation and inflection?

Evaluation measures in NLP

Give at least two examples of situations in which measuring a percentage accuracy is not adequate.

Evaluation measures in NLP

Explain: precision, recall

Evaluation measures in NLP

What is F-measure, what is it useful for?

Evaluation measures in NLP

What is k-fold cross-validation?

Evaluation measures in NLP

Explain BLEU (the exact formula not needed, just the main principles).

Evaluation measures in NLP

Explain the purpose of brevity penalty in BLEU.

Evaluation measures in NLP

What is Labeled Attachment Score (in parsing)?

Evaluation measures in NLP

What is Word Error Rate (in speech recognition)?

Evaluation measures in NLP

What is inter-annotator agreement? How can it be measured?

Evaluation measures in NLP

What is Cohen's kappa?

Deep learning for NLP

Describe the two methods for training of the Word2Vec model.

Deep learning for NLP

Use formulas to describe how Word2Vec is trained with negative sampling. (2 points)

Deep learning for NLP

Explain the difference between Word2Vec and FastText embeddings.

Deep learning for NLP

Sketch the structure of the Transformer model. (2 points)

Deep learning for NLP

Why do we use positional encodings in the Transformer model.

Deep learning for NLP

What are residual connections in neural networks? Why do we use them?

Deep learning for NLP

Use formulas to express the loss function for training sequence labeling tasks.

Deep learning for NLP

Explain the pre-training procedure of the BERT model. (2 points)

Deep learning for NLP

Explain what is the pre-train and finetune paradigm in NLP.

Deep learning for NLP

Describe the task of named entitity recognition (NER). Explain the intution behind the CRF models compared to standard sequence labeling. (2 points)

Deep learning for NLP

Explain how does the self-attention differ in encoder-only and decoder-only models.

Machine translation fundamentals

Why is MT difficult from linguistic point of view? Provide examples and explanation for at least three different phenomena. (2 points)

Machine translation fundamentals

Why is MT difficult from computational point of view?

Machine translation fundamentals

Briefly describe at least three methods of manual MT evaluation. (1-2 points)

Machine translation fundamentals

Describe BLEU. 1 point for the core properties explained, 1 point for the (commented) formula.

Machine translation fundamentals

Describe IBM Model 1 for word alignment, highlighting the EM structure of the algorithm.

Machine translation fundamentals

Explain using equations the relation between Noisy channel model and log-linear model for classical statistical MT. (2 points)

Machine translation fundamentals

Describe the loop of weight optimization for the log-linear model as used in phrase-based MT.

Neural machine translation

Describe the critical limitation of PBMT that NMT solves. Provide example training data and example input where PBMT is very likely to introduce an error.

Neural machine translation

Use formulas to highlight the similarity of NMT and LMs.

Neural machine translation

Describe, how words are fed to current NMT architectures and explain why is this beneficial over 1-hot representation.

Neural machine translation

Sketch the structure of an encoder-decoder architecture of neural MT, remember to describe the components in the picture (2 points)

Neural machine translation

What is the difference in RNN decoder application at training time vs. at runtime?

Neural machine translation

What problem does attention in NMT address? Provide the key idea of the method.

Neural machine translation

What problem/task do both RNN and self-attention resolve and what is the main benefit of self-attention over RNN?

Neural machine translation

What are the three roles each state at a Transformer encoder layer takes in self-attention.

Neural machine translation

What are the three uses of self-attention in the Transformer model?

Neural machine translation

Provide an example of NMT improvement that was assumed to come from additional linguistic information but occurred also for a simpler reason.

Neural machine translation

Summarize and compare the strategy of "classical statistical MT" vs. the strategy of neural approaches to MT.

Hurá, máš hotovo! 🎉
Pokud ti moje kartičky pomohly, můžeš mi koupit pivo.