Lecture

multi-core processor
- is it a distributed system? no
- the whole processor crashes (we cannot have one core working and other one crashed)
partial failure
- in a distributed system, the components may fail independently
- „distribuovaný systém je to tehdy, když můj počítač přestane fungovat, protože se rozbil počítač, o kterém jsem ani nevěděl, že existuje“
sharing information, taking a common decision
why do we want/have a distributed system?
- users are in different places → it is a distributed system by nature
- better performance, stability (if one server fails, we have another)

Ordering of Events

notation
- set of processes $P=\set{p_1,\dots,p_n}$
- they communicate through channels (channel from $p_i$ to $p_j$ … $C_{ij}$ )
- event $\# k$ $# k$ on process $p_i$ $p_{i}$ … $e_i^k$ $e_{i}^{k}$
  - event changes the state of the process
- states of the process $\sigma_i^0,\sigma_i^1,\dots$ $σ_{i}^{0}, σ_{i}^{1}, \dots$
  - event $e_i^1$ performs the transition from $\sigma^0_i$ to $\sigma_i^1$ on process $p_i$
- local history $h_i=e^1_i e^2_i e^3_i \dots$
- global history $H=h_1\cup h_2\cup h_3\dots$
happened_before ( $\to$ $\to$ ) relation between two events
- first case: $e_i^k\to e_i^\ell$ iff $k\lt\ell$
- second case: send( $m$ $m$ ) $\to$ $\to$ recv( $m$ $m$ )
  - we can receive a message $m$ only after we send it
- transitivity: $e\to e'\land e'\to e''\implies e\to e''$
- antisymmetry: $e\to e'\implies e'\not\to e$
- it is a partial ordering of events
- if there is no relation between $e$ $e$ and $e'$ $e^{'}$ , we say they are concurrent
  - $e||e'$
global state $(\sigma_1^{k_1},\sigma_2^{k_2},\sigma_3^{k_3},\dots)$ $(σ_{1}^{k_{1}}, σ_{2}^{k_{2}}, σ_{3}^{k_{3}}, \dots)$
- global state is composed of the local state of every process
cut
- cut $C$ … subset of the global history $H$ that includes a prefix of each local history
- consistent cut … the global state could have existed
  - cut $C$ is consistent iff $e'\in C\land e\to e'\implies e\in C$
Chandy-Lamport
- we assume FIFO channels
- to save a snapshot, we need an initiator process which is the first to save its state and broadcast the message SNAPSHOT
- when process $p_i$ receives the SNAPSHOT message, it saves its state and also broadcasts SNAPSHOT (there are no other events in between)
- the state of channel $c_{ji}$ corresponds to the messages that the process $p_i$ received from $p_j$ between broadcasting SNAPSHOT and receiving SNAPSHOT from $c_j$
- after $p_i$ receives SNAPSHOT from all processes, it knows that the computation of the snapshot is terminated

Time

how to put timestamps on events to have guarantees that one event happened before another event?
- I would like a timestamp function $TS$ such that $e\to e'\iff TS(e)\lt TS(e')$
- we are assuming an asynchronous system
  - no bound on message delays
  - no bound on relative speed of processes
Lamport Clock $LC$ $L C$
- algorithm
  - initial state … $LC_i\leftarrow 0$
  - for any internal event … $LC_i\leftarrow LC_i+1$
  - on $\mathrm{send}(m)$ $send (m)$ … $LC_i\leftarrow LC_i+1$ $L C_{i} \leftarrow L C_{i} + 1$
    - - attach $LC_i$ to $m$ , so that $ts(m)=LC_i(\mathrm{send}(m))$
  - when $p_j$ $p_{j}$ receives $m$ $m$
    - $LC_j\leftarrow\max(LC_j,ts(m))+1$
- what can we say about LC?
  - $e\to e'\implies LC(e)\lt LC(e')$ $e \to e^{'} ⟹ L C (e) < L C (e^{'})$
    - we don't have $\impliedby$
  - $LC_i(e)$ corresponds to the length of the longest chain of events leading to $e$
vector clocks
- definition
  - for $i=j:VC(e_i)[j]=$ number of events on $p_i$ up to and including $e_i$
  - for $i\neq j:VC(e_i)[j]=$ number of events on $p_j$ that happened_before $e_i$
- algorithm
  - if $e_i$ $e_{i}$ is internal or $\mathrm{send}(m)$ $send (m)$
    - $VC(e_i)[i]\leftarrow VC_i[i]+1$
    - $VC(e_i)[j]\leftarrow VC_i[j]$
  - if $e_i$ $e_{i}$ is $\mathrm{recv}(m)$ $recv (m)$
    - $VC(e_i)\leftarrow\max(VC_i,ts(m))$
    - $VC(e_i)[i]\leftarrow VC(e_i)[i]+1$