Algorithms for Inference and Training

Previously, we implemented a Hidden Markov Model (HMM) using HidTen. Now, let’s explore the different algorithms available for inference and training.

We’ll need some observations to work with:

# generate some observations
# hidten expects inputs of shape (B, T, D), where B is the batch size,
# T is the sequence length, and D is the number of features
observations = np.random.randint(0, 4, size=(4, 10,))
observations = np.eye(4)[observations] # one hot

Emission scores

The simplest inference we can get from an HMM is emission_scores().

hmm.emission_scores(observations)

This tells us how likely the observation at each time point is at each state, independently of the other time points.

The output depends only on the emitters we have added to the model. If no emitter was added, a hidten.tf.categorical.TFCategoricalEmitter is used automatically.

Likelihood

The likelihood of the observations given the model can be computed using the likelihood_log() method. This is useful for sequence classification tasks and unsupervised learning.

hmm.likelihood_log(observations)

This computes the log-probability of the observed sequence under the model, marginalizing over all possible hidden state sequences.

The likelihood is a result of the more general forward algorithm which computes forward scores.

This algorithm exists in two variants.

First, the forward_log() method.

hmm.forward_log(observations)

which computes the logarithmic forward scores.

Second, the forward_scaled() method.

hmm.forward_scaled(observations)

which computes the scaled forward scores and scaling factors.

Posterior

The posterior probabilities can be computed using the posterior() method. This is useful for sequence-to-sequence classification tasks and supervised learning.

hmm.posterior(observations)

This computes the posterior probabilities of the hidden states given the observed sequence.

Viterbi

The Viterbi algorithm finds the most likely sequence of hidden states given the observed sequence. The method is viterbi().

hmm.viterbi(observations)

This is useful for decoding solutions from a model.

Maximum expected accuracy

The maximum expected accuracy (MEA) is a training objective that aims to maximize the expected accuracy of the model on a given set of observations. The method is maximum_expected_accuracy().

hmm.maximum_expected_accuracy(observations)

This is a useful alternative to Viterbi decoding in certain situations.