Emitters

Understanding and using emission models in HidTen.

Emitters are components that compute emission probabilities or scores for observations given hidden states. They define how likely each observation is to be emitted from each state.

A hidten.hmm.HMM can have multiple emitters, each responsible for a different aspect of the observation space. New emitters can be added with add_emitter(). When using multiple emitters, multiple input tracks must be provided to the Algorithms for Inference and Training. The input tracks must match the order of emitters.

Categorical

hidten.tf.emitter.categorical.TFCategoricalEmitter handles discrete observations with categorical distributions. As everything in HidTen, CategoricalEmitters support multiple heads, i.e. parallel models that use their own set of parameters and states.

In the following example, a CategoricalEmitter is created for a discrete alphabet {0, 1, 2, 3} and 2 HMM heads with 3 states each:

from hidten.tf import TFHMM
from hidten.tf.categorical import TFCategoricalEmitter

hmm = TFHMM(states=[3, 3])

# Create categorical emitter for discrete alphabet {0, 1, 2, 3}
emitter = TFCategoricalEmitter()
hmm.add_emitter(emitter)
emitter.initializer = [
    # head 1
    0.25, 0.25, 0.25, 0.25,  # state 1
    0.4, 0.2, 0.2, 0.2,      # state 2
    0., 0.5, 0.5, 0.,        # state 3
    # head 2
    0.3, 0.3, 0.2, 0.2,      # state 1
    0.5, 0.1, 0.2, 0.2,      # state 2
    0., 0.4, 0.4, 0.2,       # state 3
]

hmm.build((None, None, 4))

print(hmm.emitter[0].matrix())

Continuous

hidten.tf.emitter.multivariate_normal.TFMVNormalEmitter handles continuous observations with multivariate normal distributions.

from hidten.tf import TFHMM
from hidten.tf.multivariate_normal import TFMVNormalEmitter

hmm = TFHMM(states=[2, 2])

# Create categorical emitter for discrete alphabet {0, 1, 2, 3}
emitter = TFMVNormalEmitter()
hmm.add_emitter(emitter)
emitter.initializer = [
    # head 1, state 1
    # means
    0.0, 0.0, 0.0,
    # variances
    0.9, 0.5, 1.0,
    # head 1, state 2
    # means
    0.5, 0.4, 0.6,
        # variances
    1.2, 0.5, 1.0,
    # head 2, state 1
    # means
    1.0, 2.0, 3.0,
        # variances
    0.9, 0.7, 1.0,
    # head 2, state 2
    # means
    0.7, 0.2, 0.3,
        # variances
    1, 1, 1,
]

hmm.build((None, None, 3))

print(hmm.emitter[0].matrix())

Padding

When the input sequences have variable length they can be padded with zeros. A hidten.tf.emitter.base.TFPaddingEmitter must be added to the model that expects the binary padding as a separate input track, where False or 0 indicates a padding position. Padding positions will be ignored in Algorithms for Inference and Training.

Note: HidTen only supports padding sequences to the right.

The following code creates an HMM with 2 states and passes an input sequence with padding.

import numpy as np

from hidten import HMMMode
from hidten.tf import TFHMM
from hidten.tf.categorical import TFCategoricalEmitter
from hidten.tf.emitter import TFPaddingEmitter

hmm = TFHMM(states=2)
hmm.add_emitter(TFCategoricalEmitter())
hmm.emitter[0].allow = [(0, 0), (1, 1)]
hmm.add_emitter(TFPaddingEmitter())
observations = np.array([
    [ [1., 0], [1, 0], [0, 1], [0, 0] ],
    [ [0., 1], [1, 0], [1, 0], [1, 0] ]
])
padding = np.array([[1, 1, 1, 0], [1, 1, 1, 1]])
posterior = hmm(observations, padding, mode=HMMMode.POSTERIOR)

<tf.Tensor: shape=(2, 4, 1, 3), dtype=float32, numpy=
array([[[[1., 0., 0.]],

        [[1., 0., 0.]],

        [[0., 1., 0.]],

        [[0., 0., 1.]]],

    [[[0., 1., 0.]],

        [[1., 0., 0.]],

        [[1., 0., 0.]],

        [[1., 0., 0.]]]], dtype=float32)>

Under the hood, the PaddingEmitter creates a new state (last channel) that will be used only for padding positions. Notice how the previously defined HMM has 2 states, but HMMMode.POSTERIOR returns shape […, 3].

Creating Custom Emitters

To create custom emitters, inherit from hidten.tf.emitter.base.TFEmitter:

from hidten.tf.emitter import TFEmitter

class CustomEmitter(TFEmitter):
    ...

TODO: This section is incomplete.

Emitters

Categorical

Continuous

Padding

Allowing and sharing

Creating Custom Emitters