Emitters
Understanding and using emission models in HidTen.
Emitters are components that compute emission probabilities or scores for observations given hidden states. They define how likely each observation is to be emitted from each state.
A hidten.hmm.HMM can have multiple emitters, each responsible for a different aspect of the observation space.
New emitters can be added with add_emitter().
When using multiple emitters, multiple input tracks must be provided to the Algorithms for Inference and Training.
The input tracks must match the order of emitters.
Categorical
hidten.tf.emitter.categorical.TFCategoricalEmitter handles discrete observations with categorical distributions.
As everything in HidTen, CategoricalEmitters support multiple heads, i.e. parallel
models that use their own set of parameters and states.
In the following example, a CategoricalEmitter is created for a discrete alphabet {0, 1, 2, 3} and 2 HMM heads with 3 states each:
from hidten.tf import TFHMM
from hidten.tf.categorical import TFCategoricalEmitter
hmm = TFHMM(states=[3, 3])
# Create categorical emitter for discrete alphabet {0, 1, 2, 3}
emitter = TFCategoricalEmitter()
hmm.add_emitter(emitter)
emitter.initializer = [
# head 1
0.25, 0.25, 0.25, 0.25, # state 1
0.4, 0.2, 0.2, 0.2, # state 2
0., 0.5, 0.5, 0., # state 3
# head 2
0.3, 0.3, 0.2, 0.2, # state 1
0.5, 0.1, 0.2, 0.2, # state 2
0., 0.4, 0.4, 0.2, # state 3
]
hmm.build((None, None, 4))
print(hmm.emitter[0].matrix())
Continuous
hidten.tf.emitter.multivariate_normal.TFMVNormalEmitter handles continuous
observations with multivariate normal distributions.
from hidten.tf import TFHMM
from hidten.tf.multivariate_normal import TFMVNormalEmitter
hmm = TFHMM(states=[2, 2])
# Create categorical emitter for discrete alphabet {0, 1, 2, 3}
emitter = TFMVNormalEmitter()
hmm.add_emitter(emitter)
emitter.initializer = [
# head 1, state 1
# means
0.0, 0.0, 0.0,
# variances
0.9, 0.5, 1.0,
# head 1, state 2
# means
0.5, 0.4, 0.6,
# variances
1.2, 0.5, 1.0,
# head 2, state 1
# means
1.0, 2.0, 3.0,
# variances
0.9, 0.7, 1.0,
# head 2, state 2
# means
0.7, 0.2, 0.3,
# variances
1, 1, 1,
]
hmm.build((None, None, 3))
print(hmm.emitter[0].matrix())
Padding
When the input sequences have variable length they can be padded
with zeros.
A hidten.tf.emitter.base.TFPaddingEmitter must be added to the model that
expects the binary padding as a separate input track, where False or 0
indicates a padding position. Padding positions will be ignored in Algorithms for Inference and Training.
Note: HidTen only supports padding sequences to the right.
The following code creates an HMM with 2 states and passes an input sequence with padding.
import numpy as np
from hidten import HMMMode
from hidten.tf import TFHMM
from hidten.tf.categorical import TFCategoricalEmitter
from hidten.tf.emitter import TFPaddingEmitter
hmm = TFHMM(states=2)
hmm.add_emitter(TFCategoricalEmitter())
hmm.emitter[0].allow = [(0, 0), (1, 1)]
hmm.add_emitter(TFPaddingEmitter())
observations = np.array([
[ [1., 0], [1, 0], [0, 1], [0, 0] ],
[ [0., 1], [1, 0], [1, 0], [1, 0] ]
])
padding = np.array([[1, 1, 1, 0], [1, 1, 1, 1]])
posterior = hmm(observations, padding, mode=HMMMode.POSTERIOR)
<tf.Tensor: shape=(2, 4, 1, 3), dtype=float32, numpy=
array([[[[1., 0., 0.]],
[[1., 0., 0.]],
[[0., 1., 0.]],
[[0., 0., 1.]]],
[[[0., 1., 0.]],
[[1., 0., 0.]],
[[1., 0., 0.]],
[[1., 0., 0.]]]], dtype=float32)>
Under the hood, the PaddingEmitter creates a new state (last channel) that will be used only for padding positions. Notice how the previously defined HMM has 2 states, but HMMMode.POSTERIOR returns shape […, 3].
Allowing and sharing
Emissions of specific symbols at specific states can be allowed or disallowed
using the hidten.emitter.Emitter.allow property.
With the hidten.emitter.Emitter.share property, it can be defined that
certain, consecutive states share the same emission parameters.
In the following example, an emitter is created that only allows specific symbols at specific states and shares parameters between some of them:
from hidten import HMMConfig
from hidten.tf.categorical import TFCategoricalEmitter
emitter = TFCategoricalEmitter()
emitter.hmm_config = HMMConfig(states=[3, 2])
emitter.allow = [
# head 1
(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 1, 0), (0, 1, 2), (0, 2, 3),
# head 2
(1, 0, 0), (1, 0, 1), (1, 1, 0), (1, 1, 1), (1, 1, 2)
]
# this shares the parameters of the three edges exiting state 1 in head 1
# and the parameters of the two edges exiting state 2 in head 1
emitter.share = [(0, 3), (3, 5)]
emitter.initializer = [
0.2, # shared 3 times, but will be rescaled to sum to one
0.5, # shared 2 times
1.,
0.3, 0.7,
0.2, 0.3, 0.5
]
emitter.build((None, None, 4))
print(emitter.matrix())
tf.Tensor(
[[[0.33333334 0.33333334 0.33333334 0. ]
[0.5 0. 0.5 0. ]
[0. 0. 0. 1. ]]
[[0.29999998 0.7 0. 0. ]
[0.2 0.3 0.5 0. ]
[0. 0. 0. 0. ]]], shape=(2, 3, 4), dtype=float32)
Side note: In this example we have demonstrated how to use an emitter without it
being part of an HMM. This can be done by providing an hidten.hmm.HMMConfig.
Creating Custom Emitters
To create custom emitters, inherit from hidten.tf.emitter.base.TFEmitter:
from hidten.tf.emitter import TFEmitter
class CustomEmitter(TFEmitter):
...
TODO: This section is incomplete.