Emitters ======== Understanding and using emission models in HidTen. Emitters are components that compute emission probabilities or scores for observations given hidden states. They define how likely each observation is to be emitted from each state. A :class:`hidten.hmm.HMM` can have multiple emitters, each responsible for a different aspect of the observation space. New emitters can be added with :meth:`~hidten.hmm.HMM.add_emitter`. When using multiple emitters, multiple input *tracks* must be provided to the :doc:`algorithms`. The input tracks must match the order of emitters. Categorical ----------- :class:`hidten.tf.emitter.categorical.TFCategoricalEmitter` handles discrete observations with categorical distributions. As everything in HidTen, CategoricalEmitters support multiple *heads*, i.e. parallel models that use their own set of parameters and states. In the following example, a CategoricalEmitter is created for a discrete alphabet {0, 1, 2, 3} and 2 HMM heads with 3 states each: .. code-block:: python from hidten.tf import TFHMM from hidten.tf.categorical import TFCategoricalEmitter hmm = TFHMM(states=[3, 3]) # Create categorical emitter for discrete alphabet {0, 1, 2, 3} emitter = TFCategoricalEmitter() hmm.add_emitter(emitter) emitter.initializer = [ # head 1 0.25, 0.25, 0.25, 0.25, # state 1 0.4, 0.2, 0.2, 0.2, # state 2 0., 0.5, 0.5, 0., # state 3 # head 2 0.3, 0.3, 0.2, 0.2, # state 1 0.5, 0.1, 0.2, 0.2, # state 2 0., 0.4, 0.4, 0.2, # state 3 ] hmm.build((None, None, 4)) print(hmm.emitter[0].matrix()) Continuous ---------- :class:`hidten.tf.emitter.multivariate_normal.TFMVNormalEmitter` handles continuous observations with multivariate normal distributions. .. code-block:: python from hidten.tf import TFHMM from hidten.tf.multivariate_normal import TFMVNormalEmitter hmm = TFHMM(states=[2, 2]) # Create categorical emitter for discrete alphabet {0, 1, 2, 3} emitter = TFMVNormalEmitter() hmm.add_emitter(emitter) emitter.initializer = [ # head 1, state 1 # means 0.0, 0.0, 0.0, # variances 0.9, 0.5, 1.0, # head 1, state 2 # means 0.5, 0.4, 0.6, # variances 1.2, 0.5, 1.0, # head 2, state 1 # means 1.0, 2.0, 3.0, # variances 0.9, 0.7, 1.0, # head 2, state 2 # means 0.7, 0.2, 0.3, # variances 1, 1, 1, ] hmm.build((None, None, 3)) print(hmm.emitter[0].matrix()) Padding ------- When the input sequences have variable length they can be padded with zeros. A :class:`hidten.tf.emitter.base.TFPaddingEmitter` must be added to the model that expects the binary padding as a separate input track, where `False` or `0` indicates a padding position. Padding positions will be ignored in :doc:`algorithms`. Note: HidTen only supports padding sequences to the right. The following code creates an HMM with 2 states and passes an input sequence with padding. .. code-block:: python import numpy as np from hidten import HMMMode from hidten.tf import TFHMM from hidten.tf.categorical import TFCategoricalEmitter from hidten.tf.emitter import TFPaddingEmitter hmm = TFHMM(states=2) hmm.add_emitter(TFCategoricalEmitter()) hmm.emitter[0].allow = [(0, 0), (1, 1)] hmm.add_emitter(TFPaddingEmitter()) observations = np.array([ [ [1., 0], [1, 0], [0, 1], [0, 0] ], [ [0., 1], [1, 0], [1, 0], [1, 0] ] ]) padding = np.array([[1, 1, 1, 0], [1, 1, 1, 1]]) posterior = hmm(observations, padding, mode=HMMMode.POSTERIOR) .. code-block:: python Under the hood, the PaddingEmitter creates a new state (last channel) that will be used only for padding positions. Notice how the previously defined HMM has 2 states, but HMMMode.POSTERIOR returns shape `[..., 3]`. Allowing and sharing -------------------- Emissions of specific symbols at specific states can be allowed or disallowed using the :attr:`hidten.emitter.Emitter.allow` property. With the :attr:`hidten.emitter.Emitter.share` property, it can be defined that certain, consecutive states share the same emission parameters. In the following example, an emitter is created that only allows specific symbols at specific states and shares parameters between some of them: .. code-block:: python from hidten import HMMConfig from hidten.tf.categorical import TFCategoricalEmitter emitter = TFCategoricalEmitter() emitter.hmm_config = HMMConfig(states=[3, 2]) emitter.allow = [ # head 1 (0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 1, 0), (0, 1, 2), (0, 2, 3), # head 2 (1, 0, 0), (1, 0, 1), (1, 1, 0), (1, 1, 1), (1, 1, 2) ] # this shares the parameters of the three edges exiting state 1 in head 1 # and the parameters of the two edges exiting state 2 in head 1 emitter.share = [(0, 3), (3, 5)] emitter.initializer = [ 0.2, # shared 3 times, but will be rescaled to sum to one 0.5, # shared 2 times 1., 0.3, 0.7, 0.2, 0.3, 0.5 ] emitter.build((None, None, 4)) print(emitter.matrix()) .. code-block:: python tf.Tensor( [[[0.33333334 0.33333334 0.33333334 0. ] [0.5 0. 0.5 0. ] [0. 0. 0. 1. ]] [[0.29999998 0.7 0. 0. ] [0.2 0.3 0.5 0. ] [0. 0. 0. 0. ]]], shape=(2, 3, 4), dtype=float32) *Side note*: In this example we have demonstrated how to use an emitter without it being part of an HMM. This can be done by providing an :class:`hidten.hmm.HMMConfig`. Creating Custom Emitters ------------------------ To create custom emitters, inherit from :class:`hidten.tf.emitter.base.TFEmitter`: .. code-block:: python from hidten.tf.emitter import TFEmitter class CustomEmitter(TFEmitter): ... *TODO: This section is incomplete.*