Advanced

Parallel algorithms

Algorithms for Inference and Training support an optional parallel argument. It allows to process input sequences parallel in time by chunking them. This can lead to significant speedups, especially for long sequences, but comes at the cost of memory usage. This cost depends on the number of hidden states, so a parallel>1 is mostly recommended for small HMMs.

In the following example, sequences of length 10 000 are split into chunks of length 100 which are then processed in parallel:

import numpy as np

from hidten.tf import TFHMM
from hidten import HMMMode

hmm = TFHMM(states=21)

# Define transitioner and emitter
hmm.transitioner.initializer = "glorot_uniform"
hmm.emitter[0].initializer = "glorot_uniform"

x = np.random.randint(0, 12, size=(32, 10_000))
x = np.eye(12)[x]

# Run the HMM in parallel mode
# The parallel value has to divide the sequence length
hmm(x, mode=HMMMode.POSTERIOR, parallel=100)

Note: the parallel value does not affect the results - except for small numerical differences. All algorithms are designed to be robust against numerical underflow in both parallel and non-parallel modes.