Hidden Markov Model example in R

   Hidden Markov Model (HMM) is a method for representing most likely corresponding sequences of observation data. HMM is used in speech and pattern recognition, computational biology, and other areas of data modeling.
   In this post, I will try to explain HMM, and its usage in R. HMM package provides HMM related functions in R.

> library(HMM)
   The transition and emission matrix are the main parameters to build HMM.
  • The transition matrix is a probability of switching from one state to another.
  • Emission matrix is a selection probability of the element in a list.

   Let's see an example. There are two possible states called "Target" and "Outlier" in a test data, and their selecting probabilities are as below,

> states <- c("Target","Outlier")
> targetProb <- c(0.4, 0.6)
> outlierProb <- c(0.6, 0.4)

Based on those selection probabilities, we build a transition probability matrix.

> transProb <- matrix(c(targetProb, outlierProb), 2)
> transProb
     [,1] [,2]
[1,]  0.4  0.6
[2,]  0.6  0.4

   A state element can be a "short" or "long" or "normal". In each state, the selection probability of an element is different. For a "Target" state, the elements probabilities are Pshort=0.1, Pnormal=0.3, Plong=0.6 and for an "Outlier" state probabilities are Pshort=0.6, Pnormal=0.3, Plong=0.1 percent. We can define them as below.

> elements <- c("short","normal","long")
> targetStateProb <- c(0.1, 0.3, 0.6)
> outlierStateProb <- c(0.6, 0.3, 0.1)

We create an emission probability matrix.

> emissProb <- matrix(c(targetStateProb,outlierStateProb), 2, byrow = T) 
> emissProb
     [,1] [,2] [,3]
[1,]  0.1  0.3  0.6
[2,]  0.6  0.3  0.1

Now, we can build a model with the above inputs. The initHMM function creates HMM.

> hmm <- initHMM(States = states, 
                 Symbols = elements,
                 emissionProbs = emissProb)

We can check the summary of an hmm model

> print(hmm)
[1] "Target"  "Outlier"

[1] "short"  "normal" "long"  

 Target Outlier 
    0.5     0.5 

from      Target Outlier
  Target     0.4     0.6
  Outlier    0.6     0.4

states    short normal long
  Target    0.1    0.3  0.6
  Outlier   0.6    0.3  0.1

Next, we simulate 10 observation elements with a simHMM function using our hmm model.

> simhmm <- simHMM(hmm, 10)
> simulated <- data.frame(state=simhmm$states, element=simhmm$observation)

Printing the result of simulated data.

> print(simulated)
     state element
1   Target  normal
2  Outlier  normal
3   Target  normal
4   Target   short
5   Target    long
6  Outlier   short
7  Outlier   short
8  Outlier   short
9   Target    long
10 Outlier   short

Predicting possible state of a test data

   Another interesting function in HMM package is a viterbi function. The Viterbi algorithm calculates the possible state for a sequence of observations for a given HMM.
   We create test data and find out possible states for those elements with the hmm model.

> testElements <- c("long","normal","normal","short",
> stateViterbi <- viterbi(hmm, testElements)

The result is listed below.

> predState <- data.frame(Element=testElements, State=stateViterbi)
> print(predState)
  Element   State
1    long  Target
2  normal Outlier
3  normal  Target
4   short Outlier
5  normal  Target
6  normal  Target
7   short Outlier
8    long  Target

Based on our hmm model, possible states of test elements are predicted.
Thank you for reading! I hope you have found it useful!


  1. Hey!
    I think there are some problems with the matrices in this post (maybe it was written against an earlier version of the HMM library?

    The transProbs-matrix needs to be transposed, so that each of the rows sum to 1. In general, this matrix needs to have the same amount of rows and columns.

    The emissionProbs-matrix also needs to have the same amount of rows/columns.

    These conclusions I have drawn from the documentation of initHMM(..).

    1. Yes, you are right, the rows sum must be equal to 1.
      I updated matrix values. Thanks!