Hidden Markov Model (HMM) is a method for representing most likely corresponding sequences of observation data. HMM is used in speech and pattern recognition, computational biology, and other areas of data modeling.

In this post, I will try to explain HMM, and its usage in R. HMM package provides HMM related functions in R.

> library(HMM)

` `

The transition and emission matrix are the main parameters to build HMM.- The transition matrix is a probability of switching from one state to another.
- Emission matrix is a selection probability of the element in a list.

Let's see an example. There are two possible states called "Target" and "Outlier" in a test data, and their selecting probabilities are as below,

> states <- c("Target","Outlier") > targetProb <- c(0.4, 0.6) > outlierProb <- c(0.6, 0.4)

Based on those selection probabilities, we build a transition probability matrix.

> transProb <- matrix(c(targetProb, outlierProb), 2)

> transProb [,1] [,2] [1,] 0.4 0.6 [2,] 0.6 0.4

A state element can be a "short" or "long" or "normal". In each state, the selection probability of an element is different. For a "Target" state, the elements probabilities are P

*short*=0.1, P

*normal*=0.3, P

*long*=0.6 and for an "Outlier" state probabilities are P

*short*=0.6, P

*normal*=0.3, P

*long*=0.1 percent. We can define them as below.

> elements <- c("short","normal","long") > targetStateProb <- c(0.1, 0.3, 0.6) > outlierStateProb <- c(0.6, 0.3, 0.1)

We create an emission probability matrix.

> emissProb <- matrix(c(targetStateProb,outlierStateProb), 2, byrow = T) > emissProb [,1] [,2] [,3] [1,] 0.1 0.3 0.6 [2,] 0.6 0.3 0.1

Now, we can build a model with the above inputs. The initHMM function creates HMM.

> hmm <- initHMM(States = states,

` Symbols = elements,`

` transProbs=transProb,`

emissionProbs = emissProb)

We can check the summary of an hmm model

> print(hmm) $States [1] "Target" "Outlier" $Symbols [1] "short" "normal" "long" $startProbs Target Outlier 0.5 0.5 $transProbs to from Target Outlier Target 0.4 0.6 Outlier 0.6 0.4 $emissionProbs symbols states short normal long Target 0.1 0.3 0.6 Outlier 0.6 0.3 0.1

Next, we simulate 10 observation elements with a simHMM function using our hmm model.

> simhmm <- simHMM(hmm, 10) > simulated <- data.frame(state=simhmm$states, element=simhmm$observation)

Printing the result of simulated data.

> print(simulated) state element 1 Target normal 2 Outlier normal 3 Target normal 4 Target short 5 Target long 6 Outlier short 7 Outlier short 8 Outlier short 9 Target long 10 Outlier short

**Predicting possible state of a test data**

Another interesting function in HMM package is a viterbi function. The Viterbi algorithm calculates the possible state for a sequence of observations for a given HMM.

We create test data and find out possible states for those elements with the hmm model.

> testElements <- c("long","normal","normal","short", "normal","normal","short","long") > stateViterbi <- viterbi(hmm, testElements)

The result is listed below.

> predState <- data.frame(Element=testElements, State=stateViterbi) > print(predState) Element State 1 long Target 2 normal Outlier 3 normal Target 4 short Outlier 5 normal Target 6 normal Target 7 short Outlier 8 long Target

Based on our hmm model, possible states of test elements are predicted.

Thank you for reading! I hope you have found it useful!

Hey!

ReplyDeleteI think there are some problems with the matrices in this post (maybe it was written against an earlier version of the HMM library?

The transProbs-matrix needs to be transposed, so that each of the rows sum to 1. In general, this matrix needs to have the same amount of rows and columns.

The emissionProbs-matrix also needs to have the same amount of rows/columns.

These conclusions I have drawn from the documentation of initHMM(..).

Yes, you are right, the rows sum must be equal to 1.

DeleteI updated matrix values. Thanks!