## Pages

### Hidden Markov Model example in R

Hidden Markov Model (HMM) is a method for representing most likely corresponding sequences of observation data. HMM is used in speech and pattern recognition, computational biology, and other areas of data modeling.
In this post, I will try to explain HMM, and its usage in R. HMM package provides HMM related functions in R.

`> library(HMM)`
` `
The transition and emission matrix are the main parameters to build HMM.
• The transition matrix is a probability of switching from one state to another.
• Emission matrix is a selection probability of the element in a list.

Let's see an example. There are two possible states called "Target" and "Outlier" in a test data, and their selecting probabilities are as below,

```> states <- c("Target","Outlier")
> targetProb <- c(0.4, 0.6)
> outlierProb <- c(0.6, 0.4)```

Based on those selection probabilities, we build a transition probability matrix.

`> transProb <- matrix(c(targetProb, outlierProb), 2)`
```> transProb
[,1] [,2]
[1,]  0.4  0.6
[2,]  0.6  0.4```

A state element can be a "short" or "long" or "normal". In each state, the selection probability of an element is different. For a "Target" state, the elements probabilities are Pshort=0.1, Pnormal=0.3, Plong=0.6 and for an "Outlier" state probabilities are Pshort=0.6, Pnormal=0.3, Plong=0.1 percent. We can define them as below.

```> elements <- c("short","normal","long")
> targetStateProb <- c(0.1, 0.3, 0.6)
> outlierStateProb <- c(0.6, 0.3, 0.1)```

We create an emission probability matrix.

```> emissProb <- matrix(c(targetStateProb,outlierStateProb), 2, byrow = T)
> emissProb
[,1] [,2] [,3]
[1,]  0.1  0.3  0.6
[2,]  0.6  0.3  0.1```

Now, we can build a model with the above inputs. The initHMM function creates HMM.

`> hmm <- initHMM(States = states, `
`                 Symbols = elements,`
`                 transProbs=transProb,`
`                 emissionProbs = emissProb)`

We can check the summary of an hmm model

```> print(hmm)
\$States
 "Target"  "Outlier"

\$Symbols
 "short"  "normal" "long"

\$startProbs
Target Outlier
0.5     0.5

\$transProbs
to
from      Target Outlier
Target     0.4     0.6
Outlier    0.6     0.4

\$emissionProbs
symbols
states    short normal long
Target    0.1    0.3  0.6
Outlier   0.6    0.3  0.1```

Next, we simulate 10 observation elements with a simHMM function using our hmm model.

```> simhmm <- simHMM(hmm, 10)
> simulated <- data.frame(state=simhmm\$states, element=simhmm\$observation)```

Printing the result of simulated data.

```> print(simulated)
state element
1   Target  normal
2  Outlier  normal
3   Target  normal
4   Target   short
5   Target    long
6  Outlier   short
7  Outlier   short
8  Outlier   short
9   Target    long
10 Outlier   short```

Predicting possible state of a test data

Another interesting function in HMM package is a viterbi function. The Viterbi algorithm calculates the possible state for a sequence of observations for a given HMM.
We create test data and find out possible states for those elements with the hmm model.

```> testElements <- c("long","normal","normal","short",
"normal","normal","short","long")
> stateViterbi <- viterbi(hmm, testElements)
```

The result is listed below.

```> predState <- data.frame(Element=testElements, State=stateViterbi)
> print(predState)
Element   State
1    long  Target
2  normal Outlier
3  normal  Target
4   short Outlier
5  normal  Target
6  normal  Target
7   short Outlier
8    long  Target```

Based on our hmm model, possible states of test elements are predicted.
Thank you for reading! I hope you have found it useful!

1. Hey!
I think there are some problems with the matrices in this post (maybe it was written against an earlier version of the HMM library?

The transProbs-matrix needs to be transposed, so that each of the rows sum to 1. In general, this matrix needs to have the same amount of rows and columns.

The emissionProbs-matrix also needs to have the same amount of rows/columns.

These conclusions I have drawn from the documentation of initHMM(..).

1. Yes, you are right, the rows sum must be equal to 1.
I updated matrix values. Thanks!