HMM model and its principle combat

Hidden Markov model (Hidden Markov Model, HMM) is a statistical model, which is used to describe a parameter containing hidden Markov process unknown. The difficulty is to determine the parameters of the process may be viewed from the hidden parameters. Hidden Markov Model (HMM) may be described by five elements, including a set of two and three state probability matrix: implicit state 1 S, 2 observable state O, 3 matrix initial state probability π, 4. implied state transition probability matrix A, 5. observation state transition probability matrix B.
Here Insert Picture Description

Markov chain

For a state Markov chain, the moment of the n + 1 state only with the n-th engraved states associated with the first n-1, n-2, n-3, ... and so there is no time relationship of.

Hidden Markov Models

For the figure, there are two rows of data, the data distribution characteristics of these two columns, the first row is the x row, the second row is the line o, the state of a certain row x depends on the previous state, each row are x wherein a row of points o.

FIG respect to the upper, generally referred to as the x-line state sequence, o line is called the observation sequence, what state and observation sequences are sequences?
State sequence: Hidden Markov chain randomly generated state sequences, state sequence is called.
Observation sequence: a status generator for each observation, and observations of the resulting random sequence, called the observation sequence
So is defined Markov model: Hidden Markov model is a probability model for timing, which is described by the a hidden Markov chain generation state series, and then process the observation sequence generated by the sequence of states. Wherein there is a certain relationship between the conversion and the probability of the observation sequence and the sequence of states between the states.

HMM representation

Let Q be the set of all possible states, V is the set of all possible observations.

Q = q1, q2, ..., qN, V = v1, v2, ..., vM
where, N is the number of possible states, M is the number of possible observation.

T I is a sequence of length state, O is the observed sequence corresponds.

= The I (I1, I2, ..., iT), = O (O1, O2, ..., oT)
A state transition matrix is: A = [of aij] N × N
I = 1,2, ... , N; j = 1,2, ... , N
at which, at time t, in the state qi at time t + 1 the condition to the state transition probability qj:

= P of aij (QJ = IT +. 1 | IT = Qi)
B is the observation probability matrix: B = [BJ (K)] × N M
K = 1,2, ..., M; J = 1,2 ,. .., N
wherein generating the probability of the observed vk state qj at time t under conditions:

BJ (K) = P (OT = VK | IT = QJ)
[pi] is the initial state probability vector: π = (πi)
where, πi = P (i1 = qi )
Hidden Markov Model state probability vector [pi] from the initial state A transition probability matrix and an observation probability matrix B determined. π A and decision state sequence, B observation sequence determined. Thus, hidden Markov model may be represented ternary symbol [lambda], namely: λ = (A, B, π). A, B, π three elements called Hidden Markov Model.

Two assumptions

(1): provided dependent Hidden Markov state in arbitrary time t and the previous state only time, but also at other times regardless of the state and observation independent of time t. (Homogeneous Markov assumption)

(2): Suppose observed any time depends only on the state of the Markov chain that time, irrespective of the state and other observations. (Observation independence assumption)

Examples

The above example has a wiki predict whether the patient is cold
imagine a rural clinics, physical condition of the name of the village is either healthy or have a fever, they only ask the doctor of the clinic in order to know whether the fever. Doctors feel by asking the name of the village to diagnose whether they have a fever. The villagers themselves have normal feeling, dizziness or cold.
Suppose a patient came to the clinic every day and tell the doctor his feelings. Assuming that patients' health is a discrete Markov chain. The patient's status in two ways: health and fever, but doctors can not be directly observed, which means that state doctors are not visible.
Every day the doctor will tell the patient that they have one of several determined by the state of his health feeling: normal, cold or dizziness. These are the observations. The whole system is a hidden Markov model (HMM).
Status: doctors know the overall health of the villagers, but also did not know fever and fever patients often show that they have any symptoms. In other words, doctors know the parameters of the hidden Markov model.
According to the information collected, the following data is obtained.
Status of the patient, i.e. Q ( 'Healthy', 'Fever ')
the patient feeling that the observation state, i.e. :( V 'Normal', 'Cold', 'Dizzy')
[pi] is the initial state probability vector: { ' Healthy ': 0.6,' Fever ' : 0.4}
state transition matrix:

transition_probability = {
   'Healthy' : {'Healthy': 0.7, 'Fever': 0.3},
   'Fever' : {'Healthy': 0.4, 'Fever': 0.6},
   }

Observation probability matrix:

emission_probability = {
   'Healthy' : {'normal': 0.5, 'cold': 0.4, 'dizzy': 0.1},
   'Fever' : {'normal': 0.1, 'cold': 0.3, 'dizzy': 0.6},
   }

Doctors believe that the probability of starting start_probability expressed its HMM state when the patient first visit, the only thing he knows is that patients tend to be healthy. A certain probability distribution used herein is not balanced, such as the transition probability is about { 'Healthy': 0.57, ' Fever': 0.43}. Transition_probability transition probability indicates a change in the underlying Markov chain health status.
In this example, the same day the health of patients is only 30% chance that the next day will have a fever. Radiation probability emission_probability indicate the possibility of the patient feels every day. If he is healthy, 50% feel normal. If he has a fever, 60 percent may feel dizzy.
as the picture shows:
Here Insert Picture Description

Three issues of HMM

1: assessment
known model λ = (A, B, π ) and the observation sequence O = o1, o2, ..., oT, occurrence probability of certain observation sequence O [lambda] is calculated in the model P (O | λ)
i.e. HMM model parameters A, B, π known. Patients seeking the probability of a series of symptoms appear?
2: Learning Problems
known observation sequence O = o1, o2, ..., oT, estimate the model parameters λ = (A, B, π ), so that P (O | λ) maximum. That it is estimated by maximum likelihood method parameters using EM idea,
that is, through a series of patients showed symptoms, the best model to estimate parameters of this series of symptoms appear.
3: PREDICTION (decoding problem)
known in the observation sequence O = o1, o2, ..., oT and model parameters λ = (A, B, π ), find the conditional probability of a given observation sequence P (I | O) Maximum the state sequence I = (i1, i2, ... , iT), i.e., a given observation sequence, for the most likely sequence of states corresponding to.
V is by observing the state of the chain, to predict whether a patient has a cold the past few days and other states.

Viterbi algorithm

Patient three consecutive days to see a doctor, the doctor found that his three days of feeling are: normal feeling, feeling cold, feeling dizzy. Doctors want to know how the health status of the sequence can best explain this series of observations. This time we need to use the Viterbi algorithm to solve this problem.
Viterbi algorithm code

def print_dptable(V):
    print "    ",
    for i in range(len(V)): print "%7d" % i,
    print

    for y in V[0].keys():
        print "%.5s: " % y,
        for t in range(len(V)):
            print "%.7s" % ("%f" % V[t][y]),
        print

def viterbi(obs, states, start_p, trans_p, emit_p):
    V = [{}]
    path = {}

    # Initialize base cases (t == 0)
    for y in states:
        V[0][y] = start_p[y] * emit_p[y][obs[0]]
        path[y] = [y]

    # Run Viterbi for t > 0
    for t in range(1,len(obs)):
        V.append({})
        newpath = {}

        for y in states:
            (prob, state) = max([(V[t-1][y0] * trans_p[y0][y] * emit_p[y][obs[t]], y0) for y0 in states])
            V[t][y] = prob
            newpath[y] = path[state] + [y]

        # Don't need to remember the old paths
        path = newpath

    print_dptable(V)
    (prob, state) = max([(V[len(obs) - 1][y], y) for y in states])
    return (prob, path[state])

Viterbi parameters are explained as follows: obs is the observation sequence, e.g. [ 'normal', 'cold' , 'dizzy']; states as a set of implicit state; START_PYour a starting state probability; trans_p is a transition probability; and emit_p to radiation probability. To simplify the code, we assume that a non-empty and the observation sequence obs trans_p [i] [j] and emit_p [i] [j] i for all states, j is defined.
Viterbi parameter gives

states = ('Healthy', 'Fever')
 
observations = ('normal', 'cold', 'dizzy')
 
start_probability = {'Healthy': 0.6, 'Fever': 0.4}
 
transition_probability = {
   'Healthy' : {'Healthy': 0.7, 'Fever': 0.3},
   'Fever' : {'Healthy': 0.4, 'Fever': 0.6},
   }
 
emission_probability = {
   'Healthy' : {'normal': 0.5, 'cold': 0.4, 'dizzy': 0.1},
   'Fever' : {'normal': 0.1, 'cold': 0.3, 'dizzy': 0.6},
   }

Solving

def example():
    return viterbi(observations,
                   states,
                   start_probability,
                   transition_probability,
                   emission_probability)
print example()

to sum up

The Viterbi algorithm revealed by observing results [ 'normal', 'cold' , 'dizzy'], to find the most likely state sequence [ 'Healthy', 'Healthy' , 'Fever'] is generated. In other words, for the observed activity, the first day the patient was normal, the next day feel are healthy when cold, and the third day of fever.
Viterbi path is essentially the longest path through the trellis. Examples clinic following format structure, the black bold Viterbi path:
Here Insert Picture Description

Guess you like

Origin www.cnblogs.com/zhangxinying/p/12071061.html
HMM