First, we make a copy of our data for a subsequent analysis. Next, we assign the
class Clickstreams to our list of clickstreams.
clstr2 <- clstr
class(clstr) <- "Clickstreams"
We make a zero-order Markov chain, which is the current distribution of our data, and a first-order Markov chain , which is the transition matrix.
mc <- fitMarkovChain(clickstreamList = clstr,
order = 0,
control = list(optimizer = "quadratic"))
mc1 <- fitMarkovChain(clickstreamList = clstr2,
order = 1,
control = list(optimizer = "quadratic"))
ErrorinfitMarkovChain(clickstreamList = clstr2, order = 1, control = list(optimizer = "quadratic")):
The order is too high for the specified clickstreams.
The latter does not work because for some we cannot determine the first order MC. Therefore, we only take the ones that have visited more than one page (can be for instance multiple times the MMA page).
clstr2 <- clstr2[map_int(clstr2, length) > 1]
class(clstr2) <- "Clickstreams"
mc1 <- fitMarkovChain(clickstreamList = clstr2,
order = 1,
control = list(optimizer = "quadratic"))
We make a Markov chain of order 2 = two transition matrices for each lag.
clstr2 <- clstr2[map_int(clstr2, length) > 2]
mc2 <- fitMarkovChain(clickstreamList = clstr2,
order = 2,
control = list(optimizer = "quadratic"))
Analyze the results some more
plot(mc2, order = 2)

View(t(mc2@transitions[[1]]))
| analytics/CI_start.htm | analytics/Graduates.htm | analytics/IT_backbone.htm | analytics/IT_frontend.htm | analytics/Keybenefits.htm | analytics/Projects.htm | |
|---|---|---|---|---|---|---|
| analytics/CI_start.htm | 0.00000000 | 0.1964285714 | 0.00000000 | 0.00000000 | 0.00000000 | 0.0000000000 |
| analytics/Graduates.htm | 0.00000000 | 0.0000000000 | 0.01754386 | 0.00000000 | 0.26315789 | 0.0175438596 |
| analytics/IT_backbone.htm | 0.00000000 | 0.0204081633 | 0.00000000 | 0.57142857 | 0.00000000 | 0.0000000000 |
| analytics/IT_frontend.htm | 0.00000000 | 0.0000000000 | 0.18367347 | 0.00000000 | 0.00000000 | 0.0000000000 |
| analytics/Keybenefits.htm | 0.01886792 | 0.0377358491 | 0.00000000 | 0.00000000 | 0.00000000 | 0.5283018868 |
Create a zero-order, first-order, and second-order Markov chain for the logs data
and store it as mc, mc1, and mc2, respectively.
Note that you need to filter the clstr variable, that you created in the
previous exercise, on users that have visited more than one (two) page(s) to create
mc1 (mc2).
Note: Don’t forget to define the Clickstreams class.
To download the all_logs_ugent dataset click
here1.
To download the logs dataset click
here2.
Assume that:
clstr variable that was calculated in the previous exercise is given.