First, we make a copy of our data for a subsequent analysis. Next, we assign the
class Clickstreams
to our list of clickstreams.
clstr2 <- clstr
class(clstr) <- "Clickstreams"
We make a zero-order Markov chain, which is the current distribution of our data, and a first-order Markov chain , which is the transition matrix.
mc <- fitMarkovChain(clickstreamList = clstr,
order = 0,
control = list(optimizer = "quadratic"))
mc1 <- fitMarkovChain(clickstreamList = clstr2,
order = 1,
control = list(optimizer = "quadratic"))
ErrorinfitMarkovChain(clickstreamList = clstr2, order = 1, control = list(optimizer = "quadratic")):
The order is too high for the specified clickstreams.
The latter does not work because for some we cannot determine the first order MC. Therefore, we only take the ones that have visited more than one page (can be for instance multiple times the MMA page).
clstr2 <- clstr2[map_int(clstr2, length) > 1]
class(clstr2) <- "Clickstreams"
mc1 <- fitMarkovChain(clickstreamList = clstr2,
order = 1,
control = list(optimizer = "quadratic"))
We make a Markov chain of order 2 = two transition matrices for each lag.
clstr2 <- clstr2[map_int(clstr2, length) > 2]
mc2 <- fitMarkovChain(clickstreamList = clstr2,
order = 2,
control = list(optimizer = "quadratic"))
Analyze the results some more
plot(mc2, order = 2)
View(t(mc2@transitions[[1]]))
analytics/CI_start.htm | analytics/Graduates.htm | analytics/IT_backbone.htm | analytics/IT_frontend.htm | analytics/Keybenefits.htm | analytics/Projects.htm | |
---|---|---|---|---|---|---|
analytics/CI_start.htm | 0.00000000 | 0.1964285714 | 0.00000000 | 0.00000000 | 0.00000000 | 0.0000000000 |
analytics/Graduates.htm | 0.00000000 | 0.0000000000 | 0.01754386 | 0.00000000 | 0.26315789 | 0.0175438596 |
analytics/IT_backbone.htm | 0.00000000 | 0.0204081633 | 0.00000000 | 0.57142857 | 0.00000000 | 0.0000000000 |
analytics/IT_frontend.htm | 0.00000000 | 0.0000000000 | 0.18367347 | 0.00000000 | 0.00000000 | 0.0000000000 |
analytics/Keybenefits.htm | 0.01886792 | 0.0377358491 | 0.00000000 | 0.00000000 | 0.00000000 | 0.5283018868 |
Create a zero-order, first-order, and second-order Markov chain for the logs
data
and store it as mc
, mc1
, and mc2
, respectively.
Note that you need to filter the clstr
variable, that you created in the
previous exercise, on users that have visited more than one (two) page(s) to create
mc1
(mc2
).
Note: Don’t forget to define the Clickstreams
class.
To download the all_logs_ugent
dataset click
here1.
To download the logs
dataset click
here2.
Assume that:
clstr
variable that was calculated in the previous exercise is given.