Social Network Learning: Probabilistic Relational Neighbor Classifier

In this exercise, we will extend the relational neighbor classifier by incorporating a logistic regression model. This process is known as probabilistic relational neighbor classification.

Probabilistic Relational Neighbor Classifier

The probabilistic relational neighbor classifier uses the connections between nodes and a logistic regression model to predict a node’s attribute. In this case, we’re predicting whether a customer will churn or not.

The churn probabilities resulting from the logistic regression are given below.

preds <- c(0.66, 0.85, 0.77, 0.89, 0.56, 0.61, 0.71, 0.42, 0.40, 0.30)

Then, the vertex labels can be updated.

V(g)$churn <- preds

The calculation of the number of churners for each node is equal to the calculation in the previous exercise.

churn_neighbors_upt <- numeric(length = ncol(adjacency))
nonchurn_neighbors_upt <- numeric(length = ncol(adjacency))
for (i in 1:ncol(adjacency)) {
  neighbors <- which(adjacency[,i]==1)
  churn_neighbors_upt[i] <- sum(V(g)$churn[neighbors], na.rm = TRUE)
  nonchurn_neighbors_upt[i] <- sum(1-V(g)$churn[neighbors], na.rm = TRUE)
}

The probability of churn is then.

prob_churn_upt <- churn_neighbors_upt / (churn_neighbors_upt + nonchurn_neighbors_upt)
data.frame(Node = rownames(adjacency), PRN =prob_churn_upt)
0.7675000
0.5400000
0.7200000
0.5800000
0.6160000
0.8300000
0.4800000
0.6033333
0.4975000
0.6450000

Exercise

Consider the PadelNetwork and its probabilities from the previous exercise. Calculate now the probabilistic probability of being a man (gender = 1) for all the nodes. Display the output in a nice dataframe with column names equal to nodes and prob_man. The dataframe itself should be stored in df.

To download the graph from the dataframe click: here1

To download the PadelNetwork click: here2


Assume that: