💻 Tutorial 02: Introduction to VIMuRe in R

VIMuRe v0.1.3 (latest)

basics
R
If you have network data that is formatted as an edge list, or a set of igraph objects, you can apply VIMuRe model the latent structure of your network.
Note

If you use VIMuRe in your research, please cite (De Bacco et al. 2023).

TLDR: By the end of this tutorial, you will be able to:

Found an interesting use case for VIMuRe? Let us know! Open a discussion on our GitHub repository.

⚙️ Setup

Import packages

library(haven)
library(tidyr)
library(dplyr)
library(magrittr)

library(igraph)
library(vimure)

⚠️ Ensure you have installed the latest version of VIMuRe before running this tutorial. Follow the 📦 Installation instructions if you haven’t already.

📥 Step 1: Load the data

Here, we’ll use the VIMuRe package to model social networks data from a village in Karnataka, India, which were gathered as part of a project studying the adoption of microfinance (Banerjee et al. 2013b). The data files we use here are derived from those openly-available on the Harvard Dataverse page (Banerjee et al. 2013a) and raw data files available on Matthew O. Jackson’s website. See 💻 Tutorial 1 for more details on how to prepare data for VIMuRe.

We have selected a particular village to focus on. The dataset contains information on multiple types of social relationships. We are interested in relationships that were “double-sampled” in the original survey, meaning those elicited through two different prompts. For example, respondents were asked both “If you needed to borrow kerosene or rice, to whom would you go?” as well as “Who would come to you if he/she needed to borrow kerosene or rice?” These two different prompts should give us different perspectives on what could be the same relationship. If everyone in the network is being surveyed, then this should mean that we are eliciting information on a relationship twice (hence, “double sampling”): we’re asking ego if they go to alter for help, and also asking ego if alter comes to them for help.

Four different tie types (or “layers”) were double-sampled in this way (i.e., elicited with two prompts). Here, we will use all four layers to model the latent network structure. The four layers are:

  • money: lending and borrowing money
  • advice: giving and receiving advice
  • visit: visiting and being visited
  • kerorice: lending and borrowing of household items

We stored all the data in a single data frame, edgelist.

Click to see the code we used
# Here we are using the get_layer() function defined in Tutorial 01

library(haven)
library(tidyr)
library(dplyr)
library(magrittr)
library(igraph)

layer_money <- get_layer(1, "money", get_indivinfo(METADATA_FILE), raw_csv_folder=RAW_CSV_FOLDER)
layer_advice <- get_layer(1, "advice", get_indivinfo(METADATA_FILE), raw_csv_folder=RAW_CSV_FOLDER)
layer_kerorice <- get_layer(1, "kerorice", get_indivinfo(METADATA_FILE), raw_csv_folder=RAW_CSV_FOLDER)
layer_visit <- get_layer(1, "visit", get_indivinfo(METADATA_FILE), raw_csv_folder=RAW_CSV_FOLDER)

edgelist <- bind_rows(layer_money$edgelist, 
                      layer_advice$edgelist, 
                      layer_kerorice$edgelist, 
                      layer_visit$edgelist)
edgelist$ego <- as.character(edgelist$ego)
edgelist$alter <- as.character(edgelist$alter)
edgelist$reporter <- as.character(edgelist$reporter)

reporters <-  union(layer_money$reporters, layer_advice$reporters) %>% union(layer_kerorice$reporters) %>% union( layer_visit$reporters)
rm(layer_money, layer_advice, layer_kerorice, layer_visit)

library(vimure)
vm_config()


# The code below should help you confirm that all reporters appear in the edgelist:
setdiff(reporters, unique(edgelist$reporter)) # This should return an empty set

# Load in the network data that we processed in [Tutorial 1](/latest/tutorials/python/tutorial01-data-preparation.qmd) for Village 1
PROCESSED_CSV_FOLDER <- "docs/data/2010-0760_Data/Data/Raw_csv"

# Read in all of the csv's and combine them into one dataframe
edgelist <- do.call("rbind", list(
money = read.csv("village1_money.csv"),
layer_advice = read.csv("village1_advice.csv"),
layer_kerorice = read.csv("village1_kerorice.csv"),
layer_visit = read.csv("village1_visit.csv")
    ))

# Clean up the row names    
rownames(edgelist) <- NULL

After our preprocessing, the input data takes the form of the dataset given below.

  • ego: generally the requester, who is also referred to as \(i\) in (De Bacco et al. 2023) (required ✔️)
  • alter: generally the giver, also referred to as \(j\) in (De Bacco et al. 2023) (required ✔️)
  • reporter: the person reporting on the tie, as referred to as \(m\) in (De Bacco et al. 2023) (required ✔️)
  • tie_type: the prompt type that was given, e.g., borrow money or lend money (optional)
  • layer: the relationship type, e.g., money (optional)
  • weight: the weight of the edge, e.g., the actual amount of money transferred (optional)

Suppose we have a data frame edgelist of our village’s data that looks like this:

set.seed(100) # for reproducibility

edgelist %>% dplyr::sample_n(size = 10, replace = FALSE)
A typical data input for VIMuRe
ego alter reporter tie_type layer weight
110501 110601 110501 visitcome visit 1
108602 109502 109502 keroricego kerorice 1
107308 106302 107308 lendmoney money 1
114701 117501 114701 keroricecome kerorice 1
104001 102101 104001 visitcome visit 1
115503 115504 115504 helpdecision advice 1
110601 102901 110601 lendmoney money 1
112902 106802 106802 helpdecision advice 1
110402 116203 116203 borrowmoney money 1
100702 100801 100702 lendmoney money 1

(Ensure that column data types are strings, not numeric.)

Alternatively, you might have your data as a directed igraph object. In this case, you must ensure that the igraph object has the following edge attributes:

  • reporter: the person reporting on the tie, as referred to as \(m\) in (De Bacco et al. 2023) (required ✔️)
  • tie_type: the prompt type that was given (optional)
  • layer: the tie type (optional)
  • weight: the weight of the edge (optional)

Note that the graph_from_data_frame function assumes that the first two columns in the data frame comprise the edge list (i.e., the \(ego\) and \(alter\) columns and treat all subsequent columns as edge attributes).

G <- igraph::graph_from_data_frame(edgelist, directed=TRUE)

Suppose our village’s data is in the form of an igraph object G:

summary(G)

which gives us the following summary:

IGRAPH 46333aa DNW- 417 2690 -- 
+ attr: name (v/c), reporter (e/c), tie_type (e/c), layer (e/c), weight | (e/n)

📊 Step 2: Summary statistics

Let’s take a closer look at how many nodes and edges we have in our data. We can also look at how many reporters we have in our data.

2.1 Number of nodes

unique_nodes <- dplyr::union(edgelist$ego, edgelist$alter) %>% unique()

cat("Number of nodes:", length(unique_nodes), "\n")
cat("Number of edges:", nrow(edgelist), "\n")
Number of nodes: 417
Number of edges: 2690 
cat("Number of nodes:", igraph::vcount(G), "\n")
cat("Number of edges:", igraph::ecount(G), "\n")
Number of nodes: 417
Number of edges: 2690 

2.2 Number of reporters

reporters <- edgelist$reporter %>% unique()
cat("Number of reporters: ", length(reporters), "\n")
Number of reporters:  203 
reporters <- igraph::get.edge.attribute(G, "reporter") %>% unique()
cat("Number of reporters: ", length(reporters), "\n")
Number of reporters:  203 

2.3: Average number of ties per reporter

There are a couple of things to note about this dataset: reporters could name a maximum of four alters for each question, and reporters can only report on ties that they are involved in (e.g., if a reporter is involved, they cannot report on it). Because we are modelling double-sampled questions, each reporter can report a maximum of 8 ties.

Let’s create a plot to visualise the distribution of the number of ties per reporter:

# Create a plot_df to summarise the number of ties per reporter

plot_df <- edgelist %>%
  group_by(layer, tie_type, reporter) %>%
  summarize(n_ties = n(), .groups="keep") %>%
  ungroup() %>%
  group_by(n_ties, tie_type, layer) %>%
  summarize(n_reporters = n(), .groups="keep") %>%
  ungroup() %>%
  arrange(layer, n_ties, tie_type)
# Create a plot_df to summarise the number of ties per reporter
plot_df <- data.frame(get.edgelist(G))
colnames(plot_df) <- c("ego", "alter")

add_cols <- list.edge.attributes(G)
for (i in 1:length(add_cols)) {
  plot_df[, add_cols[i]] <- get.edge.attribute(G, add_cols[i])
}

plot_df <- plot_df %>%
  group_by(layer, tie_type, reporter) %>%
  summarize(n_ties = n(), .groups="keep") %>%
  ungroup() %>%
  group_by(n_ties, tie_type, layer) %>%
  summarize(n_reporters = n(), .groups="keep") %>%
  ungroup() %>%
  arrange(layer, n_ties, tie_type)

producing the following table:

n_ties tie_type layer n_reporters
1 giveadvice advice 115
1 helpdecision advice 104
2 giveadvice advice 45
2 helpdecision advice 73
3 giveadvice advice 6
3 helpdecision advice 13
4 giveadvice advice 1
4 helpdecision advice 2
n_ties tie_type layer n_reporters
1 keroricecome kerorice 41
1 keroricego kerorice 42
2 keroricecome kerorice 99
2 keroricego kerorice 100
3 keroricecome kerorice 45
3 keroricego kerorice 40
4 keroricecome kerorice 2
4 keroricego kerorice 4
n_ties tie_type layer n_reporters
1 borrowmoney money 57
1 lendmoney money 66
2 borrowmoney money 92
2 lendmoney money 86
3 borrowmoney money 31
3 lendmoney money 16
4 borrowmoney money 1
n_ties tie_type layer n_reporters
1 visitcome visit 45
1 visitgo visit 36
2 visitcome visit 84
2 visitgo visit 67
3 visitcome visit 39
3 visitgo visit 38
4 visitcome visit 11
4 visitgo visit 31

Alternatively as a bar plot:

Click to show plot code
library(ggplot2)

plot_df$tie_type <- 
  factor(plot_df$tie_type, 
         levels = c("visitgo", "visitcome",
                    "keroricego", "keroricecome", 
                    "borrowmoney", "lendmoney",
                    "helpdecision", "giveadvice"))

plot_df$layer <- 
  factor(plot_df$layer, 
         levels = c("advice", "money", "kerorice", "visit"))

g = (
    ggplot(plot_df, aes(x = tie_type, y=n_reporters, fill=layer)) +
    geom_col() +
    coord_flip() +

    facet_grid(n_ties ~ ., labeller=label_both) +

    theme_bw() +
    theme(legend.position = "bottom",
          legend.box.spacing = unit(0.5, "pt")) +
    labs(x = "Number of ties reported for this prompt", 
         y = "Number of reporters",
         fill = "Layer")
)

ggsave(plot=g, filename="tutorial02_fig01.png", width=6, height=8, dpi=300)

Important

Note that the plot above only represents the reports made by the 203 nodes that appear in the reporter column. Given that the total number of nodes in the network is 417, either some nodes did not report on any ties, or they were not interviewed at all.

VIMuRe will consider all nodes in the network, even if they are not present in the reporter column. If you want to restrict your analysis to include only the network of reporters, you must filter the edge list before proceeding.

2.4: Union vs. Intersection and the concept of Concordance

Concordance is the proportion of ties in a network that both reporters report. It measures the extent to which the two reporters agree about the tie (see (Ready and Power 2021) for a discussion of concordance in the Karnataka network data). It is calculated as follows:

\[ \text{Concordance} = \frac{\text{\# of ties reported by both reporters}}{\text{\# number of unique ties reported}} \]

# Take the intersection: keep only records where 
# both reporters report on the same tie in both tie_types
df_intersection <- 
  edgelist %>% 
  group_by(ego, alter, .drop=FALSE) %>% 
  filter(n() == 2) %>% 
  select(ego, alter) %>%
  distinct()

# Take the union: keep all ties reported 
# irrespective of tie_type and how many times they were reported
df_union <- edgelist %>% select(ego, alter) %>% distinct()

# Concordance
cat(paste0("Concordance is: ", nrow(df_intersection) / nrow(df_union)))
# Take the intersection: keep only records where
# both reporters report on the same tie in both tie_types
which_edges_intersection <- E(G)[count_multiple(G, E(G)) == 2]

G_intersection <- 
  igraph::subgraph.edges(G, which_edges_intersection) %>%
  igraph::simplify()

# Take the union: keep all ties reported
# irrespective of tie_type and how many times they were reported

G_union <- igraph::simplify(G)

# Concordance
cat(
  paste0(
    "Concordance is: ", 
    igraph::ecount(G_intersection) / igraph::ecount(G_union)
  )
)

producing:

Concordance is: 0.200226244343891

📦 Step 3: Use vimure package to run the model

Now that you have prepared your data, you can use vimure to run the model:

library(vimure)

# Run the model
model <- vimure(edgelist)
library(vimure)

# Run the model
model <- vimure(G)

(If you would like to see what the model is doing, you can set verbose=True in the fit() method)

After which, you might be interested to look at the posterior estimates of the model parameters:

# Returns a list with the posterior estimates of the model parameters
posterior_estimates <- model$get_posterior_estimates()

# Check which parameters were estimated
names(posterior_estimates)
"nu"     "theta"  "lambda" "rho" 

We will look at what these mean in the next section.

3.1: Posterior Estimates

The four values you get from the get_posterior_estimates() method are the geometric expectations of the distributions of the following parameters:

Parameter Symbol Description
nu \(\nu\) The inferred mutuality for the network (latent equivalent to \(\eta\))
theta \(\theta_{lm}\) The “reliability” of a reporter \(m\) on layer \(l\)
lambda \(\lambda_{lk}\) The latent contribution that \(Y_l=k\) has on \(\theta_{lm}\), where \(k \in \{1, \ldots, K\}\) represents the weight of the tie.
rho \(\rho_{lijk}\) The probability that a directed tie of weight \(k-1\) exists between nodes \(i\) and \(j\) on a particular layer \(l\)

Remember that the expected value of our model is:

\[ \mathbb{E} \left[X_{lijm}^{}|Y_{lij}^{} = k\right] = \theta_{lm}\lambda_{lk}^{}+\eta X_{ljim}^{} \]

where \(X_{lijm}^{}\) is the observed value of the tie between nodes \(i\) and \(j\) reported by reporter \(m\) on layer \(l\) and \(Y_{lij}^{}\) is the “ground truth” of the strength of the directed tie between nodes \(i\) and \(j\) on layer \(l\).

In the subsections below, we will provide an explanation of \(\nu\) and \(\theta_{lm}\) parameters and how to reason about the mutuality and “reliability” of reporters. \(\rho\) and \(\lambda\) are more technical and we will go into detail about them in a future tutorial.

A quick note about \(\rho\)

If you want to extract a quick point estimate of the inferred network, you can simply take the expected value of \(\rho\) for \(k=1\):

# Expected value of rho for k=1
rho = posterior_estimates[['rho']][1,,,2]

The code above will return a \(N \times N\) weighted adjacency matrix, which can be thought of as the probability of a tie between individuals \(i\) and \(j\) on layer \(l = 0\) (Tutorial 03 will go into more detail about how to interpret the \(\rho\) parameter.)

Now let’s look at parameters \(\nu\) and \(\theta_{lm}\):

3.1.1: Mutuality (\(\nu\) or \(\eta\))

In our case study:

posterior_estimates[['nu']]
0.6289576

This indicates a large mutuality in the network. That is, if \(i\) reports that \(j\) lends money to them, then \(i\) is likely to report that \(j\) also borrows money from them.

This is a network-wide parameter and encompasses all layers. If you have reasons to believe mutuality varies across layers, you have to re-run the model for each layer separately.

3.1.2: Under-reporting vs over-reporting: a combination of (\(\theta_{lm} \times \lambda_{lk}\))

First, let’s look at the shape of the \(\theta\) array:

dim(posterior_estimates[['theta']])
   4 417

We have four layers, so the first dimension is 4. The second dimension is the number of reporters in the network. In our case, this shows that VIMuRe infers that there are 417 reporters in the network.

This parameter, combined with \(\lambda_{lk}\) represents the “reliability” of a reporter \(m\) on layer \(l\):

dim(posterior_estimates[['lambda']])
 4 2

The first dimension is the number of layers \(l\), and the second dimension is the strength of the tie, which in this case can assume the values \(k \in \{0, 1\}\). We only have two possible values for \(k\) because that is the default setting for the VimureModel class. If you want to model more than two possible values for \(k\), you can set the K parameter when running the fit function: model.fit(..., K=3).

These two parameters combined indicate the probability that a reporter will tend to report ties, in general, for a given layer (recall the expected value of the VIMuRe model shown in Section 3.1: Posterior Estimates). Loosely speaking, we can think of this value as the “reliability” of a reporter on a given layer.

You might be wondering why there are so many reporters, 417. After all, in Section 2.2: Number of reporters, we calculated this number to be around 200! This is due to a technicality in the way the VIMuRe package handles the data. Our current implementation represents the total number of reporters, \(M\) as \(M = N\) (i.e. the number of nodes is equal to the number of reporters).

This is akin to saying each node has the potential to be a reporter. But to control for which nodes are actually reporters, we use a binary mask, \(R_{lijm}\), which is 1 if the node \(m\) is a reporter on layer \(l\) for the tie between nodes \(i\) and \(j\) and 0 otherwise. You can control how this mask is built by passing the R parameter to the fit() method. Read more about it in section 3.3: A note on the reporter mask (R).

Changing this is a priority for the next version of the package. But, as it is not a trivial change, we have decided to leave it as it is for now.

How are reporters’ “reliability” distributed across layers?

Let’s check which reporters have the highest “reliability” values, and plot the values for each layer:

Click to see the code
# Get the node mapping: which node corresponds to which index in the array
df_nodes <- model$nodeNames

# Which nodes are reporters?
# Note: this assumes you have a `reporters` list from Tutorial 01
id_reporters <- df_nodes %>% filter(name %in% reporters) %>% pull(id)

df_lambda_k1 <- data.frame(
  layer = model$layerNames,
  lambda_k1 = posterior_estimates$lambda[,2]
)

df_theta_reporters <- purrr::map_dfr(id_reporters, ~ {
  reporter_id <- .x
  purrr::map_dfr(seq_along(model$layerNames), ~ {
    l <- .x
    layer_name <- model$layerNames[l]
    theta <- posterior_estimates$theta[l, reporter_id] * df_lambda_k1$lambda_k1[l]
    reporter_name <- df_nodes$name[reporter_id]
    tibble(
      layer = layer_name,
      reporter_id = reporter_id,
      reporter_name = reporter_name,
      theta_scaled = theta
    )
  })
})


# Sort by reliability (scaled by lambda_k=1)
df_theta_reporters %>%
  pivot_wider(names_from = layer, values_from = theta_scaled, values_fn = list) %>%
  select(-1) %>%
  unnest(cols = everything(), names_repair = "unique") %>%
  arrange(desc(advice), desc(kerorice), desc(money), desc(visit)) %>%
  head()
Click to see the code
library(ggplot2)

COLORS =  c('#E69F25', '#5CB4E4', '#069F72', '#F1E545', '#0773B2', '#CC79A8')

g = (ggplot(df_theta_reporters,  aes(x=theta_scaled, fill=layer)) +
     geom_histogram(binwidth=0.1, color="#212121") +
     theme_bw() + 
     facet_grid(layer ~ ., scales="free_y") +
     scale_fill_manual(values=COLORS, guide="none") +
     theme(axis.text.x=element_text(size=10),
           axis.text.y=element_text(size=10),
           panel.grid.minor=element_blank(),
           panel.grid.major=element_blank(),           
           strip.background = element_blank(),
           strip.text.x = element_text(size=10),
           axis.title=element_text(size=16)) +
    scale_x_continuous(name = expression(paste(theta * lambda[k==1])), 
                      breaks = seq(0, 1.1, by = 0.1), 
                      limits = c(0, 1.1)) +
     scale_y_continuous(name="Count", breaks=seq(0, 60, by=10), limits=c(0, 50)) +
     ggtitle(expression(paste("Reliability distribution across layers (", theta[{l}], " \u00D7 ", lambda[k==1], ")"))))

ggsave(plot=g, filename="theta_reporters.png", width=7, height=9, dpi=300)

How should we interpret the values?

The interpretation depends on both the mutuality of the network and the survey design. If the mutuality is high (as indicated by \(\nu\) being closer to 1), reporters will naturally tend to reciprocate the ties they report, by default, thus reporters with high \(\theta\) values (relative to the \(\theta\) distribution) are likely “over-reporting” ties.

Let’s investigate this a bit further by checking the ties reported by a top reporter (113302) in the advice layer:

reporter_name advice kerorice money visit
113302 0.8970675 0.7397009 0.4591973 0.2099856
113901 0.8780586 0.8209907 0.8680566 0.6693385
109902 0.8504384 0.7928275 0.0000155 0.6415688
109505 0.8504182 0.0000130 0.8384146 0.8622429
105902 0.8504050 0.0000128 0.4580637 0.3737390
105402 0.8503861 0.5638047 0.4760844 0.7225281

Most reporters have reliability in the \([0.4, 0.6]\) interval, but 113302 has a “reliability” of \(\approx 0.9\) in the advice layer. Let’s check the ties reported by this reporter:

edgelist %>% filter(reporter == "113302", layer == "advice")
ego alter reporter tie_type layer weight
113302 110502 113302 giveadvice advice 1
113302 113102 113302 giveadvice advice 1
105701 113302 113302 helpdecision advice 1
110501 113302 113302 helpdecision advice 1
113601 113302 113302 helpdecision advice 1
113301 113302 113302 helpdecision advice 1

Node 113302 does not report any reciprocal ties, but they do report ties with six other nodes.

Are all nodes also reporters?

Two of these six nodes are not reporters:

nodes_mentioned <- c("110502", "113102", "105701", "110501", "113601", "113301")

not_reporters <- nodes_mentioned %>% setdiff(reporters)
not_reporters
 "113102" "105701"

We will never be able to confirm ties between 113302 and these nodes and this is already one of the reasons why 113302’s “reliability” is high. (Note that these two non-reporter nodes will have reliability values of \(\approx 0\)).

On top of that, when we look at the reports made by the remaining four reporters, only node 113301 confirms the helpdecision tie reported by 113302:

potential_friends = c("110502", "110501", "113601", "113301")

edgelist %>% filter(reporter %in% potential_friends & (ego == "113302" | alter == "113302"))
ego alter reporter tie_type layer weight
113301 113302 113301 giveadvice advice 1

That is why 113302’s \(\theta\lambda_{k=1}\) is high: they report a lot of ties, but only one of them is confirmed by other reporters.

Nodes with values in the average range of the \(\theta_{lm}\lambda_{lk}\) distribution is likely to be more “reliable” reporters.

3.2: Implicit assumptions

You might have noticed the following warning messages when you ran vimure():

UserWarning: The set of nodes was not informed, using ego and alter columns to infer nodes.

UserWarning: The set of reporters was not informed, assuming set(reporters) = set(nodes) and N = M.

UserWarning: Reporters Mask was not informed (parameter R). Parser will build it from reporter column, assuming a reporter can only report their own ties.

The messages are not errors; they are simply indications that the VIMuRe package has made some implicit assumptions when loading the data. If you would like to set the nodes, and the reporter mask explicitly, you can pass these values as arguments to the fit() method. For example:

vimure(edgelist, 
       nodes=all_nodes, 
       R=reporter_mask)

3.3: A note on the reporter mask (R)

While nodes and reporters are simple lists of strings, the reporter mask R is a multidimensional sparse logical array (a tensor) with dimensions \(L \times N \times N \times N\). Here, R[l, i, j, m] indicates that reporter \(m\) can report on the tie between \(i\) and \(j\) if the tie belongs to layer \(l\).

So, for example, if the survey design was such that reporters could only report on ties involving themselves, this implies:

\[\begin{cases} R_{lijm} = 1 & \text{if } i = m \text{ or } j = m \\ R_{lijm} = 0 & \text{otherwise}. \end{cases}\]

Depending on your survey design, you might want to construct R manually. For example, if all reporters were asked about all ties – regardless of whether they were involved in the ties or not – then R would be a multi-dimensional array of ones. We are planning a future tutorial on how to handle different survey designs.

It’s worth noting that currently, the last dimension has size \(N\) instead of \(M\). This is because the current implementation of the code assumes that all reporters are part of the network. The mathematical model, as described in (De Bacco et al. 2023), is more flexible and supports an arbitrary set of reporters, whether they are a subset of nodes or not. However, we have not yet implemented this functionality in the code. Our survey designs have so far assumed that all reporters are part of the network, so this has not been a problem. If this is a feature you would like to see sooner, let us know by adding a discussion on GitHub.

4. Next steps

Future tutorials will cover:

  • How to extract point estimates and visualise the resulting network using \(\rho\)
  • How to handle different survey designs
  • How to sample from the posterior distribution
  • (Advanced) Tweaking the optimization parameters
  • (Advanced) Understanding the convergence diagnostics

References

Banerjee, Abhijit, Arun G. Chandrasekhar, Esther Duflo, and Matthew O. Jackson. 2013a. “The Diffusion of Microfinance (Dataset).” Harvard Dataverse. https://doi.org/10.7910/DVN/U3BIHX.
———. 2013b. “The Diffusion of Microfinance.” Science 341 (6144): 1236498. https://doi.org/10.1126/science.1236498.
De Bacco, Caterina, Martina Contisciani, Jonathan Cardoso-Silva, Hadiseh Safdari, Gabriela Lima Borges, Diego Baptista, Tracy Sweet, et al. 2023. “Latent Network Models to Account for Noisy, Multiply Reported Social Network Data.” Journal of the Royal Statistical Society Series A: Statistics in Society, February, qnac004. https://doi.org/10.1093/jrsssa/qnac004.
Ready, Elspeth, and Eleanor A. Power. 2021. “Measuring Reciprocity: Double Sampling, Concordance, and Network Construction.” Network Science 9 (4): 387–402. https://doi.org/10.1017/nws.2021.18.