Lab 05 - Rcpp and C++
Learning goals
- Use the different data types in Rcpp.
- Learn some fundamentals about C++ optimization.
- Practice your GitHub skills.
Lab description
For this lab, we will create a function for propensity score matching. The goal is simple: write out a C++ function with Rcpp and measure how faster it is compared to the following R implementation:
<- function(x) {
ps_matchR
<- as.matrix(dist(x))
match_expected diag(match_expected) <- .Machine$integer.max
<- apply(match_expected, 1, which.min)
indices
list(
match_id = as.integer(unname(indices)),
match_x = x[indices]
)
}
Question 1: Create a simple function
Use the following pseudo-code template to get started:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
[output must be list] ps_match1(const NumericVector & x) {
...prepare the output (save space)...
...it should be an integer vector indicating the id of the match...
...and a numeric vector with the value of `x` for the match...
for (...loop over i...) {
for (...loop over j and check if it is the optimum...) {
if (...the closests so far...) {
...update the optimum...
}
}
}
return [a list like the R function]
}
Question 2: Things can be done faster
In the previous question, we have a double loop running twice over the full set of observations. We need you to write the C++ so that the computational complexity goes below n^2
. (hint: Distance is symmetric)