Hello world
PHS 7045: Advanced Programming
The University of Utah
2024-09-17
Learning objectives:
We will need a compiler:
The program
library()
in R. This is part of the standard library.
cout
function from std
(standard library). Also, the code ends with semicolon (;
).
We can use g++
to compile the code (-std=c++14
is the C++14 standard):
Download the program here.
#include<iostream> // To print
#include<vector> // To use vectors
int main() {
// Defining the data
std::vector< double > dat = {1.0, 2.5, 4.4};
// Making room for the output
double ans = 0.0;
// Looping through the data
for (int i = 0; i < dat.size(); ++i)
ans = ans + dat[i];
ans = ans/dat.size();
// Print out the value to the screen
std::cout << "The mean of dat is " << ans << std::endl;
// Returning
return 0;
}
We can leverage modern C++ to make the code shorter with std::accumulate()
#include<iostream> // To print
#include<vector> // To use vectors
#include<numeric> // To use the accumulate function
int main() {
// Defining the data
std::vector< double > dat = {1.0, 2.5, 4.4};
// Making room for the output
double ans = std::accumulate(
dat.begin(), dat.end(), 0.0
);
ans /= dat.size();
// Print out the value to the screen
printf("The mean of dat is %.2f\n", ans);
// Returning
return 0;
}
Here are some differences between C++ and R:
Feature | C++ | R |
---|---|---|
Type | Compiled | Interpreted |
Type explicit? | Yes | No |
Index starts at | 0 | 1 |
for loop |
for (int i = 0; i < n; ++i) |
for (i in 1:n) |
Line ending | “; ” |
“\n ” (implicit) |
Compiled: the code is translated to machine code before running, allowing for faster execution. Interpreted: the code is executed, allowing interactive programming.
Type explicit: in C++, we always declare the type of the variables. In R, we don’t need to.
Adapted from W3 Schools:
int my_num = 5; // Integer (whole number)
float my_float_num = 5.99; // Floating point number
double my_double_num = 9.98; // Floating point number
char my_letter = 'D'; // Character
bool my_boolean = true; // Boolean
std::string my_text = "Hello"; // String
Vectors in C++ are similar to lists in R:
Vectors make life easier, avoiding the need to manage memory.
Vectors store contiguous memory, allowing for fast access.
Vectors have many methods to manipulate the data:
Looping through vectors can be done in different ways:
// Suppose we have this:
std::vector< int > my_vector = {1, 2, 3, 4, 5};`
// Typical loop
for (int i = 0; i < my_vector.size(); ++i) {
std::cout << my_vector[i] << std::endl;
}
// Using vector's iterators (begin and end)
// and the auto keyword
for (auto i = my_vector.begin(); i != my_vector.end(); ++i) {
std::cout << *i << std::endl;
}
// Using range-based for loop (with the auto keyword)
for (auto i: my_vector) {
std::cout << i << std::endl;
}
i
becomes a pointer to the value.Types can go accompained by keywords:
1const int x = 5;
2double fun(int x)
3double fun(const int x)
4double fun(int & x)
5double fun(const int & x)
6double fun(int * x)
const
: the value of x
cannot be changed. Trying to modify it will result in a compilation error.
x
is passed by copy (not ideal for large objects). It can be modified inside the function.
x
is still a copy, but it cannot be modified.
&
: passing by reference. Ideal for large objects. It can be modified.
const &
: passing by reference, but cannot be modified.
*
: passing by pointer. The value can be modified. NOT RECOMENDED FOR C++
The following code (pointers.cpp) illustrates how these keywords work:
#include <cstdio> // For the std version of printf
void set_x_copy(int x, int y) {x = y;};
void set_x(int * x, int y) {*x = y;};
void set_x_ref(int & x, int y) {x = y;};
int main() {
int x = 0;
set_x_copy(x, 3);
std::printf("x = %d\n", x);
set_x(&x, 2);
std::printf("x = %d\n", x);
set_x_ref(x, 1);
std::printf("x = %d\n", x);
return 0;
}
set_x_copy
: x
is passed by copy, and thus the value is not modified.set_x
: x
is passed by pointer, and thus the value is modified. *x
access the value at x
, and &x
is it’s memory address.set_x_ref
: x
is passed by reference, and thus the value is modified. Passed by reference is the preferred way in C++.Example class (you can download the file here):
1#ifndef PERSON_HPP
#define PERSON_HPP
#include<string>
#include<iostream>
class Person {
2private:
std::string name;
int age;
double height;
3public:
// Constructor
4 Person(std::string n, int a, double h) {
name = n;
age = a;
height = h;
};
// Default constructor
Person() : name("Unknown"), age(0), height(0.0) {};
// Destructor
5 ~Person() {
std::cout <<
6 this->name + " destroyed" <<
std::endl;
};
// Getters and setters
7 std::string get_name() { return name; };
void set_name(std::string n) { name = n; };
};
#endif
#ifndef
+ #define
+ #endif
is the include guard. Avoids multiple inclusions.
this->
.
Using the class (you can download the file here):
#include<string>
#include<iostream>
#include "person.hpp"
int main() {
Person p1; // Default constructor
Person p2("John", 30, 1.80); // Other constructor
std::cout << p1.get_name() << std::endl;
std::cout << p2.get_name() << std::endl;
return 0;
}
Compiling and executing the program:
Notice that the destroyer is called when p1
and p2
go out of scope (in reverse order).
A good practice is to separate the declaration (bones) from the implementation (meat).
Looking at an extract of the class Person
:
// ---------------------------------------
// Declarations: Arguments and data types
// ---------------------------------------
class Person {
private:
std::string name;
int age;
double height;
public:
// Constructor
Person(std::string n, int a, double h);
// Getters and setters
std::string get_name();
};
// ---------------------------------------
// Implementation: Body of the functions
// ---------------------------------------
inline Person::Person(std::string n, int a, double h) {
name = n;
age = a;
height = h;
};
inline std::string Person::get_name() {
return name;
};
In C++, we can have multiple functions with the same name, but different arguments. This is called overloading.
The compiler will choose the correct function based on the arguments. Both of these functions are valid:
In C++, we can use templates to create functions or classes that can work with any data type.
This is useful when we want to create a function that works with int
, double
, float
, etc.
Classes can also be templated (defined in template_class.cpp):
T
can be any type.int
and double
. The compiler with generate two classes during compilation.One way to estimate \(\pi\) is to simulate points in a square and count how many are inside a circle.
The following is an optimized R function to do this:
\(A = \pi r^2\), thus \(\pi = \frac{A}{r^2}\).
my_pi_sim <- function(n) {
xy <- matrix(runif(n*2, min=-1, max=1), ncol = 2)
message(
sprintf(
"pi approx to: %.4f",
mean(sqrt(rowSums(xy^2)) <= 1) * 4
)
)
}
set.seed(331)
my_pi_sim(1e6)
pi approx to: 3.1393
Let’s see how we can do this in C++.
#include <vector>
1#include <random>
int main() {
// Setting the seed
2 std::mt19937 rng_engine;
rng_engine.seed(123);
3 std::uniform_real_distribution<double> dist(-1.0, 1.0);
// Number of simulations
size_t n_sims = 5e6;
// Defining the data
double pi_approx = 0.0;
for (size_t i = 0u; i < n_sims; ++i)
{
// Generating a point in the unit square
double x = dist(rng_engine);
double y = dist(rng_engine);
double dist = std::sqrt(
4 std::pow(x, 2.0) + std::pow(y, 2.0)
);
// Checking if the point is inside the unit circle
if (dist <= 1.0)
pi_approx += 1.0;
}
printf("pi approx to %.4f\n", 4.0*pi_approx/n_sims);
return 0;
}
std::pow
is the power function.
The task is to write a class that computes the mean, standard deviation, minimum and maximum of a vector.
The class should be a template class so it can deal with double
and int
.
You can download the full C++ code here and the header file here:
#ifndef SUMMARY_HPP
#define SUMMARY_HPP
#include <vector>
#include <numeric>
#include <cmath>
#include <cstdio>
template<typename T>
class Summarizer {
private:
const std::vector<T>* dat = nullptr;
double n;
public:
// Constructors
Summarizer(const std::vector<T> & dat_);
// Calculators
double mean() const;
double sd() const;
T min() const;
T max() const;
// Printer
void print() const;
};
template<typename T>
inline Summarizer<T>::Summarizer(const std::vector<T> & dat_) {
dat = &dat_;
n = dat->size();
};
template<typename T>
inline double Summarizer<T>::mean() const {
return std::accumulate(
dat->begin(), dat->end(), 0.0
) / n;
};
template<typename T>
inline double Summarizer<T>::sd() const {
double m = mean();
double sum = 0.0;
for (auto & i: *dat)
sum += std::pow(i - m, 2.0);
return std::sqrt(sum / (dat->size() - 1));
};
template<typename T>
inline T Summarizer<T>::min() const {
T min = (*dat)[0];
for (std::size_t i = 1u; i < dat->size(); ++i)
if ((*dat)[i] < min)
min = (*dat)[i];
return min;
};
template<typename T>
inline T Summarizer<T>::max() const {
T max = (*dat)[0];
for (std::size_t i = 1u; i < dat->size(); ++i)
if ((*dat)[i] > max)
max = (*dat)[i];
return max;
};
template<>
inline void Summarizer<double>::print() const {
std::printf("Summary for double data\n");
std::printf("Mean : %.2f\n", mean());
std::printf("SD : %.2f\n", sd());
std::printf("Min : %.2f\n", min());
std::printf("Max : %.2f\n", max());
};
template<>
inline void Summarizer<int>::print() const {
std::printf("Summary for int data\n");
std::printf("Mean : %.2f\n", mean());
std::printf("SD : %.2f\n", sd());
std::printf("Min : %d\n", min());
std::printf("Max : %d\n", max());
};
#endif
template<typename T>
inline Summarizer<T>::Summarizer(const std::vector<T> & dat_) {
dat = &dat_;
n = dat->size();
};
The implementation of the constructor is done outside of the function.
The inline
keyword is used to tell the compiler to insert the code in the place where the function is called (more efficient).
Here, data is passed by reference and then the pointer is stored.
template<typename T>
inline double Summarizer<T>::mean() const {
return std::accumulate(
dat->begin(), dat->end(), 0.0
) / n;
};
The function is declared as const
to tell the compiler that the function does not modify the object (the class itself).
The mean function uses the std::accumulate
function.
Since dat
is a pointer to a vector, we can access the members of dat
via the ->
operator (otherwise it would be using a .
operator).
#include "summary.hpp"
int main() {
// Some data
std::vector< double > dat = {1.0, 2.5, 4.4};
std::vector< int > dat2 = {1, 2, 3, 4, 5};
// Summarize the data
Summarizer<double> s_double(dat);
s_double.print();
Summarizer<int> s_int(dat2);
s_int.print();
return 0;
}
Summary for double data
Mean : 2.63
SD : 1.70
Min : 1.00
Max : 4.40
Summary for int data
Mean : 3.00
SD : 1.58
Min : 1
Max : 5