Intro to C++

PHS 7045: Advanced Programming

George G. Vega Yon, Ph.D.

The University of Utah

2024-09-17

Introduction

Learning objectives:

  • Understand the basics of C++ programming: syntax, types, and classes.
  • Learn how to write a simple C++ program, compile it, and run it.
  • Understand the differences between C++ and R.

We will need a compiler:

  • Windows: You can download Rtools from here.

  • MacOS: It is a bit complicated… Here are some options:

    • CRAN’s manual to get the clang, clang++, and gfortran compilers here.

    • A great guide by the coatless professor here

  • If you don’t have compiler installed, you can join the class via posit.cloud.

Start!

Hello world

The program

1#include<iostream>

2int main() {
3  std::cout << "Hello world" << std::endl;
4  return 0;
}
1
The equivalent to library() in R. This is part of the standard library.
2
C++ is type-explicit, so we always declare what are the data types.
3
Like in R, we have namespaces. We access the cout function from std (standard library). Also, the code ends with semicolon (;).
4
Explicit return.

We can use g++ to compile the code (-std=c++14 is the C++14 standard):

g++ -std=c++14 hello-world.cpp -o hello-world
./hello-world
Hello world

Example: Computing the mean

Download the program here.

#include<iostream> // To print
#include<vector>   // To use vectors

int main() {

  // Defining the data
  std::vector< double > dat = {1.0, 2.5, 4.4};
  
  // Making room for the output
  double ans = 0.0;

  // Looping through the data
  for (int i = 0; i < dat.size(); ++i)
    ans = ans + dat[i];

  ans = ans/dat.size();

  // Print out the value to the screen
  std::cout << "The mean of dat is " << ans << std::endl;

  // Returning
  return 0;

}
  • Loading the libraries
  • Creating a vector double with three values.
  • A for-loop that starts from zero and goes to the size of the vector, incrementing by one (++i).

To compile it, run the following command:

g++ -std=c++14 means.cpp -o means
./means
The mean of dat is 2.63333

Example: Computing the mean (take 2)

We can leverage modern C++ to make the code shorter with std::accumulate()

#include<iostream> // To print
#include<vector>   // To use vectors
#include<numeric>  // To use the accumulate function

int main() {

  // Defining the data
  std::vector< double > dat = {1.0, 2.5, 4.4};
  
  // Making room for the output
  double ans = std::accumulate(
    dat.begin(), dat.end(), 0.0
    );
  ans /= dat.size();

  // Print out the value to the screen
  printf("The mean of dat is %.2f\n", ans);

  // Returning
  return 0;

}
  • Using the numeric library that has the accumulate function
  • The std::accumulate function summs the elements of the vector.

To compile it, run the following command:

#| eval: true
#| echo: true
g++ -std=c++14 means2.cpp -o means2
./means2

Language fundamentals

Differences with R

Here are some differences between C++ and R:

Feature C++ R
Type Compiled Interpreted
Type explicit? Yes No
Index starts at 0 1
for loop for (int i = 0; i < n; ++i) for (i in 1:n)
Line ending ; \n” (implicit)
  • Compiled: the code is translated to machine code before running, allowing for faster execution. Interpreted: the code is executed, allowing interactive programming.

  • Type explicit: in C++, we always declare the type of the variables. In R, we don’t need to.

Fundamental types

Adapted from W3 Schools:

int my_num           = 5;       // Integer (whole number)
float my_float_num   = 5.99;    // Floating point number
double my_double_num = 9.98;    // Floating point number
char my_letter       = 'D';     // Character
bool my_boolean      = true;    // Boolean
std::string my_text  = "Hello"; // String

Vectors in C++ are similar to lists in R:

std::vector< int > my_vector = {1, 2, 3, 4, 5};
std::vector< std::string > my_str_vector = {"a", "b", "c"};
std::vector< std::vector< int > > my_matrix = {{1, 2}, {3, 4}};

Vectors in C++

  • Vectors make life easier, avoiding the need to manage memory.

  • Vectors store contiguous memory, allowing for fast access.

  • Vectors have many methods to manipulate the data:

my_vector.push_back(6); // Add an element
my_vector.pop_back();   // Remove the last element
my_vector.size();       // Number of elements
my_vector[0];           // Access the first element
my_vector.at(0)         // Access the first element (safer)

Vectors in C++: Looping

Looping through vectors can be done in different ways:

// Suppose we have this:
std::vector< int > my_vector = {1, 2, 3, 4, 5};`

// Typical loop
for (int i = 0; i < my_vector.size(); ++i) {
  std::cout << my_vector[i] << std::endl;
}

// Using vector's iterators (begin and end)
// and the auto keyword
for (auto i = my_vector.begin(); i != my_vector.end(); ++i) {
  std::cout << *i << std::endl;
}

// Using range-based for loop (with the auto keyword)
for (auto i: my_vector) {
  std::cout << i << std::endl;
}
  • The typical loop, access elements by index.
  • Using iterators, a bit more complex, i becomes a pointer to the value.
  • The range-based for loop, simpler and cleaner.

Important keywords

Types can go accompained by keywords:

1const int x = 5;
2double fun(int x)
3double fun(const int x)
4double fun(int & x)
5double fun(const int & x)
6double fun(int * x)
1
const: the value of x cannot be changed. Trying to modify it will result in a compilation error.
2
x is passed by copy (not ideal for large objects). It can be modified inside the function.
3
x is still a copy, but it cannot be modified.
4
&: passing by reference. Ideal for large objects. It can be modified.
5
const &: passing by reference, but cannot be modified.
6
*: passing by pointer. The value can be modified. NOT RECOMENDED FOR C++

Important keywords: Example with pointers

The following code (pointers.cpp) illustrates how these keywords work:

#include <cstdio> // For the std version of printf

void set_x_copy(int x, int y) {x = y;}; 
void set_x(int * x, int y) {*x = y;};   
void set_x_ref(int & x, int y) {x = y;};

int main() {

    int x = 0;

    set_x_copy(x, 3);
    std::printf("x = %d\n", x);
    set_x(&x, 2);
    std::printf("x = %d\n", x);
    set_x_ref(x, 1);
    std::printf("x = %d\n", x);

    return 0;

}
  • set_x_copy: x is passed by copy, and thus the value is not modified.
  • set_x: x is passed by pointer, and thus the value is modified. *x access the value at x, and &x is it’s memory address.
  • set_x_ref: x is passed by reference, and thus the value is modified. Passed by reference is the preferred way in C++.

To compile and run the code:

g++ -std=c++14 pointers.cpp -o pointers
./pointers
x = 0
x = 2
x = 1

Classes in C++

Example class (you can download the file here):

1#ifndef PERSON_HPP
#define PERSON_HPP

#include<string>
#include<iostream>

class Person {
2private:
    std::string name;
    int age;
    double height;

3public:
  // Constructor
4  Person(std::string n, int a, double h) {
    name = n;
    age = a;
    height = h;
  };

  // Default constructor
  Person() : name("Unknown"), age(0), height(0.0) {};

  // Destructor
5  ~Person() {
    std::cout <<
6      this->name + " destroyed" <<
      std::endl;
  };

  // Getters and setters
7  std::string get_name() { return name; };
  void set_name(std::string n) { name = n; };
};

#endif
1
The #ifndef + #define + #endif is the include guard. Avoids multiple inclusions.
2
Private members: only accessible within the class.
3
Public members: accessible from outside the class.
4
Constructor: initializes the object.
5
Destructor: called when deleting the object.
6
Internal elements can be accessed with this->.
7
Access: methods to access and modify private members.

Classes in C++ (cont.)

Using the class (you can download the file here):

#include<string>
#include<iostream>
#include "person.hpp"

int main() {
  Person p1; // Default constructor
  Person p2("John", 30, 1.80); // Other constructor

  std::cout << p1.get_name() << std::endl;
  std::cout << p2.get_name() << std::endl;

  return 0;
}

Compiling and executing the program:

g++ -std=c++14 person.cpp -o person
./person
Unknown
John
John destroyed
Unknown destroyed

Notice that the destroyer is called when p1 and p2 go out of scope (in reverse order).

Classes in C++: Declaration and Implementation

  • A good practice is to separate the declaration (bones) from the implementation (meat).

  • Looking at an extract of the class Person:

// ---------------------------------------
// Declarations: Arguments and data types
// ---------------------------------------
class Person {
private:
    std::string name;
    int age;
    double height;

public:
  // Constructor
  Person(std::string n, int a, double h);

  // Getters and setters
  std::string get_name();
};

// ---------------------------------------
// Implementation: Body of the functions
// ---------------------------------------
inline Person::Person(std::string n, int a, double h) {
  name = n;
  age = a;
  height = h;
};

inline std::string Person::get_name() {
  return name;
};

Overloading

  • In C++, we can have multiple functions with the same name, but different arguments. This is called overloading.

  • The compiler will choose the correct function based on the arguments. Both of these functions are valid:

int add_int(int x, int y) {
  return x + y;
}

double add_double(double x, double y) {
  return x + y;
}

float add_float(float x, float y) {
  return x + y;
}
int add(int x, int y) {
  return x + y;
}

double add(double x, double y) {
  return x + y;
}

float add(float x, float y) {
  return x + y;
}

Templates

  • In C++, we can use templates to create functions or classes that can work with any data type.

  • This is useful when we want to create a function that works with int, double, float, etc.

int add(int x, int y) {
  return x + y;
}

double add(double x, double y) {
  return x + y;
}

float add(float x, float y) {
  return x + y;
}
1template<typename T>
T add(T x, T y) {
  return x + y;
}

2template<>
float add(float x, float y) {
  std::cout<< "This is a float!" << std::endl;
  return x + y;
}
1
Template declaration (the generic type is T).
2
Specialization for float.

Templates (cont.)

Classes can also be templated (defined in template_class.cpp):

#include<iostream>

template<typename T>
class MyAdder {
private:
  T x;
  T y;
public:
  MyAdder(T x, T y) : x(x), y(y) {};

  T add() {
    return x + y;
  };
};

int main() {
  MyAdder<int> a(1, 2);
  MyAdder<double> b(1.0, 2.0);

  std::cout << a.add() << std::endl;
  std::cout << b.add() << std::endl;

  return 0;
}
  • The class is templated. The value T can be any type.
  • The template can be used any time we specify the type.
  • The class is instantiated with int and double. The compiler with generate two classes during compilation.
g++ -std=c++14 template_class.cpp -o template_class
./template_class
3
3

Compared with R

Simulating pi

  • One way to estimate \(\pi\) is to simulate points in a square and count how many are inside a circle.

  • The following is an optimized R function to do this:

\(A = \pi r^2\), thus \(\pi = \frac{A}{r^2}\).

my_pi_sim <- function(n) {
  xy <- matrix(runif(n*2, min=-1, max=1), ncol = 2)
  message(
    sprintf(
      "pi approx to: %.4f",
      mean(sqrt(rowSums(xy^2)) <= 1) * 4
    )
  )
}

set.seed(331)
my_pi_sim(1e6)
pi approx to: 3.1393

Let’s see how we can do this in C++.

Simulating pi in C++

#include <vector>
1#include <random>
int main() {
  
  // Setting the seed
2  std::mt19937 rng_engine;
  rng_engine.seed(123);

3  std::uniform_real_distribution<double> dist(-1.0, 1.0);

  // Number of simulations
  size_t n_sims = 5e6;

  // Defining the data
  double pi_approx = 0.0;
  for (size_t i = 0u; i < n_sims; ++i)
  {

    // Generating a point in the unit square
    double x = dist(rng_engine);
    double y = dist(rng_engine);

    double dist = std::sqrt(
4        std::pow(x, 2.0) + std::pow(y, 2.0)
        );

    // Checking if the point is inside the unit circle 
    if (dist <= 1.0)
      pi_approx += 1.0;

  }

  printf("pi approx to %.4f\n", 4.0*pi_approx/n_sims);

  return 0;

}
1
Library for random rumbers and stats distributions.
2
Random number engine (used in comb. with the distributions).
3
Uniform distribution between -1 and 1.
4
std::pow is the power function.
g++ -std=c++14 pi.cpp -o pi
./pi
pi approx to 3.1420

Extended example: Writing a summary class

Writing a summary class

  • The task is to write a class that computes the mean, standard deviation, minimum and maximum of a vector.

  • The class should be a template class so it can deal with double and int.

Full program

You can download the full C++ code here and the header file here:

#ifndef SUMMARY_HPP
#define SUMMARY_HPP

#include <vector>
#include <numeric>
#include <cmath>
#include <cstdio>

template<typename T>
class Summarizer {
private:
    const std::vector<T>* dat = nullptr;
    double n;

public:
    // Constructors
    Summarizer(const std::vector<T> & dat_);

    // Calculators
    double mean() const;
    double sd() const;
    T min() const;
    T max() const;

    // Printer
    void print() const;
    
};

template<typename T>
inline Summarizer<T>::Summarizer(const std::vector<T> & dat_) {
    dat = &dat_;
    n = dat->size();
};

template<typename T>
inline double Summarizer<T>::mean() const {
    return std::accumulate(
        dat->begin(), dat->end(), 0.0
        ) / n;
};

template<typename T>
inline double Summarizer<T>::sd() const {
    double m = mean();
    double sum = 0.0;
    for (auto & i: *dat)
        sum += std::pow(i - m, 2.0);
    return std::sqrt(sum / (dat->size() - 1));
};

template<typename T>
inline T Summarizer<T>::min() const {
    T min = (*dat)[0];
    for (std::size_t i = 1u; i < dat->size(); ++i)
        if ((*dat)[i] < min)
            min = (*dat)[i];
    return min;
};

template<typename T>
inline T Summarizer<T>::max() const {
    T max = (*dat)[0];
    for (std::size_t i = 1u; i < dat->size(); ++i)
        if ((*dat)[i] > max)
            max = (*dat)[i];
    return max;
};

template<>
inline void Summarizer<double>::print() const {
    std::printf("Summary for double data\n");
    std::printf("Mean : %.2f\n", mean());
    std::printf("SD   : %.2f\n", sd());
    std::printf("Min  : %.2f\n", min());
    std::printf("Max  : %.2f\n", max());
};

template<>
inline void Summarizer<int>::print() const {
    std::printf("Summary for int data\n");
    std::printf("Mean : %.2f\n", mean());
    std::printf("SD   : %.2f\n", sd());
    std::printf("Min  : %d\n", min());
    std::printf("Max  : %d\n", max());
};

#endif

Details: Declaration of the class

template<typename T>
class Summarizer {
private:
    const std::vector<T>* dat = nullptr;
    double n;

public:
    // Constructors
    Summarizer(const std::vector<T> & dat_);

    // Calculators
    double mean() const;
    double sd() const;
    T min() const;
    T max() const;

    // Printer
    void print() const;
    
};

Details: The constructor

template<typename T>
inline Summarizer<T>::Summarizer(const std::vector<T> & dat_) {
    dat = &dat_;
    n = dat->size();
};
  • The implementation of the constructor is done outside of the function.

  • The inline keyword is used to tell the compiler to insert the code in the place where the function is called (more efficient).

  • Here, data is passed by reference and then the pointer is stored.

Details: The mean function

template<typename T>
inline double Summarizer<T>::mean() const {
    return std::accumulate(
        dat->begin(), dat->end(), 0.0
        ) / n;
};
  • The function is declared as const to tell the compiler that the function does not modify the object (the class itself).

  • The mean function uses the std::accumulate function.

  • Since dat is a pointer to a vector, we can access the members of dat via the -> operator (otherwise it would be using a . operator).

Running the example

#include "summary.hpp"

int main() {
    // Some data
    std::vector< double > dat = {1.0, 2.5, 4.4};
    std::vector< int > dat2 = {1, 2, 3, 4, 5};

    // Summarize the data
    Summarizer<double> s_double(dat);
    s_double.print();

    Summarizer<int> s_int(dat2);
    s_int.print();

    return 0;

}
g++ -std=c++14 summary.cpp -o summary
./summary
Summary for double data
Mean : 2.63
SD   : 1.70
Min  : 1.00
Max  : 4.40
Summary for int data
Mean : 3.00
SD   : 1.58
Min  : 1
Max  : 5