Building functions

Workshop info

  • When: November 25th, 12pm (PST, Vancouver, BC)
  • Where: Zoom
  • Requirements: Participants must have a laptop or desktop with a Mac, Linux, or Windows operating system. (Tablets and Chromebooks are not advised.) Please have the latest version of R and RStudio downloaded and running (free!).
  • Code of conduct: Everyone participating in The Carpentries workshop is required to conform to the Code of Conduct

Illustrations by Allison Horst


This short tutorial will attempt to show:

  • What functions are and how are they useful
  • Why custom functions are necessary and when we might need them
  • How to build your own basic functions and how to source them

This tutorial on “Building functions” was built for a GrasPods workshop, which adapted works from The Carpentries licensed under CC BY 4.0.

What are functions?

Functions package a sequence of operations into one, preserving it for ongoing use without the need to repeat them. In R functions are invoked with a special term followed immediately by (). For example you’ve probably already come across functions such as library() to load packages, print() to print characters, and paste() to string together multiple vectors to create one long vector.

You have probably noticed that functions provide:

  • A single memorable name to invoke operations
  • A relief from the need to remember the individual operations
  • A defined set of inputs and expected outputs

Functions are the basic building blocks of most programming languages, and user-defined functions constitute what we call “programming”. If you have written a function, you are a computer programmer!

Why build custom functions?

Although base R package and packages on CRAN provide many useful functions, there are instances where you’ll need to repeat a series of operations that you wish could be summed up into one function.

Defining functions

Here are the essential building blocks of a function

  • meaningful_name <- function()
  • arguments are defined within ()
  • series of operations are included within {}
  • return() result
meaningful_name <- function(argument1, arugment2, ...) {
  result <- operation_using(argument1, argument2)

Let’s build our first function together and try it out.

multiply <- function(a, b) {
  the_product <- a * b

multiply(3, 4)
## [1] 12

What did you observe?:

What happened when you executed multiply <- function(a, b){...}?

  • No verbose
  • Addition of function to global environment
  • No intermediate variables produced in global environment

Challenge 1: (10 mins)

Using the building blocks I showed you above, fill in the blanks below to build function that converts Fahrenheit to Kelvin. Here is the formula: (°F − 32) × 5/9 + 273.15 = K.

F_to_K <- _____(temp) {
  kelvin <- ________

Let’s test it out! 32°F is the freezing point, which should correspond to 273.15K.

## [1] 273.15

Challenge 2: (10 mins)

Now let’s build a function called K_to_C() that converts Kelvin to Celsius. The formula for the conversion is: K - 273.15 = Celsius.

Take the next 10 mins to figure out which on of these will give us the right function? if not why? Feel free to try them about.

K_to_C() <- function(temp){
 celsius <- temp - 273.15
K_to_C <- function(temp){
 celsius <- temp - 273.15
K_to_C <- function(kelvin){
 celsius <- kelvin - 273.15
K_to_C <- function(kelvin){
 celsius <- temp - 273.15

Combining functions

The real power of programming comes when you can mix and match various functions together. Once functions are defined, we can use it within other functions or string multiple functions together to create a single more powerful operation

Challenge 3: (15 mins)

Build a new function F_to_C() that uses the functions we previously built, F_to_K() and K_to_C(), to convert Fahrenheit directly to Celsius.

Let me start you off:

F_to_C <- function(){

Sourcing functions

There are different ways to call on custom defined function. As we have been doing, you can defined your custom function at the beginning of each script as you use them. However, if you are using the same custom functions across different projects scripts it becomes too repetitive and forces you to either remember each step and dig through previous scripts to copy and paste it into your new script.

In programming, we want to reduce repetition if possible, and this is where sourcing functions come in.

First, let’s open a new R script and paste our custom functions from below. Next save the the code block in to a script called. my_func.R.

F_to_K <- function(temp) {
  kelvin <- ((temp - 32) * (5 / 9)) + 273.15

K_to_C <- function(kelvin){
 celsius <- kelvin - 273.15

F_to_C <- function(temp){
  K_result <- F_to_K(temp)
  celsius <- K_to_C(K_result)

Now let’s delete all the variables and functions we’ve created so far and “reset” our environment using rm(list = ls())

To source our set of three functions, we use the source() function locate our my_func.R script.


Do you see three familiar functions in your global environment?

“Real world” applications

Gapminder Foundation is a non-profit venture registered in Stockholm, Sweden, that promotes sustainable global development and achievement of the United Nations Millennium Development Goals by increased use and understanding of statistics and other information about social, economic and environmental development at local, national and global levels. [1]

We’re going to use the gapminder data set to apply our knowledge of functions.We can download the data set using install.packages("gapminder") before calling the library() function. Let’s see what we’re working with:


## tibble [1,704 × 6] (S3: tbl_df/tbl/data.frame)
##  $ country  : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ year     : int [1:1704] 1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
##  $ lifeExp  : num [1:1704] 28.8 30.3 32 34 36.1 ...
##  $ pop      : int [1:1704] 8425333 9240934 10267083 11537966 13079460 14880372 12881816 13867957 16317921 22227415 ...
##  $ gdpPercap: num [1:1704] 779 821 853 836 740 ...

Let’s calculate the total gross domestic product of the data set.

calc_total_gdp <- function(data) {
  gdp <- data$pop * data$gdpPercap

Let’s make it more interesting! Let’s add options to select country or year.

calc_total_gdp <- function(data, year=NULL, country=NULL) {
  if(!is.null(year)) {
    data <- data[data$year %in% year, ]
  if (!is.null(country)) {
    data <- data[data$country %in% country,]
  gdp <- data$pop * data$gdpPercap

  data$gdp <- gdp

Let’s talk about what is happening in year = NULL and country = NULL

Challenge 4: (10 mins)

Test our your new calcGDP function to calculate the following:

  1. What was the gdp of Myanmar in 1977?
  2. What was the gdp of Togo in 1992?
  3. What was the gdp of Canada in 2000?

Mastering functions and advanced applications

Now that you’ve got a handle on creating custom functions, now what? Some topics to think about, would be serial applications of your function in an automated process. (ie. How to reuse your function without typing it over and over again.) Here are some concepts to look up as you move forward in becoming more comfortable with coding in R.

Concluding thoughts

  • Congratulations! you are now a programmer, huzzah! 🥳 🎉
  • Custom functions can be useful for repetitive operations
  • Functions can be defined per script or called upon using source()

Illustrations by Allison Horst

I’m always looking out for topics our community is interested in learning. If you have any ideas, suggestions, or comments, please get in touch!