Example of a target pipeline where the modification of one function (create_plot()
on the left) has outdated two other steps. Targets will rerun only two steps (hist and report) instead of the whole analysis.
Use these notes and the slides to run through the example in Rstudio. First make sure you have installed the tidyverse
, targets
, tarchetypes
, and biglm
packages. Open the targets-minimal.Rproj
file.
_targets.R
scripttargets
– think of these as the steps in your workflow._targets.R
file is the master script which contains all the other steps. Open this script. The first step is to load up the targets
and tarchetypes
packages, which are needed to define the pipeline.tar_option_set()
(just as you would load packages using library()
in a normal R session).tar_target()
function. This creates a target object for each step. These are stored in a folder of target objects and are automatically saved. A pipeline is essentially a list of (skippable) target objects.tar_target()
is the name of the object to be created.read_csv(raw_data_file, col_types = cols())
to read in a .csv data file).make.R
script:make.R
).tar_manifest()
function.tar_visnetwork()
visualises the objects, any functions needed to create them, and the interdependencies between them. It will tell you if each object is outdated and needs to be updated. You can click on each object to easily see its dependencies.built
) using the tar\_make()
functiontar_read()
is used to read in data (without loading it in your current session) – this can be useful if you have, for example, a very big object that you want to do stuff with without loading it into your current session, or for viewing a plot. To actually load an object into your current session, use the tar_load()
function.Note that pipelines should be built up gradually – add one or two target objects at a time, so you can easily debug. Each time you add a new object, targets can check how this affects other objects in the workflow. tar_outdated() can be used to tell you which objects are now outdated (including the new objects). This can also be done by visualising.
tar\_make()
` again will update only the outdated objects – visualising again (or running tar_outdated()) allows us to see that these objects have been updated.What if we need to change a function?
R
folder in the example, there is a script called functions.R
_targets.R
script and change our data filtering step to tar\_target(data, na.omit(raw_data))
`, then save the scripttar_make()
`.One of the good things about targets is that you can do it all via R Markdown. This means you can use it to produce an accessible, easily workflow including explanations of each step, the code used to perform it, and the corresponding outputs. See the excellent targets
package user manual for a great example of the type of document you can produce using R Markdown.
Investing a little time into organising your work into a targets pipeline at the start of a project will save you a lot of time in the long run. Happy coding!
Alain Danet SESSION
coding tutorial reproducibility R