Current state of R packages for the design of experiments

Your analytical toolkit matters very little if the data are no good. Ideally you want to know to how the data were collected before delving into the analysis of the data; better yet, get involved before the collection of data and design its collection. In this post I explore some of the top downloaded R packages for the design of experiments and analysis of experimental data.

Monash University

February 3, 2021

Data collection

As many know, it doesn’t matter how good your analytical tools is if your data are rubbish. This sentiment is often captured in the expression “garbage in, garbage out”. It’s something we all seem to know but there is still a tendency for many of us to place a greater focus on the analysis 1 . This is perhaps all natural given that a potential for discovery is just so much more exciting than ensuring the quality of the collected data.

So what is considered as good quality data? A lack of error in the data? Data containing enough range of variables and sample size for the downstream analysis? Giving an explicit definition of a good quality data is a fraught exercise, but if you know how the data were collected then you can better perform the initial data analysis ( Chatfield 1985 ) to weed out (or fix) potential poor quality data. This step will likely get more value out of the data than fitting complex models to poor quality data.

Better than knowing how the data were collected, if you can design the collection of data so that it’s optimised for the purpose of the analysis 2 , then you can potentially get even a better value out of your data. Not all data collection starts with an explicit analytical plan though. Furthermore, you may have very little control of how the data are collected. Often these are observational data or making a secondary use of experimental data . This article will focus on data collection of an experiment where you have some control of the collection process.

Experimental data

All experiments are conducted with some objective in mind. This could be that a scientist may wish to test their hypothesis, a manufacturer wants to know which manufacturing process is better or a researcher wants to understand some cause-and-effect relationships. A characteristic part of an experiment is that the experimenter has control over some explanatory variables. In a comparative experiment , the control is over the allocation of treatments to subjects. Designing an experiment in the statistics discipline usually focus on this allocation, although it’s important to keep in mind that there are other decision factors in an experiment.

Data that are collected from experiments are what we refer to as experimental data . Because it was collected with some objective in mind followed by some data collection plan, experimental data are often thought of to be better quality than observational data. But then again if you can’t quantify the quality of data, you can’t really tell. Certain scientific claims (e.g. causation, better treatment) can only be substantiated by experiments and so experimental data is held to a higher standard in general.

Design and analysis of experiments

There are all together 83 R-packages in the CRAN Task View of Design of Experiments & Analysis of Experimental Data as of 2022-09-18. 3 I’m going to refer these packages as DoE packages , although there are some packages in the mix that are more about the analysis of experimental data rather than the design of experiments and there are some packages that are missing in the list (e.g.  DeclareDesign ). The DoE packages make up about 0.4% of the 18,592 packages available in CRAN.

The DoE packages don’t include survey design. These instead belong to the CRAN Task View of Official Statistics & Survey Methodology which contains 122 packages. While some surveys are part of an experimental study, most often they generate observational data.

Below I have a number of different analysis for these DoE packages. If you push the button on the top right corner of this article, you can toggle the display for the code or alternatively you can have a look at the source Rmd document.

Bigram of DoE package titles and descriptions

Table @ref(tab:bigram-title) shows the most common bigrams in the title of the DoE packages. It’s perhaps not surprising but the words “optimal design” and “experimental design” are the top. It’s also likely that the words “design of experiments” appears often but because this is a bigram (two consecutive words) so it doesn’t appear. You might then wonder if that’s the case words like “design of” or “of experiments” should make an appearance, however “of” is a stop word and these are filtered out otherwise unwanted bigrams come up on the top.

There are couple of words like “clinical trial” and “dose finding” that suggests applications in medical experiments, as well as “microarray experiment” that suggests application in bioinformatics.

The bigram of the R-package _titles_ as provided in the DESCRIPTION file in CRAN.
Bigram Count
experimental design 6
optimal design 6
clinical trial 5
dose finding 5
sequential design 5
block design 4
microarray experiment 4

The title alone might be too succinct for text analysis so I also had a look at the most common bigrams in the description of the DoE packages as shown in Table @ref(tab:bigram-desc). The counts in Table @ref(tab:bigram-desc) (and also Table @ref(tab:bigram-title)) is across the DoE packages. To be more clear, even if the bigram is mentioned multiple times within the description, it’s only counted once per package. This removes the inflation of the counts due to one package mentioning the same bigram over and over again.

Again not surprisingly “experimental design” and “optimal design” comes on top in the DoE package descriptions. The words “graphical user” and “user interface” implies that the trigram “graphical user interface” was probably common.

The bigram of the R-package _descriptions_ as provided in the DESCRIPTION file in CRAN.
Bigram Count
experimental design 7
optimal design 7
block design 5
clinical trial 5
factorial design 5
graphical user 5
microarray experiment 5
user interface 5

Network of DoE package imports and dependencies

Figure @ref(fig:doe-network) shows the imports and dependency between the DoE packages. We can see here that DoE.wrapper imports a fair number of DoE packages that results in the major network cluster see in Figure @ref(fig:doe-network). AlgDesign and DoE.base are imported into four other DoE packages so form an important base in the DoE world.

(ref:network) The network of imports and dependency among DoE packages alone. Each node represents a DoE package. DoE packages with no imports or dependency on other DoE packages are excluded. Each arrow represents the relationship between the packages such that the package on the tail is used by package on the head of the arrow.

r package for design of experiments

CRAN download logs

Figure @ref(fig:download-hist) shows the distribution of the total download counts over the last 5 years 4 of the DoE packages. This graph doesn’t take into account that some DoE packages may only have been on CRAN in the last 5 years so the counts are in favour of DoE packages that’s been on CRAN longer.

(ref:hist) Histogram of the total download count over last 5 years of the DoE packages.

r package for design of experiments

Top 5 DoE packages

The top 5 downloaded DoE packages at the time of this writing are AlgDesign , lhs , DiceDesign , DoE.base , and FrF2 . You can see the download counts in Figure @ref(fig:download-barplot).

(ref:barplot) The above barplot shows the total downloads of the top 5 downloaded DoE packages from the period 2017-09-18 to 2022-09-16.

r package for design of experiments

We can have a look at further examination of the top 5 DoE packages by looking at the daily download counts as shown in Figure @ref(fig:download-barplot). The download counts are the raw values and these include downloads by CRAN mirror and bots. There is a noticeable spike when there is an update to the CRAN package. This is partly because when there is a new version of the package, when you install other packages that depend or import it then R will prompt you to install the new version. This means that the download counts are inflated and to some extent you can artificially boost them by making regular CRAN updates. The adjustedcranlogs ( Morgan-Wall 2017 ) makes a nice attempt to adjust the raw counts based on a certain heuristic. I didn’t use it since the adjustment is stochastic and I appear to have hit a bug .

(ref:timeplot) The above plot shows the daily downloads of the top 5 downloaded DoE packages from the period 2017-09-18 to 2022-09-16. The vertical dotted bar corresponds to the date that a new version of the corresponding package was released on CRAN.

r package for design of experiments

Here we have a closer look at the functions of the top 5 downloaded DoE packages below ordered by their download counts.

  • AlgDesign CRAN GitHub Wheeler ( 2019 ) Algorithmic Experimental Design Originally written by Bob Wheeler but Jerome Braun have taken over maintenance of the package.
  • agricolae CRAN de Mendiburu ( 2020 ) Statistical Procedures for Agricultural Research Written and maintained by Felipe de Mendiburu
  • lhs CRAN GitHub Carnell ( 2020 ) Latin Hypercube Samples Written and maintained by Rob Carnell
  • ez CRAN GitHub Lawrence ( 2016 ) Easy Analysis and Visualization of Factorial Experiments Written and maintained by Michael A. Lawrence
  • DoE.base CRAN Grömping ( 2018 ) Full Factorials, Orthogonal Arrays and Base Utilities for DoE Packages Written and maintained by Ulrike Groemping.

Before we look at the packages, let’s set a seed so we can reproduce the results.

To start off, we begin with the most downloaded DoE package, AlgDesign . The examples below are taken directly from the vignette of the AlgDesign package .

You can create a balanced incomplete block design using the optBlock function. It’s using an optimal design framework where the default criterion is D criterion and the implied model is given in the first argument.

AlgDesign also includes helper functions to generate a factorial structure.

This can be an input to specify the design using another function, say with optFederov which uses Federov’s exchange algorithm to generate the design.

If you want to further randomise within blocks, you can pass the above result to optBlock .

agricolae is motivated by agricultural applications although the designs are applicable across a variety of fields.

The functions to create the design all begin with the word “design.” and the names of the functions are remnant of the name of the experimental design. E.g. design.rcbd generates a Randomised Complete Block Design and design.split generates a Split Plot Design.

Rather than going through each of the functions, I’ll just show one. The command below generates a balanced incomplete block design with 7 treatments of block size 3. This the same design structure as the first example for AlgDesign . What do you think of the input and output?

More examples are given in the agricolae tutorial .

The lhs package is completely different to the previous two packages. It implements methods for creating and augmenting Latin Hypercube Samples and Orthogonal Array Latin Hypercube Samples. The treatment variables here are the parameters and are continuous. In the example below, there are 10 parameters were 30 samples will be drawn from.

lhs provides a number of methods to find the optimal design each with their own criteria.

This is mainly focussed on the analysis of experimental data but some functions such as ezDesign is useful for viewing the experimental structure.

r package for design of experiments

DoE.base provides utility functions for the special class design and as seen in Figure @ref(fig:doe-network), DoE.base is used by four other DoE packages that is maintained also by Prof. Dr. Ulrike Grömping .

DoE.base contains functions to generate factorial designs easily.

It also contains functions to create orthogonal array designs.

If you need to further randomise within a specified block, you can do this using rerandomize.design .

So those were the top 5 DoE packages. The API of the packages are quite distinct. The object that it outputs can vary from a matrix to a list. DoE might be a dull area for many but it’s quite important for the downstream analysis. Perhaps if many of us talk more about it, it may help invigorate the area!

At least from my teaching experience, statistics subjects are primary about the analysis and most research grants I’ve seen are about an analytical method. The analytical focus is reflected also in the R packages; there are 1,907 R-packages on CRAN with the word “analysis” in the title as opposed to 287 R-packages with the word “design” in its title. ↩︎

Keeping in mind though that your analysis plan may change once you actually have collected data. This is quite common in the analysis of plant breeding trials since some spatial variation only become apparent only after the data collection. ↩︎

I originally had a webscrapping error where I didn’t remove duplicate entries so numbers presented at TokyoR and SSA Webinar had the wrong numbers. ↩︎

As of 2022-09-18. ↩︎

skpr is an open source design of experiments suite for generating and evaluating optimal designs in R. Here is a sampling of what skpr offers:

  • Generates and evaluates D, I, A, Alias, E, T, and G optimal designs, as well as user-defined custom optimality criteria.
  • Supports generation and evaluation of split/split-split/…/N-split plot designs.
  • Includes parametric and Monte Carlo power evaluation functions, and supports calculating power for censored responses.
  • Provides an extensible framework for the user to evaluate Monte Carlo power using their own libraries.
  • Includes a Shiny graphical user interface, skprGUI, that auto-generates the R code used to create and evaluate the design to improve ease-of-use and enhance reproducibility.

Installation

  • gen_design() generates optimal designs from a candidate set, given a model and the desired number of runs.
  • eval_design() evaluates power parametrically for linear models, for normal and split-plot designs.
  • eval_design_mc() evaluates power with a Monte Carlo simulation, for linear and generalized linear models. This function also supports calculating power for split-plot designs using REML.
  • eval_design_survival_mc() evaluates power with a Monte Carlo simulation, allowing the user to specify a point at which the data is censored.
  • eval_design_custom_mc() allows the user to import their own libraries and use the Monte Carlo framework provided by skpr to calculate power.
  • calculate_power_curves() provides an interface to automate the generation and evaluation of designs to create power versus sample size and effect size curves.
  • skprGUI() opens up the GUI in either RStudio or an external browser.

If addition, the package offers two functions to generate common plots related to designs:

  • plot_correlations() generates a color map of correlations between variables.
  • plot_fds() generates the fraction of design space plot for a given design.

skprGUI() provides an graphical user interface to access all of the main features of skpr. An interactive tutorial is provided to familiarize the user with the available functionality. Type skprGUI() to begin. Screenshots:

r package for design of experiments

Experimental Design and Process Optimization with R

Gerhard Krennrich

1 Introduction

The present document is a short and elementary course on the Design of Experiments (DoE) and empirical process optimization with the open-source Software R . The course is self-contained and does not assume any preknowledge in statistics or mathematics beyond high school level. Statistical concepts will be introduced on an elementary level and made tangible with R-code and R-graphics based on simulated and real world data. So, then, what is DoE and why should the reader become familiar with the concepts of DoE? Very briefly, DoE is the science of varying many experimental parameters in a systematic way to gain insight on how to further improve and optimize these parameters. Chapter 2 will show how and why multidimensional DoE techniques are superiour to the classical “one-dimensional” optimization approach. Chapter 6 will demonstrate why and how DoE can be combined with optimization. Finally, the use of DoE and optimization will be practically demonstrated in chapter 7 for improving the performance of a catalytic system. Historically, Experimental Design started as a branch of statistics in the early years of the 20 th century and has meanwhile grown into a mature method with a plethora of applications in the experimental sciences. Consequently, there are many good and comprehensive books available about DoE, some of which we will make frequent reference to in the present text, namely (George E.P. Box, Norman R. Draper 1987 ) , (D.C. Montgomery 2013 ) and (G.E.P. Box, W.G. Hunter, J.S. Hunter 2005 ) . A more recent text with emphasis on the use of R in conjuction with DoE is (John Lawson 2015 ) . Linear models are comprehensively covered, e.g., by the text book (A. Sen, M. Srivastava 1990 ) . A general, however fairly technical text on linear and nonlinear statistical model building is the excellent book (T. Hastie, R. Tibshirani, J. Friedman 2009 ) . (J.G. Kalbfleisch 1985 ) is a smooth introduction into statistics, probability and statistical inference. The present text draws on these books and on many years of experience as a statistical consultant in the chemical industry. Most examples in this course are therefore taken from applications and optimization projects in the chemical sciences. The primarily intended readers of this document are chemists and engineers entrusted with empirical optimization in research and development. However, the presented methods and concepts are fairly generic and scientist working in other areas such as biology or the medical sciences might benefit from the text. As to software, R, probably together with Phyton, is the only open-source software which combines the whole spectrum of DoE and optimization with the flexibility of a powerful script language that allows any kind of data pre- and postprocessing within one software environment. That makes, in my opinion, R superior to many commercial GUI based tools which often buy userfriendlyness at the expense of flexibility.

1.1 How to install R

The R-software can be downloaded free of charge from the R repository CRAN

An IDE ( I ntegrated D evelopment E nvironment) is reqired for smoothly working with R. An IDE allows editing, running and debugging of R code and managing programm in- and output. In principle any IDE can be used but we recommend R-Studio as the de-facto standard.

Get R-Studio IDE

The R-introduction at CRAN is a concise introduction into the R-language. A short R-introduction

1.2 Some remarks on how to read the present text

This document is not an introduction into the R language, rather the document follows the philosophy of “learning by doing”. In this spirit the above mentioned text R-introduction is recommended as a first reference together with the present R examples on DoE and optimization. As it is usually easier to modify existing code than writing code from scratch, it is hoped that the R-examples in this course will help learning both R and DoE more rapidly. The course is divided into seven chapters. There is, however, one stand-alone chapter, chapter 5, which can be skipped by those readers not explicitly dealing with mixture problems. The final chapter 7 is a published, (Siebert M., Krennrich G., Seibicke M., Siegle A.F., Trapp O. 2019 ) , real-world example combining many elements of DoE and optimization for improving the performance of a catalytic system. This application should encourage readers to use these powerful methods for the sake of their own projects.

A. Sen, M. Srivastava. 1990. Regression Analysis, Theory, Methods and Applications . 1st ed. Springer-Verlag, New York.

D.C. Montgomery. 2013. Design and Analysis of Experiments . 8th ed. John Wiley & Sons Inc.

G.E.P. Box, W.G. Hunter, J.S. Hunter. 2005. Statistics for Experimenters: Design, Innovation, and Discovery . 2nd ed. John Wiley & Sons, Hoboken.

George E.P. Box, Norman R. Draper. 1987. Empirical Model-Building and Response Surfaces . 1st ed. John Wiley & Sons.

J.G. Kalbfleisch. 1985. Probability and Statistical Inference, Vol 1&2 . 2nd ed. Springer.

John Lawson. 2015. Design and Analysis of Experiments with R . 1st ed. Chapman & Hall.

Siebert M., Krennrich G., Seibicke M., Siegle A.F., Trapp O. 2019. “Identifying High-Performance Catalytic Conditions for Carbon Dioxide Reduction to Dimethoxymethane by Multivariate Modelling.” Chemical Science 10:45. https://pubs.rsc.org/en/content/articlelanding/2019/sc/c9sc04591k#!divAbstract .

T. Hastie, R. Tibshirani, J. Friedman. 2009. The Elements of Statistical Learning . 2nd ed. Springer-Verlag.

Grab your spot at the free arXiv Accessibility Forum

Help | Advanced Search

Statistics > Other Statistics

Title: current state and prospects of r-packages for the design of experiments.

Abstract: Re-running an experiment is generally costly and, in some cases, impossible due to limited resources; therefore, the design of an experiment plays a critical role in increasing the quality of experimental data. In this paper, we describe the current state of R-packages for the design of experiments through an exploratory data analysis of package downloads, package metadata, and a comparison of characteristics with other topics. We observed that experimental designs in practice appear to be sufficiently manufactured by a small number of packages, and the development of experimental designs often occurs in silos. We also discuss the interface designs of widely utilized R packages in the field of experimental design and discuss their future prospects for advancing the field in practice.
Comments: 14 pages, 8 figures, 1 supplementary material
Subjects: Other Statistics (stat.OT); Computation (stat.CO)
Cite as: [stat.OT]
  (or [stat.OT] for this version)
  Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

  • Other Formats

license icon

Ancillary files ( details ) :

References & citations.

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

FielDHub is an R package/shiny design of experiments (DOE) app that aids in the creation of traditional, un-replicated, augmented and partially-replicated designs applied to agriculture, plant breeding, forestry, animal and biological sciences.

For more details and examples of all functions present in the FielDHub package. Please, go to https://didiermurillof.github.io/FielDHub/reference/index.html .

r package for design of experiments

This is a basic example which shows you how to launch the app:

Diagonal Arrangement Example

A project needs to test 280 genotypes in a field containing 16 rows and 20 columns of plots. In this example, these 280 genotypes are divided among three different experiments. In addition, four checks are included in a systematic diagonal arrangement across experiments to fill 40 plots representing 12.5% of the total number of experimental plots. An option to include filler plots is also available for fields where the number of experimental plots does not equal the number of available field plots.

r package for design of experiments

The figure above shows a map of an experiment randomized along with multiple experiments (three) and checks on diagonals. Distinctively colored check plots are replicated throughout the field in a systematic diagonal arrangement.

r package for design of experiments

The figure above shows the layout for the three experiments in the field.

Using the FielDHub function diagonal_arrangement()

To illustrate using FielDHub to build experimental designs through R code, the design produced in the R Shiny interface described above can also be created using the function diagonal_arrangement() in the R script below. Note, that to obtain identical results, users must include the same random seed in the script as was used in the Shiny app. In this case, the random seed is 1249.

Users can print the returned values from diagonal_arrangement() as follow,

First 12 rows of the fieldbook,

Users can plot the layout design from diagonal_arrangement() using the function plot() as follows,

r package for design of experiments

In the figure, salmon, green, and blue shade the blocks of unreplicated experiments, while distinctively colored check plots are replicated throughout the field in a systematic diagonal arrangement.

The main difference between using the FielDHub Shiny app and using the standalone function diagonal_arrangement() is that the standalone function will allocate filler only if it is necessary, while in Shiny App, users can customize the number of fillers if it is needed. In cases where users include fillers, either between or after experiments, the Shiny app is preferable for filling and visualizing all field plots.

To see more examples, go to https://didiermurillof.github.io/FielDHub/articles/diagonal_arrangement.html

Partially Replicated Design Example

Partially replicated designs are commonly employed in early generation field trials. This type of design is characterized by replication of a portion of the entries, with the remaining entries only appearing once in the experiment. As an example, considered a field trial with 288 plots containing 75 entries appearing two times each, and 138 entries only appearing once. This field trials is arranged in a field of 16 rows by 18 columns.

r package for design of experiments

In the figure above, green plots contain replicated entries, and the other plots contain entries that only appear once.

Using the FielDHub function partially_replicated()

Instead of using the Shiny FielDHub app, users can use the standalone FielDHub function partially_replicated() . The partially replicated layout described above can be produced through scripting as follows. As noted in the previous example, to obtain identical results between the script and the Shiny app, users need to use the same random seed, which, in this case, is 77.

Users can print returned values from partially_replicated() as follows,

Users can plot the layout design from partially_replicated() using the function plot() as follows,

r package for design of experiments

To see more examples, please go to https://didiermurillof.github.io/FielDHub/articles/partially_replicated.html

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

R package that seeks better design of experiments

HAOYU-LI/UniDOE

Folders and files.

NameName
85 Commits

Repository files navigation

Package: unidoe, author: aijun zhang, haoyu li and zebin yang, date: 22/08/2017.

R(>= 3.4.1), Rtools(>=34)

Rcpp (>= 0.12.12)

Linking To:

Introduction:.

UniDOE is a R package, which implements an efficient stochastic evolutionary(SE) algorithm to search for design of experiment . Computational procedures are mainly achieved by c++ so that the calculation speed is greatly boosted. Users can either download and install from binary source package or install from github directly using devtools, of which details are illustrated below. This package is distributed in the hope that it will be useful, but without any warranty.

How to install:

At first, Make sure you are using R(>=3.4.1). Typing 'version' in R command line can retrieve related information, e.g.:

Output should show corresponding R version and architecture of your current platform:

(Update: Feb, 2018) Install from CRAN:

UniDOE is currently published to CRAN, users can conveniently install it from R Command line:

This package will be modified and updated in CRAN directly. This github repository may not be the newest version.

Install from github:

Download and install Rcpp(>=0.12.12) package if you haven't installed or updated it to >=0.12.12 version.

In R command-line:

Git clone this repostory to your local machine. After that, you can install UniDOE from local files:

Choose UniDOE_0.1.1.zip to install Or install it from GUI.

It's more convenient to install UniDOE using devtools. At first, make sure you installed devtools.

Then install UniDOE from github:

Useful links:

  • Experimental design - Intro to design of experiments
  • License - License for this project
  • Jupyter Notebook 62.0%
  • Python 0.3%

Design of Experiments with R

  • First Online: 01 January 2012

Cite this chapter

r package for design of experiments

  • Emilio L. Cano 4 ,
  • Javier M. Moguerza 4 &
  • Andrés Redchuk 4  

Part of the book series: Use R! ((USE R,volume 36))

7868 Accesses

Design of experiments (DoE) is one of the most important tools in the Six Sigma methodology. It is the essence of the Improve phase and the basis for the design of robust processes. An adequate use of DoE will lead to the improvement of a process, but a bad design can result in wrong conclusions and engender the opposite of the desired effect: inefficiencies, higher costs, and less competitiveness. In this chapter, we introduce the foundations of DoE and describe the essential functions in R to perform it and analyze its results. We will describe two-level factorial designs using a representative example of how DoE should be used to achieve the improvement of a process in a Six Sigma way. The chapter is not intended as a thorough review of DoE. The idea is to introduce a simple model in an intuitive way. For more technical or advance training a number of references are given at the end of the chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The experts do not know the recipe used for each individual pizza.

Allen, T. T. (2010). Introduction to engineering statistics and lean Six Sigma—Statistical quality control and design of experiments and systems . New York: Springer.

Google Scholar  

Berger, P., & Maurer, R. (2002). Experimental design: With applications in management, engineering, and the sciences. Duxbury titles of related interest . CA: Duxbury/Thomson Learning.

Box, G., & Jones, S. (1992). Designing products that are robust to the environment. Total Quality Management , 3 (3), 265–285.

Article   Google Scholar  

Box, G., Hunter, J., & Hunter, W. (2005). Statistics for experimenters: Design, innovation, and discovery. Wiley series in probability and statistics . New York: Wiley.

Grömping, U. (2011). Cran task view: Design of experiments (doe) & analysis of experimental data . http://cran.r-project.org/web/views/ExperimentalDesign.html . Retrieved 24.01.2012.

Grömping, U. (2012). Project: (industrial) doe in r . http://prof.beuth-hochschule.de/groemping/software/design-of-experiments/project-industrial-doe-in-r/ . Retrieved 24.01.2012.

Lalanne, C. (2006). R companion to montgomery’s design and analysis of experiments . http://www.aliquote.org/articles/tech/dae/ . Retrieved 19.01.2012.

Lopez-Fidalgo, J. (2009). A critical overview on optimal experimental designs. Boletin de Estadística e Investigación Operativa , 25 (1), 14–21. http://www.seio.es/BEIO/files/BEIOv25n1_ES_J.Lopez-Fidalgo.pdf . Retrieved 19.01.2012.

Mee, R. (2009). A comprehensive guide to factorial two-level experimentation . New York: Springer.

Book   Google Scholar  

Montgomery, D. (2008). Design and analysis of experiments. Student solutions manual . New York: Wiley.

Myers, R., Montgomery, D., & Anderson-Cook, C. (2009). Response surface methodology: Process and product optimization using designed experiments. Wiley series in probability and statistics . New York: Wiley.

Pyzdek, T., & Keller, P. (2009). The Six Sigma handbook: A complete guide for green belts, black belts, and managers at all levels . New York: McGraw-Hill.

Rasch, D., Pilz, J., & Simecek, P. (2010). Optimal experimental design with R . London: Taylor & Francis.

Taguchi, G., Chowdhury, S., & Wu, Y. (2005). Taguchi’s quality engineering handbook . USA: Wiley.

MATH   Google Scholar  

Vikneswaran (2005). An r companion to “experimental design” . http://cran.r-project.org/doc/contrib/Vikneswaran-ED_companion.pdf . Retrieved 19.01.2012.

Download references

Author information

Authors and affiliations.

Department of Statistics and Operations Research, Rey Juan Carlos University, Madrid, Spain

Emilio L. Cano, Javier M. Moguerza & Andrés Redchuk

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media New York

About this chapter

Cano, E.L., Moguerza, J.M., Redchuk, A. (2012). Design of Experiments with R. In: Six Sigma with R. Use R!, vol 36. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-3652-2_11

Download citation

DOI : https://doi.org/10.1007/978-1-4614-3652-2_11

Published : 11 May 2012

Publisher Name : Springer, New York, NY

Print ISBN : 978-1-4614-3651-5

Online ISBN : 978-1-4614-3652-2

eBook Packages : Mathematics and Statistics Mathematics and Statistics (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Main navigation | Main content

  • Twin Cities
  • Other Locations
  • School of Statistics
  • Gary's Home Page

A First Course in Design and Analysis of Experiments

This book by Gary W. Oehlert was first published in 2000 by W. H. Freeman. As of summer 2010, it has gone out of print. Curiously, I still like this book and would prefer to continue using it in my teaching; some of my colleagues feel the same way. And since the copyright has reverted to me, we can do that.

  • You must properly attribute the work.
  • You may not use this work for commercial purposes.
  • You may not alter, transform, or build upon this work.

A complete description of the license may be found at the Creative Commons website.

Download First Edition

You may download A First Course in Design and Analysis of Experiments by clicking here (1.9 MB PDF).

Download Draft Second Edition

The designation "draft" is literal. All of the books are under active development, and you will be able to tell where I stopped revising and assembled this distribution file.

The PDF versions of the text include links to (some of) the material in the Examples supplement. The "local" version of the text tries to link to the local files in the RExamples subdirectory, and the "web" version of the text tries to link to files at www.stat.umn.edu/~gary/book/RExamples. The advantage of the local version is that you can use it without a network connection, but some platforms (iOS in particular) seem to make it very hard to use local files.

Whether these links work seems to be hit or miss. For example, I use a Mac, and the links work when the text is viewed using Adobe Reader, but they don't work when using Preview. On Linux, the local links didn't work using Xpdf but the web links did, while both worked under "Document Viewer". I really don't know what happens under iOS, Android, or Windows, but I'm guessing that some of it will work.

Download R Package (with data) for Second Edition

The R-package cfcdae (Companion to A First Course in Design and Analysis of Experiments) contains all of the data from the draft second edition of the book along with a handful of useful functions. (Many of these functions have analogs in other packages, they just do things the way I like.)

You may download cfcdae by clicking here

This package is in source format. Download the package and save the file into a place where R can find it (e.g., your home directory or the desktop). Start R, set the working directory to that location (e.g., use setwd(), and then use install.packages("oehlert_1..4-11.tar.gz",repos=NULL,type="source") (The repos=NULL says not to find it online but to look for the package in the local files; the type="source" tells R the file format.) Once the package is installed, you can do library(cfcdae) to load the library.

Download Data from first edition

All of the individual text data sets accessible via the web as above are also available in a single zip archive

Russ Lenth at the University of Iowa has also provided an R package that include the data sets from the book. oehlert_1.02.tar.gz Download the package and save the file into a place where R can find it (e.g., your home directory or the desktop). Start R, set the working directory to that location (e.g., use setwd(), and then use install.packages("oehlert_1.02.tar.gz",repos=NULL,type="source") (The repos=NULL says not to find it online but to look for the package in the local files; the type="source" tells R the file format.) Once the package is installed, you can do library(oehlert) from within R to load all of the data. At that point, the command pr17.4 should give you problem 4 from chapter 17.

Note that the data set names, variable names, and variable codings in oehlert.Rdata and the direct-web-accessible data may not be the same.

oehlert-data.sas An Ascii text file in SAS format. Oehlert-SAS.zip A zip archive of individual SAS data files. Again, thanks to Russ Lenth for these files.

Single text file This is in the MacAnova matread format.

Both the SAS data format and the MacAnova matread format are plain text files. You can download either and then cut/paste if you need to put the data into another format.

  • © 2023 Regents of the University of Minnesota. All rights reserved.
  • The University of Minnesota is an equal opportunity educator and employer
  • Last modified on April 26, 2023
  • Twin Cities Campus:
  • Parking & Transportation
  • Maps & Directions
  • Directories
  • Contact U of M

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Design of experiments with mixtures and their analysis with r.

Posted on June 30, 2022 by R in the Lab in R bloggers | 0 Comments

Using R for design and analysis of results for experiments with mixtures.

“Mixtures are absolutely everywhere you look. Anything you can combine is a mixture.” (chem4kids.com)

All the code and data in this post are available in the repository: Design of Experiments with Mixtures and their Analysis with R

What are experimental designs with mixtures?

These are designs aimed at determining the effect of the proportion of different components of a mixture on one or more response variables.

We must emphasize that we are referring to the proportions of the different components in the mixture and not to their absolute amount. That is, it is the proportion that determines the effect.

This type of design has application in the formulation of many products such as beverages, foods, fuels, paints, etc.

In an experimental design of mixtures, the sum of the proportions of each component is equal to 1:

r package for design of experiments

And the limits of the proportion of each component must be between 0 and 1:

r package for design of experiments

In a practical problem, calculating the proportions of each component is straightforward. For example, suppose that the sum of three components in a soda equals 2.5 g, and the respective amounts of each component are 1, 1, and 0.5 g. The ratio for the first and second ingredient is 1/2.5 = 0.4, and the ratio for the third ingredient is 0.5/2.5 = 0.2. Thus 0.4 + 0.4 + 0.2 = 1.

An experimental design of mixtures will help us determine the proportions of each component to produce the best flavor or to reduce some undesirable physical property in the liquid, for example.

Types of mixture designs and their generation in R

In this post I will focus on two types of mix designs: simplex-lattice and simplex-centroid. The generation of these designs is simple with the mixexp package .

Simplex-lattice design

The simplex-lattice design considers q components and allows fitting a model of order m to the experimental data. To generate a design with 3 components of order 3 we use the function SLD() as follows:

It should be noted that it is not necessary to specify the levels (proportions) of each component, as these are automatically determined by the ratio:

r package for design of experiments

If the proportions of each line are added together, the result will be equal to 1. In addition, for three components it is possible to use the DesignPoints() function to visualize the experimental region of the experiment:

r package for design of experiments

In this figure, the three vertices correspond to pure mixtures (formed by a single ingredient), the three sides or edges represent binary mixtures that have only two of the three components. The interior points of the triangle represent the ternary mixtures in which the three ingredients are different from zero.

Finally, the design can be exported to our working folder with the function write.csv() :

Simplex-centroid design

If predictions are to be made within the experimental region, it is important to include centroid points within the experimental region. The simplex-centroid design includes all intermediate mixtures between components. The SCD() function is used to generate it:

By visualizing the experimental region with three components, it becomes much clearer what we mean by intermediate mixtures:

r package for design of experiments

Mixing designs with component constraints

It is normal that due to technical or economic constraints, for example, the proportion of one or more components is restricted to a shape limit:

r package for design of experiments

It is possible to generate designs considering the constraints for each component with the Xvert() function:

It is also possible to visualize the experimental subregion:

r package for design of experiments

Analysis of the results of a mix design

For the example analysis, I will use the data published in Performance of reduced fat-reduced salt fermented sausage with added microcrystalline cellulose, resistant starch and oat fiber using the simplex design . In this study, the effect of the proportion of three ingredients on different characteristics of fermented sausages was determined. In this case I will only focus the analysis on one of the response variables (hardness).

Data import

As usual, the first step in the analysis is to import our data into R:

Model adjustment

You can use the lm() function to adjust a complete model or, for the same purpose, the MixModel() function of the mixexp package:

Note that these models did not include the mean or intercept. Due to the restriction of the sum of components is equal to 1, the parameters in the model are not unique. Basically, the model without mean eliminates the problem of dependence between the coefficients. As we will see later, the interpretation of each coefficient and the hypothesis testing related to each must be done in a special way for this type of design.

Model coefficients, their interpretation and determination coefficients

The summary() function will display a complete report with the coefficients of the model we previously selected, as well as the coefficients of determination:

In general, the coefficients in this type of model are interpreted as follows:

  • Coefficients of individual components. They do not measure the overall effect of component xi , but only estimate the value of the response at the vertex of the simplex. If these coefficients are not significant, it does not mean that the effect of the individual component is not important, so hypothesis tests on them are usually ignored.
  • Coefficients of double interactions. If the sign of this coefficient is positive, there is synergy between the components; if it is negative, there is antagonism between them.
  • Triple interaction coefficients. Quantifies the effect of the ternary mixture within the simplex.

The result report can be easily saved with the capture.output() function:

step() function to improve the model

In order to improve the coefficients of determination or simplify the model, sometimes non-significant terms are eliminated. This can be done somewhat subjectively by trial and error by eliminating one or more terms and then comparing with the full model. R also offers a systematic way to do the above using the step() function, this function uses the Akaike information criterion iteratively to simplify and/or improve the coefficients of determination of the model.

The function can display a large number of results depending on the number of iterations it makes to simplify the model, so in this example I will directly save the results in a text file:

Subsequently, we only need to adjust the simplified model:

By displaying a summary of results it can be seen that there is not a big difference between the coefficients of determination of this model and the full model. However, the smaller number of terms may have some practical advantage depending on the problem:

Lack-of-fit test

Another way to evaluate the quality of the model fit, if there is more than one repetition for any of the treatments, is by means of a lack-of-fit test . This can be done directly with the pureErrorAnova() function of the alr3 package:

For the simplified model:

In this test, if the p-value obtained for Lack of fit is greater than 0.05, or at the significance level established by the experimenter, it can be concluded that the model fits the data adequately. Note how with the full model we came close to rejecting the hypothesis of lack of fit, while with the simplified model the situation improved somewhat.

Visualization of the simplified model in two dimensions

It is possible to make a contour plot with the fitted model, only for the case of three components in the mixture:

r package for design of experiments

The ModelPlot() function is also included in the mixexp package.

The graph can be exported in png format, for example, as follows:

Mixture effect plot

Another way to plot the results is by using an effect plot for the components of the mixture. This two-dimensional plot can be useful if you have more than three components in the mixture. To do this we can use the ModelEff() function included in mixexp :

r package for design of experiments

ModelEff() displays the components in the same order as specified in the fitted model, so x1 corresponds to MCC, x2 corresponds to RS and x3 corresponds to OF. This plot starts with a reference mixture (usually the center of the experimental region) and shows how the response changes as one of the components increases or decreases in the mixture; when one of the components changes, the rest increase or decrease proportionally. The disadvantage of ModelEff() is that only complete, not simplified, models can be used to make the plot.

The effect graph can be exported in the same way as the contour graph:

If the reader would like to consult more examples of analysis with the mixexp package, please check the document at the following link: Mixture Experiments in R Using mixexp .

Very good! That’s all for this post, thank you very much for visiting this blog.

Juan Pablo Carreón Hidalgo 🤓

[email protected] https://github.com/jpch26

This work is licensed under a Creative Commons Attribution 4.0 International License .

CC BY 4.0

Copyright © 2022 | MH Corporate basic by MH Themes

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Get the Reddit app

A subreddit for all things related to the R Project for Statistical Computing. Questions, news, and comments about R programming, R packages, RStudio, and more.

Design of Experiments packages

Hi! So I'm really into design of experiments now (I'm a pharmaceutical scientist) and I was wondering what are the best packages out there to apply DoE in a fashioned way. I have a solid knowledge of these techniques in a practical way aswell as R language itself so I shouldn't have any trouble with more complex packages.

My main interests beyond DoE are quality control packages/statistical control packages (control charts, process control indicators and stuff like that)

Could you guys help me with that? Thanks in advance!

By continuing, you agree to our User Agreement and acknowledge that you understand the Privacy Policy .

Enter the 6-digit code from your authenticator app

You’ve set up two-factor authentication for this account.

Enter a 6-digit backup code

Create your username and password.

Reddit is anonymous, so your username is what you’ll go by here. Choose wisely—because once you get a name, you can’t change it.

Reset your password

Enter your email address or username and we’ll send you a link to reset your password

Check your inbox

An email with a link to reset your password was sent to the email address associated with your account

Choose a Reddit account to continue

  • Experimental Design in R
  • by Daniel Pinedo
  • Last updated over 3 years ago
  • Hide Comments (–) Share Hide Toolbars

Twitter Facebook Google+

Or copy & paste this link into an email or IM:

IMAGES

  1. 4 Design of Experiments (DoE)

    r package for design of experiments

  2. Design and Analysis of Experiments with R: 1st Edition (Hardback

    r package for design of experiments

  3. emi tanaka

    r package for design of experiments

  4. (PDF) Design and Analysis of Experiments with Examples in R

    r package for design of experiments

  5. Design and Analysis of Experiments and Observational Studies Using R

    r package for design of experiments

  6. (PDF) multiDimBio: An R Package for the Design, Analysis, and

    r package for design of experiments

VIDEO

  1. soap cutting and crushing ⚠️ soap cutting satisfying video 🧼 soap cutting videos for sleep #asmr

  2. How to Make a Custom R Package

  3. The Grass Experiment

  4. What is the best Building Material? Part 1: WOOD

  5. Gusto Mo ng Apartment Business? (May Announcement Ako sa Dulo)

  6. How To... Install and Load a Package in R #10

COMMENTS

  1. CRAN Task View: Design of Experiments (DoE) & Analysis of Experimental Data

    This task view collects information on R packages for experimental design and analysis of data from experiments. Packages that focus on analysis only and do not make relevant contributions for design creation are not considered in the scope of this task view. Please feel free to suggest enhancements, and please send information on new packages or major package updates if you think they belong ...

  2. Current state of R packages for the design of experiments

    There are all together 83 R-packages in the CRAN Task View of Design of Experiments & Analysis of Experimental Data as of 2022-09-18. 3 I'm going to refer these packages as DoE packages, although there are some packages in the mix that are more about the analysis of experimental data rather than the design of experiments and there are some packages that are missing in the list (e.g ...

  3. Design of Experiments Suite: Generate and Evaluate Optimal Designs

    skpr is an open source design of experiments suite for generating and evaluating optimal designs in R. Here is a sampling of what skpr offers: Generates and evaluates D, I, A, Alias, E, T, and G optimal designs, as well as user-defined custom optimality criteria. ... If addition, the package offers two functions to generate common plots related ...

  4. PDF Design of Experiments in R

    packages related to Design of Experiments Main purposes Pointer to existing functionality support synergies, avoid double work Maintainers need help (cf. also Fox 2009): please point out relevant packages or - perhaps occasionally - complain about packages in a task view that are not helpful

  5. Experimental Design and Process Optimization with R

    1 Introduction. The present document is a short and elementary course on the Design of Experiments (DoE) and empirical process optimization with the open-source Software R. The course is self-contained and does not assume any preknowledge in statistics or mathematics beyond high school level. Statistical concepts will be introduced on an ...

  6. PDF Package Package

    Package A Toolbox for Computing Efficient Designs of Experiments. 1.0.1 Radoslav Harman, Lenka Filova Lenka Filova <[email protected]>. Algorithms for D-, A-, I-, and c-optimal designs. Some of the functions in this package re-quire the 'gurobi' software and its accompanying R package.

  7. PDF Design of Experiments in R

    R-packages AlgDesign: Algorithmic experimental designs. Bob Wheeler BsMD: Bayes Screening and Model Discrimination. Ernesto Barrios igraph: Routines for simple graphs, network analysis. Gabor Csardi lhs: Latin Hypercube Samples. Rob Carnell scatterplot3d: 3D Scatter Plot. Uwe Ligges sfsmisc: Utilities from Seminar für Statistik ETH Zürich.

  8. CRAN: Package experiment

    Provides various statistical methods for designing and analyzing randomized experiments. One functionality of the package is the implementation of randomized-block and matched-pair designs based on possibly multivariate pre-treatment covariates. The package also provides the tools to analyze various randomized experiments including cluster randomized experiments, two-stage randomized ...

  9. PDF Design of Experiments in R

    overview of what functionality is available in R for experimental design and analysis of experimental data in general (cf. also the Task View on Experimental Design and Analysis of Experimental Data, ... and then focuses on industrial experimentation (cf. e.g. Box, Hunter and Hunter 2005) and a series of R packages by the author (DoE.base, FrF2 ...

  10. Some Basic Concepts about Design of Experiments and How to ...

    Basic design of experiments in R for one factor and two factors designs. You can find all the code, data and results in the GitHub repository for this post: Basic design of experiments. There is no signal without noise It never hurts to go back to basics before tackling more complex things. The purpose of this post is to give a brief overview of the basics of design of experiments, their ...

  11. Current state and prospects of R-packages for the design of experiments

    In ExperimentalDesign task view, there are 114 R packages for experimental design and analysis of data from experiments. The sheer quantity and variation of experimental designs in the R-packages are arguably unmatched with any other programming languages, e.g. in Python, only a handful of packages

  12. Current state and prospects of R-packages for the design of experiments

    Re-running an experiment is generally costly and, in some cases, impossible due to limited resources; therefore, the design of an experiment plays a critical role in increasing the quality of experimental data. In this paper, we describe the current state of R-packages for the design of experiments through an exploratory data analysis of package downloads, package metadata, and a comparison of ...

  13. A Shiny App for Design of Experiments in Life Sciences

    Overview. FielDHub is an R package/shiny design of experiments (DOE) app that aids in the creation of traditional, un-replicated, augmented and partially-replicated designs applied to agriculture, plant breeding, forestry, animal and biological sciences. For more details and examples of all functions present in the FielDHub package.

  14. HAOYU-LI/UniDOE: R package that seeks better design of experiments

    UniDOE is a R package, which implements an efficient stochastic evolutionary (SE) algorithm to search for design of experiment. Computational procedures are mainly achieved by c++ so that the calculation speed is greatly boosted. Users can either download and install from binary source package or install from github directly using devtools, of ...

  15. PDF DiceDesign: Designs of Computer Experiments

    coverage(design) design. a matrix (or a data.frame) representing the design of experiments representing the design of experiments in the unit cube [0,1]d. If this last condition is not fulfilled, a transformation into [0,1]d is applied before the computation of the criteria. The coverage criterion is defined by.

  16. Design of Experiments with R

    Abstract. Design of experiments (DoE) is one of the most important tools in the Six Sigma methodology. It is the essence of the Improve phase and the basis for the design of robust processes. An adequate use of DoE will lead to the improvement of a process, but a bad design can result in wrong conclusions and engender the opposite of the ...

  17. A First Course in Design and Analysis of Experiments

    Download R Package (with data) for Second Edition. The R-package cfcdae (Companion to A First Course in Design and Analysis of Experiments) contains all of the data from the draft second edition of the book along with a handful of useful functions. (Many of these functions have analogs in other packages, they just do things the way I like.)

  18. PDF daewr: Design and Analysis of Experiments with R

    Title Design and Analysis of Experiments with R Version 1.2-11 Date 2023-09-04 ... daewr-package Data frames and functions for Design and Analysis of Experiments with R Description This package contains the data sets and functions from the book Design and Analysis of Experi-ments with R published by CRC in 2013.

  19. Design of Experiments with Mixtures and their Analysis with R

    The ratio for the first and second ingredient is 1/2.5 = 0.4, and the ratio for the third ingredient is 0.5/2.5 = 0.2. Thus 0.4 + 0.4 + 0.2 = 1. An experimental design of mixtures will help us determine the proportions of each component to produce the best flavor or to reduce some undesirable physical property in the liquid, for example.

  20. Design of Experiments packages : r/rstats

    Design of Experiments packages . Hi! So I'm really into design of experiments now (I'm a pharmaceutical scientist) and I was wondering what are the best packages out there to apply DoE in a fashioned way. I have a solid knowledge of these techniques in a practical way aswell as R language itself so I shouldn't have any trouble with more complex ...

  21. RPubs

    Experimental Design in R. by Daniel Pinedo. Last updated over 3 years ago.

  22. PDF RcmdrPlugin.DoE: R Commander Plugin for (Industrial) Design of Experiments

    Description Provides a platform-independent GUI for design of experiments. The package is implemented as a plugin to the R-Commander, which is a more general graphical user interface for statistics in R based on tcl/tk. DoE functionality can be accessed through the menu Design that is added to the R-Commander menus. License GPL (>= 2)