library(tidyverse)
library(FSA)
library(knitr)
specimen
Specimen question
A chemical company has developed a new herbicide and is testing it against the leading brand. Field trials on plots are randomly assigned to one of three treatments: treated according to the manufacturer’s instructions for the leading brand (here called ‘brand’); treated by the advised protocol for the new chemical (here called ‘new’); or treated with a procedural control (‘control’). There were 30 plots for each treatment. Data are counts of numbers of weed species found in a quadrat six weeks after application of the treatment (one quadrat within each trial plot) and are provided as a text file in ‘specimen.txt’.
Your role is to choose and apply analyses of the provided data appropriately, produce suitable figure(s) and table(s), provide reproducible R code and a report of your findings.
Specimen quarto report answer
Methods
Load required packages
Read in the data file
<- read_table("data_raw/specimen.txt") dat
Group by treatment to look at how many counts per treatment
%>%
dat group_by(Herbicide) %>%
count()
# A tibble: 3 × 2
# Groups: Herbicide [3]
Herbicide n
<chr> <int>
1 brand 30
2 control 30
3 new 30
Summarise and calculate means, medians etc - we can also get the summary/min/max from summary on the object
<- dat %>%
dat_summary group_by(Herbicide) %>%
summarise(mean = mean(Weeds),
std = sd(Weeds),
n = length(Weeds),
se = std/sqrt(n),
min = min(Weeds),
max = max(Weeds),
range= max-min)
Display this table using the kable function
kable(dat_summary)
Herbicide | mean | std | n | se | min | max | range |
---|---|---|---|---|---|---|---|
brand | 4.133333 | 1.870521 | 30 | 0.3415089 | 0 | 7 | 7 |
control | 8.666667 | 2.951719 | 30 | 0.5389077 | 3 | 15 | 12 |
new | 5.166667 | 2.085803 | 30 | 0.3808138 | 1 | 9 | 8 |
Save mean values for each treatment group by selecting the row number and column in the dataframe and rounded them to two decimal places
<- round(dat_summary[1,"mean"],2)
mean_brand <- round(dat_summary[2,"mean"],2)
mean_control <- round(dat_summary[3,"mean"],2) mean_new
One quick exploratory plot - this doesn’t need to be mentioned in results.
Dataset are counts so kruskal-wallis test rather than a one-way anova
kruskal.test(data = dat, Weeds ~ Herbicide)
Kruskal-Wallis rank sum test
data: Weeds by Herbicide
Kruskal-Wallis chi-squared = 34.188, df = 2, p-value = 3.768e-08
Gives a significant result, but there are 3 pairwise comparisons, so post-hoc test needed. Used the Dunn test for post-hoc with Kruskal Wallis
dunnTest(data = dat, Weeds ~ Herbicide)
Warning: Herbicide was coerced to a factor.
Dunn (1964) Kruskal-Wallis multiple comparison
p-values adjusted with the Holm method.
Comparison Z P.unadj P.adj
1 brand - control -5.653706 1.570244e-08 4.710733e-08
2 brand - new -1.535359 1.246955e-01 1.246955e-01
3 control - new 4.118347 3.816001e-05 7.632003e-05
Control is significantly different p value < 0.001. Brand - new is non significant (adj p value 0.46), control -new is non significant but borderline: p = 0.051
$Herbicide <- factor(dat$Herbicide, levels = c("control","new","brand")) dat
This can be shown in a boxplot more clearly, it makes sense to order the levels alphabetically, so the dataframe was reordered like this first
ggplot(dat, aes(x = Herbicide, y = Weeds)) +
geom_boxplot() +
geom_jitter(width=0.06, height=0.1, col="grey") +
xlab("Treatment applied") +
ylab("Weed species recorded") +
ylim(0,16) +
theme_classic()
Results
There were 30 valid weed species counts from each of the three treatments with a range of 0 to 15 weed species found in each quadrat. The mean number of weed species are highest for the control 8.67 and lowest for the leading brand 5.17. Summary data are shown in Table 1 and visualised in Figure 1
A Kruskal-Wallis test showed a significant effect of treatment on the number of weed species recorded (chi-squared 34.19, d.f. = 2, p < 0.001). A post hoc Dunn test showed that the control had significantly more weeds recorded than either the leading brand or the new chemical (p < 0.001 in control versus brand). The new chemical had 1.53 more weeds per quadrat than the leading brand, but any difference was not significant in the Dunn test (p = 0.051).
This trial has confirmed that the new chemical is a very effective herbicide and, although not quite as good at keeping weed species down, has similar efficacy to the leading brand.