Lately I have been working with a seabird diet time-series. I have learned that there are tens of different indices that can be used to describe the dataset. Best fitting index has to be chosen individually depending on the questions and data. Swanson & Krapu (1974) made a review of the issue. Here I am going to demonstrate how to calculate "frequency of occurrence", "aggregate mass", "aggregate percentage" and mean mass for a time-series.
Shortly, frequency of occurrence tells how frequently a diet item occurs in the dataset. The aggregate mass tells the percentage for a diet item of summed total volume of all samples. Finally, the mean of mass percentages, or aggregate percentage, tells the average over the dataset of how many percentages a diet item constitutes of a stomach sample. This is much better explained in Swanson & Krapu 1974.
There are of course thousands of ways to do this, but simplest I have found so far is by using "reshape" package. Here is the R code:
#Assume a dataset where "a", "b" and "c" are diet items, "x1", "x2" ... are diet samples from individual birds and "y1" and "y2" are different years. Values for diet items are given in grams:
a <- c(1, 1, 2, 0, 3, 0, 5, 6, 1, 2)
b <- c(2, 1, 3, 2, 5, 2, 3, 5, 1, 0)
c <- c(0, 0, 1, 2, 4, 0, 20, 0, 0, 0)
year <- c(rep("y1", 5), rep("y2", 5))
bird <- paste(rep("x", 10), seq(1,10,1), sep="")
data <- cbind.data.frame(bird, year, a, b, c)
library(reshape) # start the reshape package
data$total.mass <- rowSums(data[3:5]) # create total.mass column for later use
melt.data <- melt(data, id=1:2, measured=3:6) # melt the data so that we can cast it in desired format
# start from easiest, mean mass and standard deviation.
t(cast(melt.data, formula = year ~ variable, mean)) #Transposing t() because it is better to list diet items as rows when there are a lot of prey items
t(cast(melt.data, formula = year ~ variable, sd))
# aggregate mass needs column sum and total mass for each year
mass.sum <- cast(melt.data, formula = year ~ variable, sum)
aggregate.mass <- (mass.sum[2:5]/mass.sum$total.mass)*100
#if you want to print the dataframe, you can do following:
rownames(aggregate.mass) <- levels(mass.sum$year)
print(t(aggregate.mass), digits = 1)
#aggregate percentage
data.per <- cbind.data.frame(data[1:2], (data[3:6]/data$total.mass)*100) # calculate how big percentage each diet item consitutes in each sample
data.per <- data.per[-6] # drop off useless total.mass column
melt.data.per <- melt(data.per, id=1:2, measured=3:5)
aggregate.percentage <- cast(melt.data.per, formula = year ~ variable, mean)
#frequency of occurrence
data.na <- data
data.na[data.na==0] <- NA
melt.data.na <- melt(data.na, id=1:2, measured=3:6, na.rm = T)
occur <- cast(melt.data.na, formula = year ~ variable, length)
t(cbind.data.frame(occur[2:4]/occur$total.mass, row.names = levels(occur$year)))
#comparison between aggregate percentage and aggregate mass
rownames(aggregate.percentage) <- levels(aggregate.percentage$year)
aggregate.percentage <- aggregate.percentage[-1]
aggregate.mass <- aggregate.mass[-4]
aggregate.percentage-aggregate.mass
# differences are mainly because of 20 grams of item c in bird x7. The aggregate volume method gives equal weight to each unit of food consumed by any bird while the aggregate percent method gives equal weight in the analysis to each bird as Swanson & Krapu (1974) says.
Please tell, if there are mistakes in scripts.
References
Swanson, G. A., G. L. Krapu, et al. (1974). "Advantages in Mathematically Weighting Waterfowl Food-Habits Data." Journal of Wildlife Management 38(2): 302-307.
Hehe, nerdy indeed, but seems to be helpful!
ReplyDeleteI'm on my way into the world of R as well, SPSS is made for social "scientists" anyway ;)