Introduction
Every observed movement trajectory can be viewed as a sequence of step-lengths and turning angles. The basic idea of Hidden Markov Movement Models (HMMMs) is to decompose this sequence into distinct behavioral modes by parametrizing step-length and turning-angle distributions that depend on the animal’s unobserved behavioral state. Stated differently, the idea is to find step-length and turning-angle distributions for a given number of states such that the likelihood of observing the collected data is maximized. Once the maximum likelihood estimates have been found, the “Viterbi-Algorithm” can be applied to retrieve the animal’s inferred movement state at every point in time. In movement ecology, HMMMs are very popular because they allow to connect observed movements to an unobserved underlying behavioral state.
In this blog post, we will simulate some movement data with known parameters and
then fit an HMMM to see whether we can actually retrieve the correct simulation
parameters using the moveHMM
package.
Simulation Prerequisites
Let’s get started and set up our R-Session. First, we’ll load all packages that
we need for this little project. To simulate a virtual landscape, we will use
the brilliant NLMR
package, which unfortunately is not available from CRAN.
However, you can install it from github using:
# Install required r-packages
::install_github("cran/RandomFields") # Dependency of NLMR, but not on CRAN
devtools::install_github("ropensci/NLMR") # Not on CRAN devtools
# Load required packages
library(tidyverse) # For easier data handling
library(NLMR) # To simulate landscapes
library(raster) # To handle spatial data
library(moveHMM) # To analyse the simulated data
library(ggpubr) # To arrange multiple ggplot objects
Let’s generate a virtual landscape across which we can simulate some movement. Here, we’ll use a simple gaussian-field to represent a food-resource. It will determine the state-switching probabilities of our simulated animals, That is, depending on food-availability, our animals will be more or less likely to switch from one state to another (hence to change their movement mode).
# Simulate a resource (determines the likelihood of switching states)
raster(ncol = 500, nrow = 500, xmn = 0, xmx = 1000, ymn = 0, ymx = 1000)
food <- values(nlm_gaussianfield(ncol = 500, nrow = 500))
food[] <- scale(food)
food <-
# Visualize it
ggplot(as.data.frame(food, xy = T), aes(x = x, y = y, fill = layer)) +
geom_raster() +
scale_fill_viridis_c(name = "Food Availability") +
coord_equal() +
theme_minimal() +
theme(legend.position = "bottom") +
guides(fill = guide_colourbar(title.position="top", title.hjust = 0.5))
To now simulate movement across this landscape, we need three ingredients. (1) A function to sample step lengths, (2) a function to sample turning angles, and (3) a function to determine transition probabilities
Function to Sample Step Lengths
To sample step lengths we will simply use the gamma distribution. The gamma distribution has two parameters (shape and scale) and is often used in movement ecology because it matches observed movement patterns quite well. It’s mean and standard deviation are given by \[mean = shape * scale\] \[sd = shape^{0.5} * scale\]
Let’s take a look at some step lengths drawn from a gamma distribution for two hypothetical states (e.g. resting vs. moving).
# Simulate step lengths for two hypothetical states
tibble(
states <-State = rep(c("State 1", "State 2"), each = 500)
Shape = ifelse (State == "State 1", 2.5, 3.0)
, Scale = ifelse (State == "State 1", 0.5, 1.5)
, StepLength = NA
,
)for (i in 1:nrow(states)) {
$StepLength[i] <- rgamma(1, shape = states$Shape[i], scale = states$Scale[i])
states
}
# Visualize
ggplot(states, aes(x = StepLength)) +
geom_histogram(col = "white", bins = 20, fill = "cornflowerblue") +
theme_minimal() +
xlab("Step Length") +
facet_wrap(~ State)
Function to Sample Turning Angles
Aside from sampling step lengths, we also need to be able to sample turning angles. The von Mises distribution is perfectly suited for this, as it is bound between \(-\pi\) and \(\pi\) and allows to render a tendency to move directional. It has a concentration parameter kappa and a location parameter mu. Because R neither provides a density function nor a random number generator for the von Mises distribution, we have to write those ourselves.
# Function to determine the probability density of a von Mises distribution
function(x, kappa, mu, log = F) {
dvonmises <- exp(kappa * cos(x - mu)) / (2 * pi * besselI(kappa, nu = 0))
d <-if (log == T) {
log(d)
d <-
}return(d)
}
# Function to randomly sample values from a von Mises distribution
function(n, kappa, mu, by = 0.01) {
rvonmises <- seq(-pi, +pi, by = by)
x <- dvonmises(x, kappa = kappa, mu = mu)
probs <- sample(x, size = n, prob = probs, replace = T)
random <-return(random)
}
# Simulate turning angles for two hypothetical states
tibble(
states <-State = rep(c("State 1", "State 2"), each = 500)
Kappa = ifelse (State == "State 1", 0.2, 0.5)
, Mu = 0
, TurningAngle = NA
,
)for (i in 1:nrow(states)) {
$TurningAngle[i] <- rvonmises(1, kappa = states$Kappa[i], mu = states$Mu[i])
states
}
# Visualize
ggplot(states, aes(x = TurningAngle)) +
geom_histogram(col = "white", bins = 20, fill = "cornflowerblue") +
theme_minimal() +
xlab("Turning Angle") +
facet_wrap(~ State)
Function to Determine Transition-Probabilities
Finally, we need a function that governs the transition probabilities to switch from one state to another given a certain amount of food-availability. What we want is a function that takes a linear predictor and spits out a probability. A useful formula to turn a linear predictor (i.e. \(a + b * x\)) into a probability is the inverse of the logit. Hence, let’s write it down.
# Function to compute the transition probability, depending on food-availability
# and parameters a and b
function(food, a, b) {
transition <- a + b * food
linear_predictor <-1 / (1 + exp(-linear_predictor)) # Inverse logit
}
# Generate transition probability for different food-availability
tibble(
tr <-Food = seq(-4, 4, by = 0.01)
TransitionProbability = transition(Food, a = 1, b = 2)
,
)
# Function to sample a state
function(state, food) {
sample_state <- transition(food, params$T_a[state], params$T_b[state])
p <- rbinom(n = 1, size = 1, prob = p)
change <-if (state == 1 & change == 1) {
2
state <-else if (state == 2 & change == 1) {
} 1
state <-
}return(state)
}
# Visualize
ggplot(tr, aes(x = Food, y = TransitionProbability)) +
geom_line(col = "cornflowerblue", lwd = 1.2) +
ylab("Transition Probability") +
theme_minimal()
Simulation Parameters
We are now ready to put the different building blocks together and start our movement simulation. First, we define the different movement parameters that will be used in the simulation.
# Number of steps to simulate
100
n_steps <-
# Number of individuals to simulate
100
n_indivs <-
# Create tibble (a fancy dataframe) into which we will store the results
tibble(ID = 1:n_indivs)
sims <-
# Movement parameters
tibble(
params <-State = c("Resting", "Moving")
GammaMean = c(5, 15) # Mean of the gamma distribution
, GammaSD = c(5, 10) # Standard deviation of the gamma distribution
, MisesMean = c(0, 0) # Mean (location) of the von Mises distribution
, MisesCon = c(0.5, 1) # Concentration parameter of the von Mises distribution
, T_a = c(1, 2) # Intercept of the linear predictor for transition probs.
, T_b = c(-1, 4) # Slope of the linear predictor for transition probs.
, )
Given this set of parameters, we can visualize the step lengths, turning angles and transition probabilities that we will expect. This is simply to ensure that everything works fine and that we’re not generating unreasonable data.
# Compute probability density functions for step lengths, turning angles, and
# transition probabilities for the given simulation parameters
params %>%
example <- mutate(StepLengths = map2(GammaMean, GammaSD, function(x, y) {
tibble(
StepLength = seq(0, 60, length.out = 1000)
Density = dgamma(StepLength, shape = (x / y) ** 2, scale = y ** 2 / x)
,
)%>%
})) mutate(TurningAngles = map2(MisesCon, MisesMean, function(x, y) {
tibble(
TurningAngle = seq(-pi, +pi, length.out = 1000)
Density = dvonmises(TurningAngle, kappa = x, mu = y)
,
)%>%
})) mutate(Transitions = map2(T_a, T_b, function(x, y) {
tibble(
Food = seq(-4, +4, length.out = 1000)
Density = transition(Food, x, y)
,
)
}))
# Visualize
example %>%
p1 <- unnest(StepLengths) %>%
ggplot(aes(x = StepLength, y = Density)) +
geom_line(col = "cornflowerblue", lwd = 1.2) +
facet_wrap(~State) +
theme_minimal() +
xlab("Step Length")
example %>%
p2 <- unnest(TurningAngles) %>%
ggplot(aes(x = TurningAngle, y = Density)) +
geom_line(col = "cornflowerblue", lwd = 1.2) +
facet_wrap(~State) +
theme_minimal() +
xlab("Turning Angle")
example %>%
p3 <- unnest(Transitions) %>%
ggplot(aes(x = Food, y = Density)) +
geom_line(col = "cornflowerblue", lwd = 1.2) +
facet_wrap(~State) +
theme_minimal() +
ylab("Transition Probability")
# Put the plots together
ggarrange(p1, p2, p3, nrow = 3)
This looks good! When the animal is in the “Moving” state, the step lengths tend to be larger, turning angles tend to be narrower (closer to 0) and the transition probability reacts quite strongly to changes in food-availability. In contrast, when the animal is resting, step lengths tend to be short, turning angles less concentrated around 0 (still with a tendency to move forward though) and the transition probability reacts less strongly to changes in the food-availability.
With this we can finally run our movement simulation. The simulation works as follows:
An animal is released at the center of the map at \((x, y) = (500, 500)\) and is assumed to be oriented towards north
Food-availability at the animal’s current location is extracted and a new state is determined. The animal can either switch state or remain in the current state. Transition probabilities depend on the food availability.
A random step length and random turning angle are sampled from the distributions of the respective state.
The new position of the animal is calculated Steps 2 to 4 are then repeated until a total of 100 steps are simulated.
To run the simulation for all 100 individuals, we use the lapply()
function,
which is basically a “for-loop” that returns a list. For easier “bookkeeping” of
the list, we can make use of tibbles. Tibbles are incredibly powerful versions
of dataframes and make it fairly easy to keep track of our simulation outputs.
They allow us to store the simulated data of different individuals into a single
column titled “Simulations”. The rest of the code should be self-explanatory.
# Run simulations
$Simulations <- lapply(1:nrow(sims), function(x) {
sims
# Generate dataframe to keep track of coordinates and other data
data.frame(
df <-x = rep(NA, n_steps) # x-coordinate
y = rep(NA, n_steps) # y-coordinate
, stepl = rep(NA, n_steps) # Step length
, relta = rep(NA, n_steps) # Relative turning angle (heading / orientation)
, absta = rep(NA, n_steps) # Absolute turning angle
, state = rep(NA, n_steps) # Current state
, food = rep(NA, n_steps) # Food-availability
,
)
# Specify an initial orientation and state
0
absta_init <- 1
state_init <-
# Initiate first rows of the dataframe (animal is released at the center)
$x[1] <- 500
df$y[1] <- 500
df
# Generate the movement trajectory based on random step lengths and turning
# angles
for (i in 1:(nrow(df) - 1)) {
# Determine food availability
raster::extract(food, cbind(df$x[i], df$y[i]))
f <-
# Determine the animal's new state
sample_state(state_init, f)
s <-
# Sample step lengths from the state-specific gamma distribution
$stepl[i] <- rgamma(1
dfshape = (params$GammaMean[s] / params$GammaSD[s]) ** 2
, scale = params$GammaSD[s] ** 2 / params$GammaMean[s]
,
)
# Sample (relative) turning angles from the state-specific von mises
# distribution
$relta[i] <- rvonmises(1
dfmu = params$MisesMean[s]
, kappa = params$MisesCon[s]
,
)
# Compute the absolute turning angle
$absta[i] <- absta_init + df$relta[i]
df
# Compute the new location of the animal
$x[i + 1] <- df$x[i] + sin(df$absta[i]) * df$stepl[i]
df$y[i + 1] <- df$y[i] + cos(df$absta[i]) * df$stepl[i]
df
# Store other relevant information
$food[i] <- f
df$state[i] <- s
df df$absta[i]
absta_init <- df$state[i]
state_init <-
}
# Return the simulation
return(df)
})
The simulation shouldn’t take longer than a couple of seconds. Note that the final tibble is nested, i.e. all simulated data of single individual is contained within that specific individual’s row. Hence, we want to unnest the tibble to get a regular dataframe.
# Unnest the data
unnest(sims, Simulations)
sims <-
# Take a look at the unnested data
head(sims)
## # A tibble: 6 × 8
## ID x y stepl relta absta state food
## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 500 500 2.60 0.828 0.828 1 1.18
## 2 1 502. 502. 19.6 0.148 0.977 2 0.739
## 3 1 518. 513. 20.8 -0.422 0.555 1 -0.606
## 4 1 529. 530. 28.9 2.80 3.35 2 -0.764
## 5 1 523. 502. 4.15 -1.09 2.26 2 -1.27
## 6 1 526. 499. 12.8 -0.892 1.37 2 -2.12
You can see that unnesting preserves the simulation ID which is super-convenient because we can use it to distinguish different individuals. We can now visualize the simulated trajectories on a map. Remember that we released all individuals at the center, which is why it will by quite crowded there!
# Visualize it
ggplot() +
geom_raster(data = as.data.frame(food, xy = T), aes(x = x, y = y, fill = layer)) +
scale_fill_viridis_c(name = "Food Availability") +
geom_point(data = sims, aes(x = x, y = y, group = as.factor(ID)), size = 0.5) +
geom_path(data = sims, aes(x = x, y = y, group = as.factor(ID)), linewidth = 0.5) +
coord_equal() +
theme_minimal() +
theme(legend.position = "bottom") +
guides(fill = guide_colourbar(title.position="top", title.hjust = 0.5))
Fitting the Model
With the simulated data we can finally go ahead and fit a hidden markov movement
model. If everything works correctly, we should be able to obtain exactly the
input parameters that we used to simulate movement. First, however, we need to
make sure the data is in the correct format. We can do this using the
prepData()
function.
# Prepare the data (ignore the warning)
prepData(as.data.frame(sims[, c("ID", "x", "y", "food")])
df_prep <-type = "UTM"
, coordNames = c("x", "y")
, )
For the optimizer to work, we will need to provide initial values. If you want to learn more about how to chose initial values, I would recommend you to read the following vignette but the basic idea is to plot histograms of the observed values and then to set plausible initial values accordingly. Note that for the gamma distribution the package does not take the shape and scale parameters, but requires us to provide a mean and standard deviation instead (the formulas to compute mean and standard deviation from the shape and scale of a gamma distribution are given above).
# Histogram of step lengths and turning angles
ggplot(df_prep, aes(x = step)) +
p1 <- geom_histogram(col = "white", fill = "cornflowerblue", bins = 20) +
theme_minimal() +
xlab("Step Length")
ggplot(df_prep, aes(x = angle)) +
p2 <- geom_histogram(col = "white", fill = "cornflowerblue", bins = 20) +
theme_minimal() +
xlab("Turning Angle")
ggarrange(p1, p2, nrow = 1)
# Define initial values for step length distribution(s)
c(5, 10) # Resting (5) and moving (10) parameters
mu0 <- c(5, 10) # Resting (5) and moving (10) parameters
sd0 <- c(mu0, sd0)
stepPar <-
# Define intial values for turning angle distribution(s)
c(0, 0) # Resting (0) and moving (0) parameters
anglemean0 <- c(0.5, 1) # Resting (0) and moving (0) parameters
anglecon0 <- c(anglemean0, anglecon0) anglePar <-
We can now go ahead and fit the model to estimate the parameters of interest. Note that we provide a model formula indicating that we believe that the transition probabilities depend on food-availability. We also need to tell the model for how many states it should try to assign. In most applications people use two states only (e.g. resting and moving).
# Fit the model
fitHMM(
mod <-data = df_prep
nbStates = 2
, stepPar0 = stepPar
, anglePar0 = anglePar
, formula = ~ food
, )
Let’s put the true parameters and the estimates from the model side by side so that we can better verify that the model has done a good job.
# Compare step-length simulation parameters to estimates from moveHMM
list(mod$mle$stepPar, t(params[, c("State", "GammaMean", "GammaSD")]))
## [[1]]
## state 1 state 2
## mean 5.080427 15.188335
## sd 5.102551 9.992704
##
## [[2]]
## [,1] [,2]
## State "Resting" "Moving"
## GammaMean " 5" "15"
## GammaSD " 5" "10"
# Compare turning-angle simulation parameters to estimates from moveHMM
list(mod$mle$anglePar, t(params[, c("State", "MisesMean", "MisesCon")]))
## [[1]]
## state 1 state 2
## mean 0.003855044 0.001301972
## concentration 0.490353060 1.015617926
##
## [[2]]
## [,1] [,2]
## State "Resting" "Moving"
## MisesMean "0" "0"
## MisesCon "0.5" "1.0"
# Compare transition-parameters to estimates from moveHMM
list(mod$mle$beta, t(params[, c("State", "T_a", "T_b")]))
## [[1]]
## 1 -> 2 2 -> 1
## intercept 0.9307734 2.033371
## food -0.9901982 4.096431
##
## [[2]]
## [,1] [,2]
## State "Resting" "Moving"
## T_a "1" "2"
## T_b "-1" " 4"
Impressive! The model almost perfectly approximated all of the true parameters. We might, however, also be interested in the derived states. For this, we can apply the Viterbi algorithm. We then compare the derived states to the true states.
# Compare states
viterbi(mod)
states <-cor(sims$state, states, use = "pairwise.complete.obs")
## [1] 0.7017736
# Confusion matrix
table(sims$state, states)
## states
## 1 2
## 1 4036 826
## 2 651 4387
We can even plot the transition probabilities and will find that the plot looks almost identical to the one we produced in the beginning.
# Plot transition probs
plotStationary(mod, plotCI = T)
Conclusion
In this blog post we simulated 100 individuals moving across a virtual
landscape. The individuals exhibited two distinct behavioral modes that
determined their movement behavior. Using HMMMs implemented in the moveHMM
package we successfully recovered all simulation parameters and could determine
the behavioral modes with high accuracy.
Further Reading
Vignettes
- Basic introduction to the
moveHMM
package (https://cran.r-project.org/web/packages/moveHMM/vignettes/moveHMM-guide.pdf) - How to find meaningful initial values (https://cran.r-project.org/web/packages/moveHMM/vignettes/moveHMM-starting-values.pdf)
Session Information
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Linux Mint 21.3
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=de_CH.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=de_CH.UTF-8 LC_COLLATE=de_CH.UTF-8
## [5] LC_MONETARY=de_CH.UTF-8 LC_MESSAGES=de_CH.UTF-8
## [7] LC_PAPER=de_CH.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=de_CH.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggpubr_0.6.0 moveHMM_1.9 CircStats_0.2-6 boot_1.3-28
## [5] MASS_7.3-55 raster_3.6-20 sp_1.6-1 NLMR_1.1.1
## [9] lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.3
## [13] purrr_1.0.2 readr_2.1.4 tidyr_1.3.0 tibble_3.2.1
## [17] ggplot2_3.4.4 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] httr_1.4.6 sass_0.4.6 jsonlite_1.8.5
## [4] viridisLite_0.4.2 carData_3.0-5 bslib_0.5.0
## [7] highr_0.10 yaml_2.3.7 numDeriv_2016.8-1.1
## [10] pillar_1.9.0 backports_1.4.1 lattice_0.20-45
## [13] glue_1.6.2 digest_0.6.31 ggsignif_0.6.4
## [16] checkmate_2.2.0 colorspace_2.1-0 cowplot_1.1.1
## [19] htmltools_0.5.5 plyr_1.8.8 pkgconfig_2.0.3
## [22] broom_1.0.5 bookdown_0.34 scales_1.2.1
## [25] terra_1.7-74 jpeg_0.1-10 tzdb_0.4.0
## [28] timechange_0.2.0 ggmap_3.0.2 farver_2.1.1
## [31] generics_0.1.3 car_3.1-2 cachem_1.0.8
## [34] withr_2.5.2 cli_3.6.1 magrittr_2.0.3
## [37] evaluate_0.21 fansi_1.0.5 rstatix_0.7.2
## [40] RandomFieldsUtils_0.5 blogdown_1.17 tools_4.1.2
## [43] hms_1.1.3 geosphere_1.5-18 RgoogleMaps_1.4.5.3
## [46] lifecycle_1.0.4 munsell_0.5.0 compiler_4.1.2
## [49] jquerylib_0.1.4 rlang_1.1.2 grid_4.1.2
## [52] RandomFields_3.3.1 labeling_0.4.2 bitops_1.0-7
## [55] rmarkdown_2.22 gtable_0.3.3 codetools_0.2-18
## [58] abind_1.4-5 R6_2.5.1 knitr_1.43
## [61] fastmap_1.1.1 utf8_1.2.4 stringi_1.8.1
## [64] Rcpp_1.0.11 vctrs_0.6.4 png_0.1-8
## [67] tidyselect_1.2.0 xfun_0.39