Appendix A — Troubleshooting secr

What could possibly go wrong when fitting spatially explicit capture–recapture models? Quite a lot. This appendix assembles a list of known difficulties with secr 5.2, with examples, and suggests some solutions. The scaling of covariates and coordinates was discussed earlier in relation to density surface models.

A.1 Bias limit exceeded

Suppose we omitted to specify the buffer argument for the first snowshoe hare model in Chapter 2:

fit <- secr.fit (hareCH6, trace = FALSE)
Warning: using default buffer width 100 m
Warning: predicted relative bias exceeds 0.01 with buffer = 100

The second warning message is clearly a consequence of the first: relying on the 100-m default buffer for an animal as mobile as the snowshoe hare is likely to cause problems. See the advice on choosing the buffer width in Chapter 12.

Density is overestimated when the buffer width for a detector array in continuous habitat is too small to encompass the centres of all animals potentially detected. A check for this ‘mask truncation bias’ is performed routinely by secr.fit after fitting a model that uses on the ‘buffer’ argument. It may be avoided by setting biasLimit = NA or providing a pre-computed habitat mask in the ‘mask’ argument.

A.2 Initial log likelihood NA

Maximization will fail completely if the likelihood cannot be evaluated at the starting values. This will be obvious with trace = TRUE. Otherwise, the first indication will be a premature end to fitting and a lot of NAs in the estimates.

Tip

It is common to see occasional NA values among the likelihoods evaluated during maximization. This is not a problem.

For an example, this section previously used the default starting values for the dataset infraCH (Oligosoma infrapunctatum skinks sampled with pitfall traps over two 3-occasion sessions labelled ‘6’ and ‘7’). Unfortunately, those now seem to work, so we have to contrive an example by specifying a bad starting value for sigma:

fit <- secr.fit(infraCH, buffer = 25, start = list(sigma = 2), 
                trace = TRUE)
Checking data 
Preparing detection design matrices 
Preparing density design matrix 
Finding initial parameter values... 
Initial values  D = 261.98628, g0 = 0.12632, sigma = 2.87777 
Maximizing likelihood... 
Eval     Loglik        D       g0    sigma 
   1         NA   5.5683  -1.9339   0.6931 
   2         NA   5.5683  -1.9339   0.6931 
   3         NA   5.5683  -1.9339   0.6931 
   4         NA   5.5683  -1.9339   0.6931 
   5         NA   5.5683  -1.9339   0.6931 
   6         NA   5.5688  -1.9339   0.6931 
   7         NA   5.5683  -1.9338   0.6931 
   8         NA   5.5683  -1.9339   0.6932 
   9         NA   5.5694  -1.9339   0.6931 
  10         NA   5.5688  -1.9338   0.6931 
  11         NA   5.5688  -1.9339   0.6932 
  12         NA   5.5683  -1.9337   0.6931 
  13         NA   5.5683  -1.9338   0.6932 
  14         NA   5.5683  -1.9339   0.6933 
Completed in 4.18 seconds at 17:14:24 17 Mar 2025 

Unsurprisingly, the problem can be addressed by manually providing a better starting value for sigma:

fit1 <- secr.fit(infraCH, buffer = 25, start = list(sigma = 5), 
                 trace = FALSE)

The original problem seems to have been due to a discrepancy between the two sessions (try RPSV(infraCH, CC = TRUE)). The default sigma was suitable for the first session and not the second, whereas a larger sigma suits both. Rather than manually providing the starting value we could have directed secr.fit to use the second session for determining starting values:

fit2 <- secr.fit(infraCH, buffer = 25, details = list(autoini = 2),
                 trace = FALSE)

A.3 Variance calculation failed

If warnings had not been suppressed in the preceding example we would have seen

Warning in secr.fit(infraCH, buffer = 25, details = list(autoini = 2), trace =
FALSE): at least one variance calculation failed

Examination of the output would reveal missing values for SE, lcl and ucl in both the coefficients and predicted values for g0.

Failure to compute the variance can be a sign that the model is inherently non-identifiable or that the particular dataset is inadequate (e.g., Gimenez et al., 2004). However, here it is due to a weakness in the default algorithm. Call secr.refit with method = "none" to recompute the variance-covariance matrix using a more sophisticated approach:

fit2r <- secr.refit(fit2, method = "none")
Checking data 
Preparing detection design matrices 
Preparing density design matrix 
Computing Hessian with fdHess in nlme 
   1  -2260.348   5.5120  -2.2342   1.4915 
   2  -2260.354   5.5175  -2.2342   1.4915 
   3  -2260.348   5.5120  -2.2319   1.4915 
   4  -2260.350   5.5120  -2.2342   1.4929 
   5  -2260.355   5.5065  -2.2342   1.4915 
   6  -2260.349   5.5120  -2.2364   1.4915 
   7  -2260.350   5.5120  -2.2342   1.4900 
   8  -2260.357   5.5175  -2.2319   1.4915 
   9  -2260.361   5.5175  -2.2342   1.4929 
  10  -2260.353   5.5120  -2.2319   1.4929 
  11  -2260.348   5.5120  -2.2342   1.4915 
Completed in 3.26 seconds at 17:15:54 17 Mar 2025 
predict(fit2r)
$`session = 6`
       link   estimate SE.estimate        lcl       ucl
D       log 247.640709  13.9294531 221.809895 276.47964
g0    logit   0.096724   0.0082055   0.081792   0.11404
sigma   log   4.443569   0.1579074   4.144699   4.76399

$`session = 7`
       link   estimate SE.estimate        lcl       ucl
D       log 247.640709  13.9294531 221.809895 276.47964
g0    logit   0.096724   0.0082055   0.081792   0.11404
sigma   log   4.443569   0.1579074   4.144699   4.76399

The trapping sessions were only 4 weeks apart in spring 1995. We can further investigate session differences by fitting a session-specific model. The fastest way to fit fully session-specific models is to fit each session separately; lapply here applies secr.fit separately to each component of infraCH:

fits3 <- lapply(infraCH, secr.fit, buffer = 25, trace = FALSE)
Warning in FUN(X[[i]], ...): possible maximization error: nlm returned code 3. See
?nlm
class(fits3) <- "secrlist"  # ensure secr will recognise the fitted models
predict(fits3)
$`6`
       link  estimate SE.estimate       lcl       ucl
D       log 255.75796    28.36892 205.92143 317.65579
g0    logit   0.17127     0.02989   0.12032   0.23796
sigma   log   2.51903     0.18140   2.18784   2.90035

$`7`
       link  estimate SE.estimate        lcl       ucl
D       log 278.02163   18.848980 243.464690 317.48352
g0    logit   0.09894    0.010483   0.080208   0.12147
sigma   log   4.95115    0.210887   4.554778   5.38202

Notice that there is no issue with starting values when the sessions are treated separately. The skinks appeared to enlarge their home ranges as the weather warmed; they may also have become more active overall1. It is plausible that density did not change: the estimate increased slightly, but there is substantial overlap of confidence intervals.

A.4 Log likelihood becomes NA or improbably large after a few evaluations

The default maximization method (Newton-Raphson in function nlm) takes a large step away from the initial values at evaluation np + 3 where np is the number of estimated coefficients. This often results in a very negative or NA log likelihood, from which the algorithm quickly returns to a reasonable part of the parameter space. However, for some problems it does not return and estimation fails, possibly with a weird message about infinite density. Two solutions are suggested:

  • change to the more robust Nelder-Mead maximization algorithm
secr.fit(CH, method = "Nelder-Mead", ...)
  • vary the scaling of each parameter in nlm by passing the typsize (typical size) argument. The default is typsize = rep(1, np). Suppose your model has four coefficients and it is the second one that appears to be behaving wildly:
secr.fit(CH, typsize = c(1, 0.1, 1, 1), ...)

In these examples CH is your capthist object and ... indicates other arguments of secr.fit.

A.5 Possible maximization error: nlm code 3

The default algorithm for numerical maximization of the likelihood is nlm in the base R stats package. That uses a Newton-Raphson algorithm and numerical derivatives. It was chosen because it is significantly faster than the alternatives. However, it sometimes returns estimates with the ambiguous result code 3, which means that the optimization process terminated because “[the] last global step failed to locate a point lower than estimate. Either estimate is an approximate local minimum of the function or steptol is too small.”

Here is an example:

fit3 <- secr.fit(infraCH, buffer = 25, model = list(g0~session, 
    sigma~session), details = list(autoini = 2), trace = FALSE)

The results seem usually to be reliable even when this warning is given. If you are nervous, you can try a different algorithm in secr.fit – “Nelder-Mead” is recommended. We can derive starting values from the earlier fit:

fit3nm <- secr.fit(infraCH, buffer = 25, model = list(g0~session, 
    sigma~session), method = "Nelder-Mead", start = fit3, trace = FALSE)

There is no warning. Comparing the density estimates we see a trivial difference in the SE and confidence limits and none at all in the estimates themselves:

collate(fit3, fit3nm)[1,,,'D']
       estimate SE.estimate    lcl    ucl
fit3     271.95      15.849 242.62 304.83
fit3nm   271.93      15.810 242.66 304.72

This suggests that only the variance-covariance estimates were in doubt, and it would have been much quicker merely to check them with method = "none" as in the previous section.

A.6 Possible maximization error: nlm code 4

The nlm Newton-Raphson algorithm may also finish with the result code 4, which means that the optimization process terminated when the maximum number of iterations was reached (“iteration limit exceeded”). The maximum is set by the argument iterlim which defaults to 100 (each ‘iteration’ uses several likelihood evaluations to numerically determine the gradient for the Newton-Raphson algorithm).

The number of iterations can be checked retrospectively by examining the nlm output saved in the ‘fit’ component of the fitted model. Ordinarily nlm uses less than 50 iterations (for example fit3$fit$iterations = 26).

A ‘brute force’ solution is to increase iterlim (secr.fit() passes iterlim directly to nlm()) or to re-fit the model starting at the previous solution (start = oldfit). This is not guaranteed to work. Alternative algorithms such as method = 'Nelder-Mead' are worth trying, but they may struggle also.

There does not appear to be a universal solution. Slow or poor fitting seems more common when there are many beta parameters, and when one or more parameters is very imprecise, at a boundary, or simply unidentifiable. It is suggested that you examine the coefficients of the provisional result with coef(fit) and seek to eliminate those that are not identifiable.

Tricks include:

  • combining levels of poorly supported factor covariates
  • fixing the value of non-identifiable beta parameters with details argument fixedbeta
  • ensuring that all levels of a factor x factor interaction are represented in the data (possibly by defining a single factor with valid levels)
  • changing the coding of factor covariates with details argument contrasts.

The following code demonstrates fixing a beta parameter, although it is neither needed nor recommended in this case.

code to fix beta
# review the fitted values
coef(fit3)
                   beta  SE.beta      lcl      ucl
D               5.60561 0.058232  5.49148  5.71975
g0             -1.62740 0.197335 -2.01417 -1.24063
g0.session7    -0.57474 0.223044 -1.01190 -0.13759
sigma           0.91992 0.071424  0.77994  1.05991
sigma.session7  0.68185 0.082613  0.51993  0.84377
code to fix beta
# extract the coefficients
betafix <- coef(fit3)$beta
# set first 4 values to NA as we want to estimate these
betafix[1:4] <- NA
betafix
[1]      NA      NA      NA      NA 0.68185
code to fix beta
# refit, holding last coefficient constant
fit3a <- secr.fit(infraCH, buffer = 25, model = list(g0~session, sigma~session),
         details = list(autoini = 2, fixedbeta = betafix), trace = FALSE)
Warning in secr.fit(infraCH, buffer = 25, model = list(g0 ~ session, sigma ~ :
possible maximization error: nlm returned code 3. See ?nlm
code to fix beta
coef(fit3a)
                beta  SE.beta      lcl      ucl
D            5.60558 0.058378  5.49116  5.72000
g0          -1.62733 0.143073 -1.90775 -1.34692
g0.session7 -0.57506 0.131847 -0.83347 -0.31664
sigma        0.91998 0.037586  0.84632  0.99365

Note that the estimated coefficients (‘beta’) have not changed, but the estimated ‘SE.beta’ of each detection parameter has dropped - a result of our spurious claim to know the true value of ‘sigma.session7’.

There is no direct mechanism for holding the beta parameters for different levels of a factor (e.g., session) at a single value. The effect can be achieved by defining a new factor covariate with combined levels.

A.7 secr.fit requests more memory than is available

In secr 5.2 the memory required by the external C code is at least 32 \times C \times M \times K bytes, where C is the number of distinct sets of values for the detection parameters (across all individuals, occasions, detectors and finite mixture classes), M is the number of points in the habitat mask and K is the number of detectors. Each distinct set of values appears as a row in a lookup table2 whose columns correspond to real parameters; a separate parameter index array (PIA) has entries that point to rows in the lookup table. Four arrays with dimension C \times M \times K are pre-filled with, for example, the double-precision (8-byte) probability an animal in mask cell m is caught in detector k under parameter values c.

The number of distinct parameter sets C can become large when any real parameter (g0, lambda0, sigma) is modelled as a function of continuous covariates, because each unit (individual, detector, occasion) potentially has a unique level of the parameter. A rough calculation may be made of the maximum size of C for a given amount of available RAM. Given say 6GB of available RAM, K = 200 traps, and M = 4000 mask cells, C should be substantially less than 6e9 / 200 / 4000 / 32 \approx 234. Allowance must be made for other memory allocations; this is simply the largest.

There is a different lookup table for each session; the limiting C is for the most complex session. The memory constraint concerns detection parameters only.

Most analyses can be configured to bring the memory request down to a reasonable number.

  1. C may be reduced by replacing each continuous covariate with one using a small number of discrete levels (e.g. the mid-points of weight classes). For example, weightclass <- 10 * trunc(weight/10) + 5 for midpoints of 10-g classes.
  2. M can be reduced by building a habitat mask with an appropriate spacing (see [secr-habitatmasks.pdf]).
  3. K might seem to be fixed by the design, but in extreme cases it may be appropriate to combine data from adjacent detectors (see Collapsing detectors).

The mash function (see Mashing) may be used to reduce the number of detectors when the design uses many identical and independent clusters. Otherwise, apply your ingenuity to simplify your model, e.g., by casting ‘groups’ as ‘sessions’. Memory is less often an issue on 64-bit systems (see also ?"Memory-limits").

A.8 Covariate factor levels differ between sessions

This is fairly explicit; secr.fit will stop if you include in a model any covariate whose factor levels vary among sessions, and verify will warn if it finds any covariate like this. This commonly occurs in multi-session datasets with ‘sex’ as an individual covariate when only males or only females are detected in one session. Use the function shareFactorLevels to force covariates to use the same superset of levels in all sessions.

A.9 Estimates depend on starting values

Instability of the estimates can result when the likelihood surface has a local maximum and is said to be ‘multimodal’. Numerical maximization may then fail to find the true maximum from a given starting point. Nelder-Mead is more robust than other methods.

Finite mixture models have known problems due to multimodality of the likelihood, as discussed separately (Chapter 15). See Dawson & Efford (2009) and the vignette secr-sound.pdf for another example of a multimodal likelihood in SECR.


  1. Home-range area increased about 4-fold; g0 showed some compensatory decrease, but compensation was incomplete, implying increased total activity (treating g_0 as an approximation to \lambda_0; see Efford and Mowat 2014).↩︎

  2. You can see this table by running secr.fit with details = list(debug = 3) and typing Xrealparval at the browser prompt (type Q to exit).↩︎