Generic Model-Based Bycatch Estimation Procedure • BycatchEstimator

BycatchEstimator uses both model-based and design-based procedures to estimate total annual bycatch by expanding a sample, such as an observer database to the total effort from logbooks or landings records. The model framework can also be used to estimate an annual index of abundance, calculated only from the observer data. See the User’s Guide under articles at https://ebabcock.github.io/BycatchEstimator/ for details. The source code is at https://github.com/ebabcock/BycatchEstimator. For a development version of a Shiny app that runs the data checks described below, go to https://natureanalytics.shinyapps.io/BycatchEstimator/

Installation

The code runs best in R studio. Before running the code for the first time, install the latest versions of R and RStudio. The output figures and tables can be printed to an html or pdf file using R Markdown and the knitr library, which outputs a LaTex file. If you want to have the results in a pdf format, you must have a LaTex program installed, such as TinyTex (https://yihui.org/tinytex/). The default is to have the results printed to a html file, from which you can extract figures and tables.

You can install the development version of BycatchEstimator from GitHub with:

# install.packages("devtools")
devtools::install_github("ebabcock/BycatchEstimator")

For more help with installation, see the Installation Guide https://ebabcock.github.io/BycatchEstimator/ Also, see the video tutorial at https://miami.zoom.us/rec/share/ec4dqzeZ4s_fuoVM8wb6B-a5npAwZZfV9tNciZpGaUMcQAYVLrJiWXQo5yXWjfVl.9Sl_3METqgD8O5l0?startTime=1751276745000 Passcode: gZ!tK8!n

If you have used older versions of this tool, see the the article on how to adjust your code to use the new version under articles at https://ebabcock.github.io/BycatchEstimator/

Getting started

Due to a complication in coding, the user must load library(MuMIn) in addition to library(BycatchEstimator).

library(BycatchEstimator)
library(MuMIn)

LLSIM Example

To demonstrate its use, example data sets are included in this R package. In this example, we will use data sets from LLSIM (Goodyear 2021).

Logbook data

Simulation of longline fleets with LLSIM was conducted using three idealized fleets as described by Goodyear (2021). These fleets are a USA-like fleet (fleet 1), Japan-like fleet (fleet 2) and a Brazil-like fleet (fleet 3). The simulated data set included three fleets, with data spanning from 1990 to 2018 to reflect the approximate period for which observer coverage has been established. The species distribution model (SDM) and longline simulator generates a 3-dimensional distribution of blue marlin and swordfish throughout the Atlantic Ocean based on the habitat preferences of the species. Simulated longline sets are then generated by distributing hooks throughout the habitat of the species, consistent with the distribution, gear, hooks between floats, use of lightsticks and other characteristics of historical longline fishing fleets. While LLSIM initially produces set-level catches, both logbooks and observer databases were allocated by trip (Babcock and Goodyear 2021). Sets are allocated to trip if they were in the same gear, month and spatial area (5 x 5 squares). Trips with more than 100 sets were randomly allocated to different trips so that the median trips had about 20 sets.

LLSIM_BUM_Example_logbook

Observer program data

Observer program data are generated by passing the trip-level logbook data to an observer program sub-model. The observer sub-model assumes that observer coverage is randomly assigned to trips, with 5% coverage of trips. The entire trip is assumed to be observed.

LLSIM_BUM_Example_observer

Data setup

The first step in bycatch estimation is to set up the input file and review the data. The returned value from bycatchSetup is assigned as an object that will be used in subsequent steps. This first step also produces an html output (default) that is saved to the working directory for the user to review. This html output contains summary figures and tables showing the sample size and presence/absence of the bycatch species across levels of predictor variables, observer coverage levels, and raw trends in CPUE. bycatchSetup will also indicate if there are missing data or NAs with warning messages to the console and printed messages in the html output.

setupObj<-bycatchSetup(
  obsdat = droplevels(LLSIM_BUM_Example_observer[LLSIM_BUM_Example_observer$Year>2010 &LLSIM_BUM_Example_observer$fleet==2,]),
  logdat = droplevels(LLSIM_BUM_Example_logbook[LLSIM_BUM_Example_logbook$Year>2010 & LLSIM_BUM_Example_logbook$fleet==2,]),
  yearVar = "Year",
  obsEffort = "hooks",
  logEffort = "hooks",
  obsCatch = c("SWO","BUM")[2], # selecting Blue marlin catch
  catchUnit = "number",
  catchType = "catch",
  logNum = NA,
  sampleUnit = "trips",
  factorVariables = c("Year","area","season"),
  numericVariables = NA,
  EstimateBycatch = TRUE,
  baseDir = getwd(),
  runName = "LLSIMBUMtripExample",
  runDescription = "LLSIMBUM by trip",
  common = c("Swordfish","Blue marlin")[2], # selecting Blue marlin
  sp = c("Xiphias gladius","Makaira nigricans")[2], # selecting Blue marlin
  reportType = "html"
)

Design-based estimators

Estimation of bycatch using design-based estimators is done with bycatchDesign, requiring the output of bycatchSet and other inputs. This uses a stratified ratio estimator or stratified delta-lognormal estimator, with stratification variables defined by the user. To deal with strata that have no observations or few observations, the user may request pooling, and specify the minimum number of sample units needed to avoid pooling. bycatchDesign also produces an html or pdf output with results to the specified directory.

designObj <- bycatchDesign(
  setupObj = setupObj,
  designScenario = "noPool",
  designMethods = c("Ratio", "Delta"),
  designVars = c("Year","area","season"),
  designPooling = FALSE,
  poolTypes=c("adjacent","all","none"),
  pooledVar=c(NA,NA),
  adjacentNum=c(1,NA),
  minStrataUnit = 1,
  reportType="html"
)

Model-based estimators

Estimation of bycatch and/or index of abundance is carried out by using the function bycatchFit. This function requires an object produced by bycatchSetup. Bycatch estimation is carried out by running linear models based on user-defined statistical distributions of observation error models (e.g. delta-lognormal, and negative binomial) with predictor variables (e.g., year, season, depth). The task of identifying a best approximating model is addressed through a semi-automated model selection process based on the user’s choice of information criteria (AICc, AIC or BIC) and cross validation. Once a best approximating model is identified, the standardized CPUE model is used to predict total bycatch in all logbook trips and summing across trips. The user needs to define the most complex model configuration and simple model configuration, and the function will fit all models in between. Random effects can also be specified, and the user needs to opt-in for doing cross validation. The default method for variance calculation is the simulation method, but other options are available. bycatchFit also produces an html or pdf output with results to the specified directory.

bycatchFit(
  setupObj = setupObj,
  modelScenario = "s1",
  complexModel = formula(y~Year+area),
  simpleModel = formula(y~Year),
  indexModel = formula(y~Year),
  modelTry = c("TMBnbinom1","TMBtweedie"),
  randomEffects = NULL,
  randomEffects2 = NULL,
  selectCriteria = "BIC",
  DoCrossValidation = TRUE,
  CIval = 0.05,
  VarCalc = "Simulate",
  useParallel = TRUE,
  nSims = 100,
  plotValidation = FALSE,
  trueVals = NULL,
  trueCols = NULL,
  reportType = "html"
)

Optionally reload model results for plotting

The function loadOutputs reads in all the R objects from the analysis, as well as a data frame called allYearEstimates appropriate for plotting with ggplot.

allResults<-loadOutputs(baseDir = getwd(),
                      runName= "LLSIMBUMtripExample",
                      runDate =  Sys.Date(),
                      designScenarios = "noPool",
                      modelScenarios = "s1"
)
#Plot all together
ggplot(allResults$allYearEstimates,aes(x=Year,y=Total,
                                       ymin=TotalLCI,ymax=TotalUCI,
                      fill=Source,color=Source))+
  geom_line()+
  geom_ribbon(alpha=0.4)+
  theme_bw()

References

Goodyear, C.P. 2021. Development of new model fisheries for simulating longline catch data with LLSIM. Collect. Vol. Sci. Pap. ICCAT, 78(5): 53-62

Babcock, E.A. and C. P. Goodyear. 2021. Testing a bycatch estimation tool using simulated blue marlin longline data. ICCAT Collective Volume of Scientific Papers, 78(5): 179-189.

Babcock E. A., W. J. Harford, T. Gedamke, D. Soto, and C. P. Goodyear. 2022. Efficacy of a bycatch estimation tool. ICCAT Collective Volume of Scientific Papers 79(5): 304-339

Babcock, E. A., W. J. Harford, T. Gedamke, S. Anderson, and C. P. Goodyear. 2023. Simulation-Testing Model-Based and Design-Based Bycatch Estimators. ICCAT Collective Volume of Scientific Papers 80 (6): 51–79.