Forces a standardized format for inputting bomb radidocarbon data into a list object for estimating with 'cmdstanr'

data_prep(
  df_ref,
  df_unk,
  knot.min = 10,
  knot.adj = 4,
  fixed.knot,
  spline.degree = 3,
  pad.spline = 0.01,
  ll_wt = 5,
  pred.by = list(min.by = 1940, max.by = 2020, inc.by = 1)
)

Arguments

df_ref

a data.frame with two named columns for the reference series data; 1) BY - vector of known formation (birth) years of the reference series; 2) C14 - vector of ∆14C values of the reference series

df_unk

a data.frame with two named columns for samples with unknown true birth year; 1) BY - vector of estimated formation (birth) years; 2) C14 - vector of ∆14C values of the samples

knot.min

minimum number of knots, default is 10.

knot.adj

divisor of the number of observations to set the number of knots, set a default of 4 with number of knots = nrow(df_ref)/knot.adj. Increasing this value decreases the number of knots (more smoothing in the spline).

fixed.knot

(optional) set number of knots that overrides number of knots = nrow(df_ref)/knot.adj. Must be a integer followed by L to be considered.

spline.degree

degree of polynomial spline, set at a default of 3. Must be 0 or greater.

pad.spline

amount of years to pad spline knot locations, default is 0.01

ll_wt

weighting value of the reference series relative to the sample values for the integrated method, default is 5 (tested to be sufficient unless there is many samples).

pred.by

a vector of formation years or a named list object for predicting the reference series ∆14C at. The named list needs: min.by = start year of prediction sequence, max.by = end year of prediction sequence, inc.by = year increment of the prediction sequence.

Value

A named list object containing:

  • flag character string indicating model type

  • data a named list object that matches the DATA section in the STAN model

Examples

#default BY_pred, reference only
df <- data_prep(sim_ref)
#> No values provided in df_unk, skipping validation and estimating reference series only

#default BY_pred, integrated model
df <- data_prep(sim_ref, sim_unk)

#custom BY_pred
df <- data_prep(sim_ref, sim_unk, pred.by = c(1956, 1959, 1988, 1991))

#custom BY_pred with sequence function
df <- data_prep(sim_ref, sim_unk, pred.by = list(min.by=1900, max.by=2020, inc.by=0.5))