Establish a DuckDB Connection to DATRAS Datasets
dr_con.RdThis function creates a DuckDB connection to a specified DATRAS dataset type, facilitating access to trawl survey data stored in Parquet format. The dataset type determines which data is loaded from the remote source.
Usage
dr_con(
type = NULL,
trim = TRUE,
url = "https://heima.hafro.is/~einarhj/datras",
quiet = TRUE
)Arguments
- type
A character string specifying the dataset type. Available values (tables):
"HH": Haul-level data."HL": Catch-at-length data (filterable via thetrimoption)."CA": Catch-at-age data (filterable via thetrimoption)."species": Species dataset derived from ICES SpecWoRMS.
- trim
Logical. For
"HL"or"CA", ifTRUE(default), non-essential fields are excluded. Ignored for other datasets.- url
URL to the Parquet file directory, currently defaulting to
"https://heima.hafro.is/~einarhj/datras".- quiet
Logical. If
TRUE(default), suppresses connection warnings and messages.
Dataset Types
This function operates on the following dataset types:
"HH" (Haul-Level Data): Contains information related to individual haul events.
"HL" (Catch-at-Length Data): Records catches categorized by length class.
"CA" (Catch-at-Age Data): Includes age-based biological data (e.g., liver weight, length).
"species" (Species List): Derived from the ICES vocabulary 'SpecWoRMS' and includes species names and related metadata.
Dataset Paths
The dataset is accessed via HTTP/HTTPS paths at a user-defined or default URL
location. The file names are inferred from the provided type parameter
(e.g., a Parquet file named "HH.parquet" for "HH" type data).
Unique Identifier (.id)
For dataset types "HH", "HL", and "CA", a unique identifier column (.id)
represent catenation of fields Survey, Year, Quarter, Country,
Platform, Gear, StationName and HaulNumber seperated by ":" (see dr_add_id).
Examples
if (FALSE) { # \dontrun{
# Establish connections
dr_con("HH") # Connect to haul-level data.
dr_con("HL", trim = FALSE) # Include all fields for catch-at-length data.
species_data <- dr_con("species")
# Inspect species data
dplyr::glimpse(species_data)
} # }