Type: Package
Title: Easily Download Data and Metadata from 'DataONE'
Version: 0.3.1
Date: 2024-08-06
Maintainer: Julien Brun <julien.brun@alumni.duke.edu>
Description: A set of tools to foster the development of reproducible analytical workflow by simplifying the download of data and metadata from 'DataONE' (https://www.dataone.org) and easily importing this information into R.
License: Apache License (== 2.0)
Encoding: UTF-8
Language: en-US
RoxygenNote: 7.3.2
SystemRequirements: Mac OSX: redland (>= 1.0.14) ; Linux: librdf0 (>= 1.0.14), librdf0-dev (>= 1.0.14)
URL: https://nceas.github.io/metajam/, https://github.com/NCEAS/metajam
BugReports: https://github.com/NCEAS/metajam/issues
Depends: R (≥ 3.6.0)
Imports: dataone, dplyr, EML, emld, lubridate, purrr, readr, stats, stringr, tibble, tidyr, XML
Suggests: knitr, rmarkdown, testthat
VignetteBuilder: knitr
NeedsCompilation: no
Packaged: 2024-08-16 17:28:05 UTC; jb160-local
Author: Julien Brun ORCID iD [cre, aut], Irene Steves ORCID iD [aut] (https://github.com/isteves), Mitchell Maier ORCID iD [aut], Kristen Peach ORCID iD [aut], Nicholas Lyon ORCID iD [aut] (https://njlyon0.github.io/), Nathan Hwangbo ORCID iD [ctb], Derek Strong ORCID iD [ctb], Colin Smith ORCID iD [ctb], Regents of the University of California [cph]
Repository: CRAN
Date/Publication: 2024-08-16 17:50:02 UTC

metajam: Easily Download Data and Metadata from 'DataONE'

Description

A set of tools to foster the development of reproducible analytical workflow by simplifying the download of data and metadata from 'DataONE' (https://www.dataone.org) and easily importing this information into R.

Author(s)

Maintainer: Julien Brun julien.brun@alumni.duke.edu (ORCID)

Authors:

Other contributors:

See Also

Useful links:


Check PID version

Description

This function takes an identifier and checks to see if it has been obsoleted.

Usage

check_version(pid, formatType = NULL)

Arguments

pid

(character) The persistent identifier of a data, metadata, or resource map object on a DataONE member node.

formatType

(character) Optional. The format type to return (one of 'data', 'metadata', or 'resource').

Value

(data.frame) A data frame of object version PIDs and related information.

Examples

## Not run: 
# Most data URLs and identifiers work
check_version("https://cn.dataone.org/cn/v2/resolve/urn:uuid:a2834e3e-f453-4c2b-8343-99477662b570")
check_version("doi:10.18739/A2ZF6M")

# Specify a formatType (data, metadata, or resource)
check_version("doi:10.18739/A2ZF6M", formatType = "metadata")

# Returns a warning if the identifier has been obsoleted
check_version("doi:10.18739/A2HF7Z", formatType = "metadata")

# Returns an error if no matching identifiers are found
check_version("a_test_pid")

# Returns a warning if several identifiers are returned
check_version("10.18739/A2057CR99")

## End(Not run)

Download data and metadata from a dataset that uses EML metadata.

Description

This is an internal function called by the download_d1_data.R function. Not to be exported

Usage

download_EML_data(data_url, meta_obj, meta_id, data_id, metadata_nodes, path)

Arguments

data_url

(character) An identifier or URL for a DataONE object to download.

meta_obj

(character) A metadata object produced by download_d1_data. This is a different format than the metadata object required for the analogous ISO function

meta_id

(character) A metadata identifier produced by download_d1_data

data_id

(character) A data identifier produced by download_d1_data

metadata_nodes

(character) The member nodes where this metadata is stored, produced by download_d1_data

path

(character) Path to a directory to download data to.


Download data and metadata from a dataset that uses ISO metadata.

Description

This is an internal function called by the download_d1_data.R function. Not to be exported

Usage

download_ISO_data(meta_raw, meta_obj, meta_id, data_id, metadata_nodes, path)

Arguments

meta_raw

(character) A raw metadata object produced by download_d1_data

meta_obj

(character) A metadata object produced by download_d1_data

meta_id

(character) A metadata identifier produced by download_d1_data

data_id

(character) A data identifier produced by download_d1_data

metadata_nodes

(character) The member nodes where this metadata is stored, produced by download_d1_data

path

(character) Path to a directory to download data to.


Download data and metadata from DataONE

Description

Downloads a data object from DataONE along with metadata.

Usage

download_d1_data(data_url, path)

Arguments

data_url

(character) An identifier or URL for a DataONE object to download.

path

(character) Path to a directory to download data to.

Value

(character) Path where data is downloaded to.

See Also

[read_d1_files()] [download_d1_data_pkg()]

Examples

## Not run: 
download_d1_data("urn:uuid:a2834e3e-f453-4c2b-8343-99477662b570", path = file.path("."))
download_d1_data(
   "https://cn.dataone.org/cn/v2/resolve/urn:uuid:a2834e3e-f453-4c2b-8343-99477662b570",
    path = file.path(".")
    )

## End(Not run)

Download all data and metadata of a data package from DataONE

Description

Downloads all the data objects of a data package from DataONE along with metadata.

Usage

download_d1_data_pkg(meta_obj, path)

Arguments

meta_obj

(character) A DOI or metadata object PID for a DataONE package to download.

path

(character) Path to a directory to download data to.

Value

(list) Paths where data are downloaded to.

See Also

[read_d1_files()] [download_d1_data()]

Examples

## Not run: 
download_d1_data_pkg("doi:10.18739/A2028W", ".")
download_d1_data_pkg("https://doi.org/10.18739/A2028W", ".")

## End(Not run)

Read data and metadata based on 'download_d1_data()' file structure

Description

Reads data along with metadata into your R environment based on [download_d1_data()] file structure.

Usage

read_d1_files(folder_path, fnc = "read_csv", ...)

Arguments

folder_path

(character) Path to a directory where data and metadata are located.

fnc

(character) Function to be used to read the data (default is [readr::read_csv()]).

...

Parameters to pass into the function specified in 'fnc'.

Value

(list) Named list containing data and metadata as data frames.

See Also

[download_d1_data()] [download_d1_data_pkg()]

Examples

data_folder <- system.file(file.path("extdata", "test_data"), package = "metajam")
soil_moist_data <- read_d1_files(data_folder)

# You can specify the function you would like to use to read the file and pass parameters
soil_moist_data_skipped <- read_d1_files(data_folder, "read.csv",
                                         skip = 8, stringsAsFactors = FALSE)

Get tabular metadata

Description

This function takes a path to an EML (.xml) metadata file and returns a data frame.

Usage

tabularize_eml(eml, full = FALSE)

Arguments

eml

An emld class object, the path to an EML (.xml) metadata file, or a raw EML object.

full

(logical) Returns the most commonly used metadata fields by default. If full = TRUE is specified, the full set of metadata fields are returned.

Value

(data.frame) A data frame of selected EML values.

Examples

   eml <- system.file("extdata", "test_data", "SoilMois2012_2017__full_metadata.xml",
                  package = "metajam")
   tabularize_eml(eml)