convert_input.Rd
convert_input
is a relatively generic function that applies the function fcn
and inserts a record of it into the database. It is primarily designed for converting meteorological data between formats and can be used on observed data, forecasts, and ensembles of forecasts.
To minimize downloading and storing duplicate data, it first checks to see if a given file is already in the
database before applying fcn
.
convert_input(
input.id,
outfolder,
formatname,
mimetype,
site.id,
start_date,
end_date,
pkg,
fcn,
con = con,
host,
write = TRUE,
format.vars,
overwrite = FALSE,
exact.dates = FALSE,
allow.conflicting.dates = TRUE,
insert.new.file = FALSE,
pattern = NULL,
forecast = FALSE,
ensemble = FALSE,
ensemble_name = NULL,
dbparms = NULL,
...
)
The database id of the input file of the parent of the file being processed here. The parent will have the same data, but in a different format.
The directory where files generated by functions called by convert_input will be placed
data product specific format name
data product specific file format
The id of the site
Start date of the data being requested or processed
End date of the data being requested or processed
The package that the function being executed is in (as a string)
The function to be executed if records of the output file aren't found in the database. (as a string)
Database connection object
Named list identifying the machine where conversion should be performed.
Currently only host$name
and host$Rbinary
are used by convert_input
,
but the whole list is passed to other functions
Logical: Write new file records to the database?
Passed on as arguments to fcn
Logical: If a file already exists, create a fresh copy? Passed along to fcn.
Ignore time-span appending and enforce exact start and end dates on the database input file? (logical)
Should overlapping years ignore time-span appending and exist as separate input files? (logical)
Logical: force creation of a new database record even if one already exists?
A regular expression, passed to dbfile.input.check
, used to match the name of the input file.
Logical: Is the data product a forecast?
An integer representing the number of ensembles, or FALSE if it data product is not an ensemble.
If convert_input is being called iteratively for each ensemble, ensemble_name contains the identifying name/number for that ensemble.
list of parameters to use for opening a database connection
Additional arguments, passed unchanged to fcn
A list of two BETY IDs (input.id, dbfile.id) identifying a pre-existing file if one was available, or a newly created file if not. Each id may be a vector of ids if the function is processing an entire ensemble at once.
convert_input executes the function fcn in package pkg via PEcAn.remote::remote.execute.R. All additional arguments passed to convert_input (...) are in turn passed along to fcn as arguments. In addition, several named arguments to convert_input are passed along to fcn. The command to execute fcn is built as a string.
There are two kinds of database records (in different tables) that represent a given data file in the file system. An input file contains information about the contents of the data file. A dbfile contains machine spacific information for a given input file, such as the file path. Because duplicates of data files for a given input can be on multiple different machines, there can be more than one dbfile for a given input file.
By default, convert_input tries to optimize the download of most data products by only downloading the years of data not present on the current machine. (For example, if files for 2004-2008 exist for a given data product exist on this machine and the user requests 2006-2010, the function will only download data for 2009 and 2010). In year-long data files, each year exists as a separate file. The database input file contains records of the bounds of the range stored by those years. The data optimization can be turned off by overriding the default values for exact.dates and allow.conflicting.dates.
If the flag forecast is TRUE, convert_input treats data as if it were forecast data. Forecast data do not undergo time span appending.
convert_input has the capability to handle ensembles of met data. If ensemble = an integer > 1, convert_input checks the database for records of all ensemble members, and calls fcn if at least one is missing. convert_input assumes that fcn will return records for all ensembles. convert_input can also be called iteratevely for each ensemble member. In this case ensemble_name contains the unique identifying name/number of the ensemble to be processed.