[Esip-documentation] definitive data set identification
Nan Galbraith
ngalbraith at whoi.edu
Thu Jan 21 09:27:28 EST 2021
Hi all -
The OceanSITES data management team is hoping to solve a problem
with identifying duplicate or secondary instances of data sets on our
servers. We work with in situ observational data sets, which are often
used by modelers and remote sensing systems. If these users unknowingly
access duplicate copies of data, it may skew their results by inaccurately
weighting these data points.
We originally tried to ensure that we had only one copy of any given
data point on our server, but that hasn't proved to be practical. Certain
kinds of computed data sets, like PCO2 and surface fluxes, are more
useful to end users if the files contain copies of the component observed
data variables used in their calculations. These copies may start out at a
different rate from the originals, being gridded or averaged to match the
time base of the related data, or, over time, the original data may change
slightly, as calibrations, algorithms, or clock adjustments are updated.
My question to the documentation cluster is whether you know of
any community standards that identify a given data variable as the
authoritative or 'original' copy. I haven't encountered any kind of
standard for this, but I may not be looking in the right places. I feel
that there may be a solution related to DOIs, but ... it wouldn't be
meaningful unless our data users knew about it, and were prepared
to use it, and if we acquired a DOI for each observed variable in a
data set.
Any ideas on this would be very welcomed; we try, whenever possible, to
adopt existing standards instead of inventing our own one-off solutions.
Thanks in advance -
Nan Galbraith
--
*******************************************************
* Nan Galbraith Information Systems Specialist *
* Upper Ocean Processes Group Mail Stop 29 *
* Woods Hole Oceanographic Institution *
* Woods Hole, MA 02543 (508) 289-2444 *
*******************************************************
More information about the Esip-documentation
mailing list