[Esip-preserve] Fwd: [Infusion] Suggestion for tech infusion activity vis a vis MEaSUREs

Ruth Duerr rduerr at nsidc.org
Wed Apr 14 11:52:32 EDT 2010


Forgot to select reply-all...

Begin forwarded message:

> From: Ruth Duerr <rduerr at nsidc.org>
> Date: April 14, 2010 9:51:31 AM MDT
> To: Curt Tilmes <Curt.Tilmes at nasa.gov>
> Subject: Re: [Esip-preserve] [Infusion] Suggestion for tech infusion activity vis a vis MEaSUREs
> 
> Hi Curt, 
> 
> I think we need to start with the definitions.  I've added some updates and comments on the wiki to the set you drafted.  The biggest concern I have with your definitions are that NSIDC currently uses conflicting definitions.  What you call a Data Type we call a Data Set, where our Data Sets have Version numbers.
> 
> As for using PURL's or ARK's or Handle's well... I think it will be hard to justify any particular standard...
> 
> - Ruth
> 
> On Apr 14, 2010, at 8:04 AM, Curt Tilmes wrote:
> 
>> On 03/23/2010 02:35 PM, Wilson, Brian D (335G) wrote:
>>> We will need to formulate this consensus recommendation quickly.
>>> 
>>> I suggest two features:
>>> 
>>> 1) Publish the MEASUREs datasets as a dataset paper in an appropriate
>>> journal so the *dataset* has a refrence-able DOI.
>> 
>> We've begun to discuss/distinguish the concepts of "Data Type" (what
>> EOS call's ESDT) from "Dataset", which is a specific version (EOS
>> parlance 'Collection') of that Data Type in the ESIP Preservation
>> cluster identifiers group.
>> 
>> I put some strawman terms and definitions here: (up for discussion!)
>> http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Identifiers#Definitions
>> 
>> I think each of those concepts needs a referenceable identifier from
>> which we can construct data citations.
>> 
>> For example, consider ESDT FOO.  It is archived in DAAC MyOrg
>> (CrossRef DOI Org 10.12345), which has archived data from ESDT FOO for
>> collection 1 (a "Closed Data Set") and is currently archiving
>> collection 2 (an "Open Data Set" still being processed from current
>> data).
>> 
>> We need a citation for the general data type:
>> 
>> Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO.
>> 
>> and a citation for each data set (each version of the data time).
>> Rather than registering a new DOI for each new version (collection),
>> I'm inclined to advise reusing the data type DOI:
>> 
>> Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO,
>> Collection 1.
>> 
>> This "datatype DOI" could also be the 'published paper describing the
>> dataset' DOI, but I guess I'd be inclined to have separate DOIs, one
>> for the paper, and one for the datatype.  Then a paper could reference
>> either or both as appropriate to the nature of the use.
>> 
>> 
>> Alternatively, we could register distinct DOIs for each new version:
>> 
>> Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO.1,
>> Collection 1.
>> 
>> For the "Open Data Set" case, I think we must precisely qualify the
>> citation to reference the specific granule membership of the dataset.
>> There are a few ways to do this, but I think the cleanest is a
>> date/time stamp:
>> 
>> Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO,
>> Collection 2, 2010-04-01T14:00:00.
>> 
>>> 2) Serve the dataset granules from permanent (as possible) URL's
>>> from the origin sites and the receiving DAAC's.  The grabbed real
>>> estate, the root of the URL, should reference MEASUREs and the
>>> institution, and not contain the name of a computer (or something
>>> else that is dumb).
>>> 
>>> 3) As far as truly permanent URI's, I don't know what to say.  I
>>> don't think either the handle system, XRI's, or any other system has
>>> gotten traction (a large market share).  This is mostly the fault of
>>> the W3C, which thinks the entire problem has been solved by existing
>>> URLs and URNs.  Hogwash.
>> 
>> I like including both identifiers, datatype and dataset.  I'm leaning
>> toward using DOIs for the datatype and PURLs for the precise data
>> specification and locator:
>> 
>> Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO,
>> Collection 2, http://purl.org/NET/MyOrg/data/FOO/2/2010-04-01T14:00:00.
>> 
>> (Though, as Ruth points out, ARKs are nice too and have their own
>> benefits.)
>> 
>> Curt
>> _______________________________________________
>> Esip-preserve mailing list
>> Esip-preserve at lists.esipfed.org
>> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve

-------------- next part --------------
HTML attachment scrubbed and removed


More information about the Esip-preserve mailing list