[Esip-preserve] [Esip-citationguidelines] Distributed Web, identifiers, and stewardship

Parsons, Mark parsom3 at rpi.edu
Thu Oct 10 15:32:14 EDT 2019


Kelsey should chime in, but here is my take.

It’s not a scaling issue. Key-validation is only necessary in formal situations. So yes it would be used in repo-to-repo transfers, legal cases, and things like that.

I think it would also be used by researchers when they want to be sure. My limited ethnographic observation suggests that scientists can be pretty fast and loose with their data analysis UNTIL they get to final publication. Then they make sure they get the best, most-authoritative data for their study. Today that means going to a trusted URL. Tomorrow it could mean using digital signatures.

Good scientists focus on the details. They may not cite their data, but they do make effort to make sure they have the best possible data. So when they do finally cite their data, we can be sure they cite the authoritative data :-)

cheers,

-m.


On 10 Oct 2019, at 12:58, Matthew Mayernik <mayernik at UCAR.EDU<mailto:mayernik at UCAR.EDU>> wrote:

Thanks Mark for posting the slides. I'm sorry to have missed the presentation. I have a question related to a specific use case: A random university researcher/student downloads data either from a data repository site, or from a peer-to-peer system as described in Kelsey & Rob's slides.

My comment/question for this case is that I'm not sure how the key-based validation approach would scale beyond repository-to-repository data exchanges. In other words, I can imagine that repositories with appropriate knowledge and technology could use this approach to ensure validity of data exchanges, but I don't think we can expect the typical data user to do anything complex (or even not complex). It's been an ongoing struggle to get people to cite data via DOIs, which is a process (citation) that users are familiar with.

I guess my point is that for the specific use case given above, I think any algorithmic/computational approach to validation will have to be something that just happens for users, i don't see how we can expect people to actively do a validation step themselves.  I could be off base either with my comment or my use case, so other input is appreciated.

Matt

On Thu, Oct 10, 2019 at 11:54 AM Parsons, Mark via Esip-citationguidelines <esip-citationguidelines at lists.esipfed.org<mailto:esip-citationguidelines at lists.esipfed.org>> wrote:
hi all,

Sorry for the cross-posting but I thought this relevant to both stewardship and the citation cluster.

Kelsey Breseman, the new Stewardship Committee co-chair,  ably supported by Rob Brackett,  gave an excellent webinar with an overview of how the Distributed Web works and some of the remaining challenges, many of which are stewardship related.

Slides are here: https://docs.google.com/presentation/d/1R4OXvaMYCG_pGlxRoxJ228OADgV0ZhgKcYEXFFP9Ulk/edit?usp=sharing
and the presentation was recorded.

The issue of validity stood out to me. I think this is a concern that we need to consider more closely in the citation cluster. It has long been suggested that one should use both authority-based and content-based identifiers to specifically reference an object (Altman 2007), and we have seen some of the work that the folks at INREA have done using content-based IDs for some software citation concerns and authority-based IDs for other. We maight want to consider developing some recommended practice in this area. For example, I think key-based addressing could be a helpful way to help with provenance tracking and even the impossible issue of “scientifically equivalent”. They would also offer technically authoritative layers to human identifiers like ORCID.

There are also issues like how to handle data deprecation, granularity, redirects, high-volume and high-latency data, etc. Apparently the different systems have different approaches to these issues, and it is still early days. Ultimately, however, it is reasonable to assume that at least one of the distributed web protocols will be added to the stack of protocols we already use. It would be good to make sure Earth and environmental science stewardship issues are considered in addressing all these issues.

cheers,

-m.


_______________________________________________
Esip-citationguidelines mailing list
Esip-citationguidelines at lists.esipfed.org<mailto:Esip-citationguidelines at lists.esipfed.org>
https://lists.esipfed.org/mailman/listinfo/esip-citationguidelines

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.esipfed.org/pipermail/esip-preserve/attachments/20191010/70452dd1/attachment-0001.htm>


More information about the Esip-preserve mailing list