Follow up from Jan 7 soil meeting: Data dictionary/ library hosting, schema publishing

gbathree · January 18, 2020, 11:10pm

@dornawcox @mstenta @sudokita @plawrence <-- there’s more but here’s some folks who may be interested…

Hi all, I wanted to follow up regarding one sub-topic from Jan 7th which has since come up again following Friday’s meeting about backend stuff (Regen, OADA, etc.): shared ontologies and background information. OADA has space to create such things (which is great!), but it’s specifically in the context of their API tool.

In the case of soils, there are several needs in this space:

Shared definitions - of terms
Standard conversions - of units, and also translations of error from one measurement method to another)
Shared ontologies - lists of items used when making a choice which can be subgrouped (so tillage types, with subgroups by machine used, or by farm type).

Some of the soil requirements overlap with other applications - like farm management software, or survey software, or modelling software.

Based on our discussions, I’m going to throw out a plan and you all tear it to bits, or propose another (I like that better than hand waving and hoping someone will jump in)

so how’s about this -->

Ontologies, conversions, and definitions lists should be maintained by self-organizing ‘experts’, in the style of wikipedia.
Experts are trusted to improve and version the information based on the reasonable state of the art. They may continually improve/discuss, but versioning comes with serious and significant thought.
These domain experts are not tech experts… so having them fill in JSON text in gitlab probably won’t work… so…
We use a standard wiki, which has
- versioning
- discussion integrated
- a history and structure people are used to
- are human readable and editable (by normal humans, not programmers)
- are old tech and not likely to be replaced or not supported
BUT we enforce structure in the wiki so it is machine scrapable --> So the wiki generates the JSON which is then the object we all call when we want to perform an operation (get an ontology list, or generate a conversion, or get a definition).
This avoids building an entire website to create both machine readable and pretty UI just to crowd-source or expert-source these diverse lists and functions (for example https://openfarm.cc/ does this for plants and its awesome, but it’s a lot of work).

Any other ideas? Maybe just use OADA? Maybe something simpler?

And remember, it’s easy to do this once (a JSON file in gitlab), we need something that’s actually sustainable and will get investment from the actual users (us and others).