Human and machine readable methods

@dorn @sudokita @mstenta

Hey guys, I’m starting to develop methods standards for the Real Food Campaign. My goal is the method should be:

  1. human readable
  2. machine readable (like… JSON format)
  3. fully available (ie if you can make the method an ODK form, you should so that it can get used directly without additional looking up or work)
  4. flexible (so if you want to add additional information, variables, etc. you can do so without breaking the basic structure).

Here’s my first shot. It depends on using a Markdown --> JSON converter so that it’s human and machine readable. This is just a rough outline for a soils method application, but the structure is what matters.

What do you think? Can you break it (find methods that won’t work, won’t fit, won’t extend using this structure)? Is it too hard to read? … … …

Greg

Hey Greg - I recall a brief conversation about this when you and I were walking somewhere in Tampa, specifically about JSON and human-readability I believe.

I recommended using YAML. It’s been gaining a lot of adoption for config files - it’s become the standard method of shipping config in Drupal 8, and it’s also used in a lot of other major software platforms like Docker.

The two biggest benefits of YAML over JSON, in my opinion, are that it doesn’t use braces, and it can have comments.

Also, YAML is technically a super-set of JSON - so a YAML parser can read JSON (but not vice versa).

I am planning to use YAML when I build out farmOS’s crop/variety database features (for defining properties of different crops/varieties, and sharing them between sites).

Here’s a good breakdown of the difference between JSON and YAML: https://stackoverflow.com/questions/1726802/what-is-the-difference-between-yaml-and-json-when-to-prefer-one-over-the-other

We also had a good discussion about this in the OpenFarm issue queue a year ago: https://github.com/openfarmcc/Crops/issues/6

And if you’re really curious, here’s one of the threads on Drupal.org discussing the pros/cons of different formats which eventually led to the selection of YAML for config management in Drupal 8: https://groups.drupal.org/node/159044

Another thing you might want to look at is Modus Project - which is used by a lot of the big soil and water testing labs in the midwest. I’m looking at it as part of a grant I got with the Vermont NRCS to explore standardized data formats for sharing soil test data electronically.

We came across the Modus Project after getting the grant - and are just starting to review it. They’ve done a lot of work outlining the different types of tests that exist. It’s all in a big spreadsheet.

The data format itself is XML - which I don’t love, but they’ve done a lot of work and it’s actively being used.

Not sure if that helps at all, but something to be aware of… I’ll dig up some more specific links for you… one sec…

I find that BitBucket site very hard to navigate - it always takes me a while to find what I’m looking for. :slight_smile:

Here’s the list of spreadsheet files they provide that outline various things: https://bitbucket.org/modus/analysis-nomenclature/src

Take a look at Soil Analysis Nomenclature Modus.xlsx

Ugh… ODK is also XML. XML is not great, but there are converters to more human readable things out there I suppose.

Yeah, adding comments is a huge win, I love that.

I’m totally game for YAML, so long as we can automatic convert markdown to yaml… seems like there’s a few converters so we should be ok (mike can you confirm? Seems that there are some stuff in npm for converting YAML).

Checked out Modus… sheesh, not really clear. No readme, no directions. Information is stored in xlsx tables. And the depth of the methods is lacking. Organizationally I’m not super impressed, but it does seem to be used as there are commits as recent as 2017. I can see how this would be useful as an inter-lab standard among labs who only do soil testing.

I’m hoping for something that’s more vertically integrated. I really want to include the the ODK file which walks through the method piece by piece, including required data inputs and calculations. I feel like the key is having everything in one place, fully coherent, clear and usable without having to follow links and hit paywalls or confusing scientific jargon.

I’m sure there are, but I’ve never worked with any of them.

Why do you need to convert from Markdown in the first place? I think I missed that part of this. I thought you were just using MD for human-readability. YAML would meet that need I think, so you could avoid the need for conversion. Or did you need MD for something else?

Right, good point :slight_smile: Just need something human readable, so maybe YAML just solves the problem. Drrr…

Cool. After I wrote that I thought, “Hmm it might be nice to have a YAML to Markdown converter though”…

That would allow you to easily publish the YAML files to a website, perhaps?

Not an immediate need, but might be a cool by-product. :slight_smile:

Yeah, actually, now that I’m fooling with it, it does kind of matter. YAML is more human readable, but it’s still not particularly pretty. Markdown is pretty right out of the box. My goal is to make it so I can use a sticky wiki right in the forum to manage the actual original method file. The sticky wiki will track changes, so people can see changes over time and even revert. Obviously we need to use github for long term storage, but a lot of people aren’t familiar enough with it to use it, it’s a tech barrier.

So my thought is to use the sticky wiki version as the in-process that everyone can easily touch, and github for versions (major versions, and revisions) which I’ll probably manage pushes to.

In terms of using YAML, problem is, you can’t use tabs in the wiki :\ So it makes YAML harder to use and the hierarchies become impossible to see (you’d need to use 4 spaces manually, but in markdown that doesn’t display anyway)… I don’t know, it’s also bad to have an extra transition (markdown to YAML) so that adds complexity.

Anyway, that’s the thought…