Soil standards comparison

Hi all, we’ve been looking into both schemas and reference lists (or ontologies) which are standardized to use when storing soil lab test (or in-field test) data. This post is just a summary of investigation for posterity’s sake and to see if folks in the know (@jherrick @DanT @kanedan29 @mstenta @TomWatson many others) have any feedback.

Application details

  • Support tools to move send samples to a lab, return data from a lab, or even ingest a spreadsheet from a lab into a standard format
  • Support ability to distinguish between comparable test methods (include the exact lab method (LOI carbon), as well as measurable outputs (g carbon / kg soil)).
  • Include a clear schema for standard creation of entries and storage across services and labs.
  • Example applications: storage of soil data in farm information management systems (like farmOS), movement of data to / from labs, use of data by end-use application like benchmarking, modeling, etc.

So far, MODUS is the only spec really designed to handle intake/export for labs.

Left to do

  • follow up with FarmLab in Australia who may know of other standards
  • follow up Vic to see if NEON has a more standard schema buried in there
  • follow up with Jason who maintains MODUS to get more details and use.

Review of sources

MODUS
Link: https://bitbucket.org/modus/

Modus has a schema (XML) and a list of terms. Overall, they are both very practical and are still kept up to date. Based on discussions with Jason (who maintains the current lists), a large number of large-scale labs use the Modus spec to transfer bulk data to and from those labs, primarily used with companies who do large scale sampling.

Pluses:

  • very nicely curated and well documented list of terms, clearly separates by comparability like (soil carbon has several entries based on the method, for example - this is critical if we want to benchmark like against like).
  • simple and well designed schema for “Result” (returning data) and “Submit” (pushing data).
  • Is still actively maintained
  • Is easy to push updates to (send to Jason, and he updates the bitbucket repo!)
  • Overall, a good design fit for use by things we’re interested in (soilstack push/pull to labs, farmOS ingest lab data, coffee shop benchmark lab data, etc.).

Minuses:

  • US specific - unclear if it’s used internationally
  • XML isn’t the greatest modern format
  • Not maintained by a community (just 1 person really at this point)

NEON Soil Archive

Link: https://data.neonscience.org/documents

Suggested by Tom H., this is more of a soil archive. They do use standard formats, but those formats aren’t really published in an accessible way (there’s no clear JSON/XML/other schema that can be easily used). Their formats are quite general (they collect a lot of different types of data), and they don’t seem to have a list of lab test types the same way Modus does (which is really central for our application.

Overall, they have a great soil library and accessible data, but the formatting doesn’t seem to be a fit.

European Soils Database

Links:

Again - this is mostly a database of soils, but does contain it’s own schema… however, again, it doesn’t seem to be a schema that thought deeply about how to be updated or expanded. It does contain more detailed lab measurements, but doesn’t reference specific methods (exctraction methods, DOIs etc) which are required to determine comparability between samples (so, distinguishing LOI versus dry combustion).

Pluses

  • has an existing large dataset which uses this format

Minuses

  • the format is not well updated (or clearly updatable)
  • there is no actual schema - it’s just a database with a published list of categories) in XLS and DOC format.

“Support ability to distinguish between comparable test methods (include the exact lab method (LOI carbon), as well as measurable outputs (g carbon / kg soil))”. I would just add “and sample preparation methods”. This could include pre-sieving, grinding, drying, storage and and pre-treatment (e.g. carbonate removal is particularly important for carbon).

1 Like

Yes - he does include extraction method, though not other details (so soil C nuance the way you described may not be included). However, he seems pretty open to adjustment, so we can offer suggestions.

Our team is going to connect on this to decide this week, but in some form or another we’ll be using Modus. I’m excited about the farmOS integration in particular, I think that’ll be really helpful. Modus is not as well used for putting in requests to labs, but is often used for pulling results back. I’ll push some notes here after we meet!

Hey, just wanted to follow up, we did some more background and discussion on this thorugh OpenTEAM / GOAT and also posted info in the AgStack forum.

Hey also just wanted to post a follow up here as it relates to the use of this in the SurveyStack -> FarmOS -> Coffee Shop integration.

  • We talked to Jason Ellsworth directly - it’s still very actively used, both in the US and internationally.
  • Primarily, the export schema (ie sending data from a lab to a user) is very commonly used and the schema is well respected (not lots of people breaking it)
  • the import schema (ie sending data from a user to the lab) is not often used. Usually people are still just sending CSVs or through each lab’s custom process on their website.
  • Overall, after evaluating it, it’s a really really nice standard. Specifically I think they did an excellent job in creating truly comparable methods in their nomenclature docs…
  • We are going to contribute to this and expand it for use with produce quality which is exciting.
  • We’re trying to move towards having a farmos integration so you can push a MODUS schema into farmOS and it’ll display it effectively and be able to kick it back out in MODUS format also.

Hope this is useful for others also!

1 Like