Summary
This was a hybrid unconference session on Thursday (11/7/24) in the stage / patio. The topics being discussed ranged from:
- tools & mechanisms for leveraging AI to turn unstructured data (e.g. natural language / plain text) into structured data (when explicit schemas for structured datasets already exist)
- general discussion around use cases & value proposition of using linked data & RDF as a tool for data interchange & interoperability between GOAT projects
- evaluating community need for a meta-registry of ontologies used within the GOAT / open ag tech communities
Action Items
We discussed using the existing AI weekly meetings organized by Our-Sci as one immediate place to use as a container for holding these conversations. Likely this work will warrant spinning out a separate working group or set of regular calls if there is enough interest in continuing the conversation, but for now there seemed to be interest in leveraging existing calls until there is critical mass for having a separate community call for this process.
Raw Notes
- Rob: When I had talked about ontology and linked data — a reason I had wanted to talk about it is that it’s an interesting way to standardize things without necessarily agreeing with people
- Let’s say you have “Corn” — in the UK you have “Maze”, there can be multiple labels, and these labels can be relational
- Shipping wheat, the manifests are always shipping pieces of paper. Sometimes someone says “that’s not right” and crosses it out and changes it and writes the correct label.
- Robert: it’s important to know when things are most definitely not the same thing
- Cory: it’s important to distinguish between an ontology and a JSON Schema
- ontologies are used to map concepts and vocabularies together in a formal, but extensible and open way
- tools like JSON Schema are used for data validation, to ensure that a given dataset has the required fields, and that those fields match certain criteria. Given that a JSON dataset adheres to a given schema, it can then be inferred how that dataset fits within a larger ontological structure, and what other data it is related to
- How do we bring this closer to the farmer? Many of these conversations at GOAT have been about bringing stuff closer to the farmer
- Open futures coalition: We have multiple working groups having these conversations, and have spent a lot of time thinking about how we can standardize
- A lot of our realization is that it’s really best to build stuff in as usable way as possible for end users, and then on the backend try and standardize the systems
- Open futures coalition: We have multiple working groups having these conversations, and have spent a lot of time thinking about how we can standardize
- Rob: The problem with local slang is that it can be interpreted very different in different contexts
- There’s still an educational component where folks are coming up with their master list for the first time, what kind of workflows can we use to help people better navigate this space?
- Greg: The solution to a lot of these things, is actually well known. If you have many different interacting with different data, you need some commonality of how they are structuring data
- Even if you have a common standard, there’ still a lot of pairwise API integrations that would need to get built
- Maybe, AI can dramatically reduce the effort involved in creating those connections
- We could actually maybe get on the same page
- Many folks that I know use AirTable, what if you could have something smart enough to store something, and it could ask questions to connect the dots, engage this ontology development world with AirTable world
- AI can be useful also in being agile and supporting with integration work itself
- Cory: and additionally, AI could also be used to build ontologies
- An AI could be really good at “losing weight”, as AI systems are really just a large graph of words, and an ontology is just a much simpler graph of words
- Rob: If we took everybody here and asked them to come up with an ontology for Wheat, they would both be completely reasonable and completely incompatible with each other
- I think an LLM would actually have too many ways to screw it up
- I wrote a wheat ontology to describe malts, and someone got really upset at me for it
- The one things that LLM’s are really good at, is giving descriptions to things
- This week, Open Futures Coalition has a platform which has the ability for folks to create articles & upload files, which can all be vectorized, and an AI chat assistance that uses a RAG for helping projects engage with this
- We have multiple parallel instances with different master lists, but we’re asking ourselves to what extent we index this explicitly, and how we do that?
- One of our devs suggested recently that we remove our tagging system because the AI is so good at being able to navigate all the articles given a search query
- So what started as a technical conversation ended up becoming a usability conversation
- Let’s take weeds…
- Supposing I send something out into a field for it to do a scan of the ground, and identify as many weeds as possible-
- Cory: I’d love to have a reference
- There’s 3 different components here:
- Structural compatibility between our systems
- How are we aligning on vocabularies & ontologies (and contents) — is there a metadirectory that is useful across tools?
- How can we support our communities through UX or curricular or other approaches about how specific communities can navigate through their own ontologies and potentially build their own
- There’s a weekly AI meeting where we’ve been sharing stuff that we have been working on, and our processes
- It might not be a bad idea for an ontology group to meet