RDF and knowledge graphs
Thank you @jtwood for facilitating!!
Resources
- Schema.org: Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond.
Interests/Whys (popcorn around group)
- Using RDF to semantically descript API coming out of farmOS, going beyond JSON:API
- Higher level conventions that bubble up lower-level data into semantic representations
- Can RDF be used to help make collaborative standard definitions
- Data schemas and how it related to methodologies, tagging, understanding potential
- Boils down to shared meaning, we should share meaning with the data → “semantics”, meaning
- Relationship aspect is interesting, connections, relationships: ability to visualize relationships between datasets
- How can it help solve a social consensus of what is happening in the real world
- Talking about the “shape” of data, typically you write software to change/transform the shape of data → can RDF help us make a consistent shape for data? Not worry about transforming shape of data, only focus on transmission of data.
- Pay the cost of interoperability up front
Deeper dive on why
- Global data aggregation, ecological state analysis, help come to novvel conclusions
- Global scientific database eg: aggregate data about two different grazing systems
- Less data silos, more decentralized data stores
- Data is useless → conclusions are useful :: Syntax is useless, semantics is useful
- Usually sit at desk using data, without understanding how hard it is to collect all that data
- Data for purpose
- Be able to collect the bare minimum data for what you are doing
- Talk about how much we want to use data & interoperate; shared meaning will make this useful
- Example: here is a landmass, what data do I know about this space. Actionable things are all associated with this landmass
- Visualization, storytelling, statistical modeling
- There are different whys for different businesses/communities
What
- RDF:
- Example: “Who is the king of england?” …. Wikidata vs Wikipedia. Wikipedia is human readable data, but hard for a computer to “use” wikipedia. BUT wikidata, uses the RDF data format, and computers are able to traverse these knowledge graphs to answer a question.
- Way of taking database schemas, socially aligning the metadata
- Failures of coordination have prevented RDF from becoming more used
- Concrete example (Mike farmOS example):
- Log: I seeded my lettuce here, the lettuce is now associated in this location
- Schema.org has a “thing”, and an action” is a “thing”, and a “movement” is an “action”
- Could we represent this log as a schema.org “movement”
- Conventions for a “soil test” → could this be described in a semantic way?
- Rothamsted convention, etc, let people build their own conventions
- … if you had this in an RDF format, still, who would use this? You could have done this
- We could have, but didn’t need to build integrations/share meaning at that time
- Coffee shop demo that used farmOS….
- A good example of custom “conventions” for data coming in & going out
Barriers
- Aaron core point: we want alignment among the standards. But incentives are not aligned for this to be happening
- What are some actionable things to make this happen
- Framework has been around for a long time, but the standards/schemas have not
- Bigger issues of justice:
- We’re inclined to figure out the barriers, doing this as technologists in a small group
- Need to make the space for other unrepresented peoples to come in, in a cooperative way
- Need funding to make something happen, but may drive it in a different direction, not always good
- Everything on the web should be semantic, every web document should return things like this
- Top down planning/boiling the ocean
- Lets try bottom up. Here is a very specific thing I need to describe…
- Business models are often structured with how you extract money from data
- A lot more effort to do this, labor costs
- Example: want to define ground truth of carbon, derived from models
- Open ag data alliance: tried to put standards together in xml or something but it fizzled out
- Semantic consensus in data standardization - this is core to huge business models. If we can make this less expensive, does this screw with our people
Benefits
- SurveyStack use case:
- What tillage is, asks questions about what a tillage event is. Later, it pulls this back out. Tightly controlled input and output. If I enter my tillage event in a different way then it will not work in the coffeeshop
- IF the tillage event has a semantic standard/meaning then it is easier to save & share data in this format
- RDF being a machine readable file
Enablers
- Ability to selectively share only some pieces of a dataset (schema is a prerequisite for this, but definitely requires business models to implement this capability)
- We need better software libraries and tooling to help use RDF
- Need an argument for why this is the best solution
- Graph element
- Are there other solutions that are better for interoperability?
- The graph part is a good part
- Tooling isn’t as good
- At an early stage it seems like overkill, but adding it on later makes it complex
- Methods for consensus setting
- Top down deciding to bottom up… Marcus example
- Funding specific tools for developers
Final questions:
- Is access control critical to this?
- Is regen using RDF? In parts of the stack. You can use tooling to RDF datasets to canonicalize data, get a hash, and save it on-chain
- Are there connections between what we need to build??
- YES
- Methodology developer that has data in farmOS, we would use that data in Regen blockchain data module
- MODUS lab test data
- Are there other groups outside of ag that are using RDF successfully?
- Maybe science/research?
Follow up on this with OpenTEAM tech working group on Tuesday calls…