Session: Data Content and Availability

GOAT Session Notes Wednesday, 2:20-3:30

Data Content: What is out there? How to organize it?

Facilitator and Convener: @MarsBenzle

Outline data for Ag data
Government Data
Agro operation data
Agro knowledge base management

Carl (@carllippert)

  • Production data would be useful to know. Kinda private, but no reason not to know/share.
  • DHIA - Self regulated USDA-like - 50 years production data for all cows. (How much milk, proportion of fat/sugar.) Big set of data inaccessible.
  • Multiple dairy herd data processing center.

Bar (@barbieri) - Soil data being collected. Could be more fully used.

Aldyen - Soil cores sitting places. Had success in Canada locating and retrieving to analyze. Individual landowners. 60 year old useful cores.

There are groups working on preserving physical samples.

Ian (@ircwaves) - Gobs of remote sensing data.

Reginald - “Supposed to be gov data” What do you mean you want access? Do you understand rules/regulations about access? (Licensed data from commercial enterprise.) LandSAT is from NASA - you are free to use it.

Lee - Bio-geo-chemical data. Static information and flow information. Collected now be researchers. Anything comprehensive?

Kita (@sudokita) - Farm operations data. What/how/when. Applying fertilizer, paying water bill, paying a worker. Currently data is stored in shoeboxes, notebooks, custom software.

Bar - Ag census data exists, but output is summary in PDF - difficult to extract.

Weston (@weston)

  • LocalOrbit/OpenFoodNetwork - Looking at publishing historical market data. Looking at ingesting farm operations and inventory. Useful for crop planning and profitability projection.
  • “Tend” is a FarmOS-like system. Operations, planning, sales. How is data accessible? How to ingest sensor data? Wants to start as wide as possible being able to collect whatever, then find what is important.

Greg - (Regen network) Interested in data showing why ecosystems are changing. Looking to create contracts around ecosystem functions. Inherently useful for farmers for planning/optization.

Remote sensing, Army corp of engineers, related public data needs to be correlated. Greg feels that is easy to get. Tricky part is getting private data.

Bar/Carl - Farms are required to do 4 year soil sampling. Results held by sample labs and state. Do farmers own that data? Uncertain.
Output is summary or interpretation of test results.
Testing labs test samples of samples and not necessarily for properties everyone wants.

Jack (@jackofchiu) - University weather stations. 10 Iowa stations he maintained. Arkansas has 4.

Kita - If you collect data, you should provide access to that data to farmers.

Bar - Opportunity for aggregation is not happening. Private labs perform test and then discard. State University (UVM) has 15 years of data, though spotty and needs cleanup.

Farm Service Agency - Identity holder of all farm info. Crop insurance. FSA Tract numbers, FSA Farm numbers.

Reginald - Researchers reluctant to release data prior to publication. USDA and other gov agencies collect data specifically to fulfill agency mission. NSDA does soil mapping.

Bar - We have not resolved legal status of data - who owns it, who can access it.

Aldyen - Soil samples are going to be valuable in 100 years.

Jack - NOAA uses funny db format but can explain how to download sample period. Brandon notes: However, they delete after 2 days.

Jack - Local plant boards. Funded by produce sales. Money for research - for instance, Soybean/Rice/etc plant board. Can get access to this data. Weather station data. Crop improvement organization private/public. Mandatory belonging. Should be archived. Jack sends them a hard drive and they send back data for every all time since weather station got turned on.

Brandon (@bpwong) - Could we set up a place in the forum as a database inventory?

Aldyen - Entities don’t share because liability of uncertain ownership.

Carl - Can we capture data at point of data collection before it goes into a walled garden?

Reginald - May not have comprehensive data because voluntary submission. NASS National Agricultural Statistics Service. Yield data. Tillage, irrigation, land use, fertilizer, hog inventory. Updated every year. Jack - Broken down by state and crop. Bar - Most trusted source for statistical data, but expensive. Actually, adding supplemental survey data to NASS is the pricey part.

Carl - is a great starting off point for ag database.

Jack - NRCS - Should be able get a record of how much land is in production. Need to ask specifically for acreage without any attached PII.

Reginald - Data comes back as a big table. Need to figure out what columns are and what you need. If you can write scripts you can filter to what you need.

LTAR and LTER Long Term Ag Research + Long Term Ecological Research - Can contact groups and because all govt funded they should make data available. Aldyen notes: Land owner supplies data when requested. No effort to confirm accuracy.

GRACENET - Greenhouse Gas Emissions Reduction in Agriculture - Gas flux and carbon sequestration measurements.


  • LICOR Runs hosting for data. If you turn on data upload, data from gas emissions (methane + CO2) gets hosted. Because post processing necessary is complex.
  • Will send raw AND qc data - Second set has sensor spikes removed.

(Session was one of two. The other discussed what data would we like to have if we could get it.)

A lot of the LTER data is available for download online. I put the link in the open data and tech tools spreadsheet.

Hey @MarsBenzle, is this the link?

Yes, that’s it there are several tabs.