Join the conversation #BigDatainAg                                                                                          View this email in your browser
ISSUE #11  /  August 2021                                                                                                         NEWSLETTER
Greetings colleagues!

Welcome to the August 2021 Module 1 Newsletter.

We hope you and your loved ones are well and enjoying a relaxing summer. In the past months there's been a lot of activity and progress on a number of fronts! Some examples: The collaboration with FAO on AGROVOC has been strengthened through the creation of a task force; the FAIRscribe to enable FAIRification of data assets is open for testing and commenting, and the PII checker is being improved. As always, several webinars have been organized to share CoP knowledge. 

Happy reading!

Medha and Céline

* You are receiving this email because you are a valued collaborator of Module 1 Organize of the CGIAR Platform for Big Data in Agriculture. We use MailChimp to provide updates on Module 1 activities, and per the EU GDPR, we would like to remind you that you can at any time update your preferences or unsubscribe from this list by following the links included at the bottom of the Newsletter.


FAIRscribe, the FAIR Data Workflow tool of GARDIAN

In an effort to assure avenues for input as the FAIR workflow takes shape, we have put in place a consultative process to shape the development of a more user-intuitive FAIRscribe.  

FAIRscribe builds on and will replace the FAIR workflow currently in GARDIAN, based on the results from several user tests with you that indicated ways in which it needed to be more user-friendly. The workflow allows users to set up teams and collections if/as desired, and to annotate data assets with the CG Core metadata elements and variables within datasets with ontology terms. It includes an auto-check for PII, and user-modifiable keywords and geolocations (place names and coordinates) extracted by the tool from the resource being annotated, tying them to ontologies and controlled vocabularies (including AGROVOC and the ICASA variables). There are several other user-friendly features, including a license wizard and FAIR report to enable users to improve FAIR scores by honing in on particular indicators. 

A group of testers from the CoP and the EiA initiative saw demos of the tool in June and July, and improvements are being made to the tool following their feedback. A demo will be offered to the CoP once the tool is ready for wider testing after the summer holidays.

This work is moving forward in collaboration with SCiO, building on tools developed by two of our long-term collaborators: The COPO team at Earlham University and the Agricultural Model Intercomparison and Improvement Project (AgMIP) at the University of Florida.

AGROVOC task force 

This year the IDM CoP facilitated a task force co-chaired by the International Institute of Tropical Agriculture (IITA) and the Alliance of Bioversity International and CIAT to define a CGIAR collaboration framework to enrich AGROVOC.

Two blog posts published by the FAO and the CGIAR describe the objectives of this task force and its members. Click on the pictures below to access the articles.

PII and OpenSafely @ CGIAR    
The Platform for Big Data in Agriculture has been learning from the biomedical domain regarding its leveraging of sensitive data for analyses while keeping PII well-secured. The SCiO team won an EU grant to look into this with 3 collaborating teams, and will use the OpenSAFELY model to help CGIAR enable researchers to action over its sensitive data while not exposing it. OpenSAFELY is a highly secure, transparent, open-source software platform for analysis of electronic health records data. All platform activity is publicly logged, and code for data management and analysis is shared, automatically and openly, for scientific review and efficient reuse.

The overall idea is to keep the original sensitive data secure, but allow analytical scripts to be developed using "synthetic data" that mimics the original. Scripts can operate over these data where they reside, with the analytical results available to the researcher for insight and actionable outcomes - but not the original data. We will keep you in the loop on this important work as it progresses.
Qualitative Data Management Task Force     

A task force has been set up to help CGIAR researchers deal with qualitative data. Bosun Obileye (IITA) has offered to lead it, with close collaboration from Nilam Prasai (IFPRI), Indira Yerramareddy (IFPRI), Medha Devare (IFPRI/Big Data Platform) and Gundula Fisher (IITA). 

Qualitative data is non-numerical and non-statistical in nature and is typically unstructured or semi-structured. It can be generated through texts, audio and video recordings, interview transcripts, observations and notes. One of the challenge researchers face relates to managing this type of data with regard for openness and FAIRness. There are also questions on how to capture, store and share it responsibly without compromising confidentiality, integrity and availability (CIA). 

The goal of this task force is to assess the type of qualitative data CGIAR generates and develop best practices and guidelines to facilitate the institutionalization of qualitative data management in OneCGIAR.

The Task force is currently reviewing its objective and will shortly develop a questionnaire to be shared with all stakeholders to chart the path for further work. Please contact Bosun if you are interested in learning more or being part of this task force.

CGIAR Open and FAIR Data Assets Policy 
The latest version of the Open and FAIR Data Assets Policy has been approved by the System Management Board with effect from April 16. This Policy supersedes and replaces the 2013 Open Access and Data Management Policy. It addresses funder and publisher requirements, the recommendations of a 2018 external assessment undertaken at the recommendation of the then CGIAR Independent Evaluation Arrangement. It also responds to the 2020 Data Management Maturity Assessment many of you participated in.

The full policy is available here

The Information and Data Management CoP (IDM CoP) includes the Metadata, Repository, Ontology, Open Access, and Globus Working Groups. Working Groups (WGs) are intended to be a forum for cooperation and participation, with members sharing with each other what information, knowledge and expertise they have to help sort issues that are common to benefit all. 
If you would like to actively participate in those groups, please sign up to engage with us!
Metadata Working Group
This year, the group is reviewing proposed changes to the CG Core metadata schema. The latest discussions are available here and a permanent link has been created.

Over this year, the group will set up a technical support team, produce education material to explain why and how to use the CGCore metadata schema, work on consolidating controlled lists for repositories, and clarify how the CG Core metadata schema is implemented within GARDIAN. Members of the group are already engaged within the development of the FAIR workflow. 
See here for more details on the Metadata WG and the people involved.
Repository Working Group
The Repository WG is collecting terms that CGIAR centers use to define the metadata field “kind of data.” The group will work with the Metadata working group to arrive at controlled list for defining this field.

The working group has conducted a survey within its members, and activities were suggested to move forward, among which guidelines. The list of activities will be posted on the working group website. Soon, the working group will come up with presentations on best practices. 

At the end of each meeting, members will discuss the next meeting agenda, and upload it in to the WG website.  

See here for more details on the Repository WG and the people involved.

Ontology Working Group
A task group to define collaboration framework with the FAO regarding contribution and use of AGROVOC has been created. It gathers 10 CGIAR representatives, among which Bosun Obileye is the chair (see blog post).

3 webinars have been organised; one on the advances of CO website and graph database for graph visualisation, a second  on how to search and request term in ontologies and a last one on GOMO, the Governance Operational Model for Ontologies from BASF.

The working group is collaborating with WorldFish to develop the small-scale fisheries and aquaculture ontology.  

A paper on SEONT, the socioeconomic ontology is being written to be published in a special edition of Frontiers. The aim of the paper is to promote the outputs from the machine learning extraction exercise realised in collaboartion with the University of Sheffield. 

See here for details on Ontology WG efforts and the people involved.
Open Access Working Group
Since January, the Open Access WG has continued information sharing on topics related to open access, including news updates on expanding uptake Copyright Clearance Center Rights Link service by major publishers, and the MIT Press open access e-book payment model.
The group also continues to explore experiences re: read/publish agreements at universities, and was involved in a demonstration of capabilities of MARLO to extract information from publishers using DOIs.

More details on Open Access WG efforts available here.
 IDM CoP webinars
These webinars are intended to facilitate exchange among our members and enhance capacity. Please forward widely as relevant to communities/researchers at your Center.
If you have suggestions for topics of interest, or would like to present on something you’re doing at your Center that may be of interest to all of us. Please fill this poll. We would love to hear from you!

Future webinars

  • September 7: Ontologies CoP webinar: Neo4J as a backend database for web protégé with Mattew Horridge and Mark Musen from Stanford University. Register:
  • September 21: Ontologies CoP webinar: Enterprise Knowledge Graph with Alexander Garcia from BASF. Register:

Past webinars

Information on past webinars is available in the EVENTS section of the CoP webpage.

Open Access & Open Data Support Pack

Questions/Concerns? Here's how you can get in touch:
Medha Devare:
Céline Aubert:
Copyright © 2021 CGIAR Platform for Big Data in Agriculture, All rights reserved.

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.

Email Marketing Powered by Mailchimp