Join the conversation #BigDatainAg                                                                                          View this email in your browser
ISSUE #9  /  December 2020                                                                             NEWSLETTER

Greetings colleagues!

Welcome to the December 2020 Module 1 Newsletter.

You are receiving this email because you are a valued collaborator of Module 1 Organize of the CGIAR Platform for Big Data in Agriculture. We use MailChimp to provide updates on Module 1 activities, and per the EU GDPR, I would like to remind you that you can at any time update your preferences or unsubscribe from this list by following the links included at the bottom of the Newsletter.

This has been a difficult year, but I have been humbled by the can-do spirit I've seen in our CoP and across CGIAR. A very big thank you - for your energy and professionalism under challenging circumstances, often while juggling family needs and work commitments. Although 2021 promises to be a bit unsettled as One CGIAR takes shape, I believe it will allow us to seize new opportunities. The year is likely to bring stronger support for the good work you do, in the form of a new Open and FAIR data policy (that I am still hoping to be able to share again soon, once I get the go-ahead from the System Office), and a new CGIAR digital strategy. I look forward to a more stable, peaceful, and saner 2021 for all, when we are once again able to be near those we care about while all staying healthy - and to have a better work-life balance.

Best wishes for a happy holiday season and/or relaxing break,
Medha and Michelle

GARDIAN wins Elastic Search Award

The BIG DATA Platform is pleased to announce that, together with SCiO, we've been awarded a 2020 Elastic Search Cause Award for GARDIAN, CGIAR's flagship data harvester, which uses the Elastic Stack towards making a positive impact in the world.

The Elastic Search Awards honorees were announced during a virtual, three-day ElasticON Global event, 13-15 October. 

"The originality demonstrated by the Elastic Search Awards nominees never fails to impress. It's always a challenge to select the honorees, and this year's applicants really put us to the test with an array of innovative contributions and Elasticsearch use cases,” said Madison Bahmer, the chief technology officer of IST Research and member of the Elastic Search Awards judging panel.

GARDIAN, developed with SCiO, enables the discovery of publications and datasets from more than thirty institutional publications and data repositories across CGIAR, and several more data sources managed by CGIAR's key stakeholders to enable value addition and innovation via data reuse.

Click here to watch the GARDIAN team's Elastic Search Award acceptance video.

   GARDIAN has evolved some more in 2020! We now have a                   new interface that makes everything on offer much more visible -           including the data management toolkit. Some highlights of 2020             GARDIAN data ecosystem offerings:

  • The FAIR data workflow is now part of the toolkit, and currently includes an auto-check for PII, with the results sent to the user's email. The workflow also allows users to annotate data assets with the CG Core metadata elements, and data variables within datasets with ontology terms - try it! The workflow is up and working, but we are planning a completely re-engineered, much more user-friendly and intuitive version available early next year. 
  • GARDIAN's Data Exploration module now offers several very large datasets that can be visualized and subjected to spatial query and download - at country or lower administrative level, or a polygon of your choice. Datasets include the global crop production estimates for 2000, 2005, 2010, and 2017; two climate forecast model datasets from 2030 - 2090; and global soil data from ISRIC.
  • Collaborative GARDIAN Labs (CG Labs) was launched (undergirded by Globus for secure data exchange), and has been used extensively for high-compute crop suitability and yield potential analyses, drawing on GARDIAN's climate forecast model and global crop production datasets as input.
2021 plans include UI improvements to GARDIAN and the FAIR data workflow, improved search, and even more flexibility to deploy CG Labs on the Cloud, institutional server, or user laptop.

      CGIAR Expert Finder 
       I talked briefly about progress with a prototype VIVO for           CGIAR (or CGIAR Expert Finder) during the IDM CoP             annual meeting in November. Expert Finder is a                       semantic web application based on ontologies, and                 showcases research expertise across the System, using publications to enable flexible discovery of researchers and their outputs across CGIAR – by expertise area, geography, organization and more. Many of you had expressed interest in it during past DMTF and OAWG meetings, and the Big Data Platform put some resources behind modifying the VIVO ontology to reflect CGIAR's structure, and consume, clean, and present publicly-available data. There is interest in leveraging Expert Finder, and there are 2-3 efforts underway across the System to showcase researcher expertise using the tool. The Platform will be enhancing several features in 2021, including self-editing capabilities for researchers to enrich their profiles and request corrections of data pulls from authoritative sources. Please let me know if your Center is interested in learning more about implementing Expert Finder.

Data Management Maturity Assessment (DMMA)                 
As you know, the Internal Audit Function of the
CGIAR System contracted Accenture Development Partnerships to perform a Data Management Maturity Assessment (DMMA), which resulted in a DMMA report. A response to the DMMA was requested from us, and I drafted this using your input during our brainstorming session in October. A slightly modified  management response by Programs in the System Office was approved by the One CGIAR Executive Management Team, to be implemented starting in 2021.

CoP changes...
I want to end by letting you know that Michelle will not be working with our CoP starting in January. Michelle has been with us for a few years, working 2 days a week on CoP business this year. She will be missed, not just by me but many of you, I know. Michelle has supported me as well as the Working Groups over the last few years, and has played an indispensible role in orchestrating our annual meetings. I'd like to acknowledge her hard work and quiet but reliable assistance. Many many thanks, Michelle - and all the very best as you transition to working full time at the CGIAR System Office. Céline Aubert who currently helps with communications for the Ontologies CoP and works with me on the AgroFIMS project will take over some of Michelle's responsibilities, and I will carry the rest. 

Working Groups (WGs) are intended to be a forum for cooperation and participation, with members sharing with each other what information, knowledge and expertise they have to help sort issues that are common to benefit all. Your active participation ensures that we move ahead cooperatively and efficiently, so big THANK YOU for your time and dedication!
IDM COP - WGs Updates

Metadata Working Group (MWG)

The MWG’s main tasks for the year were focusing on two objectives:
  • Understanding gaps in identifying gender research to further improve findability of CGIAR resources via discussions with the CGIAR Gender Platform. This has made clear the need for a list of controlled vocabularies or concepts that could help better describe gender outputs, and capacity building to help researchers understand the value of enhanced metadata. In the coming year, we hope to finalize this work with the CGIAR Gender Platform.
  • Sharing experience on the CG Core Metadata Schema implementation across CGIAR repositories a focused discussion on usage of web services that CG Core recommends. In 2021 we will continue building capacity related to the implementation of CG Core in collaboration with the Repository WG and beyond CGIAR. 
See here for more detail on the MWG. 
Repository Working Group (RWG)
  • The RWG is collecting terms that CGIAR centers use to define the metadata field “kind of data.” The group will work with the MWG to arrive at controlled list for defining this field.
  • The group’s lead, Nilam Prasai submitted an abstract “Toward Fair and Open Data: CGIAR Centers’ Data Repositories”, which was accepted for presentation at the American Geophysical Union Fall Meeting, Dec 1-17.
See here for more detail on the RWG.
Ontology Working Group (OWG)
  • During the Big Data 2020 Convention, colleagues of the OWG presented the progress of the sub-groups they lead and discussed plans for 2021. Videos of the sessions are available here, along with session summary.
  • Members worked collaboratively with partners within and beyond CGIAR on a number of ontology development and enhancement efforts. These included: Ontology for small fisheries and aquaculture, developing connections between SEONT and OIMS, the Agronomy Ontology, and the Crop Ontology. The latter also included work on a new prototype website using NEO4J as the backend for the graph database to enable dynamic visualization of ontologies and semantic relationships.
  • The OWG meetings also included presentations for knowledge sharing and cross-learning, including one focused on integrating IITA’s Cassavabase and CKAN. Data files extracted from the breedbases and available in CKAN are already annotated with Crop Ontology term IDs and the OWG will prospect how they might be extracted to populate keyword fields in the CG Core.
See here for details on OWG efforts, and the people involved.
Open Access Working Group (OAWG)
  • The Open Access Working Group worked together to develop educational material for use at centers, on topics such as predatory publishing. Part or all of such materials can be used as handouts, web content, or in another form. Other guides to developments in publishing in general, picking Creative Commons licenses, and reporting on ISI journal status were also created.
  • In 2021, the OAWG will engage with Open Access communities beyond CGIAR to understand how similar institutions cope with challenges in Scholarly Publishing. The group is also working to document the CGIAR experience with open access, comparing citation rates between OA and restricted articles and augmenting this with researcher interviews to capture how OA is used, and what is most valuable for both research staff and the audience we serve.
More detail on OAWG efforts available here.

The Information and Data Management Community of Practice (IDM CoP) includes the Metadata, Repository, Ontology, Open Access, and Globus Working Groups. If you would like to actively participate in those groups, please sign up to engage with us!

IDM CoP webinar series - stay tuned!

These webinars are intended to facilitate exchange among our members and enhance capacity. Please forward widely as relevant to communities/researchers at your Center.

Upcoming webinars
Please email Michelle ( if you have suggestions for topics of interest, or would like to present on something you’re doing at your Center that may be of interest to all of us – in which case a tentative topic/title and rough date would be helpful. We would love to hear from you!

Past webinars

Click on the links below to access recordings and presentations. Please note that some events are facilitated through the CGIAR Big Data course Platform and are internal to CGIAR for open learning and conversation within CGIAR specifically. You will need to register to to access those.

Feb 24 & 26: COPO User Workshops - Implementation of CG Core
March 26: Publishing Scientific Datasets - the Alliance Bioversity-CIAT Example
April 7: Ontology CoP webinar: The Agronomy Ontology and AgroFIMS - Harmonizing Agronomic Data Collection and Annotation 
April 16: Globus - Simplifying Research Data Management via SaaS
April 27: Ontology CoP webinar: Use of Ontologies and Knowledge Graphs by BASF and KWS
May 19: Ontology CoP webinar: The Food Ontology, and its application for text mining (LexMapr), and food production description (PO2)

June 2: Ontology CoP webinar: Strengthening AGROVOC through engagement with expert communities
August 6: CGLabs - The Collaborative Gardian Labs
Sept 22: CGLabs: How we use Globus in a Collaborative Analytical Environment 
Nov 3-42020 IDM COP Annual Meeting
Dec 2: Enabling data-driven transformation of agriculture

Information on past webinars is available via Meetings and Workshops in the OA-OD Support Pack.

Useful resources

Questions/Concerns? Here's how you can get in touch:
Medha Devare
Michelle Fotsy
Copyright © 2020 CGIAR Platform for Big Data in Agriculture, All rights reserved.

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.

Email Marketing Powered by Mailchimp