Copy
Join the conversation #BigDatainAg                                                                                          View this email in your browser
ISSUE #6  /  30 March,  2020                                                                                                  NEWSLETTER

Warm greetings!

Welcome to the Spring 2020 Module 1 Newsletter.

You are receiving this email because you are a valued collaborator of Module 1 Organize of the CGIAR Platform for Big Data in Agriculture. Michelle and I use MailChimp to provide updates on Module 1 activities, and with the changes of the EU GDPR, we would like to remind you that you can at any time update your preferences or unsubscribe from this list by following the links included at the bottom of the Newsletter.

We --along with the Big Data Platform team-- hope you and your families are well and finding ways to stay upbeat during the covid-19 crisis. Our best wishes come to you with hopes that life will get back towards (a new and better?) normal soon. 


Warm wishes,
Medha Devare

About our CoP

As you know, the CGIAR data and information managers self-selected into the Data Management Task Force (DMTF) and the Open Access Working Group (OAWG) around 2014/15. In November 2018 members of these groups voted to consolidate into the Data and Information Management Community of Practice; however, CoP members agreed during the CoP’s annual meeting in Hyderabad in October 2019 that this acronym could be better. CoP members were also supportive of the idea that Working Group chairs work together towards Terms of Reference for the CoP, as a first step towards formalization within the Platform (meeting report and summary for reference). Your colleagues the WG Chairs have worked with me to develop ToRs for the tentatively named Information and Data Management CoP, currently comprising about 70 members. Huge thanks to WG Chairs Elizabeth, Leroy, Marie, Nilam, and Ryan! Please feel free to have a look at these ToRs and provide comments. If you can think of a catchier name for the CoP (IDA -- Info and Data Assets?!), don’t hesitate to suggest it -- your colleagues will be in your debt.

Big Data Platform Management Retreat

The Platform’s annual Management Team retreat was held at CIMMYT during December 16-18 as a team building exercise, to assess 2019 progress, and to brainstorm around the 2020 timeline of activities and associated deliverables.

Digital Strategy

The Big Data Platform was tasked with helping to develop a Digital Strategy for CGIAR, to help identify CGIAR’s comparative advantages in digitizing the agricultural development enterprise. Three on-site workshops at Centers in the Americas, Africa-Middle East-Europe, and Asia regions were intended for March onward; however, covid-19 intervened, and we are now planning brief online workshops. These will be coupled with a web-based questionnaire composed of multiple-choice questions targeting individuals, and follow-up remote interviews structured around a series of open questions. The exercise is designed to find out what digital and agricultural experts think about four inter-related questions about the current status and future of digital agricultural technology:
  1. Trends: What digital trends from agribusiness, life sciences, etc. relevant to the agricultural research for development sector could transform the sector by 2030?   
  2. Roles: What roles should CGIAR or public goods organizations claim in this evolving landscape?
  3. Capabilities: What organizational capabilities do organizations need to navigate this evolving space?
  4. Investments: How should investments be prioritized in light of the above?
Our aim is to gain a cross-cutting view of digital trends in the agricultural industry as they impact digital strategies for agricultural research for development. Thus, participants for workshops and interviews include members of CGIAR, agribusiness, food companies, research organizations, funding and financing organizations, digital agriculture startups, and other relevant industries.


Data Management Course  

A course entitled “Best Practices for FAIR, Ethical, and Open Data” was created with content support from Tom Hazekamp and admin support from Sandra Perez. IDM CoP colleagues at the Alliance Bioversity-CIAT, CIMMYT, and IFPRI also provided feedback.
This researcher and data/information manager-oriented course focuses on the “how do I…?” and is intended to strengthen capacity in the CGIAR System and beyond to create, manage and share research and development data assets that are open (i.e., discoverable and downloadable), and  also easily interpretable, interoperable, and reusable as well as responsible. Guidelines on operationalizing the FAIR Principles to make resources Findable, Accessible, Interoperable, and Reusable (FAIR) is included, along with tools to help researchers and data managers adhere to ethical practices in managing data. These modules were created after discussion with FAO, CTA, Wageningen UR, and GODAN, who developed the GODAN Online Course in Open Data Management in Agriculture and Nutrition designed primarily for the newcomer. The course will be available as an interactive resource this spring on the Big Data Platform’s TechChange space. Feedback for further improvements will be welcome -- stay tuned for its release! 

VIVO (aka CGIAR Expert Finder) 

Many of you were interested in VIVO to enable CGIAR to showcase its research and allow funders, researchers and others to easily find experts and collaborators.

It has taken awhile to appropriately modify the ontology underlying this semantic web tool and to clean up the data (currently from CGIAR’s active directory, Web of Science publications and GARDIAN), but a very prototype CGIAR Expert Finder is now up. Here is an overview of the CGIAR Expert Finder that I presented at the 2019 Annual Big Data Convention in Hyderabad. The tool can be improved – particularly if Centers were interested in a local install that enabled individuals to edit profiles to add keywords, project abstracts, geographic locations of work, and other non-authoritative information. Centers could provide a Center-specific and CGIAR view (via “search <Center name>” and “search CGIAR” buttons, say) and enable additional data pulls from authoritative databases with information appropriate for public viewing. Please let me or Michelle know if you are interested in more detail/support.

Collaborative GARDIAN Labs  

Collaborative GARDIAN Labs (CG Labs) is the latest offering in the GARDIAN data ecosystem.

CG Labs has a built-in collaboration platform, allowing a user to create either private or public virtual spaces, invite members, receive notifications and collaborate remotely and asynchronously. Access is handled by a Single Sign-On (SSO) functionality via Globus secure data software. CG Labs is still being improved, but currently offers three modules with specific, interlinked functionalities, allowing the user to:

  1. Find data: Search, download, and save datasets from GARDIAN in CG Labs;
  2. Securely share data: This may be published (or unpublished but sensitive data) securely exchanged via a Globus implementation; and
  3. Analyze data: Collaboratively write scripts and run analyses via Jupyter, which has been extended to support smooth data file exchange via the CG Labs Globus Server. Analytical pipelines to commonly used crop models are being built.
A webinar on CG Labs is anticipated in the first half of this year; keep an eye out for that!

Agronomy Field Information Management Systems (AgroFIMS)  

AgroFIMS enables the collection of agronomic data harmonized with interoperability standards.

The Agronomy Field Information Management System (AgroFIMS) v.1.0 is an Organize Module product released in 2019. It is based on the Agronomy Ontology and other ontologies, and has been developed by CIP, the Alliance Bioversity-CIAT, and IFPRI, with input from several other CGIAR and non-CGIAR entities. AgroFIMS data variables are mapped to ontology concepts and already associated with URIs, enabling FAIR data at the collection stage. It also allows easy creation of field books, digital data collection via the KDSmart app, checks on data quality, and generation of statistical analyses and reports. 2020 plans include: enabling data collection from large, multi-locational, survey-type agronomic trials, associated ontology modification, an improved code base and UI, and the ability to collect data using the Fieldbook app and Open Data Kit (ODK).

For more information attend the Ontology CoP webinar on April 7 at 15:00 CEST: The Agronomy Ontology and AgroFIMS - Harmonizing Agronomic Data Collection and Annotation (GoToMeeting Link)

Big Data Platform Shared Services

Empowering CGIAR and its partners to deliver on the potential of big data to yield solutions for the challenges faced in the agriculture for development sector.

The Platform’s Convene Module develops ambitious partnerships to facilitate collaboration and ideation, provide access to data-rich tools and services, and build CGIAR capacity on big data approaches. These shared services are available to you and your Center researchers, with more information available on the Big Data Platform’s Shared Service webpage. Please let your researchers know, and if you (or they) would like to know more about any of these services, please contact Jawoo Koo at j.koo@cgiar.org.

WORKING GROUPS
Working Groups (WGs) are a forum for cooperation and participation, with members sharing with each other what information, knowledge and expertise they have to help sort issues that are common to benefit all. Your active participation ensures that we move ahead cooperatively and efficiently, so a big THANK YOU for your time and dedication!
The Information and Data Management Community of Practice (IDM CoP) includes the Metadata, Dataverse, Ontology and Open Access Working Groups, and a new one -- the Globus Working Group. If you would like to actively participate in those groups, please do not hesitate to request to join. 
 
From Ryan Miller, chairing the Open Access Working Group (OAWG)
The OAWG has developed draft educational materials related to predatory publishing for all Centers. The group has also developed the following workplan to address its 2020 scope of work:
  • Continue knowledge sharing/education on predatory publishing and on developments in scholarly publishing, along with guides for authors to navigate scholarly publishing choices 
  • Support group members and researchers with reporting on impact measures
  • Advocate for continued funding for OA by agreeing on consistent ways to fund article processing fees, developing a “best practices” guide, gathering evidence on the utility of OA and reporting on citation rates comparing OA and restricted articles available via Web of Science 
  • Serve as a discussion forum to address issues related to the group’s efforts

From Elizabeth Arnaud, chairing the Ontology WG & CoP
Collaboration with FAO AIMS:
Elizabeth Arnaud and Enrico Bonaiuti attended FAO’s AGROVOC retreat in January, focused on current developments and collaboration between CGIAR and FAO to contribute missing terms in AGROVOC used in Dspace and Dataverse. We also prospected the matching of Agronomy Ontology to AGROVOC with Marie-Angélique Laporte. FAO AIMS will soon release a MOOC on the use of semantic tools for publications annotation including AGROVOC and curation tools. CGIAR data managers are invited to use it once it is online. The FAO team will organize an online 2-day meeting in June to which members of the Ontology WG are invited to speak and/or attend; agenda will be sent soon.

Progress of the Socioeconomic ontology WG (led by Soonho Kim and Berta Miro)
A virtual workshop was held over 24 and 25 March on Socioeconomic Ontology content towards publishing a v.1.0 by June-end, and on workflows for the annotation of IRRI code books on historical survey data. The ontology is now called SEONT. Supported by the Socioeconomic Development CoP, the WG is collaborating with Xingyi Song at the University of Sheffield to extract concepts from survey questionnaires using a Machine Learning algorithm.

Plant Phenotype Ontology WG
A Sunflower Ontology was submitted to the Crop Ontology by the University of British Columbia, Canada. A February webinar addressed ontology collaborations for harmonizing breeding data as part of the Ontologies CoP series (recording available here). Speakers: Elizabeth Arnaud, Alliance Bioversity-CIAT; Corina de Luna Habito, IBP; and Lukas Mueller, Boyce Thompson Institute.

The 2020 work plan of the Ontology Working Group includes the following: 
  • Collaborate with FAO AIMS for integrating CGIAR concepts into AGROVOC
  • Create an interest group on ontologies for unmanned aerial vehicles (UAVs)
  • Create an ontology for household surveys
  • Increase the use of quality ontologies for data annotation in CGIAR repositories
  • Organize PhenoHarmonIS 2020 at ICRISAT - India in November
  • Promote the Ontologies CoP to the Excellence in Breeding Platform 
  • Organize sessions at the (likely) virtual Big Data Platform Convention 2020

From Marie Angélique Laporte, chairing the Metadata WG
Changes were made to the CG Core v.2.0 in Dec-Jan in response to requests by the CGspace community. The 2020 work plan of the Metadata Working Group was developed, and includes activities to:
  • Define CG Core extensions to be used on specific types of information products, to further enhance the interoperability of systems
  • Create a sub-group to build capacities and share experiences on the implementation of the CG Core metadata schema in the different institutional repositories
  • Create a sub-group to coordinate with the CLARISA team (https://clarisa.cgiar.org/) to publish controlled lists of terms relevant to CGIAR research that can be used to standardize metadata elements. This group will also help identify gaps in information needed during the reporting phase and add them to the CG Core as recommendations. 
 
From Nilam Prasai and Leroy Mwanzia, co-chairing the Dataverse WG
The 2020 work plan of the Dataverse Working Group includes efforts to:
  • Address issues and gaps in implementing the CG Core Metadata Schema in data repositories 
  • Coordinate with repository software providers to implement file-level metadata, an important component for interoperability across datasets
  • Work with repository software providers to implement the ontology look-up service
  • Coordinate with repository providers to develop features requested by members 
  • Address emerging common issues regarding repository and data management

From Leroy Mwanzia, chairing the (new!) Globus WG
This working group, led by Leroy Mwanzia, seeks to build a community around the research life cycle to use APIs, tools and services provided by Globus for more effective and efficient data management. Globus is a non-profit service managed by the University of Chicago to provide unified access to research data across all systems (high performance computing cluster, laptop, in-cloud or on-premise storage) using any existing identity. Globus allows researchers to efficiently, securely, and reliably transfer data directly between systems separated by an office wall or an ocean. Please contact Leroy (l.mwanzia@cgiar.org) if you would like to be part of this group.

The 2020 work plan of the Globus Working Group includes activities to:

  • Conduct trainings and webinars on Globus for new and seasoned users and technical implementers (server side maintenance, API and SDK integrators)  
  • Share lessons and address issues in deploying and integrating Globus with research applications
  • Discuss and recommend options to use Globus to transfer and share data with personally-identifiable information

What's coming up
Upcoming webinars - stay tuned!

These webinars are intended to facilitate exchange among our members and enhance capacity. Please forward widely as relevant to communities/researchers at your Center.

April 7: Ontology CoP webinar: The Agronomy Ontology and AgroFIMS - Harmonizing Agronomic Data Collection and Annotation (15:00 CEST; GoToMeeting Link)
April 16: Globus: Simplifying Research Data Management via SaaS (15:00 CEST)
May 19:  Overview of Collaborative GARDIAN (CG) Labs (15:00 CEST)

Please email Michelle (m.fotsy@cgiar.org) if you have suggestions for topics of interest, or would like to present on something you’re doing at your Center that may be of interest to all of us – in which case a tentative topic/title and rough date would be helpful. We would love to hear from you!


Past webinars

Click on the links below to access recordings and presentations. Please note that some events are facilitated through the CGIAR Big Data course Platform and are internal to CGIAR for open learning and conversation within CGIAR specifically. You will need to register to https://bigdata-cgiar.course.tc to access those.

Feb 24 & 26: COPO User Workshops - Implementation of CG Core
March 26: Publishing Scientific Datasets - the Alliance Bioversity-CIAT Example

Information on past webinars is available via Meetings and Workshops in the OA-OD Support Pack.

 

Useful resources

Questions/Concerns? Here's how you can get in touch:
Medha Devare m.devare@cgiar.org
Michelle Fotsy m.fotsy@cgiar.org
Copyright © 2020 CGIAR Platform for Big Data in Agriculture, All rights reserved.


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.

Email Marketing Powered by Mailchimp