Report Released Explores FAIR and Workflows

In July of 2021, Oak Ridge National Laboratory hosted the second annual Computational and Autonomous Workflows (CAW) workshop, held virtually over two half-days which assembled 62 ORNL scientists and engineers to discuss how FAIR applies to computational and autonomous workflows. Topics covered included FAIR metrics, workflow lifecycles, interoperability, and reusability, and how streaming workflows fit into FAIR.

The entire report can be found here. Some highlights of the report’s findings are as follows:

In the discussion around FAIR metrics, the community remarked that “metrics”, to them, sounded closer to a grade rather than a measure of the degree to which something is FAIR, and suggested a “FAIR fingerprint”, or a visual representation of the degree of sameness/difference between objects insofar as they represent various FAIR characteristics.

Reusability was a hot topic of discussion, as workflows are defined by how you use them. Dataset reuse is often restricted to questions about licensing and allowable or auditable reuse, but, by the very nature of workflows, this is insufficient for the intentions of FAIR: a workflow is a connection between input datasets, computational components, and potentially large numbers of intermediate datasets and routing decisions; so, you need more than just a license to understand what “reuse” means for a workflow.

Streaming was also a topic that was returned to multiple times, both in terms of what actually constitutes a streaming workflow (the nature of streams varies considerably), and the qualitative/quantitative results of a streaming system are tied to the raw input and specific ordering and time of insertion.

Finally, the “R” in FAIR generated a lot of confusion for attendees, and the overall consensus is that this needs to be clarified for workflows. A workflow must be reusable with different datasets, but that may not be possible, as some workflows may differ with different datasets, and some may remain unchanged. Also, the community remarked that a predefined set of workflow patterns would be useful: single process runs, parallel MPI runs, ensemble executions, and so on; common execution backends that can provide implementation for different workflow patterns that are usable by different domains.

Overall, the community pointed to a need for common abstractions over technologies, where workflows are independent from a specific technology. The next CAW meeting will happen at ORNL later this year, in September.

Previous
Previous

NSF Awards $1.26 Million to SDSC to Extend FAIR Research Community Activities

Next
Next

VODAN-Africa Celebrates the GO FAIR Approach Working Across 88 facilities in 8 African Countries to Combat Covid-19 and Future Outbreaks