EOSC – European Open Science Cloud
This proposal has a two-fold objective. The first one is aimed at developing base software and the second one is to provide a suite of Comparative Genomics software. For the first objective we will address the issue of automatic and efficient scaling of resources in the cloud infrastructure; and high-throughput data transfer to/from remote infrastructures. This base software will simplify the transformation of a simple local application to a dynamically scalable cloud application. Data transfer throughput will be improved by providing specific network protocols and technologies. Our previous work in this base software has been demonstrated in the framework of the Mr.Symbiomath FP7 project (grant agreement 324554) using an OpenStack cloud, the Galaxy workflows management system and bioinformatics software. Although it has been used for bioinformatics use cases, it is applicable to other scientific domains .
On the other side, this proposal will also provide Scientific Demonstrators for heavy computational tasks in the field of Comparative Genomics: software for pairwise and multiple genomes comparisons , and metagenomics studies . Such applications perform a high number of I/O operations, work with big-data sets, and are composed by a collection of processes, and therefore are suitable as use cases for an auto-scalable infrastructure. At present, Comparative genomics is in the spotlight with thousands of different organisms already available and a great number of on-going sequencing projects. Computationally efficient techniques are required to study relationships within this large amount of data.
The field of metagenomics is gaining increasing popularity. One of the aims of metagenomics studies is to determine the species present in an environmental community and identify changes in the abundance of species under different conditions. Metagenomic analysis software faces bottlenecks due to the high computational load required to analyse complex samples typically requiring the use of HPC resources.
- Krieger, M. T., Torreno, O., Trelles, O., & Kranzlmüller, D. (2016). “Building an open source cloud environment with auto-scaling resources for executing bioinformatics and biomedical workflows”. Future Generation Computer Systems.
- Óscar Torreño and Oswaldo Trelles. Breaking computational barriers in pairwise genome comparison. BMC Bioinformatics, 2015, 16:250. DOI: 10.1186/s12859-015-0679-9.
- Pérez-Wohlfeil, E., Arjona-Medina, J. A., Torreno, O., Ulzurrun, E., & Trelles, O. (2016). Computational workflow for the fine-grained analysis of metagenomic samples. BMC genomics, 17(8), 351.