English   español  
Por favor, use este identificador para citar o enlazar a este item: http://hdl.handle.net/10261/145832
logo share SHARE logo core CORE   Add this article to your Mendeley library MendeleyBASE

Visualizar otros formatos: MARC | Dublin Core | RDF | ORE | MODS | METS | DIDL
Exportar a otros formatos:


Analysis of plant pan-genomes and transcriptomes with GET_HOMOLOGUES-EST, a clustering solution for sequences of the same species

AutorContreras-Moreira, Bruno ; Pérez Cantalapiedra, Carlos ; García-Pereira, María J.; Gordon, Sean P.; Vogel, John P.; Igartua Arregui, Ernesto ; Casas Cendoya, Ana María ; Vinuesa, Pablo
Palabras clavecomparative genomics
accessory genome
Arabidopsis thaliana
Fecha de publicaciónfeb-2017
EditorFrontiers Media
CitaciónContreras-Moreira B, Cantalapiedra CP, García-Pereira MJ, Gordon SP, Vogel JP, Igartua E, Casas AM, Vinuesa P. Analysis of plant pan-genomes and transcriptomes with GET_HOMOLOGUES-EST, a clustering solution for sequences of the same species. Frontiers in Plant Science 8: 184 (2017)
ResumenThe pan-genome of a species is defined as the union of all the genes and non-coding sequences found in all its individuals. However, constructing a pan-genome for plants with large genomes is daunting both in sequencing cost and the scale of the required computational analysis. A more affordable alternative is to focus on the genic repertoire by using transcriptomic data. Here, the software GET_HOMOLOGUES-EST was benchmarked with genomic and RNA-seq data of 19 Arabidopsis thaliana ecotypes and then applied to the analysis of transcripts from 16 Hordeum vulgare genotypes. The goal was to sample their pan-genomes and classify sequences as core, if detected in all accessions, or accessory, when absent in some of them. The resulting sequence clusters were used to simulate pan-genome growth, and to compile Average Nucleotide Identity matrices that summarize intra-species variation. Although transcripts were found to under-estimate pan-genome size by at least 10%, we concluded that clusters of expressed sequences can recapitulate phylogeny and reproduce two properties observed in A. thaliana gene models: accessory loci show lower expression and higher non-synonymous substitution rates than core genes. Finally, accessory sequences were observed to preferentially encode transposon components in both species, plus disease resistance genes in cultivated barleys, and a variety of protein domains from other families that appear frequently associated with presence/absence variation in the literature. These results demonstrate that pan-genome analyses are useful to explore germplasm diversity.
Descripción16 Pags.- 4 Tabls.- 5 Figs.- Supplementary material foundable online. This Document is Protected by copyright and was first published by Frontiers. All rights reserved. it is reproduced with permission.
Versión del editorhttps://doi.org/10.3389/fpls.2017.00184
Aparece en las colecciones: (EEAD) Artículos
Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
Contreras-MoreiraB_FrontPlantSci_2017.pdf2,67 MBAdobe PDFVista previa
Mostrar el registro completo

Artículos relacionados:

NOTA: Los ítems de Digital.CSIC están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.