Por favor, use este identificador para citar o enlazar a este item: http://hdl.handle.net/10261/162254
COMPARTIR / EXPORTAR:
logo share SHARE BASE
Visualizar otros formatos: MARC | Dublin Core | RDF | ORE | MODS | METS | DIDL | DATACITE

Invitar a revisión por pares abierta
Título

Unmasking new intra-species diversity through K-mer count analysis

AutorPérez Cantalapiedra, Carlos CSIC ORCID; Contreras-Moreira, Bruno CSIC ORCID ; Casas Cendoya, Ana María CSIC ORCID ; Igartua Arregui, Ernesto CSIC ORCID
Palabras claveCopy Number Variations (CNV)
Gene families
Genotyping
Barley
Presence-Absence Variation
Sequencing Plant Genomics
NBS-LRR
K-mer Analysis
Pentotricopeptide
Pangenomics
Exome Capture
Fecha de publicaciónmar-2018
CitaciónEUCARPIA Cereal Section/ IWW2 Meetings (Polydome - Clermont-Ferrand, France. 19-22 Marzo 2018)
ResumenHigh-throughput sequencing is often used to examine intra-species diversity. Most studies are focused on calling and genotyping SNPs. Other kinds of genomic variation, such as copy-number variation (CNV), are more rarely exploited despite literature reports linking them to phenotypic differences. For some loci, it is difficult to identify reliable SNPs. For instance, reads from closely related sequences (e.g. paralog genes) will often map stacked to the same location if some of those loci are absent from the reference sequence. Such piled up mappings produce abundant fake heterozygous SNPs, and thus have been called apparent heterozygous mappings (AHMs). To avoid wrong conclusions from false positive calls, SNPs from AHMs are often discarded, either in early (e.g. samples expected to be homozygous), or in downstream steps of the analysis (e.g. when incoherent haplotype blocks are identified). This would lead to information loss at certain loci. AHMs can be seen as a kind of CNV which is specific to non-identical copies. Unmasking such variation could help to i) assess the completeness of a genome or pan-genome reference, ii) confirm results from other CNV genotyping methods, when the copies originate in non-identical loci, iii) provide hints about the history and behavior of duplicating DNA loci, and iv) reveal novel intra-species genetic diversity. Here we present a software pipeline, kmeleon, available at https://github.com/eead-csic-compbio/kmeleon, designed to identify regions harboring AHMs. kmeleon is based on mappings, and thus it can be used for both homozygous and heterozygous samples. First, the different k-mers (sequences of length k) mapping to a single locus are identified and counted. Then, loci are classified based on the presence or absence of AHMs. From those intervals, it is straightforward to perform comparisons between genotypes, or to translate existing annotation to the regions with AHMs. We used exome capture data to detect AHMs in a set of barley accessions. We included the cultivar Morex, the genotype of the genome reference, as a control sample. As expected, it had the lowest number of AHMs, although some were still detectable. For all accessions, AHMs were found both in inter- and intragenic loci. Enrichment analysis showed that NBS-LRR proteins were overrepresented at AHMs, whereas PPRs proteins were depleted. Also, we will show that AHMs can be used to infer phylogenetic trees which are congruent to those produced with SNP-based approaches, supporting the information value, of this hidden variability, to describe genetic relationships.
Descripción1 .pdf copy (3 Figs.) from the original poster of the Authors. Creative Commons License Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
URIhttp://hdl.handle.net/10261/162254
Aparece en las colecciones: (EEAD) Comunicaciones congresos




Ficheros en este ítem:
Fichero Descripción Tamaño Formato
CantalapiedraCP_EUCARPIA-Post_2018.pdf1,5 MBAdobe PDFVista previa
Visualizar/Abrir
Mostrar el registro completo

CORE Recommender

Page view(s)

373
checked on 07-may-2024

Download(s)

131
checked on 07-may-2024

Google ScholarTM

Check


Este item está licenciado bajo una Licencia Creative Commons Creative Commons