Por favor, use este identificador para citar o enlazar a este item:
http://hdl.handle.net/10261/3458
COMPARTIR / EXPORTAR:
SHARE CORE BASE | |
Visualizar otros formatos: MARC | Dublin Core | RDF | ORE | MODS | METS | DIDL | DATACITE | |
Título: | Discovering semantic features in the literature: a foundation for building functional associations |
Autor: | Chagoyen, Mónica ; Carmona-Sáez, Pedro; Shatkay, Hagit; Carazo, José M.; Pascual-Montano, Alberto | Fecha de publicación: | 26-ene-2006 | Editor: | BioMed Central | Citación: | BMC Bioinformatics 2006, 7:41 | Resumen: | [Background] Experimental techniques such as DNA microarray, serial analysis of gene expression
(SAGE) and mass spectrometry proteomics, among others, are generating large amounts of data
related to genes and proteins at different levels. As in any other experimental approach, it is
necessary to analyze these data in the context of previously known information about the biological
entities under study. The literature is a particularly valuable source of information for experiment
validation and interpretation. Therefore, the development of automated text mining tools to assist
in such interpretation is one of the main challenges in current bioinformatics research. [Results] We present a method to create literature profiles for large sets of genes or proteins based on common semantic features extracted from a corpus of relevant documents. These profiles can be used to establish pair-wise similarities among genes, utilized in gene/protein classification or can be even combined with experimental measurements. Semantic features can be used by researchers to facilitate the understanding of the commonalities indicated by experimental results. Our approach is based on non-negative matrix factorization (NMF), a machine-learning algorithm for data analysis, capable of identifying local patterns that characterize a subset of the data. The literature is thus used to establish putative relationships among subsets of genes or proteins and to provide coherent justification for this clustering into subsets. We demonstrate the utility of the method by applying it to two independent and vastly different sets of genes. [Conclusion] The presented method can create literature profiles from documents relevant to sets of genes. The representation of genes as additive linear combinations of semantic features allows for the exploration of functional associations as well as for clustering, suggesting a valuable methodology for the validation and interpretation of high-throughput experimental data. |
Descripción: | This article is available from: http://www.biomedcentral.com/1471-2105/7/41 | URI: | http://hdl.handle.net/10261/3458 | DOI: | 10.1186/1471-2105-7-41 | ISSN: | 1471-2105 |
Aparece en las colecciones: | (CNB) Artículos |
Ficheros en este ítem:
Fichero | Descripción | Tamaño | Formato | |
---|---|---|---|---|
1471-2105-7-41.pdf | Principal | 1,32 MB | Adobe PDF | Visualizar/Abrir |
1471-2105-7-41-s1.pdf | Archivo adicional 1 | 576,59 kB | Adobe PDF | Visualizar/Abrir |
1471-2105-7-41-s2.xls | Archivo adicional 2 | 1,08 MB | Microsoft Excel | Visualizar/Abrir |
1471-2105-7-41-s3.pdf | Archivo adicional 3 | 31,12 kB | Adobe PDF | Visualizar/Abrir |
1471-2105-7-41-s4.pdf | Archivo adicional 4 | 23,5 kB | Adobe PDF | Visualizar/Abrir |
CORE Recommender
PubMed Central
Citations
25
checked on 11-may-2024
SCOPUSTM
Citations
65
checked on 11-may-2024
WEB OF SCIENCETM
Citations
58
checked on 25-feb-2024
Page view(s)
456
checked on 19-may-2024
Download(s)
524
checked on 19-may-2024
Google ScholarTM
Check
Altmetric
Altmetric
Artículos relacionados:
NOTA: Los ítems de Digital.CSIC están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.