############################################################################################################################################################################ # # Title: DisGeNET-RDF data dump # # DisGeNET Version: 3.0 # # RDF Version: 3.0.0 # # Date: 08/19/2015 # ############################################################################################################################################################################ DisGeNET-RDF v3.0.0 (2015) - Release notes and bugs -------------------------------------------------------------------------------- DisGeNET-RDF dataset is the formal semantics representation of the DisGeNET database, which is a database integrating gene-disease associations from several public sources and the literature. DisGeNET-RDF v3.0.0 dataset is the first release of the RDF distribution of DisGeNET Version v3.0, and it is available for download as of August 19, 2015. DisGeNET-RDF v3.0.0 has new annotation and new linksets: - More GDAs comprising 17 000 genes and more than 14 000 diseases as linked data in the Semantic Web. - New source: CLINVAR (curated). - New text mined gene-disease associations from MEDLINE (BeFree source). The new BeFree dataset has more association types classified (GeneticVariation, AlteredExpression, PostTranslationalModification) and SNPs identified. - All linksets updated, i.e. all ontologies updated. - All gene-disease associations annotated to the Homo sapiens (Human) taxon. - New Phenotype annotation: disease-phenotype associations from the Human Phenotype Ontology project (HPO). - New disease linksets: * NCI * Orphanet Rare Disease Ontology (ORDO) * DECIPHER - RDF enhanced: * New 303 URIs for DisGeNET gene-disease associations and PANTHER class entities. * New labels. * Primary source evidence better described with the Evidence Code Ontology and new preproperties. * Name descriptions: predicate replaced by . - RDF bugs fixed: * Fixed formal description of the DisGeNET Score: Score described as an object property, and not as a datatype property. * Fixed formal description of gene-disease association type 'label' from original source attribute: Label described as a datatype property by a new predicate: replaced by . - Interlinking: More mappings to the Linked Open Data cloud. - A new full metadata description of the dataset which is compliant with the W3C HCLS and Open PHACTS specifications. The full description of the DisGeNET-RDF v3.0.0 is available in RDF for download as 'void.ttl' file, which contains release statistics. Additional information regarding DisGeNET is available on the DisGeNET homepage at http://www.disgenet.org/. -------------------------------------------------------------------------------- DisGeNET-RDF v3.0.0 (2015) - data dump -------------------------------------------------------------------------------- The DisGeNET-RDF dataset v3.0.0 and its metadata description is distributed within fourteen files: - geneDiseaseAssociation.ttl.tgz: all gene-disease associations triples and related annotated objects. It is a tar file that contains 11 RDF/turtle files. - disease.ttl.gz: all diseases triples and related annotated objects. - gene.ttl.gz: all genes triples and related annotated objects. - diseaseClass.ttl.gz: all MeSH disease classes triples. - umlsSTY.ttl.gz: all UMLS Semantic Types categories triples. - hpoAnnotation.ttl.gz: HPO disease-phenotype annotation triples. - geneSymbol.ttl.gz: all HGNC symbols triples. - protein.ttl.gz: all protein triples. - pantherClass.ttl.gz: all PANTHER classes triples. - pathway.ttl.gz: all pathways triples. - phenotype.ttl.gz: all phenotypes triples. - pubmed.ttl.gz: all PubMed articles triples. - snp.ttl.gz: all SNPs triples. - void.ttl.gz: Metadata description triples of the DisGeNET-RDF dataset, which is W3C HCLS compliant. Otherwise, there is the option to download the entire dump at once: - disgenetv3.0-rdf-v3.0.0-dump.tgz The dump dataset is serialized in RDF/Turtle format. The linksets are embedded in these files and also extracted in independent files named 'id1-id2-rdflink-ls.ttl.gz'. This dataset is the RDF representation of the DisGeNET version 3.0. For more information about the RDF dataset, please visit the Web site at: http://rdf.disgenet.org/ -------------------------------------------------------------------------------- DisGeNET Nanopublications v3.0.0.0 (2015) - data dump -------------------------------------------------------------------------------- DisGeNET Nanopublications linked dataset is the nanopublication representation of the DisGeNET database. The DisGeNET Nanopublications dataset v3.0.0.0 is the distribution of the DisGeNET v3.0 and is distributed in a unique file: - nanopublications_v3.0.0.0.trig.gz: all nanopublications. The dump dataset is serialized in RDF/TriG format. This dataset is the nanopublication representation of the DisGeNET version 3.0. For more information about the nanopublication dataset, please visit the Web site at: http://rdf.disgenet.org/ -------------------------------------------------------------------------------- Attribution -------------------------------------------------------------------------------- If you use DisGeNET, you are requested to cite the source articles: Janet Piñero, Núria Queralt-Rosinach, Àlex Bravo, Jordi Deu-Pons, Anna Bauer-Mehren, Martin Baron, Ferran Sanz, Laura I Furlong. DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database (2015) Vol. 2015: article ID bav028; doi:10.1093/database/bav028 Anna Bauer-Mehren, Markus Bundschus, Michael Rautschka, Miguel A. Mayer, Ferran Sanz, Laura I. Furlong. Gene-disease network analysis reveals functional modules in mendelian, complex and environmental diseases. PLoS ONE 2011 6(6): e20284. doi:10.1371/journal.pone.0020284. Bauer-Mehren A, Rautschka M, Sanz F, Furlong LI. DisGeNET: a Cytoscape plugin to visualize, integrate, search and analyze gene-disease networks. Bioinformatics. 2010 Nov 15;26(22):2924-6. Epub 2010 Sep 21. To cite specific data: Gene-disease association data were retrieved from the DisGeNET Database, GRIB/IMIM/UPF Integrative Biomedical Informatics Group, Barcelona. (http://www.disgenet.org/). [Month, year of data retrieval]. -------------------------------------------------------------------------------- License information -------------------------------------------------------------------------------- The DisGeNET database is made available under the Open Database License whose full text can be found at http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License whose text can be found at http://opendatacommons.org/licenses/odbl/1.0/. If DisGeNET is incorporated into other works, we ask that the DisGeNET IDs are preserved, and that the release number of DisGeNET is clearly displayed. Please, see more information on legal notices at http://www.disgenet.org/ds/DisGeNET/html/legal.html -------------------------------------------------------------------------------- Contact us -------------------------------------------------------------------------------- Integrative Biomedical Informatics Group Research Unit on Biomedical Informatics - GRIB Barcelona Biomedical Research Park - PRBB Dr. Aiguader 88 08003 Barcelona email: lfurlong(at)imim(dot)es phone: +34 93 316 0521 fax: +34 93 316 0550 web: http://ibi.imim.es/ -------------------------------------------------------------------------------- Help and troubleshooting -------------------------------------------------------------------------------- If you have any suggestion, question or comment about DisGeNET-RDF datasets, please do not hesitate to contact us: Support team Email: support(at)disgenet(dot)org © 2010-2015, Integrative Biomedical Informatics Group