조회수 30
자료실
AI 활용신약DB 상세
조회수 30
본 DB에서는 해당 논문의 원본데이터 및 이를 활용한 다른 논문의 전처리 데이터의 정보를 제공하고 있습니다.
원저작물에 대한 권리는 해당 연구자 및 기관에 있습니다.
Harmful cyanobacterial blooms, which frequently contain toxic secondary metabolites, are reported in aquatic environments around the world. More than two thousand cyanobacterial secondary metabolites have been reported from diverse sources over the past fifty years. A comprehensive, publically-accessible database detailing these secondary metabolites would facilitate research into their occurrence, functions and toxicological risks. To address this need we created CyanoMetDB, a highly curated, flat-file, openly-accessible database of cyanobacterial secondary metabolites collated from 850 peer-reviewed articles published between 1967 and 2020. CyanoMetDB contains 2010 cyanobacterial metabolites and 99 structurally related compounds. This has nearly doubled the number of entries with complete literature metadata and structural composition information compared to previously available open access databases. The dataset includes microcytsins, cyanopeptolins, other depsipeptides, anabaenopeptins, microginins, aeruginosins, cyclamides, cryptophycins, saxitoxins, spumigins, microviridins, and anatoxins among other metabolite classes. A comprehensive database dedicated to cyanobacterial secondary metabolites facilitates: (1) the detection and dereplication of known cyanobacterial toxins and secondary metabolites; (2) the identification of novel natural products from cyanobacteria; (3) research on biosynthesis of cyanobacterial secondary metabolites, including substructure searches; and (4) the investigation of their abundance, persistence, and toxicity in natural environments.
Open Access List | Entries | Microcystins | Mol. formulae & primary references | Structural codes (e.g., SMILES) |
---|---|---|---|---|
CyanoMetMass1 | 852 | 35 | 499 | 0 |
The Natural Products Atlas2 | 1006 | 27 | 1006 | 1006 |
Microcystins_Miles3 | 296 | 286 | 286 | 0 |
CyanoMetDB4 | 2010 | 310 | 2004 | 2009 (1691 isomeric SIMLES |
CyanoMetDB was established as a consolidation of multiple, disparate, sources of information pertaining to cyanobacterial secondary metabolites, including in-house libraries of the CyanoMetDB curation-team members and various open-access databases (Table 1). CyanoMetDB was then extended to include additional cyanobacterial secondary metabolites reported in the scientific literature. For each compound, the primary literature metadata was manually verified and where required, corrected. The sample type from which a compound was extracted and identified (e.g., genus/species/strain of cyanobacterium, or field sample), as well as whether nuclear magnetic resonance spectroscopy was used for its structure elucidation, were recorded. Furthermore, the 2D-chemical structure of each compound was manually drawn (ChemDraw, ChemDraw Professional, ACD/ChemSketch) based on the information provided in the primary references, from which structural identifiers were generated, including: simplified molecular input line entry system (SMILES) string, International Union of Pure and Applied Chemistry (IUPAC) International Chemical Identifier (InChI), and InChIkey and IUPAC name. In some instances, this led to the correction of structures originally reported in one of the consolidated data sources, e.g., aeruginosin 101 and aeruginosin 98C both contain a D-allo-Ile in their structure, but were previously misreported as L-allo-Ile derivatives in the primary literature (Ishida et al., 1999). Entries extracted from PubChem were carefully checked to verify the structure from the primary literature. Where discrepancies were observed between structures reported in the primary literature and those found in PubChem, we report the information from the primary literature in CyanoMetDB. In the above-mentioned example, aeruginosin 101 was misreported as its L-allo-Ile derivative whereas aeruginosin 98C was correctly reported in PubChem (accessed 08. September 2020). It is useful to highlight that some issues emerged, for example, with the representation of olefin stereochemistry when a compound's structure was drawn using an InChI code in ChemDraw. For this reason, we recommend the use of SMILES codes from CyanoMetDB for accurate representation of a compound's 2D structure.
After the initial compilation and expansion of CyanoMetDB, the accuracy of data in the database was confirmed through multiple rounds of database integrity checks. In each round, members of the CyanoMetDB curation team received sub-sets of CyanoMetDB compounds, whereupon they evaluated and if necessary, corrected the assignment of the primary (and secondary) literature sources and chemical structural descriptors (SMILES, InChI, InChIKey and IUPAC name).