Welcome to Chemistry Toolkit Rosetta Wiki[edit | edit source]
The Chemistry Toolkit Rosetta is a wiki for sharing how to use different chemistry toolkits for the same set of common tasks. The main focus is on chemical informatics, with toolkits that handle molecular structures, depiction, databases, property analysis, nomenclature, and the like. This includes 3D structure visualization especially as applied to docking and small molecule conformation generation, but the visualization should support bond types. In the future we may add other chemistry tasks, like MMFF energy evaluation, molecular dynamics, spectra analysis, x-ray crystallography, quantum mechanics and more, but don't hold your breath unless you want to help define and evaluate the tasks!
What's it all about?[edit | edit source]
Chemical informatics toolkits share many core functions, but often use different approaches. The goal of this project is to help people be able to compare and contrast toolkit functionality. You might know how to use RDKit and want to see how CDK does something, or you have the choice between OpenBabel or OEChem and you want to see which is easier to use or better fits your needs. Perhaps you are a Pipeline Pilot user and want to show how to implement via a graphical workflow vs. using a text program.
You should feel free to contribute code solutions using any toolkit, or set of toolkits, which solves the problem. Please include comments if a given entry is designed for simplicity, performance, maintainability, or some other criteria.
This is not meant as a cookbook site for any one toolkit. For example, a page on how to generate OpenBabel FP3 fingerprints isn't going to be that useful to users of other toolkits, or that fair of a comparison since it's not a widely used fingerprint and OpenBabel is the only one which has it as a built-in feature.
This inspiration for this wiki comes from Rosetta Code and PLEAC, and the term "Rosetta" in term comes from the Rosetta Stone. Many of the initial examples and code are based on Noel O'Boyle's work, like where he shows [different ways to manipulate SD files with Open Source tools] and his [Cinfony package] which "presents a common API to ... OpenBabel, RDKit and the CDK."
The Code[edit | edit source]
- Read an SD file and list the heavy atom counts
- Read a SMILES file and list the number of rings
- Convert a SMILES string to canonical SMILES
- Working with SD tag data
- Detect and report SMILES and SDF parsing errors
- Report how many SD file records are within a certain molecular weight range
- Convert SMILES file to SD file
- Report the similarity between two structures
- Find the 10 nearest neighbors in a data set
- Depict a compound as an image
- Highlight a substructure in the depiction
- Align the depiction using a fixed substructure
- Unique SMARTS matches against a SMILES string
- Calculate TPSA
- Find the graph diameter
- Break rotatable bonds and report the fragments
- Perform a substructure search on SDF file and report the number of false positives
- Change stereochemistry of certain atoms in SMILES file