Line 80: | Line 80: | ||
OEParseSmiles(mol, "CN2C(=O)N(C)C(=O)C1=C2N=CN1C") |
OEParseSmiles(mol, "CN2C(=O)N(C)C(=O)C1=C2N=CN1C") |
||
print TPSA(mol) |
print TPSA(mol) |
||
+ | </source>Yesterday, my friend bought a http://www.ebuyrosettastone.com Rosetta Stone Spanish which is so beautiful, i am surprised by the design and style. Do you have a pair of http://www.christianlouboutinsoutlet.com/ cheap coach handbags now? if not, go to online store and have one, it is so amazing!!! |
||
− | </source> |
||
+ | |||
==RDKit/Python== |
==RDKit/Python== |
||
Revision as of 02:00, 9 May 2011
Ertl, Rohde, and Selzer (J. Med. Chem., 43:3714-3717, 2000) published an algorithm for fast molecular polar surface area (PSA). Part of it involves summing up partial surface values based on fragment contributions. Each fragment corresponds to a SMARTS match.
The goal of this task is get an idea of how to do a set of SMARTS matches when the data comes in from an external table. In this case it's a data table from TJ O'Donnell's CHORD chemistry extension for PostgreSQL, listed at http://www.gnova.com/book/tpsa.tab and available for use here with permission. Each line in the file contains three tab-separated fields. The first line is the header. The other lines define a fragment contribution. The first field is the partial surface area contribution, for each SMARTS pattern match defined in the second column. The last column is a comment. Note that the first SMARTS definition contains a typo, it should be "[N+0;H0;D1;v3]" instead of "[N0;H0;D1;v3]".
To compute the topological polar surface area (for purposes of this task) of a given structure, take the sum over all fragment contributions, weighted by the number of times that fragment matches.
Implementation
Yesterday, my friend bought a http://www.ebuyrosettastone.com Rosetta Stone Spanish which is so beautiful, i am surprised by the design and style. Do you have a pair of http://www.christianlouboutinsoutlet.com/ cheap coach handbags now? if not, go to online store and have one, it is so amazing!!!
Indigo/Python
import sys
import collections
import indigo
indigo = indigo.Indigo()
# Some place to store the pattern defintions
Pattern = collections.namedtuple("Pattern", ["value", "subsearch"])
patterns = []
# Get the patterns from the tpsa.tab file, ignoring the header line
for line in open("tpsa.tab").readlines()[1:]:
# Extract the fields
value, smarts, comment = line.split("\t")
subsearch = indigo.loadSmarts(smarts)
# Store for later use
patterns.append( Pattern(float(value), subsearch) )
# Helper function to count how many times a substructure matches
def count_matches(subsearch, mol):
return indigo.countSubstructureMatches(subsearch, mol)
def TPSA(mol):
"Compute the topological polar surface area of a molecule"
return sum(count_matches(pattern.subsearch, mol)*pattern.value
for pattern in patterns)
# Test it with the reference structure
mol = indigo.loadMolecule("CN2C(=O)N(C)C(=O)C1=C2N=CN1C")
print TPSA(mol)
OpenEye/Python
from openeye.oechem import *
import collections
# Some place to store the pattern defintions
Pattern = collections.namedtuple("Pattern", ["value", "subsearch"])
patterns = []
# Get the patterns from the tpsa.tab file, ignoring the header line
for line in open("tpsa.tab").readlines()[1:]:
# Extract the fields
value, smarts, comment = line.split("\t")
# Use the SMARTS to define a subsearch object
subsearch = OESubSearch(smarts)
# Store for later use
patterns.append( Pattern(float(value), subsearch) )
# Helper function to count how many times a substructure matches
def count_matches(subsearch, mol):
return sum(1 for match in subsearch.Match(mol))
def TPSA(mol):
"Compute the topological polar surface area of a molecule"
return sum(count_matches(pattern.subsearch, mol)*pattern.value
for pattern in patterns)
# Test it with the reference structure
mol = OEGraphMol()
OEParseSmiles(mol, "CN2C(=O)N(C)C(=O)C1=C2N=CN1C")
print TPSA(mol)
Yesterday, my friend bought a http://www.ebuyrosettastone.com Rosetta Stone Spanish which is so beautiful, i am surprised by the design and style. Do you have a pair of http://www.christianlouboutinsoutlet.com/ cheap coach handbags now? if not, go to online store and have one, it is so amazing!!!
RDKit/Python
from rdkit import Chem
import collections
# Some place to store the pattern defintions
Pattern = collections.namedtuple("Pattern", ["value", "subsearch"])
patterns = []
# Get the patterns from the tpsa.tab file, ignoring the header line
for line in open("tpsa.tab").readlines()[1:]:
# Extract the fields
value, smarts, comment = line.split("\t")
# Use the SMARTS to define a subsearch object
subsearch = Chem.MolFromSmarts(smarts)
# Store for later use
patterns.append( Pattern(float(value), subsearch) )
# Helper function to count how many times a substructure matches
def count_matches(subsearch, mol):
return len(mol.GetSubstructMatches(subsearch))
def TPSA(mol):
"Compute the topological polar surface area of a molecule"
return sum(count_matches(pattern.subsearch, mol)*pattern.value
for pattern in patterns)
# Test it with the reference structure
mol = Chem.MolFromSmiles("CN2C(=O)N(C)C(=O)C1=C2N=CN1C")
print TPSA(mol)
Yesterday, my friend bought a http://www.ebuyrosettastone.com Rosetta Stone Spanish which is so beautiful, i am surprised by the design and style. Do you have a pair of http://www.christianlouboutinsoutlet.com/ cheap coach handbags now? if not, go to online store and have one, it is so amazing!!!