Chemistry Toolkit Rosetta Wiki
Register
Line 80: Line 80:
 
OEParseSmiles(mol, "CN2C(=O)N(C)C(=O)C1=C2N=CN1C")
 
OEParseSmiles(mol, "CN2C(=O)N(C)C(=O)C1=C2N=CN1C")
 
print TPSA(mol)
 
print TPSA(mol)
  +
</source>Yesterday, my friend bought a http://www.ebuyrosettastone.com Rosetta Stone Spanish which is so beautiful, i am surprised by the design and style. Do you have a pair of http://www.christianlouboutinsoutlet.com/ cheap coach handbags now? if not, go to online store and have one, it is so amazing!!!
</source>
 
  +
 
==RDKit/Python==
 
==RDKit/Python==
   

Revision as of 02:00, 9 May 2011

Ertl, Rohde, and Selzer (J. Med. Chem., 43:3714-3717, 2000) published an algorithm for fast molecular polar surface area (PSA). Part of it involves summing up partial surface values based on fragment contributions. Each fragment corresponds to a SMARTS match.


The goal of this task is get an idea of how to do a set of SMARTS matches when the data comes in from an external table. In this case it's a data table from TJ O'Donnell's CHORD chemistry extension for PostgreSQL, listed at http://www.gnova.com/book/tpsa.tab and available for use here with permission. Each line in the file contains three tab-separated fields. The first line is the header. The other lines define a fragment contribution. The first field is the partial surface area contribution, for each SMARTS pattern match defined in the second column. The last column is a comment. Note that the first SMARTS definition contains a typo, it should be "[N+0;H0;D1;v3]" instead of "[N0;H0;D1;v3]".

To compute the topological polar surface area (for purposes of this task) of a given structure, take the sum over all fragment contributions, weighted by the number of times that fragment matches.

Implementation

Yesterday, my friend bought a http://www.ebuyrosettastone.com Rosetta Stone Spanish which is so beautiful, i am surprised by the design and style. Do you have a pair of http://www.christianlouboutinsoutlet.com/ cheap coach handbags now? if not, go to online store and have one, it is so amazing!!!

Indigo/Python

import sys
import collections
import indigo
 
indigo = indigo.Indigo()

# Some place to store the pattern defintions
Pattern = collections.namedtuple("Pattern", ["value", "subsearch"])
patterns = []
 
# Get the patterns from the tpsa.tab file, ignoring the header line
for line in open("tpsa.tab").readlines()[1:]:
    # Extract the fields
    value, smarts, comment = line.split("\t")
 
    subsearch = indigo.loadSmarts(smarts)
 
    # Store for later use
    patterns.append( Pattern(float(value), subsearch) )
 
# Helper function to count how many times a substructure matches
def count_matches(subsearch, mol):
    return indigo.countSubstructureMatches(subsearch, mol)
 
def TPSA(mol):
    "Compute the topological polar surface area of a molecule"
    return sum(count_matches(pattern.subsearch, mol)*pattern.value
                   for pattern in patterns)
 
# Test it with the reference structure
mol = indigo.loadMolecule("CN2C(=O)N(C)C(=O)C1=C2N=CN1C")
print TPSA(mol)

OpenEye/Python

from openeye.oechem import *
import collections

# Some place to store the pattern defintions
Pattern = collections.namedtuple("Pattern", ["value", "subsearch"])
patterns = []

# Get the patterns from the tpsa.tab file, ignoring the header line
for line in open("tpsa.tab").readlines()[1:]:
    # Extract the fields
    value, smarts, comment = line.split("\t")

    # Use the SMARTS to define a subsearch object
    subsearch = OESubSearch(smarts)

    # Store for later use
    patterns.append( Pattern(float(value), subsearch) )

# Helper function to count how many times a substructure matches
def count_matches(subsearch, mol):
    return sum(1 for match in subsearch.Match(mol))

def TPSA(mol):
    "Compute the topological polar surface area of a molecule"
    return sum(count_matches(pattern.subsearch, mol)*pattern.value
                   for pattern in patterns)

# Test it with the reference structure
mol = OEGraphMol()
OEParseSmiles(mol, "CN2C(=O)N(C)C(=O)C1=C2N=CN1C")
print TPSA(mol)

Yesterday, my friend bought a http://www.ebuyrosettastone.com Rosetta Stone Spanish which is so beautiful, i am surprised by the design and style. Do you have a pair of http://www.christianlouboutinsoutlet.com/ cheap coach handbags now? if not, go to online store and have one, it is so amazing!!!

RDKit/Python

from rdkit import Chem
import collections

# Some place to store the pattern defintions
Pattern = collections.namedtuple("Pattern", ["value", "subsearch"])
patterns = []

# Get the patterns from the tpsa.tab file, ignoring the header line
for line in open("tpsa.tab").readlines()[1:]:
    # Extract the fields
    value, smarts, comment = line.split("\t")

    # Use the SMARTS to define a subsearch object
    subsearch = Chem.MolFromSmarts(smarts)

    # Store for later use
    patterns.append( Pattern(float(value), subsearch) )

# Helper function to count how many times a substructure matches
def count_matches(subsearch, mol):
    return len(mol.GetSubstructMatches(subsearch))

def TPSA(mol):
    "Compute the topological polar surface area of a molecule"
    return sum(count_matches(pattern.subsearch, mol)*pattern.value
                   for pattern in patterns)

# Test it with the reference structure
mol = Chem.MolFromSmiles("CN2C(=O)N(C)C(=O)C1=C2N=CN1C")
print TPSA(mol)

Yesterday, my friend bought a http://www.ebuyrosettastone.com Rosetta Stone Spanish which is so beautiful, i am surprised by the design and style. Do you have a pair of http://www.christianlouboutinsoutlet.com/ cheap coach handbags now? if not, go to online store and have one, it is so amazing!!!