Chemistry Toolkit Rosetta Wiki
 
(2 intermediate revisions by the same user not shown)
Line 45: Line 45:
 
ens delete $eh
 
ens delete $eh
 
}
 
}
  +
molfile close $fh
 
</pre>
 
</pre>
  +
  +
The logged messges about corrupted records are typically something like "Data file syntax error in line 99 record 6"
   
 
All I/O modules in the toolkit have the capability to re-sync the file (trivial for SMILES, not so simple for SDF). In the read loop, this happens automatically.
 
All I/O modules in the toolkit have the capability to re-sync the file (trivial for SMILES, not so simple for SDF). In the read loop, this happens automatically.
Line 66: Line 69:
   
 
<pre lang="python">
 
<pre lang="python">
  +
f=Molfile('dubious.smi')
  +
while True:
  +
try:
  +
e=f.read()
  +
except Exception as x:
  +
print(x.args[0])
  +
else:
  +
if (e==None): break
  +
e.delete()
  +
f.close()
 
</pre>
 
</pre>
  +
  +
Note that there is a subtle difference between the Tcl and Python implementations of the structure file input command: In Tcl, hitting EOF raises an error, while on python, a None magic object is returned.
 
[[Category:Cactvs/Tcl]]
 
[[Category:Cactvs/Tcl]]
 
[[Category:Cactvs/Python]]
 
[[Category:Cactvs/Python]]

Latest revision as of 18:16, 4 October 2013

Parse the strings "Q" and "C1C1" as if they were SMILES and store any error or warning messages to a file. These should not be accepted as valid SMILES strings.

There needs to be a similar test for SD files.

Hmm, even better would be to parse a SMILES file and SD file containing errors, to see if the respective readers can skip the errors.

The idea here is that an application (including a web app) may want to report that an input structure was incorrect, and give some information about what was wrong.

OpenBabel/Rubabel[]

require 'rubabel'
File.open("log.txt", 'w') do |out|
     %w(Q C1C).each do |smile|
          Rubabel[smile] rescue out.puts "bad smiles #{smile}"
     end
end

Cactvs/Tcl[]

In Tcl

set fh [open log.txt w]
foreach smiles [list "Q" "C1C"] {
   if {[catch {ens create $smiles} msg]} {
      puts $fh $msg
   }
}
close $fh

The message is "Error: ens create failed: Failed to decode structure data specification"

For file input, you can do something like

set fh [molfile open "dubious.smi"]
while 1 {
  if {[catch {molfile read $fh} eh]} {
     if {[molfile get $fh eof]} break
     puts $eh
     continue
  }
  ens delete $eh
}
molfile close $fh

The logged messges about corrupted records are typically something like "Data file syntax error in line 99 record 6"

All I/O modules in the toolkit have the capability to re-sync the file (trivial for SMILES, not so simple for SDF). In the read loop, this happens automatically.

Cactvs/Python[]

Here essentially the same code in Python:

f=open('log.txt','w')
for smiles in ['Q','C1C']:
    try:
        e=Ens(smiles)
    except Exception as x:
        f.write(x.args[0]+"\n")
f.close()

and

f=Molfile('dubious.smi')
while True:
    try:
        e=f.read()   
    except Exception as x:
        print(x.args[0])                           
    else:
        if (e==None): break
        e.delete()             
f.close()

Note that there is a subtle difference between the Tcl and Python implementations of the structure file input command: In Tcl, hitting EOF raises an error, while on python, a None magic object is returned.