Convert a SMILES file (yet to be determined) into an SD file. The conversion must do its best to use the MDL conventions for the SD file, including aromaticity perception. Note that the use of aromatic bond types in CTABs is only allowed for queries, so aromatic structures must be written in a Kekule form.

(In some tools the conversion is automatic, in other tools it must be done explicitly, while in still others there's only a single chemistry model for every format.)

Because the stereochemistry of molecules in SD files is defined solely by the arrangement of atoms, it is necessary to assign either 2D or 3D coordinates to the molecule before generating output. The coordinates do not have to be reasonable (i.e. it's ok if they would make a chemist scream in horror), so long as the resulting structure is chemically correct.

OpenBabel/Rubabel[edit | edit source]

require 'rubabel'"benzodiazepine.smi.sdf", 'w') do |out|
    Rubabel.foreach("benzodiazepine.smi.gz") do |mol|
    out.print mol.write_string(:sdf)

Indigo/C++[edit | edit source]

#include "base_cpp/scanner.h"
#include "base_cpp/output.h"
#include "molecule/molecule.h"
#include "molecule/smiles_loader.h"
#include "molecule/molfile_saver.h"
#include "molecule/molecule_dearom.h"
#include "layout/molecule_layout.h"

int main (int argc, char *argv[])
   if (argc < 3)
      fprintf(stderr, "Usage: smi_sdf_convert infile.smi outfile.sdf\n");
      return -1;

      FileScanner scanner(argv[1]);
      FileOutput output(argv[2]);
      Array<char> smiles;
      Molecule mol;
      int cnt = 0;

      while (!scanner.isEOF())
         scanner.readString(smiles, false);

         if (smiles.size() < 1)

         printf("saving molecule #%d... ", ++cnt);

         BufferScanner smiles_s(smiles);
         SmilesLoader smiles_l(smiles_s);

         smiles_l.loadMolecule(mol, false);

            DearomatizationsStorage dst;
            Dearomatizer dearom(mol);

            MoleculeDearomatizer mdearom(mol, dst);
            for (int i = 0; i < dst.getGroupsCount(); i++)
               mdearom.dearomatizeGroup(i, 0);

            MoleculeLayout ml(mol);

         MolfileSaver saver(output);

         // saver.v2000 = true;
   catch (Exception &e)
      fprintf(stderr, "error: %s\n", e.message());
      return -1;
   return 0;


  1. Unpack 'graph', 'molecule', and 'layout' projects into some folder
  2. Create 'utils' folder nearby
  3. Paste the above code into utils/smi_sdf_convert.cpp file
  4. Compile the file using the following commands:
    $ cd graph; make CONF=Release32; cd ..
    $ cd molecule; make CONF=Release32; cd ..
    $ cd layout; make CONF=Release32; cd ..
    $ cd utils
    $ gcc smi_sdf_convert.cpp -o smi_sdf_convert -O3 -m32 -I.. -I../common ../liblayout/dist/Release32/GNU-Linux-x86/liblayout.a ../molecule/dist/Release32/GNU-Linux-x86/libmolecule.a ../graph/dist/Release32/GNU-Linux-x86/libgraph.a -lpthread -lstdc++
  5. Run the program like that:
    $ ./smi_sdf_convert infile.smi outfile.sdf

RDKit/Python[edit | edit source]

from rdkit import Chem
from rdkit.Chem import AllChem

suppl = Chem.SmilesMolSupplier('benzodiazepine.smi')
w = Chem.SDWriter('bz.out.sdf')
for mol in suppl:
    # skip molecules the rdkit doesn't read:
    if not mol: continue
    # add coordinates so we get a correct mol block:


Cactvs/Tcl[edit | edit source]

molfile write "benzodiazepine.sdf" [molfile open "benzodiazepine.smi"]

Yup, it's a one-liner. 2D coordinates and wedges are automatically generated, and aromaticity resolution is also implicit. The format of the output file is, in the absence of a specific format setting, decoded from the suffix of that file.

Cactvs/Python[edit | edit source]

A one-liner as well:

Community content is available under CC-BY-SA unless otherwise noted.