PURY is a database of empirical geometric restrains for refinement of heteromolecules in the complexes with macromolecular structures. It is based on Cambridge Structural Database (CSD). When PURY server was developed (Andrejasic et al., 2008) it became clear that it can be improved in many points. The first limitation was in the concept of atom classes which were assigned using extensive prior chemical knowledge arising from connectivity and planarity of the structures. This approach resulted in over two thousand atom classes defining tens and hundreds of thousands of thousands of geometric restraints. Analysis showed that that about half were statistically under represented and that many atom classes and terms are redundant as they describe essentially the same distributions. The second consideration was that in order to make the resulting force field publicly more widely available, the new generation parameters will include also the version based on the Crystallography Open Database (COD) will besides the proprietary CSD. The new COD based set of geometric parameters will be in the public domain, and we expect that it will soon become as accurate as the set derived from the CSD. The COD includes over 220,000 entries by March, 2013, and steadily grows. A set of Python programs were written to access the crystallographic entries, organize the structural information in a database-like structure and filter out problematic files and molecular structures. Using filtered data an iterative process for atom class definition was developed. It is based only on the topology of the molecules and the distribution of atom distances among classes. This scheme starts identifying the atom class with the plain atom name and then increasing the atom class complexity by gradually including information from the neighbors only for those classes which distribution of distances are multi-modal or with a high standard deviation. This process leads to a set of atom classes of uneven complexity, with atoms on hybridized structures like rings reaching the highest level of complexity. The current state of the art of the new generation of parameter data set of geometric restraints will be presented.
B.06 Other
COBISS.SI-ID: 27630887