The progress of the microbiology and immunology is strongly associated with structural studies of the material carriers of bioactivity including glycopolymers. One of the main components of the outer membrane of bacterial cells is a lipopolysaccharide (LPS) as it is responsible for normal membrane functioning and determines the immune response in higher organisms. The significant part of LPS studies is elucidation of the structure of its O-antigen (regular glycopolymeric part) because it allows to interpret the biological activity of LPS on the molecular level.
For recent decades NMR spectroscopy has become the main approach to biopolymer structure elucidation, especially 13C NMR that assumes a large experimental database having been obtained, but there has been no algorithm invented to determine the whole structure by 13C NMR data only, and therefore structural research requires many NMR experiments and comprehensive analysis (click to view).
This encouraged us to create a completely automatic computer facility (BIOPSEL) for prediction of the structure of biopolymers, mainly glycopolymers, basing on the 13C NMR spectrum only. The BIOPSEL abbreviation stands for BIOpolymers Primary Structure Elucidation.
The existing programs for structural predictions can be separated into two groups: ones dealing with separate atoms and their spectral properties and ones dealing with molecular fragments with certain structural and spectral properties. In the former case generating all the possible structures and their theoretical 13C NMR spectra requires significant computer resources and thereby performance of this software is not enough for prediction of the structure of large molecules, the latter case remaining is the only variant for biopolymers.
There has been several similar programs described, but all of them require knowledge of the repeating unit topology and, often of the residues sequence, as well as of monomeric composition. The most valuable result of these calculations is the substitution pattern (or residue sequence if the substitution pattern is input). In the meantime, if the residue sequence is known, the substitution pattern is usually known too, as it is determined at the earlier stage of research. The significant limitation of the existing programs is the fixed number of possible monomers, which structural properties are built in the program and can not be modified.
BIOPSEL is the computer predictor of the structure of biopolymers built of residues linked by glycosidic, amidic and phospho-diester bridges. The input data are experimental 13C NMR spectrum of the regular polymer and monomeric composition. Optionally, absolute and anomeric configurations of residues and substitution constraints may be input too, to improve the accuracy and performance. Output is the list of possible structures with repeating unit topology, residues sequence, substitution pattern and unknown configurations determined.
The good convergence of predictions with independently obtained structural data was observed for about 70 polysaccharides and their derivatives containing up to three non-pyranose residues (alditols, furanoses, aminoacids, phosphate groups etc.) per repeating unit of up to nine residues (see details here). The predictions obtained were used in structural studies of several medically important bacterial O-antigens.