bioworld.c File Reference

Detailed Description

Implementation of the bridge between world structures and biological information.

See Also
bioworld.h

Definition in file bioworld.c.

Functions

void ReadResidueList (char *fname, unsigned int *nr, char *ch, unsigned int **r)
 Read the list of residues to be considered flexible. More...
 
void ReadRigidsAndHinges (char *fname, unsigned int **r, unsigned int *nh, unsigned int **h1, unsigned int **h2, boolean **added, TBioWorld *bw)
 Read the list of rigids and hinges of a molecule. More...
 
void GetMoleculeBasicInfo (TBioWorld *bw)
 Collects the basic information about the molecule. More...
 
void GetAtomPositions (char *fname, double *pos, TBioWorld *bw)
 Initializes the atom positions in a BioWorld. More...
 
void DetectLinksAndJoints (TBioWorld *bw)
 Determines the rigid groups of atoms and the connections between them. More...
 
void DetectLinksAndJointsFromResidues (unsigned int nr, char ch, unsigned int *rID, TBioWorld *bw)
 Determines the rigid groups of atoms and the connections between them. More...
 
void DetectLinksAndJointsFromRigidsAndHinges (unsigned int *rg, unsigned int nh, unsigned int *h1, unsigned int *h2, TBioWorld *bw)
 Determines the rigid groups of atoms and the connections between them. More...
 
void Atoms2Transforms (Tparameters *p, double *atoms, THTransform *t, TBioWorld *bw)
 Generates a transform from a gobal frame to a local frame for each link. More...
 
void InitWorldFromMolecule (Tparameters *p, double **conformation, unsigned int maxAtomsLink, TBioWorld *bw)
 Defines a world from molecular information. More...
 
void InitBioWorld (Tparameters *p, char *filename, unsigned int maxAtomsLink, double **conformation, TBioWorld *bw)
 Initializes a world form a biomolecule. More...
 
void AdjustBioWorldGeometry (Tparameters *p, TBioWorld *bw)
 Enforces all bond lengths and angles to be the same. More...
 
TworldBioWorldWorld (TBioWorld *bw)
 Returns a pointer to the world generated from the bio-information. More...
 
unsigned int BioWorldNAtoms (TBioWorld *bw)
 Number of atoms in the molecule. More...
 
unsigned int BioWorldConformationSize (TBioWorld *bw)
 Number of variables used to represent a conformation. More...
 
void BioWordGetAtomPositionsFromConformation (Tparameters *p, boolean simp, double *conformation, double *pos, TBioWorld *bw)
 Computes the position of the atoms. More...
 
void BioWordSetAtomPositionsFromConformation (Tparameters *p, boolean simp, double *conformation, TBioWorld *bw)
 Changes the position of the atoms. More...
 
double BioWorldRMSE (double *pos, TBioWorld *bw)
 Computes the RMSE. More...
 
unsigned int BioWordConformationFromAtomPositions (Tparameters *p, double *atoms, double **conformation, TBioWorld *bw)
 Produces the internal coordinates from the atom positions. More...
 
double BioWorldEnergy (Tparameters *p, boolean simp, double *conformation, void *bw)
 Computes the energy of a given configuration. More...
 
void SaveBioWorldBioInfo (Tparameters *p, char *fname, boolean simp, double *conformation, TBioWorld *bw)
 Stores the BioWorld information in a molecular format (eg. pdb). More...
 
void PrintBioWorld (Tparameters *p, char *fname, int argc, char **arg, TBioWorld *bw)
 Prints the BioWorld information into a file. More...
 
void DeleteBioWorld (TBioWorld *bw)
 Destructor. More...
 

Function Documentation

void ReadResidueList ( char *  fname,
unsigned int *  nr,
char *  ch,
unsigned int **  r 
)

Attempts to read a list of residues whose internal degrees of freedom are to be considered flexible. If the file with the list is not available, the degrees of freedom are deduced from the bond type: double bonds are rigid and single bonds define rotations

This is intented to be used to model loops in proteins where only a small part of the protein must move.

Note that only the backbone rotations between the N-Calpha and Calpha-C atoms are considered. Lateral chains are fixed.

Parameters
fnameThe problem name used to define the namd of the file with the residues
nrOutput. The number of residues readed. O if none.
chOutput. The chain including the residues (all are in the same chain).
rOutput. The array with the residues. NULL if none is read.

Definition at line 231 of file bioworld.c.

References CreateFileName(), DeleteFileName(), Error(), GetFileFullName(), MEM_DUP, NEW, and RES_EXT.

Referenced by InitBioWorld().

void ReadRigidsAndHinges ( char *  fname,
unsigned int **  r,
unsigned int *  nh,
unsigned int **  h1,
unsigned int **  h2,
boolean **  added,
TBioWorld bw 
)

Attempts to read a list of rigids and hinges of a molecule. This was implemented to test the examples provided by Adnan Sljoka.

The list of rigids is just a list of all atoms in the molecule with the associated rigid. RigidID 0 is used for atoms not assigned to any rigid. The un-assigned atoms not involved in the hinges (see below) are not modelled.

The hinges is a list of pairs of atoms defining rotation joints. These joints can correspond to actual bonds between atoms or to hydrogen bonds (not explicit).

Typically we will have two rigid bodies and several chains connecting them. The chains are composed by revolute joints only. In the chains, we use the normal policy of considering fixed all the bonds not included in the list of bonds defining the hinge. Thus, several small rigids might appear in the chains.

Note that the read fails if the rigids or the hinges can not be fully read.

This function can only be used after calling GetMoleculeBasicInfo and after setting the atom positions.

Parameters
fnameThe problem name used to define the namd of the file with the residues
rOutput. The ridid for each atom. NULL if the read fails.
nhOutput. Number of flexible bonds in the hinges 0 if the read fails.
h1Output. Indexes of the first atom defining the bonds in the hinges. NULL if the read fails.
h2Output. Indexes of the second atom defining the bonds in the hinges. NULL if the read fails.
addedTRUE if we had to add an artificial bond in between atoms h1[] and h2[] in order to define the joint. These artifial bonds are needed in InitWorldFromMolecule but are latter removed.
bwThe BioWorld to update.

Definition at line 277 of file bioworld.c.

References AddBond(), CreateFileName(), DeleteFileName(), Error(), FALSE, GetFileFullName(), HasBond(), HINGE_EXT, TBioWorld::m, MEM_DUP, MEM_EXPAND, TBioWorld::na, TBioWorld::nb, TBioWorld::nbAtom, NEW, RIGID_EXT, and TRUE.

Referenced by InitBioWorld().

void GetMoleculeBasicInfo ( TBioWorld bw)

Collects the basic information about the molecule such as the number of atoms, the number of bonds per atoms, the total number of bonds,...

This is part of the BioWorld initialization.

Parameters
bwThe BioWorld to update.

Definition at line 398 of file bioworld.c.

References DeleteBondIterator(), Error(), GetFirstNeighbour(), GetNextNeighbour(), TBioWorld::m, TBioWorld::na, nAtoms(), TBioWorld::nb, TBioWorld::nba, TBioWorld::nbAtom, NEW, NO_UINT, and VdWRadius().

Referenced by InitBioWorld().

void GetAtomPositions ( char *  fname,
double *  pos,
TBioWorld bw 
)

Initializes the atom positions in a BioWorld using an external file of atoms or, if the file is not available, from the molecular information.

This is part of the BioWorld initialization.

Parameters
fnameName for the file with the atom positions.
posArray where to store the atom positions.
bwThe partially initalized BioWorld.

Definition at line 437 of file bioworld.c.

References ATOM_EXT, CreateFileName(), DeleteFileName(), GetAtomCoordinates(), GetFileFullName(), TBioWorld::m, and TBioWorld::na.

Referenced by InitBioWorld().

void DetectLinksAndJoints ( TBioWorld bw)

Determines the rigid groups of atoms (links) and the connections between them (joints).

This function can only be used after calling GetMoleculeBasicInfo and after setting the atom positions.

Parameters
bwThe BioWorld to initialize.

Definition at line 472 of file bioworld.c.

References CopyID(), TBioWorld::cut, DeleteBondIterator(), DeleteID(), FALSE, GetFirstNeighbour(), GetNextNeighbour(), InitVector(), TBioWorld::joint1, TBioWorld::joint2, TBioWorld::linkID, TBioWorld::linkList, TBioWorld::links, TBioWorld::m, TBioWorld::na, TBioWorld::nbAtom, NEW, NewVectorElement(), TBioWorld::nj, TBioWorld::nl, and NO_UINT.

Referenced by InitBioWorld().

void DetectLinksAndJointsFromResidues ( unsigned int  nr,
char  ch,
unsigned int *  rID,
TBioWorld bw 
)

Determines the rigid groups of atoms (links) and the connections between them (joints) for proteins by fixing all the degrees of freedom but those for a list of residues. The residue information is read from file using ReadResidueList.

This function can only be used after calling GetMoleculeBasicInfo and after setting the atom positions.

Parameters
nrThe number of residues.
chThe chain including the flexible residues.
rIDThe identifiers of the residues to free.
bwThe BioWorld to initialize.

Definition at line 553 of file bioworld.c.

References CopyID(), TBioWorld::cut, TBioWorld::cutB, DeleteBondIterator(), DeleteID(), Error(), ExpandBox(), FALSE, GetAtomChain(), GetAtomicNumber(), GetAtomResidue(), GetFirstNeighbour(), GetNextNeighbour(), InitBoxFromPoint(), InitVector(), IsAtomInProline(), TBioWorld::joint1, TBioWorld::joint2, TBioWorld::linkID, TBioWorld::linkList, TBioWorld::links, TBioWorld::m, TBioWorld::na, TBioWorld::nbAtom, NEW, NewVectorElement(), TBioWorld::nj, TBioWorld::nl, NO_UINT, TBioWorld::pos, and TRUE.

Referenced by InitBioWorld().

void DetectLinksAndJointsFromRigidsAndHinges ( unsigned int *  rg,
unsigned int  nh,
unsigned int *  h1,
unsigned int *  h2,
TBioWorld bw 
)

Determines the rigid groups of atoms (links) and the connections between them (joints) for proteins taking into account the provide rigid/hinge information. This information is read from file using ReadRigidsAndHinges.

This function can only be used after calling GetMoleculeBasicInfo and after setting the atom positions.

Parameters
rgThe rigid for each atom (0 if the atom is not assigned to rigid).
nhNumber of bonds defining the hinges.
h1The first atom in the bonds defining the hinges.
h2The second atom in the bonds defining the hinges.
bwThe BioWorld to initialize.

Definition at line 766 of file bioworld.c.

References CopyID(), TBioWorld::cut, DeleteBondIterator(), DeleteID(), Error(), FALSE, GetFirstNeighbour(), GetNextNeighbour(), GetVectorElement(), InitVector(), TBioWorld::joint1, TBioWorld::joint2, TBioWorld::linkID, TBioWorld::linkList, TBioWorld::links, TBioWorld::m, TBioWorld::na, TBioWorld::nbAtom, NEW, NewVectorElement(), TBioWorld::nj, TBioWorld::nl, NO_UINT, and TRUE.

Referenced by InitBioWorld().

void Atoms2Transforms ( Tparameters p,
double *  atoms,
THTransform t,
TBioWorld bw 
)

Generates a transform from a gobal frame to a local frame for each link. The global frame is attached to the first link (i.e., its transform is the identity).

The local frame attachd to each link (link=rigid group of atoms) is defined from the first atom in the link with at least two bonds. The first bond defines the X axis, the plane including the two bonds defines the X-Y plane and Z is just a vector orthogonal to this plane.

By defining these local frames, we can readily obtain the configuration for a given conformation from the position of the atoms, irrespectively of the frame where these positions are defined.

Note that this function does not requires a fully defined BioWorld since it only uses the list of atoms for each link (i.e., the world is not used).

Parameters
pThe set of parameters.
atomsArray with the position of the atoms.
tOutput array with the "local to global" transform for each link. The space for this array mus t be allocated/released externally.
bwThe bioworld with the link information.

Definition at line 962 of file bioworld.c.

References CrossProduct(), DeleteBondIterator(), DifferenceVector(), Error(), FALSE, GetAtomicNumber(), GetAtomResidue(), GetFirstNeighbour(), GetNextNeighbour(), GetVectorElement(), HasBond(), HTransformFromVectors(), HTransformPrint(), TBioWorld::linkList, TBioWorld::links, TBioWorld::m, TBioWorld::nbAtom, TBioWorld::nl, NO_UINT, and Normalize().

Referenced by BioWordConformationFromAtomPositions(), and InitWorldFromMolecule().

void InitWorldFromMolecule ( Tparameters p,
double **  conformation,
unsigned int  maxAtomsLink,
TBioWorld bw 
)

Defines a mechanical structure (a world) from molecular information.

Parameters
pThe set of parameters.
conformationConformation given by the atom positions.
maxAtomsLinkLinks with more atoms than this limit are represented in wireframe. This is faster for visualization but it does not allow to detect collisions with the link. This is to be used only for visualization. Use NO_UINT for no limit (all in facy).
bwThe bioworld to initialize.

Definition at line 1130 of file bioworld.c.

References AddBody2Link(), AddJoint2World(), AddLink2World(), Atoms2Transforms(), CheckAllCollisions(), CT_REPRESENTATION, CT_VDW_RATIO, TBioWorld::cut, TBioWorld::cutB, DECOR_SHAPE, DeleteBondIterator(), DeleteColor(), DeleteJoint(), DeleteLink(), DeletePolyhedron(), DifferenceVector(), Error(), FALSE, TBioWorld::g2l, GenerateWorldEquations(), GetAtomicNumber(), GetBoxDiagonal(), GetFirstNeighbour(), GetNextNeighbour(), GetParameter(), GetSolutionPointFromLinkTransforms(), GetVectorElement(), GetWorldLink(), HIDDEN_SHAPE, HTransformApply(), HTransformDelete(), HTransformInverse(), InitLink(), InitWorld(), TBioWorld::joint1, TBioWorld::joint2, TBioWorld::linkID, TBioWorld::linkList, TBioWorld::links, TBioWorld::localPos, TBioWorld::m, MEM_DUP, MEM_EXPAND, TBioWorld::na, NEW, NewColor(), NewCylinder(), NewRevoluteJoint(), NewSegments(), NewSphere(), TBioWorld::nj, TBioWorld::nl, NO_UINT, NORMAL_SHAPE, PointInBox(), TBioWorld::pos, SumVectorScale(), VdWRadius(), and TBioWorld::w.

Referenced by InitBioWorld().

void InitBioWorld ( Tparameters p,
char *  filename,
unsigned int  maxAtomsLink,
double **  conformation,
TBioWorld bw 
)

Initializes a world structure from biological information.

Parameters
pThe set of parameters.
filenameThe name for the file with the biological information. All formats available in babel can be used.
maxAtomsLinkLinks with more atoms than this limit are represented in wireframe. This is faster for visualization but it does not allow to detect collisions with the link. This is to be used only for visualization. Use NO_UINT for no limit (all in facy).
conformationConfiguration of the molecule (space allocated internally).
bwThe BioWorld to initialize.

Definition at line 1394 of file bioworld.c.

References BioWordConformationFromAtomPositions(), BioWordSetAtomPositionsFromConformation(), DetectLinksAndJoints(), DetectLinksAndJointsFromResidues(), DetectLinksAndJointsFromRigidsAndHinges(), FALSE, GetAtomPositions(), GetMoleculeBasicInfo(), InitWorldFromMolecule(), TBioWorld::m, TBioWorld::na, TBioWorld::nb, TBioWorld::nbAtom, NEW, TBioWorld::pos, ReadMolecule(), ReadResidueList(), ReadRigidsAndHinges(), RemoveBond(), and SetAtomCoordinates().

Referenced by main().

void AdjustBioWorldGeometry ( Tparameters p,
TBioWorld bw 
)

Enforces all bond lengths and angles to be the same for each pair or triplet of atoms of different type.

We set up a set of constraints and finds a point that holds them form the current conformation using a local procedure. The new point is used to change the atom positions in the given bioworld.

Parameters
pThe set of parameters.
bwThe BioWorld to initialize.

Definition at line 1470 of file bioworld.c.

References AddCt2Monomial(), AddEquation2CS(), AddMonomial(), AddVariable2CS(), AddVariable2Monomial(), DeleteBondIterator(), DeleteBox(), DeleteCuikSystem(), DeleteEquation(), DeleteMonomial(), DeleteVariable(), DifferenceVector(), Distance(), DotProduct(), DUMMY_EQ, DUMMY_VAR, EQU, Error(), GenerateDotProductEquation(), GenerateNormEquation(), GetAtomicNumber(), GetCSSystemVars(), GetCSVariableID(), GetCSVariableNames(), GetFirstNeighbour(), GetNextNeighbour(), InitBoxFromPoint(), InitCuikSystem(), InitEquation(), InitMonomial(), TBioWorld::m, TBioWorld::na, TBioWorld::nb, TBioWorld::nba, NEW, NewInterval(), NewVariable(), NFUN, NO_UINT, TBioWorld::pos, PrintBoxSubset(), PrintCuikSystem(), ResetMonomial(), SetEquationCmp(), SetEquationType(), SetEquationValue(), SetVariableInterval(), SYSTEM_EQ, and SYSTEM_VAR.

Tworld* BioWorldWorld ( TBioWorld bw)

Returns a pointer to the world generated from the bio-information.

This function is provided for convenience but caution must be taken not to modify this internal structure.

Parameters
bwThe BioWorld to query.
Returns
A pointer to the internal world structure.

Definition at line 1802 of file bioworld.c.

References TBioWorld::w.

Referenced by main().

unsigned int BioWorldNAtoms ( TBioWorld bw)

Number of atoms in the molecule.

Parameters
bwThe BioWorld to query.
Returns
The number of atoms in the molecule.

Definition at line 1807 of file bioworld.c.

References TBioWorld::na.

Referenced by main().

unsigned int BioWorldConformationSize ( TBioWorld bw)

Number of variables used to represent a conformation. This changes for different representations (see CT_REPRESENTATION) In any case this function returns the number of system varibles used to represent a conformation. This is the number of variables appearing in the output files. Internally, though, a different number of variables migth be used (resulting from the simplification and dummification of the equations, if applied).

Parameters
bwThe BioWorld to query.
Returns
The number of variables representing a conformation.

Definition at line 1812 of file bioworld.c.

References GetWorldNumSystemVariables(), and TBioWorld::w.

Referenced by main().

void BioWordGetAtomPositionsFromConformation ( Tparameters p,
boolean  simp,
double *  conformation,
double *  pos,
TBioWorld bw 
)

Computes the position of the atoms in a molecule from the conformation encoded in the conformation vector. The contents of this vector changes according to the representation used for the equations (see CT_REPRESENTATION).

Parameters
pThe set of parameters.
simpTRUE if the sample is given in the simplified system.
conformationThe configuration.
posArray where to store the new positions (3 entries per atom).
bwThe BioWorld information.

Definition at line 1817 of file bioworld.c.

References DeleteLinkTransforms(), GetLinkTransformsFromSolutionPoint(), GetVectorElement(), HTransformApply(), TBioWorld::linkList, TBioWorld::links, TBioWorld::localPos, TBioWorld::nl, NO_UINT, and TBioWorld::w.

Referenced by BioWordSetAtomPositionsFromConformation(), and main().

void BioWordSetAtomPositionsFromConformation ( Tparameters p,
boolean  simp,
double *  conformation,
TBioWorld bw 
)

Changes the position of the atoms in a molecule. The new position is computed from the conformation encoded in the conformation vector. The contents of this vector changes according to the representation used for the equations (see CT_REPRESENTATION).

Parameters
pThe set of parameters.
simpTRUE if the sample is given in the simplified system.
conformationThe configuration.
bwThe BioWorld information.

Definition at line 1846 of file bioworld.c.

References BioWordGetAtomPositionsFromConformation(), TBioWorld::m, TBioWorld::pos, and SetAtomCoordinates().

Referenced by BioWorldEnergy(), InitBioWorld(), and SaveBioWorldBioInfo().

double BioWorldRMSE ( double *  pos,
TBioWorld bw 
)

Computes the Root Mean Square Error (RMSE) between the current the atom positions stored int he BioWorld structure and the atom position corresponding to another conformation. These atom postiions can be computed form conformations using BioWordGetAtomPositionsFromConformation.

If the flexibility of the molecule is reduced to only few residues only those are considered for the error.

Parameters
posThe atom positions for the new conformation.
bwThe BioWorld to be used as a reference.
Returns
The RMSE.

Definition at line 1854 of file bioworld.c.

References Distance(), TBioWorld::linkID, TBioWorld::na, and TBioWorld::pos.

Referenced by main().

unsigned int BioWordConformationFromAtomPositions ( Tparameters p,
double *  atoms,
double **  conformation,
TBioWorld bw 
)

Generates a conformation (internal coordinates) from the atom positions). The exact form of the internal coordinates depend on the CT_REPRESENTATION value in the file of parameters. If DOF is used, the conformation is represented in the tradional internal coordinates in molecular modeling (dihedral angles). Other representations are not typical in bio-engineering but in Robotics.

Parameters
pThe set of parameters.
atomsThe 3d positions of the atoms.
conformationThe output conformation. Space is allocated internally.
bwThe BioWorld to query.
Returns
The size of the conformation.

Definition at line 1876 of file bioworld.c.

References Atoms2Transforms(), GetSolutionPointFromLinkTransforms(), HTransformDelete(), NEW, TBioWorld::nl, and TBioWorld::w.

Referenced by InitBioWorld(), and main().

double BioWorldEnergy ( Tparameters p,
boolean  simp,
double *  conformation,
void *  bw 
)

Computes the energy of a given configuraion. This is used to implement T-RRT like algorithms on biomolecules. The signature of this function is generic for all cost function (see EvaluateCSCost).

Parameters
pThe set of parameters.
simpTRUE if the conformation is given in the simplified system.
conformationArray giving the conformation.
bwThe BioWorld information.

Definition at line 1896 of file bioworld.c.

References BioWordSetAtomPositionsFromConformation(), and ComputeEnergy().

Referenced by main().

void SaveBioWorldBioInfo ( Tparameters p,
char *  fname,
boolean  simp,
double *  conformation,
TBioWorld bw 
)

Changes the atom positions and stores the resulting conformation into a molecular format.

Parameters
pThe set of parameters.
fnameThe name for the output file. The format of the file is infered from the extension of the filename. All the formats available in OpenBabel can be used.
simpTRUE if the conformation is given in the simplified system.
conformationArray giving the conformation.
bwThe BioWorld information store.

Definition at line 1905 of file bioworld.c.

References BioWordSetAtomPositionsFromConformation(), TBioWorld::m, and WriteMolecule().

Referenced by main().

void PrintBioWorld ( Tparameters p,
char *  fname,
int  argc,
char **  arg,
TBioWorld bw 
)

Store the world information (stored into the BioWorld structure) into a file so that it can be read as

Moreover we generate a conformation deduced from the atom positions. This can be given it is already available to the caller (for instance as an output of InitBioWorld).

The world and the conformation file can be used to generate the atlas or to plan.

Parameters
pThe set of parameters.
fnameName of the file where to store the information (different extensions are used for the world and the sample files).
argcNumber of strings to be added to the world file header as comments.
argStrings to be added to the world file as comments. Right now, this is used to store in the command line used to create it.
bwThe BioWorld with the information to save.

Definition at line 1913 of file bioworld.c.

References PrintWorld(), and TBioWorld::w.

Referenced by main().

void DeleteBioWorld ( TBioWorld bw)