SDA (SDA flex)  7.2 Simulation of Diffusional Association
SDA 7 ( SDA_Flex )
Date
2016

# Introduction: SDA 7 Main Features

The functionalities from SDA 6 and SDAMM have been unified into a single codebase, making it easier to develop new functionalities for either or both codes.

The following calculations are possible for two-solute simulations only (comparable to SDA 6 simulations):

• Association rates
• Docking
• Electron transfer

For multiple solute simulations up to high concentrations: (formerly in SDAMM)

• Translational and rotational diffusion coefficients
• Transient oligomerization and binding

For solute diffusion to surfaces:

• Docking
• Diffusion of many solutes in the presence of a surface
• Spherical or cubic geometries -with or without Periodic Boundary Conditions (PBC) - can be simulated

The following interaction types are implemented:

• Electrostatic interaction, including the method of image charges if a metal surface is present
• Electrostatic desolvation
• Non-polar desolvation
• Repulsive soft-core (repulsive Lennard-Jones) interaction for use with multiple solutes (formerly in SDAMM)
• Hydrodynamic interactions for simulating multiple solutes at high concentration (formerly in SDAMM)
• Lennard-Jones interactions for simulating protein-surface interactions (parametrized for the ProMetCS force-field)
• Metal desolvation for simulating protein-surface interactions (parametrized for the ProMetCS force-field)
• Debye-Hückel approximation of the electrostatic interaction at large solute separations

New features :

• Solutes can be represented by more than one structure to model solute flexibility or different protonation states.
• Each solute structure is associated to a set of interaction grids and charges.
• OpenMP parallelisation. This is efficient for multiple and two-solute simulations.
• Additional statistics can be computed during the simulations: residence time; first passage time; radial distribution function.
• Extended set of tools for pre- and post-analysis of simulations.
Workflow for SDA 7

# Architecture of the Solute-Grid relationship

SDA 7 is written in Fortran 90. The additional features of Fortran 90 allow for a more "object oriented design" for the software than was possible with Fortran 77 which was used for previous versions of SDA.

• Dynamic allocation and use of pointers allow for more efficient design and better memory usage.
• Modules and types are used as classes: allocation (constructor) functions, deallocation (destructor) functions, and member functions.
• Each module in the code (files which begins with mod_*.f90 ) include one main "class" and functionalities, they are all included in the library libsda.so.
• The main program sda_flex and all tools included in the software are linked to this library. This allows reusability of the code.

The main new feature of SDA 7 is the introduction of 'flexibility' for one or all of the solutes. Flexibility is introduced by considering several rigid body representations of a protein taken from an NMR ensemble, molecular dynamics simulation, normal mode analysis, or another relevant conformational sampling technique. SDA 7 can then swap between these representations based on e.g. a Monte Carlo criterion. Thus multiple conformations can be treated within a single trajectory.

The diagram below: "Diagram of Solutes, Grids and their relationship", shows the relationships between key modules and how multiple solutes and flexibility are implemented:

• Each grid ( electrostatic, exclusion, etc... ) is of the type grid.
• A specific conformation or structure of a solute, including atom positions, charges, electrostatic grid, a set of reaction criteria, etc. is stored in an object of type setofgrid.
• A solute with multiple structure is represented by a set of conformation, regrouped in the flexible objects. Flexible objects ensure than only one conformation is used at a given time, with a "current_pointer" associated to the corresponding setofgrid.
• Solutes, denoted equivalently protein in the code although they can be any type of molecule, * are defined by their position, orientation, and a pointer to setofgrid or to an instance of flexible.
• All solutes are stored in an array array_protein.

In the diagram solute 1 is flexible, and has two possible conformations. During the trajectory, solute 1 changes conformation, from conformation 1 to conformation 2 (algorithms are discussed below). The change in conformation is performed by changing the association of the pointer (p_current_node) of the flexible object to the setofgrid for the conformation

Solutes 2 and 3 are identical solutes and are not flexible. As they are not flexible, flexible objects do not need to be created.

The solutes are identical, therefore they are directly associated with a single setofgrid, resulting in efficient use of memory. Only the positions and orientations of solute 2 and 3 must be stored independently.

It is straightforward to specify many identical flexible solutes: All solutes are associated to their own flexible object, associated to their current conformation. They all share the same possible conformations, defined by N setofgrid.

Diagram of solutes (referred to as proteins in the code), grids and their relationship

The design presented is efficient for the memory usage, but has one important limitation:

• Everything related to one conformation is stored in one setofgrid, including the position of the atoms, charges and reaction criteria. But the positions evolve during the trajectory unless the solute is fixed. (We refer to them as the "movable atoms"). It would be possible to retrieve the instantaneous position of these "movable atoms" from the position and orientation of the solute stored for each protein (by rigid-body transformation). However, it is not efficient in practice. In particular, rotations require a matrix multiplication which is computationally expensive.

The compromise made in SDA (since its origin in 1998) is to apply these rotations at every step, when the Brownian Dynamics update is applied in make_bd_move. Only the rotation is applied, not the translation. The translation is indeed recomputed each time that the instantaneous position of a movable atom is needed. The full transformations (rotation and translation) are required at several stages, e.g. for computing the interactions, checking overlap with the exclusion grid, checking reaction criteria, etc., and many other functionalities. There were many reasons to use this implementation in SDA 6 (and previous versions): Only two solutes were simulated and the larger solute, solute 1, is kept fixed throughout the simulation, so this transformation was needed only for the smaller, second solute.

# Parallelisation in SDA 7

## Multiple molecule simulation (SDAMM) parallelisation

Different parallelisation schemes are used in SDA 7. To help understand the parallelisation schemes, we briefly discuss each simulation type, and then the strategy used for parallisation with some pseudo codes.

In SDAMM simulations, only one trajectory is generated and the trajectory has a predefined number of steps. The most computationally expensive subroutines are :

• Force calculations: can be easily more than 95 % of the time, algorithm of O(N^2) with N being the number of solutes
• Brownian Dynamics move: linear scaling with N
• Energy calculations: only performed before writing the configuration in the trajectory output file

The method of parallelisation is a straightforward application of OpenMP, with the main loop split between threads in these specific subroutines. For the force and energy calculations:

• The forces and energes are computed by a double loop over all pairs of solutes, the sum of the interactions of each solute is added to an array "total_interaction".
      do s1 = 1, n
do s2 = s1 + 1, n
interaction12, interaction21 = compute_interaction_between_solute1_and_solute2()
total_interaction ( s1 ) = interaction ( s1 ) + interaction12
total_interaction ( s2 ) = interaction ( s2 ) + interaction21
end do
end do
• The first loop is split over different threads, the REDUCTION keyword assures the correction synchronization * when updating the "total_interaction" array.
     !$OMP PARALLEL !$OMP REDUCTION(+:all_interaction)
!$OMP DO SCHEDULE( DYNAMIC ) do s1 = 1, n ! only the first loop is split over the threads do s2 = s1 + 1, n interaction12, interaction21 = compute_interaction_between_solute1_and_solute2() ! secure because of REDUCTION total_interaction ( s1 ) = interaction ( s1 ) + interaction12 total_interaction ( s2 ) = interaction ( s2 ) + interaction21 end do end do !$OMP END DO
!$OMP END PARALLEL • For the BD update : there is only one loop over all solutes. ## Parallel Performance • The scaling of parallel calculations can be heavily system dependent, so you should test your system before carrying out large calculations, especially at high levels of parallelisation. You can expect that larger systems will scale better on more cores. We tested a system of 128 lysozyme molecules on a shared memory AMD Magny Cours machine with 4 CPUs of 12 cores for a total of 48 cores. We present the following results as a guideline. • Force and energies scale relatively well up to 24 CPUs, the BD update does not scale after approx. 10 CPUs. • Without accounting for the overhad of thread creation and synchronization, the calculation rapidly becomes limited by the Amdahl's law, .i.e. the serial section in the loop becomes the bottleneck to scaling. • With larger systems (512 lisozymes), we obtain a scaling of 80 % with 24 CPUs. ## Two solute simulation (sda_2proteins) parallelisation The implementation of the parallelisation with two solutes is very different from that for SDAMM: • For association rate calculations or docking, we want to perform many trajectories to achieve good statistics. • We do not know the time required for a trajectory to complete (timemax or escape) • There is no clear "most expensive function", it depends on the simulation performed. • Force and energy cannot be parallelised in the same way as for SDAMM, the parallelized loops should be inside the interaction calculations The approach chosen for parallelisation is to split the total number of trajectories across the different threads. Each thread is responsible for running a full trajectory. When one trajectory is finished, the thread will start calculating a new trajectory, until no more trajectories need to be conducted. The difficulties with this approach are: • An independent copy of the protein object is needed for each thread because the solute's position, orientation and movable atoms cannot be shared. However, the grids are fixed and only used for reading, so only one copy of the setofgrid is required, no matter how many threads are used. • The synchronization of shared objects. Every function must be thread-safe and the update of most of the shared arrays (e.g. complexes) must be protected from concurrent writes. • The following choices were made for a thread safe implementation for two-solute calculations: • An independent copy of Solute 2 (the moving solute) is assigned to each thread. It includes a copy of the movable arrays even if the solute is not flexible, and a copy of the associated flexible object in this case. • A local copy of the list of complexes is also assigned to each thread. They are merged into a global list of complexes at user-defined intervals (e.g. every 100 trajectories). • The insertion of a complex into the list of complexes is not thread-safe. • The copy and merging into a global list of complexes is probably the best compromise. The local list of complexes are typically smaller than the global one. Additionally, checking and inserting a new complex into the local list is faster. • Merging local into global lists is protected by a CRITICAL section, but it is infrequent (every 50-100 trajectories done by the thread). After a few trajectories have been performed, the desynchronisation between threads becomes an advantage (attractive feature). • For the update of global arrays (association rates, residence time...), a temporary and local array is owned by each thread throughout an entire trajectory. The update of the shared array is only done at the end of the trajectory. The update of global arrays are usually simple updates adapted for the use of the ATOMIC section, which tends to have minimal performance overheads.  !$OMP ATOMIC
global( i ) = global ( i ) + local ( i )
• Some arrays, like those used for the bootstrap or to store the number of steps per trajectory, are designed to store a unique result on a per trajectory basis. Therefore there is no need to protect them against concurrent writes. A schematic outline of MainLoop_sda_2proteins_omp.f90 (variables assignment as SHARED or PRIVATE are omitted) is given below:
• All objects have been previously initialized in read_input_file.f90
    MainLoop( tab_protein_unique )
!$OMP PARALLEL ! allocate local array, OpenMP or not allocate ( local_escape, local_sumtime,...) ! if OpenMP make a copy of the Solutes !$  if ( tid > 0 ) then
!$call copy_array_protein ( tab_protein_unique, ptab_protein, type_calc ) ! normal if not OpenMP or master !$  else
! write (*,*) 'Associate ptab_protein to tab_protein_unique'
ptab_protein => tab_protein_unique
!$end if ! if OpenMP make a local copy of the list of complexes ( here empty, only copy data members ) !$  if ( tid >= 0 )  then
!$call copy_record ( o_complexes, p_local_complexe, .false., size_thread_complexe ) ! else normal if not openMP !$  else
! associate the global pointer to o_complexe if not openmp
if ( associated ( o_complexes % pcomplexe ) ) then
p_local_complexe => o_complexes
end if
!$end if ! not necessary but convenient, simpler to refer to solute 1 and 2 protein1 => ptab_protein % array ( 1 ) protein2 => ptab_protein % array ( 2 ) !$OMP BARRIER
!$OMP DO SCHEDULE( DYNAMIC ) LOOP_RUN: & do run = 1, total_traj !$ if need to merge complexes every 100 runs
!$OMP CRITICAL ( crit_merge_p_complexe ) !$      call merge_complexe ( o_complexes, p_local_complexe )
!$OMP END CRITICAL ( crit_merge_p_complexe ) !$ endif
call init_position_protein()
! begin BD loop
LOOP_STEP: &
do
call force_epedhdlj_2proteins_fast( protein1, protein2 )
call make_bd_move_2prot( protein1, protein2 )
call check_reaction_pairs( protein1, protein2 )
! add to the list of complexes
if ( complexe_to_record .eqv. .true. )
call energy_epedhdlj_2proteins_fast( protein1, protein2, energy )
call add_record ( p_local_complexe, ptab_protein % array ( 2:2 ), energy )
end if
! use atomic inside
call update_residence_time ( resid_time, protein2 % position, dist, dtnow )
end do LOOP_STEP
! only at the end of the trajectory
call update_escape()
end do LOOP_RUN
! deallocate Solutes for all threads, except the MASTER
nullify ( protein1, protein2 )
!$if ( tid > 0 ) then !option true for NOT deleting the grid !$     call delete_array_protein ( ptab_protein, .true. )
! now need explicit deallocation of the pointer
!$deallocate ( ptab_protein, stat = ierr ) !$     nullify ( ptab_protein )
!\$ end if
! same deallocation for local_complexes

The use of OpenMP pragma allows a compilation with or without OpenMP (without loss of performance). With this implementation a high level of parallelization is achieved because there is no sequential code in the main loop. The merging of the lists of complexes is still the main limitation, but their frequency can be tuned for a specific system and computer architecture.

### Run completion (end effect) issues

A demonstration simulation with 6 trajectories and 3 available threads can be represented by the scheme below

End Effect : threads are waiting at the end of the simulation

Each thread runs calculations for one trajectory. When a trajectory is completed, the thread will run the next trajectory. At the very end of the calculation there are no more trajectories to perform and N-1 CPUs are in a waiting state. This may be a problem for association rate calculations where trajectories can, in principle, last for an infinite amount of time. In practice this effect is usually minor.

It is difficult to estimate the end effect during runs to test the performance of SDA 7 (relatively short runs with X trajectories, where X is generally small ~ 100), so the scaling is generally underestimated compared to that expected for a real simulation (> 10,000 trajectories).

In the most favorable case, association rates calculations (with no complexes recorded) have scaling reaching 80 % on 48 CPUs.

# Computation of forces and energy in SDA 7

The implementation of computations of forces and of energies are similar in sda_2protein and SDAMM, but they are nevertheless implemented in different functions for optimization. The differences are the double-loop over all pairs of solutes in the case of SDAMM (as seen in the parallel section) and the fact than the first solute is always kept fixed in sda_2proteins.

In the following sections, we use the example of the computation of the energy of the charges on solute 1 in the potential grid of solute 2.

## Charges and Grids

The following five types of interaction are described as "charges" interacting with a potential grid.

• Coulombic electrostatic interaction : The effective charges of solute 1 and the potential grid (minus the derivative of the potential for the force) of solute 2.
• Electrostatic desolvation interaction : The square of the effective charges and the electrostatic desolvation grid.
• Non-polar desolvation : The surface accessibility of atoms and the non-polar grid.
• Repulsive soft-core (Lennard-Jones): The surface accessibility of atoms and the repulsive soft-core Lennard-Jones grid. In this case, no "charges" are needed in the computation, only the position of atoms. But to have an identical algorithm, we introduced a dummy_array which contains only values of 1 for the charges.
• Lennard-Jones for surface: The same dummy_array is used for the charges, and a LJ grid is generated for each type of atoms on the surface.
• Image-charge: Use both effective charges and the electrostatic grid of the same solute

The other energy terms: metal desolvation, and the correction to the image-charge have a different implementation and will not be considered in this section.

All energies are computed as :

$E_{12} = \sum_{i=1}^n q_1^i(x,y,z) \, U_2(x,y,z)$

where $$E_{12}$$ is the energy of the effective charges from solute 1, $$q_1$$, in the potential of solute 2, $$U_2$$.

With the exception of the electrostatic energy, the total energy is evaluated as the sum of those components, $$E = E_{12} + E_{21}$$.

For the electrostatic interaction energy, the definition is slightly different. The total electrostatic energy should satisfy:

$E_{el} = E_{12} = E_{21}$

But due to approximations in the electrostatic potential grid and effective charges computations, the results of $$E_{12}$$ do not exactly equal $$E_{21}$$. In order to use the same algorithm for each type of interaction, we estimate the total electrostatic energy as the average of the two components:

$E_{el} = \frac{1}{2} ( E_{12} + E_{21} )$

Where the factor $$\frac{1}{2}$$ is applied on the electrostatic potential grids when they are read into memory. It can be adjusted in the SDA7 input file with the parameter epfct (default 0.5)

The same applies for the forces, with the difference that we need the derivatives of the energy:

$\vec{f_{12}} = \sum_{i=1}^n - \, q_1^i(x,y,z) \, \nabla U_2(x,y,z)$

The values of the energy and the derivatives are extrapolated from the grids (see functions mod_force_energy::potential_atom_uhbd and mod_force_energy::force_atom_uhbd ). Note that pointers to the grids are used, rather than the original grid. The use of pointers allows for these functions to be used on any 3-dimensional grid of real*8. It is therefore relatively straightforward to implement calculations of new energy and force terms.

## Choice of the Grids

All grids and charges have a type associated with them (enumerated in mod_gridtype), all grids derive from the same class mod_grid. The types are used to select the correct association between charges and grids. This is performed by the function mod_protein::select_charge. Given the grid_number, the mod_protein::select_charge function associates pointers to the charges and the grids. Pointers are then used for computing the interactions by the same subroutine. A simple and general method for evaluating the forces is summarized here schematically. It is simplified pseudo-code, from mod_compute_force::force_epedhdlj_2proteins, which does not take into account asymmetric interactions.

   subroutine energy_epedhdlj_2proteins ( prot1, prot2,...)
! loop over grid, sure to be associated because a pre-sorting is done in param_force_energy
GRID:  do i = 1, param_force_energy % nb_grid
! extract the grid number, identical to accessing the sogrid directly
grid_number =  param_force_energy % array_grid_calc ( i, 1 )
call compute_force_sda_2proteins ( prot1, prot2, grid_number,..)
end do
end subroutine
   subroutine compute_energy_sda_2proteins ( prot1, prot2, grid_number,..)
! assign correctly the pointers from the grid_number
call select_charge ( grid_number, prot1, prot2, nq1, nq2, p_charge1, p_charge2, &
p_charge_position1, p_charge_position2, dummy_array )
! compute force, discussed in detail the next chapter
Energy 12 = Sum of p_charge1 * U ( p_charge_position1, prot1 % p_sogrid % pset ( grid_number ) )
Energy 21 = Sum of p_charge2 * U ( p_charge_position2, prot2 % p_sogrid % pset ( grid_number ) )
end subroutine

## Simple Algorithm for computing forces and energies

There are several possible methods for calculating the force (and energy), here we present the simplest one. Its advantage is, as its name suggests, simplicity, and in addition generality. This method should be used as the first when a new type of interaction is being added to the code, since it greatly reduces the risk of introducing errors. The principal difficulty is dealing with the coordinate transformation between frames of reference, because the grids are fixed and never rotated.

Computation of forces

The figure "Computation of forces" shows the general computation of $$f_{21}$$, the force applied on solute 2, when both solutes can rotate (e.g. the case of SDAMM).

The solute 1 grid shown with the dashed line represents the grid orientation in the fixed frame of reference R', and is stored in memory. The solid lines represents the grid orientation corresponding to the instantaneous orientation of solute 1 (frame R).

The simple algorithm is a loop over the charges of solute 2:

• The position of the charge $$i$$ in solute 2 is computed in the fixed frame of reference of the grid (R') for the rotated and translated dashed "solute 2".
• The force is computed in this reference frame as $$f'= - q.\nabla U$$.
• f' is transformed back into R, it is shown as f.
• The torque, t, is computed as the cross product of the position and the force : $$\tau = r \wedge f$$ * (not shown in the picture).
• f and t are added to the total force and total torque on solute 2.
• Below is an extract from the SDA 7 code (mod_compute_force_sdamm::force_epedhdlj_sdamm) for the computation of the force and the torque on solute 2.
 subroutine compute_force_sdamm_2proteins( prot1, prot2, pos_relat,..)
! get the orientation of the first solute
ex = prot1 % orientation_x
ey = prot1 % orientation_y
ez = prot1 % orientation_z
! inverse the matrix, it is a unity matrix : transposed equal the inverse
xe(1) = ex(1)
xe(2) = ey(1)
xe(3) = ez(1)
ye(1) = ex(2)
ye(2) = ey(2)
ye(3) = ez(2)
ze(1) = ex(3)
ze(2) = ey(3)
ze(3) = ez(3)
! loop over the charges of solute 2
FORCE2: do j=1,nq2
! pos_relat is the relative position of the center of geometry of the solutes, and p_charge_position2 have the correct orientation
p2 = p_charge_position2 (:,j)
! relative position of the charge in the frame of reference R
rto = p2 + pos_relat
! make the trnasformation between the frames of reference, rtn is the relative position in the frame R'
call tr ( rto, rtn, ex, ey, ez )
! get the derivative of the potential at the position rtn
call force_atom_uhbd ( rtn, force, prot1 % p_sogrid % pset_grid ( grid_number) )
! transform back into R
call tr ( force, force_o, xe, ye, ze )
! multiply by the charge to get f in R. Back transformation and multiplication can be inverted here
qf = force_o * p_charge2(j)
! call the cross product to get the torque, directly in R
call cross (tmp2, p2, qf )
! add to the total torque and force
tt2 = tt2 + tmp2
ff2 = ff2 + qf
end do FORCE2

With the code presented, we could compute all interactions, both for sda_2proteins and SDAMM. There are nevertheless different implementations for optimizing the calculations. In sda_2proteins, solute 1 is always fixed, so that some computations are simpler: the charges of solute 2 remain in the reference frame.

## Fast Algorithm

Two improvements of the simple algorithm are made:

• In the simple algorithm, for each charge on solute 2, the position of the charge is rotated into the frame of reference of solute 1, the force acting on solute 2 due to this charge is calculated, and the resulting force vector is rotated back into the frame of reference of solute 2. The total force on solute 2 is then calculated as the sum of the forces that have been rotated back into its own frame of reference. This approach is inefficient. The total force acting on solute 2 can be summed in the frame of reference of solute 1, then only a single force vector needs to be rotated back into the frame of solute 2. This is the approach used in the fast algorithm, which reduces the required number of calls to the tr() and cross() functions by approximately a factor of two.
• Both electrostatic potential and electrostatic desolvation calculations use the effective charge positions. * In the former case with the effective charges and in the latter with the square of the effective charges. * This implies that exactly the same geometric transformation is performed twice. Therefore, we can compute the electrostatic and the desolvation terms at the same time. Again, we can divide the number of calls to tr() by two.
• We can use the same logic as above for the case of the non-polar desolvation and repulsive soft core (Lennard-Jones), which both use the positions of the surface atom.

The functions that implement these improvements have an equivalent name to the functions implementing the simple algorithm, with the "fast" suffix added. The module mod_force_energy stores alot of information about the required calculations. In particular, where beneficial, it uses the optimized (i.e. fast) functions.

The fast algorithm is more difficult to maintain than the simple algorithm, but in the best case (SDAMM with all four interactions) there is a 50% gain in performance.

# Record complexes and trajectories files

SDA 7 has three output file types which are used to store the positions and orientations of solutes, along with energy information. Complex files are used to store the docked positions of the mobile solute, relative to the fixed central solute, obtained from docking simulations. Trajectory files may be generated from either two solute or many solute simulations. In two solute simulations, there are used to store trajectories of the mobile solute relative to the fixed solute, while in many molecule simulations they show the trajectories of all solutes, in the frame of the simulation box. Finally, restart files can be used to restart many molecule simulations from the end of a previously run simulation. They have the same form as a single frame from a trajectory file.

The the form of these output files has been changed, compared to SDA6. The specifications for the new design were :

• Reuse of the code: All simulation routines in SDA 7 (sda_2proteins, SDAMM, sda_koff,..) use the same output formats. Furthermore, many tools use complexes or trajectories files as input and/or output. The common tasks, such as file opening, reading, and copying should be easy to implement in the tools
• Easy to maintain: A change in the format of a file, like adding new information (new types of interaction energy, new descriptor for a solute...) should not make it necessary to modify all existing tools
• Consise and optimized in size: As interactions or new properties are added, the number of fields present in a line of a file may increase to accomodate this. In previous versions of SDA, a docking simulation with only electrostatic interactions was producing a docked complex line with more than 25 fields, half of them filled with zeros, as the interactions defined by these fields were not used in the simulation. In SDA 7, the software varies the number of fields per line, so that only those required are present in the file. The option to produce and read binary files format has also been included.

A versioning has also been included to maintain compatibility with previously generated files, should new formats be added.

## Design of the module record

The requirements cited above apply to both complex and trajectory files. Their functions are indeed very similar: they record information about the solutes (solute number, conformation number...), the solute's position and orientation, the energy interactions terms and additional statistics dependent on the type of simulation. However their use differs. New data for trajectory files is always appended to the file each time it is written, to give a chronological order to the file. On an other hand, complex files are ordered by the energy values of the recorded complexes. The best complexes (with the lowest interaction energy) are at the top of the file, while those with the largest interaction energies not stored, this gives the maximum number of relevant complexes, which can be analysed later with external tools, such as clustering algorithms.

The implementation of the complex and trajectory files in SDA 7 groups the similarities into a main class record. All the functions/tools that use complexes and trajectories objects, should call functions from record.

Moreover, each "line" of these files is defined in the class one_complex (it's called 'complex' but it refers to one solute).

The main difference between complexes and trajectories files is in the implementation of intermediate types: ListComplexes and Trajectory. There are different algorithms for collecting the objects of type one_complex into complexes and trajectories:

• An array of one_complex is used for storing trajectory data in memory. It is filled sequentially and written to disk when full.
• A list is implemented for storing the one_complex of the complexes file. The insertion or deletion of a one_complex at any position is then optimized.

The scheme below shows a schematic design of those modules: The information stored in each one complex can vary but it always follows the same order:

• Integers array int_ocompl: stores the description of the trajectory and solute. Any of those fields can be printed or not, but their order is fixed:
• trajectory number
• step number
• solute number
• conformation number
• Position and orientation of the solute: 9 fields of real*8 (3 for translation and 6 for the orientation of the inner x and y axis) are mandatory
• Total energy: the sum of the different energy terms, this field is always present
• Real array real_ocompl: stores the energy interaction terms. * Just like the integer array, these are variable fields but their order is fixed (following the numbering in mod_gridtype.f90 )
• Coulombic electrostatic (ES)
• Desolvation electrostatic (ED)
• Nonpolar/hydrophobic desolvation (HD)
• Repulsive soft-core(Lennard-Jones) (rLJ)
• Full Lennard-Jones (LJ)
• Electrostatic image charge (ImgChg)
• Metal desolvation (MetDes)
• One more real array other_real_ocompl: stores additional data, depending on the type of calculation
• In the complexes file, 3 additional fields: the occurence of each complex, its average total energy and the standard deviation of the total energy
• In the trajectories files: the time in ps
• Three integers representing the information about the box: when PBC are used in SDAMM, these fields are used to write in which copy of the box the solute resides currently.

The information about what is written to disk in each file is coded in the header of the complexes/trajectories file (defined in description_column_str), if these are in ascii format. The header contains all information to read the following lines. It is constructed in header_str or directly printed from write_header_bin, and is composed of:

• The version number: currently, the newest is "#version0.2"
• The type of calculation: either the enumeration code from record_type_calc or its string version
• The type of record: complex or trajectory, written also either with its enumeration value or with a string
• The total number of solutes: as given by the user in the input for SDAMM, fixed at 2 for sda_2proteins
• The number of types of solutes: used to split the energy terms between the types of solutes in SDAMM
• Five integers: they define all parametrizable options for printing the complexes. These are described in the next paragraph
• Optionally, for sda_2proteins, the centers of geometry of the solutes are written in the next 2 lines

## Coding the input options

The five mysterious integer values present in the header describe the following options:

• The value of bit_integer: indicates which of the integer variable fields of the one_complex object are present
• The value of bit_energy: indicates which of the energy terms were computed
• The value of nb_other_real: how many fields are used for the additional information. Currently, the complexes file expects nb_other_real=3 and trajectories nb_other_real=1
• The boolean value opt_sum: if true (1), the energy terms representing interactions between different types of solutes are summed up, otherwise, they are output separately
• The boolean value box_info: if the box information should be output (in SDAMM, if PBC is set)

The first 2 integers are read and written as bits. The idea was to store the information about the presence of a term or not and avoid repetitive numbers.

The implemented solution uses the binary representation of those integers as logical values. As an example of bit_energy, the value of 5 in binary representation is written as:

       5
00000101 <- we have to read from the right to the left

This value codes the following information (remember that the order of the terms is fixed):

• first term 1: electrostatic energy was computed
• second term 0: electrostatic desolvation is not present in the file
• third term 1: apolar desolvation was computed
• fourth term 0: repulsive soft-core (Lennard-Jones) was not computed
• fith term 0: Lennard-Jones was not computed
• (all other fields are 0 and are not currently used)
• Another example for a bit_integer of 14 - its binary format is:
      14
00001110

Which means that:

Which means that:

• first term 0: No trajectory number is present
• second term 1: Step number is present
• third term 1: Solute numbers are present
• fourth term 1: Conformation numbers are present (all other fields are 0 and are not currently used)

In principle, all the classes using the complexes/trajectories, should use the interface functions implemented in mod_record and mod_onecomplexe, and not deal directly with the BITS functions themselves!

Below, we show a very simple exemplary code. Real examples can be found in tools (specifically in read_record.f90 and in trajectory2dcd.f90).

The code reads an existing record (complex or trajectory), extracts useful information and writes the new complexes or trajectory file to the disk.

    ! simple initialization, nullify pointers. clean for allocated objects
call init_record ( o_record, opt_iocmx = io_record )
! set all internal fields and bits from the header
! if optional nb_line, count the number of lines and assign it to nb_line
! finalize a correct initialization, optional .true. do not write the header
call initialize_record ( o_record, tab_prot, .true. )
! can read all data at once
! for traj can proceed by steps or multiple of steps
! reset position, complexe or trajectory
call reset_position ( out_record )
! can loop over each one_complexe's
! p_tmp_onecomplexe is a convenient shortcut (for o_record % p_onecompl),
! will be updated with the getNext optional argument
p_tmp_onecomplexe => o_record % p_onecompl
! make a loop over all entries
do while ( ( ierr == 0 ) )
! if correctly set_up applies for sda 2proteins and sdamm
do i = 1, nb_prot_in
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! Here make whatever you want
! e.g., get the conformation number, method 1 with indice_nconf data member of record
if ( iflex ) then
tmp_conf = p_tmp_onecomplexe % int_ocompl ( o_record % indice_nconf )
end if
! get integer and energy information from the current one_complexe
! order is natural : 4 integer, as many as necessary real
call return_compatible_array_wrap ( o_record, array_int, array_real, other_real )
! e.g., get position and orientation
!
! Here need to declare tab_prot explicitely before, see example in trajectory2dcd.f90
tab_prot % array ( tmp_prot ) % position = p_tmp_onecomplexe % position
tab_prot % array ( tmp_prot ) % orientation_x = p_tmp_onecomplexe % orient_x
tab_prot % array ( tmp_prot ) % orientation_y = p_tmp_onecomplexe % orient_y
!!! Write output
! allocate onecomplexe, with different fileds (coming from record_comp)
allocate ( p_new_onecomplexe, stat = ierr )
call new_onecomplexe ( p_new_onecomplexe, record_comp % nb_int, record_comp % nb_real, record_comp % nb_other_real )
call set_onecomplexe_wrap ( record_comp, p_tmp_onecomplexe % position, p_tmp_onecomplexe % box, &
p_tmp_onecomplexe % orient_x, p_tmp_onecomplexe % orient_y, &
array_int, array_real, other_real, opt_p_onecomplexe = p_new_onecomplexe )
! always insert onecomplexe at the end
call insert_new_node ( record_comp % pcomplexe % p_last, p_new_onecomplexe )
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! go to the next one_complexe
! optionaly increment p_tmp_onecomplexe to the next element
call getNext(  record_traj, p_tmp_onecomplexe, ierr )
end do ! loop over solutes
! or write the data here in case of sdamm, after having read all solutes
! increment the loop
nloop = nb_line + 1
end do ! loop over nb_line
! clean memory from the record
call delete_record ( o_record, .true. )
! write output into disk and clean the memory
call write_record ( record_comp, .false. )
call delete_record ( record_comp, .true. )

# Compiling SDA 7

We have tested SDA 7 compilation on Linux with:

• gnu fortran : 4.4 to 4.8.3
• intel fortran: 11 and 14
• Portland fortran: version 12 and 13.4 (only partly tested!)
• XL Fortran: even less tested, but compile

We had errors during the link stage with a MacOS (maverick XOS with gfortran 4.8). In the provided Makefile, the default options for the intel compiler (-warn all) are very strict. The declaration of interfaces (for all the functions not contained in a given module) is mandatory in this case. Fortran does not use header files, so without those interfaces it is impossible for the compiler to notice a wrong or even missing argument of a function. It is convenient to check the integrity of the code, and may save (a lot by experience) debugging time.

# Memory leak tools

SDA 7 uses extensive memory allocation. It requires explicit allocation of and deletion of objects from the heap memory. Valgrind ( http://valgrind.org/ ) has been used during the development of the software. It is invaluable for finding mistakes (forget to free memory) or bugs (writing outside an array) in the code.

In 90 % of the cases, it answers the question (and gives the origin of the error if the code was compiled with "debug=yes" option): Why does my program crash on the cluster? It was working fine on my workstation!!

We recommend to use this tool when modifying the SDA7 code or developing a new module/tool to avoid accumulation of mistakes and errors. The released version gives 0 errors and 0 memory leaks in the serial version, when compiled with gfortran (as far as tested), however there are a few errors related to the use of OpenMP when using Valgrind.

When compiling with the intel compiler, more errors are found by Valgrind, many related to the use of SSE instructions.

Imprint/Privacy