Understanding Density object obtained from MDAnalaysis DensityAnalysis

Question:

I am having trouble understanding what exactly the Density object produces from the DensityAnalysis class. The documentation is found here.

Running the code is not the problem, but understanding what exactly the Density object produces and how to interpret the information.

What is meant by the "(113, 113, 113) bins"
I’ve seen examples from the MDAnalysis User guide but I still can’t understand what this is or how to interpret it.

    from MDAnalysis.analysis.density import Density
    from MDAnalysis.analysis.density import DensityAnalysis
    from MDAnalysis import *
    import numpy as np

    PDB = '/Users/joveyosagie/Desktop/1vmd6lu7.pdb'
    DCD = '/Users/joveyosagie/Desktop/1vmd6lu7.dcd'
    u = Universe(PDB, DCD)
    protein = u.select_atoms('protein')
    OH2 = u.select_atoms('name OH2')

    OH2 = u.select_atoms('name OH2')    #select for water atoms
    D = DensityAnalysis(OH2, delta = 1.0) # each bin in histogram has size of 1 Angstrom
    D.run()

    D.density

[Out]Density density with (113, 113, 113) bins
Asked By: Jovey Osagie

||

Answers:

The MDAnalysis.analysis.density.Density object holds the 3D histogram that was generated with DensityAnalysis. The way a density is generated is by counting how often a particle shows up in one small region of space — a volume element or voxel (orthorhombic box with e.g., 1 Å length, but Density.delta tells you exactly the voxel dimensions) or called a "bin". We average over all frames of a trajectory and normalize the count to get a density (particles per volume). The raw NumPy array with shape (num_bins_x, num_bins_y, num_bins_z) is accessible as Density.grid

The density is associated with the coordinate system of the original simulation. Thus, we also need to know where the origin of the grid of voxels is (Density.origin holds this information). With origin, delta, and the shape of the array we can now calculate where each bin is located in space. The Density.edges attribute provides the value of the bin edge along the x, y, and z axes. For instance edges = [np.array(-2.5, -0.5, 1.5, 3.5]), np.array([0., 1., 2.]), np.array([2.5, 4.5])] belongs to a grid with shape (3, 2, 1) with delta = np.array(2.0, 1.0, 2.0]). The bin in the lower left hand front corner is at origin (-1.5, 0.5, 3.5) (the origin is at the bin’s center) and contains points with coordinates -2.5 ≤ x < -0.5, 0 ≤ y < 1, and 2.5 ≤ z < 4.5.

The class contains methods to change in which units the density is stored, namely Density.convert_density(). This method changes the underlying data by multiplying the values stored in grid with an appropriate factor.

Other methods are inherited from the gridData.core.Grid class that forms the basis for Density. See the GridDataFormats documentation for what else one can do with these classes. For instance, one can treat two Density objects as numpy arrays and perform arithmetic on them such as subtracting them to get a difference density.

Example: comparing water densities

You can subtract densities if they were generated on the same coordinate system (i.e., have the same edges).

Let’s compare the water density for two simulations to see what the ligand does to the water:

  • apo (no ligand): u_apo Universe
  • holo (with ligand): u_holo

First superimpose trajectories on a common reference structure so that the proteins are in the same coordinate system. You can use MDAnalysis.analysis.align.AlignTraj as described under Aligning a trajectory with AlignTraj or see the more elaborate instructions in the User Guide on density analysis under Centering, aligning, and making molecules whole with on-the-fly transformations.

We then need to make sure that our densities are analyzed in the same coordinate system.

  • Find the common reference center.
  • Set the same number of bins in both densities. We assume that a 30 Å x 30 Å x 30 Å cube is sufficient but you’ll have to figure out the correct dimensions for an actual analysis.
protein_apo = u_apo.select("protein")
gridcenter = protein_apo.center_of_mass()  # should be the same as for holo

# select water oxygens for density
water_apo = u_apo.select_atoms("resname SOL and name OW")  # adjust for your simulation
water_holo = u_holo.select_atoms("resname SOL and name OW")

# perform the analysis
D_apo = DensityAnalysis(water_apo, gridcenter=gridcenter, xdim=30, ydim=30, zdim=30).run(verbose=True)
D_holo = DensityAnalysis(water_holo, gridcenter=gridcenter, xdim=30, ydim=30, zdim=30).run(verbose=True)

# work with the density instances
density_apo = D_apo.results.density
density_holo = D_holo.results.density

# convert to units in which water at standard conditions is 1
# (see Density.convert_units() docs for more choices)
density_apo.convert_units("water")
density_holo.convert_units("water")

# compare densities
delta_density = density_holo - density_apo

print("max difference", delta_density.grid.max())
print("min difference", delta_density.grid.min())

# export to DX file for visualization
delta_density.export("delta_holo_apo.dx")

More help?

If you have more questions please have a look at how to participate in the MDAnalysis community — we have a discord server and mailing lists where people (users and developers) are happy to help and discuss.

Answered By: orbeckst
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.