1. gridData – Handling grids of data

1.1. Overview

This module contains classes that allow importing and exporting of simple gridded data, A grid is an N-dimensional array that represents a discrete mesh over a region of space. The array axes are taken to be parallel to the cartesian axes of this space. Together with this array we also store the edges, which are are (essentially) the cartesian coordinates of the intersections of the grid (mesh) lines on the axes. In this way the grid is anchored in space.

The Grid object can be resampled at arbitrary resolution (by interpolating the data). Standard algebraic operations are defined for grids on a point-wise basis (same as for numpy.ndarray).

1.2. Description

The package reads grid data from files, makes them available as a Grid object, and allows one to write out the data again.

A Grid consists of a rectangular, regular, N-dimensional array of data. It contains

  1. The position of the array cell edges.
  2. The array data itself.

This is equivalent to knowing

  1. The origin of the coordinate system (i.e. which data cell corresponds to (0,0,...,0)
  2. The spacing of the grid in each dimension.
  3. The data on a grid.

Grid objects have some convenient properties:

  • The data is represented as a numpy.ndarray and thus shares all the advantages coming with this sophisticated and powerful library.
  • They can be manipulated arithmetically, e.g. one can simply add or subtract two of them and get another one, or multiply by a constant. Note that all operations are defined point-wise (see the numpy documentation for details) and that only grids defined on the same cell edges can be combined.
  • A Grid object can also be created from within python code e.g. from the output of the numpy.histogramdd() function.
  • The representation of the data is abstracted from the format that the files are saved in. This makes it straightforward to add additional readers for new formats.
  • The data can be written out again in formats that are understood by other programs such as VMD or PyMOL.

1.3. Reading grid data files

Some Formats can be read directly from a file on disk:

g = Grid(filename)

filename could be, for instance, “density.dx”.

1.4. Constructing a Grid

Data from an n-dimensional array can be packaged as a Grid for convenient handling (especially export to other formats). The Grid class acts as a universal constructor:

g = Grid(ndarray, edges=edges)                 # from histogramdd
g = Grid(ndarray, origin=origin, delta=delta)  # from arbitrary data

g.export(filename, format)   # export to the desire format

See the doc string for Grid for details.

1.5. Formats

The following formats are available (Supported file formats):

OpenDX
IBM’s Data Explorer, http://www.opendx.org/
gOpenMol
http://www.csc.fi/gopenmol/
pickle
python pickle file (pickle)

2. Examples

In most cases, only one class is important, the Grid, so we just load this right away:

from gridData import Grid

2.1. Loading data

From a OpenDX file:

g = Grid("density.dx")

From a gOpenMol PLT file:

g = Grid("density.plt")

From the output of numpy.histogramdd():

import numpy
r = numpy.random.randn(100,3)
H, edges = np.histogramdd(r, bins = (5, 8, 4))
g = Grid(H, edges=edges)

For other ways to load data, see the docs for Grid.

2.2. Subtracting two densities

Assuming one has two densities that were generated on the same grid positions, stored in files A.dx and B.dx, one first reads the data into two Grid objects:

A = Grid('A.dx')
B = Grid('B.dx')

Subtract A from B:

C = B - A

and write out as a dx file:

C.export('C.dx')

The resulting file C.dx can be visualized with any OpenDX-capable viewer, or later read-in again.

2.3. Resampling

Load data:

A = Grid('A.dx')

Interpolate with a cubic spline to twice the sample density:

A2 = A.resample_factor(2)

Downsample to half of the bins in each dimension:

Ahalf = A.resample_factor(0.5)

Resample to the grid of another density, B:

B = Grid('B.dx')
A_on_B = A.resample(B.edges)

or even simpler

A_on_B = A.resample(B)

Note

The cubic spline generates region with values that did not occur in the original data; in particular if the original data’s lowest value was 0 then the spline interpolation will probably produce some values <0 near regions where the density changed abruptly.