spacepy_logo

Table Of Contents

Previous topic

spacepy.poppy.value_percentile

Next topic

spacepy.pycdf.CDF

This Page

pycdf - Python interface to CDF files

This package provides a Python interface to the Common Data Format (CDF) library used for many NASA missions, available at http://cdf.gsfc.nasa.gov/. It is targeted at Python 2.6+ and should work without change on either Python 2 or Python 3.

The interface is intended to be ‘pythonic’ rather than reproducing the C interface. To open or close a CDF and access its variables, see the CDF class. Accessing data within the variables is via the Var class. The lib object provides access to some routines that affect the functionality of the library in general. The const module contains constants useful for accessing the underlying library.

The CDF C library must be properly installed in order to use this package. The CDF distribution provides scripts meant to be called in a user’s login scripts, definitions.B for bash and definitions.C for C-shell derivatives. (See the installation instructions which come with the CDF library.) These will set environment variables specifying the location of the library; pycdf will respect these variables if they are set. Otherwise it will search the standard system library path and the default installation locations for the CDF library.

If pycdf has trouble finding the library, try setting CDF_LIB before importing the module, e.g. if the library is in CDF/lib in the user’s home directory:

>>> import os
>>> os.putenv("CDF_LIB", "~/CDF/lib")
>>> from spacepy import pycdf

If this works, make the environment setting permanent. Note that on OSX, using plists to set the environment may not carry over to Python terminal sessions; use .cshrc or .bashrc instead.

Note

If the CDF library cannot be found, pycdf will be left in a “half-imported” state. You will need to restart your Python interpreter before trying the fix above.

Authors: Jon Niehof

Institution: Los Alamos National Laboratory

Contact: jniehof@lanl.gov

Copyright 2010-2013 Los Alamos National Security, LLC.

Quickstart

Create a CDF

This example presents the entire sequence of creating a CDF and populating it with some data; the parts are explained individually below.

>>> from spacepy import pycdf
>>> import datetime
>>> time = [datetime.datetime(2000, 10, 1, 1, val) for val in range(60)]
>>> import numpy as np
>>> data = np.random.random_sample(len(time))
>>> cdf = pycdf.CDF('MyCDF.cdf', '')
>>> cdf['Epoch'] = time
>>> cdf['data'] = data
>>> cdf.attrs['Author'] = 'John Doe'
>>> cdf.attrs['CreateDate'] = datetime.datetime.now()
>>> cdf['data'].attrs['units'] = 'MeV'
>>> cdf.close()

Import the pycdf module.

>>> from spacepy import pycdf

Make a data set of datetime. These will be converted into CDF_EPOCH types.

>>> import datetime
>>> # make a dataset every minute for a hour
>>> time = [datetime.datetime(2000, 10, 1, 1, val) for val in range(60)]

Warning

If you create a CDF in backwards compatibility mode (default), then datetime objects are degraded to CDF_EPOCH (millisecond resolution), not CDF_EPOCH16 (microsecond resolution).

Create some random data.

>>> import numpy as np
>>> data = np.random.random_sample(len(time))

Create a new empty CDF. The empty string, ‘’, is the name of the CDF to use as a master; given an empty string, an empty CDF will be created, rather than copying from a master CDF. If a master is used, data in the master will be copied to the new CDF.

>>> cdf = pycdf.CDF('MyCDF.cdf', '')

Note

You cannot create a new CDF with a name that already exists on disk. It will throw a NameError

To put data into a CDF, assign it directly to an element of the CDF. CDF objects behave like Python dictionaries.

>>> # put time into CDF variable Epoch
>>> cdf['Epoch'] = time
>>> # and the same with data (the smallest data type that fits the data is used by default)
>>> cdf['data'] = data

Adding attributes is done similarly. CDF attributes are also treated as dictionaries.

>>> # add some attributes to the CDF and the data
>>> cdf.attrs['Author'] = 'John Doe'
>>> cdf.attrs['CreateDate'] = datetime.datetime.now()
>>> cdf['data'].attrs['units'] = 'MeV'

Closing the CDF ensures the new data are written to disk:

>>> cdf.close()

CDF files, like standard Python files, act as context managers

>>> with cdf.CDF('filename.cdf', '') as cdf_file:
...     #do brilliant things with cdf_file
>>> #cdf_file is automatically closed here

Read a CDF

Reading a CDF is very similar: the CDF object behaves like a dictionary. The file is only accessed when data are requested. A full example using the above CDF:

>>> from spacepy import pycdf
>>> cdf = pycdf.CDF('MyCDF.cdf')
>>> print(cdf)
    Epoch: CDF_EPOCH [60]
    data: CDF_FLOAT [60]
>>> cdf['data'][4]
    0.8609974384307861
>>> data = cdf['data'][...] # don't forget the [...]
>>> cdf_dat = cdf.copy()
>>> cdf_dat.keys()
    ['Epoch', 'data']
>>> cdf.close()

Again import the pycdf module

>>> from spacepy import pycdf

Then open the CDF, this looks the same and creation, but without mention of a master CDF.

>>> cdf = pycdf.CDF('MyCDF.cdf')

The default __str__() and __repr__() behavior explains the contents, type, and size but not the data.

>>> print(cdf)
    Epoch: CDF_EPOCH [60]
    data: CDF_FLOAT [60]

To access the data one has to request specific elements of the variable, similar to a Python list.

>>> cdf['data'][4]
    0.8609974384307861
>>> data = cdf['data'][...] # don't forget the [...]

CDF.copy() will return the entire contents of a CDF, including attributes, as a SpaceData object:

>>> cdf_dat = cdf.copy()

Since CDF objects behave like dictionaries they have a keys() method and iterations are over the names in keys()

>>> cdf_dat.keys()
    ['Epoch', 'data']

Close the CDF when finished:

>>> cdf.close()

Modify a CDF

An example modifying the CDF created above:

>>> from spacepy import pycdf
>>> cdf = pycdf.CDF('MyCDF.cdf')
>>> cdf.readonly(False)
    False
>>> cdf['newVar'] = [1.0, 2.0]
>>> print(cdf)
    Epoch: CDF_EPOCH [60]
    data: CDF_FLOAT [60]
    newVar: CDF_FLOAT [2]
>>> cdf.close()

As before, each step in this example will now be individually explained. Existing CDF files are opened in read-only mode and must be set to read-write before modification:

>>> cdf.readonly(False)
    False

Then new variables can be added

>>> cdf['newVar'] = [1.0, 2.0]

Or contents can be changed

>>> cdf['data'][0] = 8675309

The new variables appear immediately:

>>> print(cdf)
    Epoch: CDF_EPOCH [60]
    data: CDF_FLOAT [60]
    newVar: CDF_FLOAT [2]

Closing the CDF ensures changes are written to disk:

>>> cdf.close()

Non record-varying

Non record-varying (NRV) variables are usually used for data that does not vary with time, such as the energy channels for an instrument.

NRV variables need to be created with CDF.new(), specifying the keyword ‘recVary’ as False.

>>> from spacepy import pycdf
>>> cdf = pycdf.CDF('MyCDF2.cdf', '')
>>> cdf.new('data2', [1], recVary=False)
    <Var:
    CDF_BYTE [1] NRV
    >
>>> cdf['data2'][...]
    [1]

Slicing and indexing

Subsets of data in a variable can be easily referenced with Python’s slicing and indexing notation.

This example uses bisect to read a subset of the data from the hourly data file created in earlier examples.

>>> from spacepy import pycdf
>>> cdf = pycdf.CDF('MyCDF.cdf')
>>> start = datetime.datetime(2000, 10, 1, 1, 9)
>>> stop = datetime.datetime(2000, 10, 1, 1, 35)
>>> import bisect
>>> start_ind = bisect.bisect_left(cdf['Epoch'], start)
>>> stop_ind = bisect.bisect_left(cdf['Epoch'], stop)
>>> # then grab the data we want
>>> time = cdf['Epoch'][start_ind:stop_ind]
>>> data = cdf['data'][start_ind:stop_ind]
>>> cdf.close()

The Var documentation has several additional examples.

Access to CDF constants and the C library

Constants defined in cdf.h and occasionally useful in accessing CDFs are available in the const module.

The underlying C library is represented by the lib variable.

Class reference

CDF(pathname[, masterpath]) Python object representing a CDF file.
Var(cdf_file, var_name, *args) A CDF variable.
gAttrList(cdf_file[, special_entry]) Object representing all the gAttributes in a CDF.
zAttrList(zvar) Object representing all the zAttributes in a zVariable.
zAttr(*args, **kwargs) zAttribute for zVariables within a CDF.
gAttr(*args, **kwargs) Global Attribute for a CDF
AttrList(cdf_file[, special_entry]) Object representing a list of attributes.
Attr(cdf_file, attr_name[, create]) An attribute, g or z, for a CDF
Library() Abstraction of the base CDF C library and its state.
CDFCopy(cdf) A dictionary-like copy of all data and attributes in a CDF
VarCopy A list-like copy of the data and attributes in a Var
CDFError(status) Raised for an error in the CDF library.
CDFException(status) Base class for errors or warnings in the CDF library.
CDFWarning(status) Used for a warning in the CDF library.
EpochError Used for errors in epoch routines