Development

Basic instructions

The default cclib files distributed with a release, as described in the tutorial, do not include any unit tests and logfiles necessary to run those tests. This section covers how to download the full source along with all test data and scripts, and how to use these for development and testing.

Cloning cclib from GitHub

cclib is hosted by the fantastic people at GitHub (previously at Sourceforge) in a git repository. You can download a zipped archive of the current development version (called master) for installation and testing or browse the available releases. In order to contribute any changes, however, you will need to create a local copy of the repository:

git clone https://github.com/cclib/cclib.git cclib

Guidelines

We follow a typical GitHub collaborative model, relying on forks and pull requests. In short, the development process consists of:

Here are some general guidelines for developers who are contributing code:

  • Run and review the unit tests (see below) before submitting a pull request
  • There should normally not be more failed tests than before your changes
  • For larger changes or features that take some time to implement, using branches is recommended

Releasing a new version

The release cycle of cclib is irregular, with new versions being created as deemed necessary after significant changes or new features. We roughly follow semantic versioning with respect to the parsed attributes.

When creating a new release on GitHub, the typical procedure might include the following steps:

  • Update the CHANGELOG, ANNOUNCE and any other files that might change content with the new version
  • Make sure that setup.py has the right version number, as well as __version__ in __init__.py and any other relevant files
  • Update the download and install instructions in the documentation, if appropriate
  • Create a branch for the release, so that development can continue
  • Run all tests for a final time and fix any remaining issues
  • Tag the release (make sure to use an annotated tag using git -a) and upload it (git push --tags)
  • Run manifest.py to update the MANIFEST file
  • Create the source distributions (python setup.py sdist --formats=gztar,zip) and Windows binary installers (python setup.py bdist_wininst)
  • Create a release on Github using the created tag (see Creating releases) and upload the source distribiuions and Windows binaries
  • Email the users and developers mailing list with the message in ANNOUNCE
  • Update the Python package index (https://pypi.python.org/pypi/cclib), normally done by python setup.py register
  • For significant releases, if appropriate, send an email to the CCL list and any mailing lists for computational chemistry packages supported by cclib

Testing

The test directory, which is not included in the default download, contains the test scripts that keep cclib reliable, and keep the developers sane. With any new commit or pull request to cclib on GitHub the tests are triggered and run with Travis CI, for both the current production version 1.5 (travis_prod) as well as master (travis_master).

The input files for tests, which are logfiles from computational chemistry programs, are located in the data directory. These are a central part of cclib, and any progress should always be supported by corresponding tests. When a user opens an issue or reports a bug, it is prudent to write a test that reproduces the bug as well as fixing it. This ensures it will remain fixed in the future. Likewise, extending the coverage of data attributes to more programs should proceed in parallel with the growth of unit tests.

Unit tests

Unit tests check that the parsers work correctly for typical calculation types on small molecules, usually water or 1,4-divinylbenzene (dvb) with C2h symmetry. The corresponding logfiles stored in folders like data/NWChem/basicNWChem6.0 are intended to test logfiles for an approximate major version of a program, and are standardized for all supported programs to the extent possible. They are located alongside the code in the repository, but are not normally distributed with the source. Two different recent versions are often used in the unit tests, with older versions being moved to the regression suite (see below). Attributes are considered supported only if they are checked by at least one test, and the table of attribute coverage is generated automatically using this criterion.

The job types currently included as unit tests:

  • restricted and unrestricted single point energies for dvb (RHF/STO-3G and B3LYP/STO-3G)
  • geometry optimization and scan for dvb (RHF/STO-3G and/or B3LYP/STO-3G)
  • frequency calculation with IR and Raman intensities for dvb (RHF/STO-3G or B3LYP/STO-3G)
  • single point energy for carbon atom using a large basis set such as aug-cc-pCVQZ
  • Møller–Plesset and coupled cluster energies for water (STO-3G or 6-31G basis set)

Regression tests

Regression tests ensure that bugs, once fixed, stay fixed. These are real-life files that at some point broke a cclib parser, and are stored in folders like data/regression/Jaguar/Jaguar6.4. The files associated with regression tests are not stored stored together with the source code as they are often quite large. A separate repository on github, cclib-data, is used to track these files, and we do not distribute them with any releases.

For every bug found in the parsers, there should be a corresponding regression test that tests this bug stays fixed. The process is automated by run_regressions.py, which runs through all of our test data, both the basic data and regression files, opens them, tries to parse, and runs any relevant regression tests defined for that file. New regression tests are added by creating a function testMyFileName_out according to the examples at the start of run_regressions.py.

Doctests

Doctests are a useful Python feature for unit testing individual functions. To run the doctests in a particular file, you need to run the script. For example, python gaussianparser.py runs the doctests contained in gaussianparser.py. To run all of the doctests at once, you need to install a testing tool such as nose, and then use the following command (note that many errors may be due to missing libraries like biopython):

$ nose cclib --with-doctest -e test* -v
ERROR
ERROR
Doctest: cclib.bridge.cclib2openbabel.makeopenbabel ... ok
ERROR
ERROR
Doctest: cclib.parser.adfparser.ADF.normalisesym ... ok
Doctest: cclib.parser.gamessparser.GAMESS.normalise_aonames ... ok
Doctest: cclib.parser.gamessparser.GAMESS.normalisesym ... ok
Doctest: cclib.parser.gamessukparser.GAMESSUK.normalisesym ... ok
Doctest: cclib.parser.gaussianparser.Gaussian.normalisesym ... ok
Doctest: cclib.parser.jaguarparser.Jaguar.normalisesym ... ok
Doctest: cclib.parser.logfileparser.Logfile.float ... ok
Doctest: cclib.parser.utils.PeriodicTable ... ok
Doctest: cclib.parser.utils.convertor ... ok
ERROR
ERROR
......

Developers

Besides input from a number of people listed in the repository, the following developers have contributed code to cclib (in alphabetical order):