Development

Basic instructions

The default cclib files distributed with a release, as described in How to install, do not include any unit tests and logfiles necessary to run those tests. This section covers how to download the full source along with all test data and scripts, and how to use these for development and testing.

Cloning cclib from GitHub

cclib is hosted by the fantastic people at GitHub (previously at Sourceforge) in a git repository. You can download a zipped archive of the current development version (called master) for installation and testing or browse the available releases. In order to contribute any changes, however, you will need to create a local copy of the repository:

git clone https://github.com/cclib/cclib.git

Guidelines

We follow a typical GitHub collaborative model, relying on forks and pull requests. In short, the development process consists of:

Here are some general guidelines for developers who are contributing code:

  • All contributions should be reviewed by at least one core developer

  • Contributions from a core developer need to be reviewed by another core developer

  • Run and review the unit tests (see below) before submitting a pull request.

  • There should normally not be more failed tests than before your changes.

  • For larger changes or features that take some time to implement, using branches is recommended.

Releasing a new version

The release cycle of cclib is irregular, with new versions being created as deemed necessary after significant changes or new features. We roughly follow semantic versioning with respect to the parsed attributes.

When creating a new release on GitHub, the typical procedure might include the following steps:

  • Update the CHANGELOG, ANNOUNCE and any other files that might change content with the new version

  • Make sure that setup.py has the right version number, as well as __version__ in __init__.py and any other relevant files

  • Update the download and install instructions in the documentation, if appropriate

  • Create a branch for the release, so that development can continue

  • Run all tests for a final time and fix any remaining issues

  • Tag the release (make sure to use an annotated tag using git -a) and upload it (git push --tags)

  • Run manifest.py to update the MANIFEST file

  • Create the source distributions (python setup.py sdist --formats=gztar,zip) and Windows binary installers (python setup.py bdist_wininst)

  • Create a release on GitHub using the created tag (see Creating releases) and upload the source distributions and Windows binaries

  • Email the users and developers mailing list with the message in ANNOUNCE

  • Update the Python package index, normally done by python setup.py register

  • For significant releases, if appropriate, send an email to the CCL list and any mailing lists for computational chemistry packages supported by cclib

Testing

The test directory, which is not included in the default download, contains the test scripts that keep cclib reliable, and keep the developers sane. With any new commit or pull request to cclib on GitHub the tests are triggered and run with GitHub Actions.

The input files for tests, which are logfiles from computational chemistry programs, are located in the data directory. These are a central part of cclib, and any progress should always be supported by corresponding tests. When a user opens an issue or reports a bug, it is prudent to write a test that reproduces the bug as well as fixing it. This ensures it will remain fixed in the future. Likewise, extending the coverage of data attributes to more programs should proceed in parallel with the growth of unit tests.

Unit tests

Unit tests check that the parsers work correctly for typical calculation types on small molecules, usually water or 1,4-divinylbenzene (dvb) with \(C_{\mathrm{2h}}\) symmetry. The corresponding logfiles stored in folders like data/NWChem/basicNWChem6.0 are intended to test logfiles for an approximate major version of a program, and are standardized for all supported programs to the extent possible. They are located alongside the code in the repository, but are not normally distributed with the source. Attributes are considered supported only if they are checked by at least one test, and the table of attribute coverage is generated automatically using this criterion.

The job types currently included as unit tests:

  • restricted and unrestricted single point energies for dvb (RHF/STO-3G and B3LYP/STO-3G)

  • geometry optimization and scan for dvb (RHF/STO-3G and/or B3LYP/STO-3G)

  • frequency calculation with IR intensities and Raman activities for dvb (RHF/STO-3G or B3LYP/STO-3G)

  • single point energy for carbon atom using a large basis set such as aug-cc-pCVQZ

  • Møller–Plesset and coupled cluster energies for water (STO-3G basis set)

  • static polarizabilities for tryptophan (RHF/STO-3G)

Adding a new program version

There are a few conventions when adding a new supported program version to the unit tests: * Two different recent versions are typically used in the unit tests. If there already are two, move the older version(s) the regression suite (see below). * When adding files for the new version, first copy the corresponding files for the last version already in cclib. Afterwards, check in files from the new program version as changes to the copied files. This procedure makes it easy to look at the differences introduced with the new version in git clients.

Regression tests

Regression tests ensure that bugs, once fixed, stay fixed. These are real-life files that at some point broke a cclib parser and are stored in folders like data/regression/Jaguar/Jaguar6.4. The files associated with regression tests are not stored together with the source code as they are often quite large. A separate repository on GitHub, cclib-data, is used to track these files, and we do not distribute them with any releases.

For every bug found in the parsers, there should be a corresponding regression test that tests if this bug stays fixed. The process is automated by regression.py, which runs through all of our test data, both the basic data and regression files, opens them, tries to parse, and runs any relevant regression tests defined for that file. New regression tests are added by creating a function testMyFileName_out according to the examples at the start of regression.py.

Using both the unit and regression tests, the line-by-line test coverage shows which parts of cclib are touched by at least one test. When adding new features and tests, the GitHub Actions testing script can be run locally to generate the HTML coverage pages and ensure that the tests exercise the feature code.

Developers

Besides input from a number of people listed in the repository, the following are core developers (in alphabetical order):