User Guide


md_harmonize was created to harmonize compounds and reactions in the public metabolic databases. It is a command line tool directly supporting harmonization of KEGG, MetaCyc, and HMDB, but has an API for expanding to other metabolic databases.


The md_harmonize package runs under Python 3.7+. Use pip to install.

Install on Linux, Mac OS X

python3 -m pip install md-harmonize

Get the source code

Code is available on GitHub:

You can clone the public repository:



md_harmonize requires the following Python libraries:

  • docopt for creating the command-line interface.

  • jsonpickle for saving Python objects in a JSON serializable form and outputting to a file.

  • numpy and cython for speeding optimization.

  • ctfile for parsing compound molfile representation.

  • indigo for detecting aromatic atoms in the compound.

  • pebble for multiprocessing of cythonized calculation.


The raw data from KEGG and MetaCyc databases can be accessed from this URL:

Basic usage

md_harmonize provides functions to achieve compound and reaction harmonization across public metabolic databases. Details about the usages are in the The md_harmonize Tutorial.

   md_harmonize -h | --help
   md_harmonize --version
   md_harmonize download <database_names> <working_directory>
   md_harmonize standardize <database_names> <working_directory>
   md_harmonize aromatize <database_names> <working_directory> <save_file> [--pickle] [--aromatic_manager=<aromatic_manager_file>]
   md_harmonize initialize_compound <database_names> <working_directory> <aromatic_manager_file> [--parse_kegg_atom] [--pickle] [--split=k]
   md_harmonize initialize_reaction <database_names> <working_directory> [--pickle]
   md_harmonize harmonize_compound <database_names> <working_directory> [--pickle]
   md_harmonize harmonize_reaction <database_names> <working_directory> [--pickle]

   -h, --help           Show this screen.
   --version            Show version.
   --aromatic_manager=<aromatic_manager_file>   A pre-constructed aromatic manager is provided.
   --pickle             Use pickle to save the results, otherwise use jsonpickle.
   --parse_kegg_atom    To parse KEGG atom mapping between compounds.
   --split=k              Split compounds to speed up construction.