API

Note: Many KEGG entry IDs contain colons and kegg_pull saves KEGG entry files with their ID in the file name. When running on Windows, all file names with colons will have their colons replaced with underscores.

This package has the following modules:

pull

entry_ids

map

pathway_organizer

rest

kegg_url

Pulling, Parsing, and Saving KEGG Entries

Provides API functionality for pulling KEGG entries from the KEGG REST API, parsing them, and saving the entries as files.

class kegg_pull.pull.PullResult[source]

The collections of entry IDs, each of which resulted in a particular KEGG Response status after a pull.

property successful_entry_ids: tuple[str, ...]

The IDs of entries successfully pulled.

property failed_entry_ids: tuple[str, ...]

The IDs of entries that failed to be pulled.

property timed_out_entry_ids: tuple[str, ...]

The IDs of entries that timed out before being pulled.

class kegg_pull.pull.SinglePull(kegg_rest: KEGGrest | None = None, multiprocess_lock_save: bool = False)[source]

Class capable of performing a single request to the KEGG REST API for pulling up to a maximum number of entries.

Parameters:
  • kegg_rest (KEGGrest | None) – Optional KEGGrest object used to make the requests to the KEGG REST API (a KEGGrest object with the default settings is created if one is not provided).

  • multiprocess_lock_save (bool) – Whether to block the code that saves KEGG entries in order to be multiprocess safe. Should not be needed unless pulling across multiple processes.

pull(entry_ids: list[str], output: str, entry_field: str | None = None) PullResult[source]

Makes a single request to the KEGG REST API to pull one or more entries and save them as files.

Parameters:
  • entry_ids (list[str]) – The IDs of the entries to pull and save.

  • output (str) – Path to the location where entries are saved. Treated like a ZIP file if ends in “.zip”, else a directory. If a directory, the directory is created if it doesn’t already exist.

  • entry_field (str | None) – An optional field of the entries to pull.

Returns:

The pull result.

Return type:

PullResult

pull_dict(entry_ids: list[str], entry_field: str | None = None) tuple[PullResult, dict[str, str | bytes]][source]

Rather than saving the KEGG entries to the file system, stores them in-memory as a mapping from the ID to the corresponding entry.

Parameters:
  • entry_ids (list[str]) – The IDs of the entries to pull and include in the mapping.

  • entry_field (str | None) – An optional field of the entries to pull.

Returns:

The pull result and the mapping from entry IDs to KEGG entries as strings (or bytes if the entries are in binary format).

Return type:

tuple[PullResult, dict[str, str | bytes]]

class kegg_pull.pull.AbstractMultiplePull(single_pull: SinglePull, unsuccessful_threshold: float | None = None)[source]

Abstract class that makes multiple requests to the KEGG REST API to pull and save entries of an arbitrary amount.

Parameters:
  • single_pull (SinglePull) – The SinglePull object used for each pull.

  • unsuccessful_threshold (float | None) – If set, the ratio of unsuccessful entry IDs to total entry IDs at which execution stops. Details of the aborted process are logged.

ABORTED_PULL_RESULTS_PATH = 'aborted-pull-results.json'
pull(entry_ids: list[str], output: str, entry_field: str | None = None, force_single_entry: bool = False) PullResult[source]

Makes multiple requests to the KEGG REST API for an arbitrary amount of entry IDs.

Parameters:
  • entry_ids (list[str]) – The IDs that are split into multiple pulls, the entries of which are saved to the file system.

  • output (str) – Path to the location where entries are saved. Treated like a ZIP file if ends in “.zip”, else a directory. If a directory, the directory is created if it doesn’t already exist.

  • entry_field (str | None) – An optional field of the entries to pull.

  • force_single_entry (bool) – Whether to pull only one entry at a time regardless of the entry field specified. Recommended if there are Brite entry IDs.

Returns:

The pull result.

Return type:

PullResult

pull_dict(entry_ids: list[str], entry_field: str | None = None, force_single_entry: bool = False) tuple[PullResult, dict[str, str | bytes]][source]

Rather than saving the KEGG entries to the file system, stores them in-memory as a mapping from the ID to the corresponding entry.

Parameters:
  • entry_ids (list[str]) – The IDs that are split into multiple pulls, the entries of which are stored in the mapping.

  • entry_field (str | None) – An optional field of the entries to pull.

  • force_single_entry (bool) – Whether to pull only one entry at a time regardless of the entry field specified. Recommended if there are Brite entry IDs.

Returns:

The pull result and the mapping from entry IDs to KEGG entries as strings (or bytes if the entries are in binary format).

Return type:

tuple[PullResult, dict[str, str | bytes]]

class kegg_pull.pull.SingleProcessMultiplePull(kegg_rest: KEGGrest | None = None, unsuccessful_threshold: float | None = None)[source]

Class that makes multiple requests to the KEGG REST API to pull entries within a single process.

Parameters:
  • kegg_rest (KEGGrest | None) – Optional KEGGrest object used to make the requests to the KEGG REST API (a KEGGrest object with the default settings is created if one is not provided).

  • unsuccessful_threshold (float | None) – If set, the ratio of unsuccessful entry IDs to total entry IDs at which execution stops. Details of the aborted process are logged.

class kegg_pull.pull.MultiProcessMultiplePull(kegg_rest: KEGGrest | None = None, unsuccessful_threshold: float | None = None, n_workers: int | None = None)[source]

Class that makes multiple requests to the KEGG REST API to pull entries within multiple processes.

Parameters:
  • kegg_rest (KEGGrest | None) – Optional KEGGrest object used to make the requests to the KEGG REST API (a KEGGrest object with the default settings is created if one is not provided).

  • unsuccessful_threshold (float | None) – If set, the ratio of unsuccessful entry IDs to total entry IDs at which execution stops. Details of the aborted process are logged.

  • n_workers (int | None) – The number of processes to use. If None, defaults to the number of cores available.

Pulling Lists of KEGG Entry IDs

Provides API functionality for pulling lists of KEGG entry IDs from the KEGG REST API.

kegg_pull.entry_ids.from_database(database: str, kegg_rest: KEGGrest | None = None) list[str][source]

Pulls the KEGG entry IDs of a given database.

Parameters:
  • database (str) – The KEGG database to pull the entry IDs from. If equal to “brite”, the “br:” prefix is prepended to each entry ID such that they succeed if used in downstream use of the KEGG “get” operation (e.g. for the “pull” API module or CLI subcommand).

  • kegg_rest (KEGGrest | None) – The KEGGrest object to request the entry IDs. If None, one is created with the default parameters.

Returns:

The list of resulting entry IDs.

Raises:

RuntimeError – Raised if the request to the KEGG REST API fails or times out.

Return type:

list[str]

kegg_pull.entry_ids.from_file(file_path: str) list[str][source]

Loads KEGG entry IDs that are listed in a file with one entry ID on each line.

Parameters:

file_path (str) – The path to the file containing the entry IDs.

Returns:

The list of entry IDs.

Raises:

ValueError – Raised if the file is empty.

Return type:

list[str]

kegg_pull.entry_ids.from_keywords(database: str, keywords: list[str], kegg_rest: KEGGrest | None = None) list[str][source]

Pulls entry IDs from a KEGG database based on keywords searched in the entries.

Parameters:
  • database (str) – The name of the database to pull entry IDs from.

  • keywords (list[str]) – The keywords to search entries in the database with.

  • kegg_rest (KEGGrest | None) – The KEGGrest object to request the entry IDs. If None, one is created with the default parameters.

Returns:

The list of entry IDs.

Raises:

RuntimeError – Raised if the request to the KEGG REST API fails or times out.

Return type:

list[str]

kegg_pull.entry_ids.from_molecular_attribute(database: str, formula: str | None = None, exact_mass: float | tuple[float, float] | None = None, molecular_weight: int | tuple[int, int] | None = None, kegg_rest: KEGGrest | None = None) list[str][source]

Pulls entry IDs from a KEGG database containing chemical entries based on one (and only one) of three molecular attributes of the entries.

Parameters:
  • database (str) – The name of the database containing chemical entries.

  • formula (str | None) – The chemical formula to search for.

  • exact_mass (float | tuple[float, float] | None) – The exact mass of the compound to search for (a single value or a range).

  • molecular_weight (int | tuple[int, int] | None) – The molecular weight of the compound to search for (a single value or a range).

  • kegg_rest (KEGGrest | None) – The KEGGrest object to request the entry IDs. If None, one is created with the default parameters.

Returns:

The list of entry IDs.

Raises:

RuntimeError – Raised if the request to the KEGG REST API fails or times out.

Return type:

list[str]

Converts the output of the KEGG “link” operation (of the form that maps the entry IDs of one database to the entry IDs of another) into a dictionary along with other helpful optional functionality.

Parameters:
  • source_database (str) – The name of the database with entry IDs mapped to the target database.

  • target_database (str) – The name of the database with entry IDs mapped from the source database.

  • deduplicate (bool) – Some mappings including “pathway” entry IDs result in half beginning with the normal “path:map” prefix but the other half with a different prefix. If True, removes the IDs corresponding to entries that are identical but with a different prefix.

  • add_glycans (bool) – Whether to add the corresponding compound IDs of equivalent glycan entries. Logs a warning if neither the source nor the target database is “compound”.

  • add_drugs (bool) – Whether to add the corresponding compound IDs of equivalent drug entries. Logs a warning if neither the source nor the target database is “compound”.

  • kegg_rest (KEGGrest | None) – The KEGGrest object to perform the “link” operation. If None, one is created with the default parameters.

Returns:

The dictionary.

Raises:
  • RuntimeError – Raised if the request to the KEGG REST API fails or times out.

  • ValueError – Raised if deduplicate is True but neither source_database nor target_database is “pathway”.

Return type:

dict[str, set[str]]

kegg_pull.map.database_conv(kegg_database: str, outside_database: str, reverse: bool = False, kegg_rest: KEGGrest | None = None) dict[str, set[str]][source]

Converts the output of the KEGG “conv” operation (of the form that maps the entry IDs of one database to the entry IDs of another) into a dictionary.

Parameters:
  • kegg_database (str) – The name of the KEGG database with entry IDs mapped to the outside database.

  • outside_database (str) – The name of the outside database with entry IDs mapped from the KEGG database.

  • reverse (bool) – Reverses the mapping with the target becoming the source and the source becoming the target. Equivalent to calling the reverse() function of this module.

  • kegg_rest (KEGGrest | None) – The KEGGrest object to perform the “conv” operation. If None, one is created with the default parameters.

Returns:

The dictionary.

Raises:

RuntimeError – Raised if the request to the KEGG REST API fails or times out.

Return type:

dict[str, set[str]]

Converts the output of the KEGG “link” operation (of the form that maps specific provided entry IDs to the IDs of a target database) to a dictionary.

Parameters:
  • entry_ids (list[str]) – The IDs of the entries to map to entries in the target database.

  • target_database (str) – The name of the database with entry IDs mapped to from the provided entry IDs.

  • reverse (bool) – Reverses the mapping with the target becoming the source and the source becoming the target. Equivalent to calling the reverse() function of this module.

  • kegg_rest (KEGGrest | None) – The KEGGrest object to perform the “link” operation. If None, one is created with the default parameters.

Returns:

The dictionary.

Raises:

RuntimeError – Raised if the request to the KEGG REST API fails or times out.

Return type:

dict[str, set[str]]

kegg_pull.map.entries_conv(entry_ids: list[str], target_database: str, reverse: bool = False, kegg_rest: KEGGrest | None = None) dict[str, set[str]][source]

Converts the output of the KEGG “conv” operation (of the form that maps specific provided entry IDs to the IDs of a target database) to a dictionary.

Parameters:
  • entry_ids (list[str]) – The IDs of the entries to map to entries in the target database.

  • target_database (str) – The name of the database with entry IDs mapped to from the provided entry IDs.

  • reverse (bool) – Reverses the mapping with the target becoming the source and the source becoming the target. Equivalent to calling the reverse() function of this module.

  • kegg_rest (KEGGrest | None) – The KEGGrest object to perform the “link” operation. If None, one is created with the default parameters.

Returns:

The dictionary.

Raises:

RuntimeError – Raised if the request to the KEGG REST API fails or times out.

Return type:

dict[str, set[str]]

Creates a dictionary that maps the entry IDs of a source database to those of a target database using an intermediate database (“link” operation) e.g. ko-to-compound where the intermediate is reaction (connecting cross-references of ko-to-reaction and reaction-to-compound).

Parameters:
  • source_database (str) – The name of the database with entry IDs to map to the target database.

  • intermediate_database (str) – The name of the database with which two mappings are made i.e. source-to-intermediate and intermediate-to-target, both of which are merged to create source-to-target.

  • target_database (str) – The name of the database with entry IDs to which those of the source database are mapped.

  • deduplicate (bool) – Some mappings including “pathway” entry IDs result in half beginning with the normal “path:map” prefix but the other half with a different prefix. If True, removes the IDs corresponding to entries that are identical but with a different prefix.

  • add_glycans (bool) – Whether to add the corresponding compound IDs of equivalent glycan entries. Logs a warning if neither the source nor the target database are “compound”.

  • add_drugs (bool) – Whether to add the corresponding compound IDs of equivalent drug entries. Logs a warning if neither the source nor the target database are “compound”.

  • kegg_rest (KEGGrest | None) – The KEGGrest object to perform the “link” operations. If None, one is created with the default parameters.

Returns:

The dictionary.

Raises:
  • RuntimeError – Raised if the request to the KEGG REST API fails or times out.

  • ValueError – Raised if deduplicate is True but neither source_database nor target_database is “pathway”.

Return type:

dict[str, set[str]]

kegg_pull.map.combine_mappings(mapping1: dict[str, set[str]], mapping2: dict[str, set[str]]) dict[str, set[str]][source]

Combines two mappings together. If a key in mapping 2 is already in mapping 1, their values are merged in the combined mapping e.g. X -> {A,B} and X -> {B,C} becomes X -> {A,B,C}.

Parameters:
  • mapping1 (dict[str, set[str]]) – The first mapping to combine.

  • mapping2 (dict[str, set[str]]) – The second mapping to combine.

Returns:

The combined mapping.

Return type:

dict[str, set[str]]

kegg_pull.map.reverse(mapping: dict[str, set[str]]) dict[str, set[str]][source]

Reverses the dictionary (mapping entry IDs of one database to IDs of related entries) turning keys into values and values into keys.

Parameters:

mapping (dict[str, set[str]]) – The dictionary (of entry IDs (strings) to sets of entry IDs) to reverse.

Returns:

The reversed mapping.

Return type:

dict[str, set[str]]

kegg_pull.map.to_json_string(mapping: dict[str, set[str]]) str[source]

Converts a mapping of entry IDs (dictionary created with this map module) to a JSON string.

Parameters:

mapping (dict[str, set[str]]) – The dictionary to convert.

Returns:

The JSON string.

Raises:

ValidationError – Raised if the mapping does not follow the correct JSON schema. Should follow the correct schema if the dictionary was created with this map module.

Return type:

str

kegg_pull.map.save_to_json(mapping: dict[str, set[str]], file_path: str) None[source]

Saves a mapping of entry IDs (dictionary created with this map module) to a JSON file, either in a regular directory or ZIP archive.

Parameters:
  • mapping (dict[str, set[str]]) – The mapping to save.

  • file_path (str) – The path to the JSON file. If in a ZIP archive, the file path must be in the following format: /path/to/zip-archive.zip:/path/to/file (e.g. ./archive.zip:mapping.json).

Raises:

ValidationError – Raised if the mapping does not follow the correct JSON schema. Should follow the correct schema if the dictionary was created with this map module.

Return type:

None

kegg_pull.map.load_from_json(file_path: str) dict[str, set[str]][source]

Loads a mapping of entry IDs (dictionary created with this map module) to a JSON file, either in a regular directory or ZIP archive.

Parameters:

file_path (str) – The path to the JSON file. If in a ZIP archive, the file path must be in the following format: /path/to/zip-archive.zip:/path/to/file (e.g. ./archive.zip:mapping.json).

Returns:

The mapping.

Raises:

ValidationError – Raised if the mapping does not follow the correct JSON schema. Should follow the correct schema if the dictionary was created with this map module.

Return type:

dict[str, set[str]]

Flattening A Pathways Brite Hierarchy

Provides API functionality for flattening a pathways Brite hierarchy (ID: ‘br:br08901’) into a collection of its nodes, mapping a node ID to information about it, enabling combinations with other KEGG data.

class kegg_pull.pathway_organizer.HierarchyNode[source]

A dictionary with the following keys:

name: str

The name of the node obtained directly from the Brite hierarchy.

level: int

The level that the node appears in the hierarchy.

parent: str | None

The key (not the name) of the parent node (None if top level node).

children: list[str] | None

The keys (not the names) of the node’s children (None if leaf node).

entry_id: str | None

The entry ID of the node (None if the node does not correspond to a KEGG entry).

class kegg_pull.pathway_organizer.PathwayOrganizer[source]

Contains methods for managing a mapping of node keys to node information, these nodes coming from a pathways Brite hierarchy. An instantiated PathwayOrganizer object must be returned from either PathwayOrganizer.load_from_kegg or PathwayOrganizer.load_from_json. The __init__ is not meant to be called directly. The __str__ method returns a JSON string of hierarchy_nodes.

Variables:

hierarchy_nodes (dict[str, HierarchyNode]) – The mapping of node keys to node information managed by the PathwayOrganizer.

static load_from_kegg(top_level_nodes: set[str] | None = None, filter_nodes: set[str] | None = None, kegg_rest: KEGGrest | None = None) PathwayOrganizer[source]

Pulls the Brite hierarchy from the KEGG REST API and converts it to the hierarchy_nodes mapping.

Parameters:
  • top_level_nodes (set[str] | None) – Node names in the highest level of the hierarchy to select from. If None, all top level nodes are traversed to create the hierarchy_nodes.

  • filter_nodes (set[str] | None) – Names (not keys) of nodes to exclude from the hierarchy_nodes mapping. Neither these nodes nor any of their children will be included.

  • kegg_rest (KEGGrest | None) – Optional KEGGrest object for obtaining the Brite hierarchy. A new KEGGrest object is created by default.

Returns:

The resulting PathwayOrganizer object.

Return type:

PathwayOrganizer

static load_from_json(file_path: str) PathwayOrganizer[source]

Loads the hierarchy_nodes mapping that was cached in a JSON file using load_from_kegg followed by save_to_json.

Parameters:

file_path (str) – Path to the JSON file. If reading from a ZIP archive, the file path must be in the following format: /path/to/zip-archive.zip:/path/to/file (e.g. ./archive.zip:hierarchy-nodes.json).

Returns:

The resulting PathwayOrganizer object.

Raises:

ValidationError – Raised if the JSON file does not follow the correct JSON schema. Should follow the correct schema if hierarchy_nodes was cached using load_from_kegg followed by save_to_json and without any additional alteration.

Return type:

PathwayOrganizer

save_to_json(file_path: str) None[source]

Saves the hierarchy_nodes mapping to a JSON file to cache it.

Parameters:

file_path (str) – The path to the JSON file to save the hierarchy_nodes mapping. If saving in a ZIP archive, the file path must be in the following format: /path/to/zip-archive.zip:/path/to/file (e.g. ./archive.zip:hierarchy-nodes.json).

Return type:

None

KEGG REST API Operations

Provides wrapper methods for the KEGG REST API including all its operations.

class kegg_pull.rest.KEGGresponse(status: Status, kegg_url: AbstractKEGGurl, text_body: str = None, binary_body: bytes = None)[source]

Class containing details of a response from the KEGG REST API.

Variables:
  • status (Status) – The status of the KEGG response.

  • kegg_url (AbstractKEGGurl) – The URL used in the request to the KEGG REST API that resulted in the KEGG response.

  • text_body (str) – The text version of the response body.

  • binary_body (bytes) – The binary version of the response body.

Parameters:
  • status (Status) – The status of the KEGG response.

  • kegg_url (AbstractKEGGurl) – The URL used in the request to the KEGG REST API that resulted in the KEGG response.

  • text_body (str) – The text version of the response body.

  • binary_body (bytes) – The binary version of the response body.

Raises:

ValueError – Raised if the status is SUCCESS but a response body is not provided.

class Status(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

The status of a KEGG response.

SUCCESS = 1
FAILED = 2
TIMEOUT = 3
class kegg_pull.rest.KEGGrest(n_tries: int | None = 3, time_out: int | None = 60, sleep_time: float | None = 5.0)[source]

Class containing methods for making requests to the KEGG REST API, including all the KEGG REST API operations.

Parameters:
  • n_tries (int | None) – The number of times to try to make a request (can succeed the first time, or any of n_tries, or none of the tries).

  • time_out (int | None) – The number of seconds to wait for a request until marking it as timed out.

  • sleep_time (float | None) – The number of seconds to wait in between timed out requests or blacklisted requests.

request(KEGGurl: type[AbstractKEGGurl] = None, kegg_url: AbstractKEGGurl = None, **kwargs) KEGGresponse[source]

General KEGG request function based on a given KEGG URL (either a class that is instantiated or an already instantiated KEGG URL object).

Parameters:
  • KEGGurl (type[AbstractKEGGurl]) – Optional KEGG URL class (extended from AbstractKEGGurl) that’s instantiated with provided keyword arguments.

  • kegg_url (AbstractKEGGurl) – Optional KEGGurl object that’s already instantiated (used if KEGGurl class is not provided).

  • kwargs – The keyword arguments used to instantiate the KEGGurl class, if provided.

Returns:

The KEGG response.

Return type:

KEGGresponse

test(KEGGurl: type[AbstractKEGGurl] | None = None, kegg_url: AbstractKEGGurl | None = None, **kwargs) bool[source]

Tests if a KEGGurl will succeed upon being used in a request to the KEGG REST API.

Parameters:
  • KEGGurl (type[AbstractKEGGurl] | None) – Optional KEGGurl class used to instantiate a KEGGurl object given keyword arguments.

  • kegg_url (AbstractKEGGurl | None) – KEGGurl object that’s already instantiated (used if a KEGGurl class is not provided).

  • kwargs – The keyword arguments used to instantiated the KEGGurl object from the KEGGurl class, if provided.

Returns:

True if the URL would succeed, false if it would fail or time out.

Return type:

bool

list(database: str) KEGGresponse[source]

Executes the “list” KEGG API operation, pulling the entry IDs of the provided database.

Parameters:

database (str) – The database from which to pull entry IDs.

Returns:

The KEGG response.

Return type:

KEGGresponse

get(entry_ids: List[str], entry_field: str | None = None) KEGGresponse[source]

Executes the “get” KEGG API operation, pulling the entries of the provided entry IDs.

Parameters:
  • entry_ids (List[str]) – The IDs of entries to pull.

  • entry_field (str | None) – Optional field to extract from the entries.

Returns:

The KEGG response.

Return type:

KEGGresponse

info(database: str) KEGGresponse[source]

Executes the “info” KEGG API operation, pulling information about a KEGG database.

Parameters:

database (str) – The database to pull information about.

Returns:

The KEGG response

Return type:

KEGGresponse

keywords_find(database: str, keywords: List[str]) KEGGresponse[source]

Executes the “find” KEGG API operation, finding entry IDs based on keywords to search in entries.

Parameters:
  • database (str) – The name of the database containing entries to search for.

  • keywords (List[str]) – The keywords to search in entries.

Returns:

The KEGG response

Return type:

KEGGresponse

molecular_find(database: str, formula: str | None = None, exact_mass: float | tuple[float, float] | None = None, molecular_weight: int | tuple[int, int] | None = None) KEGGresponse[source]

Executes the “find” KEGG API operation, finding entry IDs in chemical databases based on one (and only one) choice of three molecular attributes of the entries.

Parameters:
  • database (str) – The name of the chemical database to search for entries in.

  • formula (str | None) – The chemical formula (one of three choices) of chemical entries to search for.

  • exact_mass (float | tuple[float, float] | None) – The exact mass (one of three choices) of chemical entries to search for (single value or range).

  • molecular_weight (int | tuple[int, int] | None) – The molecular weight (one of three choices) of chemical entries to search for (single value or range).

Returns:

The KEGG response

Return type:

KEGGresponse

database_conv(kegg_database: str, outside_database: str) KEGGresponse[source]

Executes the “conv” KEGG API operation, converting the entry IDs of a KEGG database to those of an outside database.

Parameters:
  • kegg_database (str) – The name of the KEGG database to pull converted entry IDs from.

  • outside_database (str) – The name of the outside database to pull converted entry IDs from.

Returns:

The KEGG response.

Return type:

KEGGresponse

entries_conv(target_database: str, entry_ids: List[str]) KEGGresponse[source]

Executes the “conv” KEGG API operation, converting provided entry IDs from one database to the form of a target database.

Parameters:
  • target_database (str) – The name of the database to get converted entry IDs from.

  • entry_ids (List[str]) – The entry IDs to convert to the form of the target database.

Returns:

The KEGG response.

Return type:

KEGGresponse

Executes the “link” KEGG API operation, showing the IDs of entries in one KEGG database that are connected/related to entries of another KEGG database.

Parameters:
  • target_database (str) – One of the two KEGG databases to pull linked entries from.

  • source_database (str) – The other KEGG database to link entries from the target database.

Returns:

The KEGG response

Return type:

KEGGresponse

Executes the “link” KEGG API operation, showing the IDs of entries that are connected/related to entries of a provided databases.

Parameters:
  • target_database (str) – The KEGG database to find links to the provided entries.

  • entry_ids (List[str]) – The IDs of the entries to link to entries in the target database.

Returns:

The KEGG response

Return type:

KEGGresponse

ddi(drug_entry_ids: List[str]) KEGGresponse[source]

Executes the “ddi” KEGG API operation, searching for drug to drug interactions. Providing one entry ID reports all known interactions, while providing multiple checks if any drug pair in a given set of drugs is CI or P. If providing multiple, all entries must belong to the same database.

Parameters:

drug_entry_ids (List[str]) – The IDs of the drug entries within which search for drug interactions.

Returns:

The KEGG response

Return type:

KEGGresponse

kegg_pull.rest.request_and_check_error(kegg_rest: KEGGrest | None = None, KEGGurl: type[AbstractKEGGurl] | None = None, kegg_url: AbstractKEGGurl = None, **kwargs) KEGGresponse[source]

Makes a general request to the KEGG REST API using a KEGGrest object. Creates the KEGGrest object if one is not provided. Additionally, raises an exception if the request is not successful, specifying the URL that was unsuccessful.

Parameters:
  • kegg_rest (KEGGrest | None) – The KEGGrest object to perform the request. If None, one is created with the default parameters.

  • KEGGurl (type[AbstractKEGGurl] | None) – Optional KEGG URL class (extended from AbstractKEGGurl) that’s instantiated with provided keyword arguments.

  • kegg_url (AbstractKEGGurl) – Optional KEGGurl object that’s already instantiated (used if KEGGurl class is not provided).

  • kwargs – The keyword arguments used to instantiate the KEGGurl class, if provided.

Returns:

The KEGG response

Raises:

RuntimeError – Raised if the request fails or times out.

Return type:

KEGGresponse

Constructing URLs for the KEGG REST API

Classes for creating and validating KEGG REST API URLs.

class kegg_pull.kegg_url.AbstractKEGGurl(rest_operation: str, base_url: str = 'https://rest.kegg.jp', **kwargs)[source]

Abstract class which validates and constructs URLs for accessing the KEGG REST API and contains the base data and functionality for all KEGG URL classes.

Variables:

url (str) – The constructed and validated KEGG URL.

Parameters:
  • rest_operation (str) – The KEGG REST API operation in the URL.

  • base_url (str) – The base URL for accessing the KEGG web API.

  • kwargs – The arguments used to construct the REST options after they are validated.

Raises:

ValueError – Raised if the given arguments cannot construct a valid KEGG URL.

class kegg_pull.kegg_url.ListKEGGurl(database: str)[source]

Contains URL construction and validation functionality of the KEGG API list operation.

Parameters:

database (str) – The database option for the KEGG list URL.

Raises:

ValueError – Raised if the provided database is not valid.

class kegg_pull.kegg_url.InfoKEGGurl(database: str)[source]

Contains URL construction and validation functionality of the KEGG API info operation.

Parameters:

database (str) – The database option for the KEGG info URL.

Raises:

ValueError – Raised if the provided database is not valid.

class kegg_pull.kegg_url.GetKEGGurl(entry_ids: list[str], entry_field: str | None = None)[source]

Contains URL construction and validation functionality for the KEGG API get operation.

Variables:
  • MAX_ENTRY_IDS_PER_URL (str) – The maximum number of entry IDs allowed in a single get KEGG URL.

  • entry_ids (list) – The entry IDs of the get KEGG URL.

Parameters:
  • entry_ids (list[str]) – Specifies which entry IDs go in the first option of the URL.

  • entry_field (str | None) – Specifies which entry field goes in the second option.

Raises:

ValueError – Raised if the entry IDs or entry field is not valid.

MAX_ENTRY_IDS_PER_URL = 10
property multiple_entry_ids: bool

Determines whether the get KEGG URL has more than one entry ID.

static only_one_entry(entry_field: str | None) bool[source]

Determines whether a KEGG entry field can only be pulled in one entry at a time for the KEGG get API operation.

Parameters:

entry_field (str | None) – The KEGG entry field to check.

Return type:

bool

static is_binary(entry_field: str | None) bool[source]

Determines if the entry field is a binary response or not.

Parameters:

entry_field (str | None) – The KEGG entry field to check.

Return type:

bool

class kegg_pull.kegg_url.KeywordsFindKEGGurl(database: str, keywords: list[str])[source]

Contains the URL construction and validation functionality for the KEGG API find operation based on the URL form that searches entries by keywords.

Parameters:
  • database (str) – The database name option for the first part of the URL.

  • keywords (list[str]) – The keyword options for the second part of the URL.

Raises:

ValueError – Raised if the database name is invalid or keywords are not provided.

class kegg_pull.kegg_url.MolecularFindKEGGurl(database: str, formula: str | None = None, exact_mass: float | tuple[float, float] | None = None, molecular_weight: int | tuple[int, int] | None = None)[source]

Contains the URL construction and validation functionality for the KEGG API find operation based on the URL form that uses chemical / molecular attributes of compounds.

Parameters:
  • database (str) – The database name option for the first part of the URL.

  • formula (str | None) – The chemical formula option that can go in the second part of the URL.

  • exact_mass (float | tuple[float, float] | None) – The exact molecule mass option that can go in the second part of the URL.

  • molecular_weight (int | tuple[int, int] | None) – The molecular weight option that can go in the second part of the URL.

Raises:

ValueError – Raised if the provided database name or molecular attribute is invalid.

class kegg_pull.kegg_url.AbstractConvKEGGurl(**kwargs)[source]

Abstract class containing data shared by the KEGG URL classes that validate and construct URLs for the conv KEGG REST API operation.

Parameters:

kwargs – Arguments for the URL validation and construction.

Raises:

ValueError – Raised if the provided arguments cannot construct a valid conv KEGG URL.

class kegg_pull.kegg_url.DatabaseConvKEGGurl(kegg_database: str, outside_database: str)[source]

Contains the URL construction and validation functionality of the KEGG API conv operation based on the URL form that uses a KEGG database and an outside database.

Parameters:
  • kegg_database (str) – The name of the KEGG database.

  • outside_database (str) – The name of the outside database.

Raises:

ValueError – Raised if the database names are not valid or are not of the same type.

class kegg_pull.kegg_url.EntriesConvKEGGurl(target_database: str, entry_ids: list[str])[source]

Contains the URL construction and validation functionality for the KEGG API conv operation based on the URL form that uses a target database and entry IDs.

Parameters:
  • target_database (str) – The target database option.

  • entry_ids (list[str]) – The entry IDs options.

Raises:

ValueError – Raised if the target database is invalid or entry IDs are not provided.

class kegg_pull.kegg_url.AbstractLinkKEGGurl(**kwargs)[source]

Abstract class containing the shared data for the link KEGG URLs.

Parameters:

kwargs – The arguments to validate and construct the URL.

Raises:

ValueError – Raised if the provided arguments are invalid.

class kegg_pull.kegg_url.DatabaseLinkKEGGurl(target_database: str, source_database: str)[source]

Contains the URL construction and validation functionality for the link KEGG REST API operation of the form that uses a target database and a source database.

Parameters:
  • target_database (str) – The name of the target database option.

  • source_database (str) – The name of the source database option.

Raises:

ValueError – Raised if the databases are invalid.

class kegg_pull.kegg_url.EntriesLinkKEGGurl(target_database: str, entry_ids: list[str])[source]

Contains the URL construction and validation functionality for the link KEGG REST API operation of the form that uses a target database and entry IDs.

Parameters:
  • target_database (str) – The name of the target database option.

  • entry_ids (list[str]) – The entry IDs options.

Raises:

ValueError – Raised if the target database is invalid or entry IDs are not provided.

class kegg_pull.kegg_url.DdiKEGGurl(drug_entry_ids: list[str])[source]

Contains the URL construction and validation functionality for the ddi KEGG REST operation.

Parameters:

drug_entry_ids (list[str]) – The entry IDs for a drug database.

Raises:

ValueError – Raised if the drug entry IDs are not provided.