The icikt API Reference

Python Information-Content-Informed Kendall Tau Correlation (ICIKT)

The icikt package provides a Python tool to calculate an information-content-informed Kendall Tau correlation coefficient between arrays, while also handling missing values or values which need to be removed.

icikt.methods.get_global_data(shmName, shape, dtype)[source]

Retrieve global data from shared memory.

Parameters:
  • shmName – Name of global data

  • shape – Shape of global data

  • dtype – dtype of global data

Returns:

data ndarray

icikt.methods.icikt(x: ndarray, y: ndarray, perspective: str = 'global') tuple[source]

Finds missing values, and replaces them with a value slightly smaller than the minimum between both arrays.

Parameters:
  • x – First array of data

  • y – Second array of data

  • perspective – perspective can be ‘local’ or ‘global’. Default is ‘global’. Global includes (NA,NA) pairs in the calculation, while local does not.

Returns:

tuple with correlation, pvalue, and tauMax values

icikt.methods.iciktArray(dataArray: ndarray, globalNA: List[float] = [nan, inf, 0.0], perspective: str = 'global', scaleMax: bool = True, diagGood: bool = True, chunkSize: int = 1, includeOnly: tuple = None) tuple[source]

Calls iciKT to calculate ICI-Kendall-Tau between every combination of columns in the input 2d array, dataArray. Also replaces any instance of the globalNA in the array with np.nan.

Parameters:
  • dataArray – 2d array with columns of data to analyze

  • globalNA – Optional list of values to be considered “missing”. Default is NaN, Inf, and 0.

  • perspective – perspective can be ‘local’ or ‘global’. Default is ‘global’. Global includes (NA,NA) pairs in the calculation, while local does not.

  • scaleMax – should everything be scaled compared to the maximum correlation?

  • diagGood – should the diagonal entries reflect how many entries in the sample were “good”?

  • chunkSize – What should the size of the chunks be for multiprocessing? Default is 1.

  • includeOnly – only run correlations of specified columns/combinations

Returns:

tuple of the output correlations, raw correlations, pvalues, and max tau 2d arrays

Future Parameters: featureNA sampleNA

icikt.methods.icikt_mp_wrapper(pairwiseIndices: ndarray, perspective: str, shm: SharedMemory, shape: tuple, dtype: dtype) tuple[source]

Wrapper function which is given to multiprocessing. This then calls the icikt method using the indices of pairwise combinations and the perspective.

Parameters:
  • pairwiseIndices – Indices of pairwise combination

  • perspective – perspective can be ‘local’ or ‘global’. Default is ‘global’. Global includes (NA,NA) pairs in the calculation, while local does not.

  • dtype – dtype in globalShm array

  • shape – shape of global array

  • shm – shared memory of global data

Returns:

tuple result of the icikt method

icikt.methods.initialize_global_data(data)[source]

Initialize the global data array using shared memory

Parameters:

data – data input

Returns:

global variable data

icikt.kendall_dis.kendall_dis(x, y)[source]

Returns the kendall tau distance between two arrays

Parameters:
  • x – First array of data

  • y – Second array of data