mosaicmpi.integration.Integration#

class mosaicmpi.integration.Integration(datasets: dict[str, mosaicmpi.dataset.Dataset], corr_method: str = 'pearson', max_median_corr: float = 0, negative_corr_quantile: float = 0.95, k_subset: Collection[int] | Dict[str, Collection[int]] = (2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60))#

Integrate multiple datasets together.

Parameters:

datasets (dict[str, Dataset]) – dictionary of name: Dataset pairs.
corr_method (str, optional) – Correlation method: “pearson”, “spearman”, or “kendall”, defaults to “pearson”
max_median_corr (float, optional) – Threshold for rank reduction procedure, relevant only for datasets where programs tend to be highly correlated. This procedure reduces the maximum rank included for a dataset until the median of the correlation distribution is below the threshold. Defaults to 0
negative_corr_quantile (float, optional) – Threshold for network-based integration, between 0 and 1, with 1 resulting in fewer edges in the network. Defaults to 0.95
k_subset (Union[Collection[int], Dict[str, Collection[int]]], optional) – k-values to use for integration. Either a Collection of integers, or a dict specifying k-values separately for each dataset. Defaults to (2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60)

Attributes

n_datasets

Get the number of datasets in the integration

sample_to_patient

return:: Series with dataset and sample ID index. Values are the patient from which the samples/observations were derived.

selected_k

Gets the values of k selected for integration.

Methods

`compute_corr`([method, cpus])	Computes correlation matrix of all programs in the integration from all datasets.
`compute_pairwise_thresholds`([...])	Compute thresholds for each dataset and dataset pair based on the correlation distribution of programs.
`filter_programs_rank_reduction`([max_median_corr])	Filter programs using the rank-reduction procedure, relevant only for datasets where programs tend to be highly correlated.
`get_category_overrepresentation`(layer[, ...])	Calculate Pearson residual of chi-squared test, associating programs for each rank (k) to categories of samples/observations.
`get_corr_matrix_lowertriangle`([...])	Get the lower triangular correlation matrix for building the correlation network.
`get_features_overlap_table`()
`get_hvf_overlap_table`()
`get_metadata_correlation`(layer[, ...])	Calculate correlation of programs usage to numerical metadata across samples/observations.
`get_metadata_df`([include_categorical, ...])	Get sample/observation metadata for all datasets.
`get_node_table`()	Get node counts before and after various node and edge filters.
`get_programs`([type])	Get programs.
`get_usages`([discretize, normalize])	Calculate usage of each program in each dataset and sample/observation.
`select_k_values`([k_subset, ...])	Select k-values for integration.

mosaicmpi.integration.Integration

Contents

mosaicmpi.integration.Integration#