Correlations from reactions samples

Prerequisites for calculating pairwise correlations

In some cases, the forward direction of reactions (as defined in the model) does not align with the observed network topology. As a result, pairs of reactions may show identical correlation magnitudes but with opposite signs than the expected result.

To deal with this, we split all reversible reactions with both positive and negative flux to separate reactions. Thus, one of the 2 (forward or reverse) has the expected correlations with reactions from the rest network.

The split_forward_reverse function splits all bidirectional reactions having at least 1 positive and 1 negative flux value in a sampling array into separate forward and reverse reactions. This step is useful to avoid losing information, when computing correlations from a reactions set that includes reversible reactions.

  • steady_states is the default flux Sampling array with only forward reactions

  • reactions is a list with reactions of interest to be split and result in separate forward and reverse reactions. This list includes reactions from a pathway of interest and thus not all reactions from the list will be split (only those that have both positive and negative flux)

(subset_extended_steady_states_100,
 subset_extended_reactions_100) = split_forward_reverse(
    steady_states = subset_pathways_optgp_condition_100, 
    reactions = reactions_in_pathways_ordered)

(subset_extended_steady_states_0,
 subset_extended_reactions_0) = split_forward_reverse(
    steady_states = subset_pathways_optgp_condition_0, 
    reactions = reactions_in_pathways_ordered)

The find_reactants_products function identifies and saves in separate lists the reactants, products and directionality status of each reaction in a given metabolic model. Cofactors do not count as reactants/products.

  • cobra_model is a metabolic model in cobra format

  • reactions_ids is a list of the reactions of interest

(reversibility_list_all_reactions_100, 
 reactants_list_all_reactions_100,
 products_list_all_reactions_100) = find_reactants_products(
    cobra_model = ec_cobra_model,
    reactions_ids = subset_extended_reactions_100)

(reversibility_list_all_reactions_0, 
 reactants_list_all_reactions_0,
 products_list_all_reactions_0) = find_reactants_products(
    cobra_model = ec_cobra_model,
    reactions_ids = subset_extended_reactions_0)

The sharing_metabolites_square_matrix function works given the lists with reactants, products and reversibility information, creates a square boolean matrix with True values representing reactions sharing a common metabolite, as implemented in sharing_metabolites helper function.

  • reactions_ids is a list containing IDs from a set of reactions of interest

  • reversibility_list_all_reactions is a list containing the reversibility status from a set of reactions (reactions must be the same and in accordance with their order in reactions_ids list. This is ensured with the sort_reactions_by_model_order function)

  • reactants_list_all_reactions is a list containing the reactants from a set of reactions (reactions must be the same and in accordance with their order in reactions_ids list. This is ensured with the sort_reactions_by_model_order function)

  • products_list_all_reactions is a list containing the products from a set of reactions (reactions must be the same and in accordance with their order in reactions_ids list. This is ensured with the sort_reactions_by_model_order function)

subset_boolean_sharing_metabolites_matrix_100 = sharing_metabolites_square_matrix(
    reactions_ids = subset_extended_reactions_100, 
    reversibility_list_all_reactions = reversibility_list_all_reactions_100,
    reactants_list_all_reactions = reactants_list_all_reactions_100,
    products_list_all_reactions = products_list_all_reactions_100)

subset_boolean_sharing_metabolites_matrix_0 = sharing_metabolites_square_matrix(
    reactions_ids = subset_extended_reactions_0, 
    reversibility_list_all_reactions = reversibility_list_all_reactions_0,
    reactants_list_all_reactions = reactants_list_all_reactions_0,
    products_list_all_reactions = products_list_all_reactions_0)

Calculation of pairwise linear correlations and non-linear copula dependencies

The correlated_reactions function calculates pairwise linear correlations and non-linear copula dependencies given a flux sampling array. Users can choose the preffered coefficient to calculate pairwise correlations (pearson or spearman). Moreover, users can filter correlations not meeting a cutoff value and choose whether to proceed with calculation of non-linear dependencies using copulas. Correlations and dependencies from reactions not sharing metabolites can also be removed.

  • steady_states is a numpy array of the generated steady states fluxes

  • boolean_sharing_metabolites_matrix is a boolean symmetric numpy 2D array with True / False based on the presense of shared metabolites between reactions. This matrix, which is optionally provided, is calculated with the sharing_metabolites_square_matrix function

  • reactions is a list with the reactions IDs (must be in accordance with the rows of the steady states)

  • linear_coeff is a string declaring the linear coefficient to be selected. Available options: “pearson”, “spearman”

  • linear_corr_cutoff is a cutoff to filter (remove) linear correlations (not greater than the cutoff) based on the pearson or spearmanr coefficient

  • indicator_cutoff is a cutoff to classify non-linear correlations as positive, negative or non-significant

  • jensenshannon_cutoff is a cutoff to filter (remove) non-linear correlations (not greater than the cutoff) based on the Jensen-Shannon metric

  • std_cutoff is a cutoff to avoid computing the copula between 2 fluxes with almost fixed values

  • include_non_linear is a boolean variable that if True, takes into account and calculates non-linear copula dependencies

  • cells is the number of cells to compute the copula

  • cop_coeff is a value that narrows or widens the width of the copula’s diagonal (use lower values to capture extreme tail dependences)

  • lower_triangle is a boolean variable that if True returns only the lower triangular matrix (useful for heatmap visualization)

  • verbose is a boolean variable that if True additional information is printed as an output to the user.

If user requests calculation of non-linear copula dependencies, 3 numpy arrays and 1 dictionary are returned, as shown below.

(subset_linear_correlation_matrix_100,
subset_non_linear_correlation_matrix_100,
subset_mixed_correlation_matrix_100,
subset_correlations_dictionary_100) = correlated_reactions(
        steady_states = subset_extended_steady_states_100,
        boolean_sharing_metabolites_matrix = None,
        reactions = subset_extended_reactions_100,
        linear_coeff = "pearson",
        linear_corr_cutoff = 0.3, 
        indicator_cutoff = 1.2,
        jensenshannon_cutoff = 0.05,
        std_cutoff = 1e-2,
        include_non_linear = True, 
        cells = 5, 
        cop_coeff = 0.2, 
        lower_triangle = False, 
        verbose = True
)

(subset_linear_correlation_matrix_0, 
subset_non_linear_correlation_matrix_0, 
subset_mixed_correlation_matrix_0, 
subset_correlations_dictionary_0) = correlated_reactions(
        steady_states = subset_extended_steady_states_0,
        boolean_sharing_metabolites_matrix = None,
        reactions = subset_extended_reactions_0,
        linear_coeff = "pearson",
        linear_corr_cutoff = 0.3, 
        indicator_cutoff = 1.2,
        jensenshannon_cutoff = 0.05,
        std_cutoff = 1e-2,
        include_non_linear = True, 
        cells = 5, 
        cop_coeff = 0.2, 
        lower_triangle = False, 
        verbose = True
)

If user does not request calculation of non-linear copula dependencies, 1 numpy array and 1 dictionary are returned, as shown below.

(subset_linear_correlation_matrix_100,
subset_correlations_dictionary_100) = correlated_reactions(
        steady_states = subset_extended_steady_states_100,
        boolean_sharing_metabolites_matrix = None,
        reactions = subset_extended_reactions_100,
        linear_coeff = "pearson",
        linear_corr_cutoff = 0.3, 
        indicator_cutoff = 1.2,
        jensenshannon_cutoff = 0.05,
        std_cutoff = 1e-2,
        include_non_linear = False, 
        cells = 5, 
        cop_coeff = 0.2, 
        lower_triangle = False, 
        verbose = True
)

(subset_linear_correlation_matrix_0, 
subset_correlations_dictionary_0) = correlated_reactions(
        steady_states = subset_extended_steady_states_0,
        boolean_sharing_metabolites_matrix = None,
        reactions = subset_extended_reactions_0,
        linear_coeff = "pearson",
        linear_corr_cutoff = 0.3, 
        indicator_cutoff = 1.2,
        jensenshannon_cutoff = 0.05,
        std_cutoff = 1e-2,
        include_non_linear = False, 
        cells = 5, 
        cop_coeff = 0.2, 
        lower_triangle = False, 
        verbose = True
)

The dictionary returned from this function is independent and not filtered from the boolean_sharing_metabolites_matrix. It contains information on pairwise reactions linear coefficient values, the copula’s jensen shannon distance value, the copula’s indicator value and a copula classification value.

print(subset_correlations_dictionary_100.get("PYK~PGK"))
print(subset_correlations_dictionary_0.get("PYK~PGK"))
{'pearson': 0, 'jensenshannon': 0.11517866713970133, 'indicator': 1.6059379192742054, 'classification': 'positive_upper_lower_tail'}

{'pearson': 0, 'jensenshannon': -0.10089110481651221, 'indicator': 0.7628205134721049, 'classification': 'negative_upper_lower_tail'}

The plot_correlation_matrix function plots a correlation matrix created with the correlated_reactions function.

  • correlation_matrix is the correlation matrix to plot.

  • reactions is an optional list of reaction labels to be plot as axes in the plot.

  • label_font_size is the font size for the axes labels

  • width is the width of the plot

  • height is the height of the plot

plot_correlation_matrix(
    correlation_matrix = subset_mixed_correlation_matrix_100, 
    reactions = subset_extended_reactions_100, 
    label_font_size = 10,
    width = 900,
    height = 900)

heatmap_correlation_matrix