Differential Flux Analysis

Compare reaction flux distributions across conditions with a Kolmogorov–Smirnov test

The significantly_altered_reactions function takes as input at least 2 flux sampling conditions to compare and identifies significantly altered reactions. It performs a Kolmogorov-Smirnov (KS) non-parametric test and corrects p-value for multiple comparisons. It additionally calculates a fold change that together with the p-value classifies reactions as significantly altered or not. It returns 2 lists one with the significantly changed (significant_diff_reactions) and one with the not significantly changed (not_significant_diff_reactions) reactions. Also, 2 dicitonaries mapping reaction IDs to corrected p_values (pval_dict) and fold change values (fold_change_dict)

  • conditions is a list of different flux sampling arrays

  • selected_comparisons is a list showing which conditions to compare (useful when comparing more than 2 sampling arrays)

  • cobra_model is a cobra model object

  • fold_change_cutoff is a cutoff for fold-change to consider 2 distributions significantly different

  • std_cutoff is a cutoff to ensure distributions that are compared are not fixed to a certain value

conditions = [samples_optgp_condition_100, samples_optgp_condition_0]

selected_comparisons = [(0, 1)]

(significant_diff_reactions,
 not_significant_diff_reactions,
 pval_dict,
 fold_change_dict) = significantly_altered_reactions(
    conditions = conditions, 
    selected_comparisons = selected_comparisons, 
    cobra_model = ec_cobra_model,
    fold_change_cutoff = 0.6,
    std_cutoff = 1e-3)

The significantly_altered_reactions function creates a volcano plot to show results from differential flux analysis. Volcano plot has Fold Change on the x-axis and -log10(p_value) on the y-axis. Users can provide a reactions list in the annotate parameter, to show reaction IDs on the plot. Also, lines showing the significance cutoffs may optionally be added, when providing p_value_cutoff, fold_change_cutoff and show_cutoff_lines.

  • pval_dict is a dictionary mapping reaction ID to corrected p-value.

  • fold_change_dict is a dictionary mapping reaction ID to fold-change value.

  • p_value_cutoff is a significance threshold for p-value (to appear in the plot).

  • fold_change_cutoff is a threshold for fold-change (to appear in the plot).

  • annotate is a list of reaction IDs to annotate on the plot.

  • width is the width of the figure in pixels.

  • height is the height of the figure in pixels.

  • title is the title of the plot.

  • show_cutoff_lines is a boolean variable for whether to draw p-value and fold-change cutoff lines.

reactions_to_annotate = ["NH4t"]

plot_volcano(
    pval_dict = pval_dict,
    fold_change_dict = fold_change_dict,
    p_value_cutoff = 0.05,
    fold_change_cutoff = 0.6,
    annotate = reactions_to_annotate,
    width = 800,
    height = 600,
    title = "",
    show_cutoff_lines = True)

volcano_plot

Hypergeometric test for pathway enrichment

The dictionary_reaction_to_all_pathways function is used to create a dictionary mapping reaction IDs to all pathways they belong.

  • df_kegg_pathways is a pandas DataFrame with reactions and pathway names, created from the get_kegg_pathways_from_reaction_ids function

  • reaction_col is the column name for reaction identifiers, as expected to appear in the df_kegg_pathways dataframe

  • pathway_col is the column name for list of pathways, as expected to appear in the df_kegg_pathways dataframe

reaction_to_pathways = dictionary_reaction_to_all_pathways(
    df_kegg_pathways = df_kegg_pathways,
    reaction_col = 'model_reaction', 
    pathway_col = 'pathway_names')

print(reaction_to_pathways.get("PGI"))
['Starch and sucrose metabolism', 'Metabolic pathways', 'Biosynthesis of secondary metabolites']

The hypergeometric_test_pathway_enrichment function performs a hypergeometric test to find significantly affected pathways between our sampling conditions. It also calculated fold enrichment and p_values useful for filtering significant changes

  • significant_reactions is a list of reaction IDs with altered flux.

  • model_reactions is a list of all reaction IDs considered in the analysis.

  • reaction_to_pathways is a dictinary mapping reaction ID -> list of pathway names.

hypergeometric_pathway_enrichment_df = hypergeometric_test_pathway_enrichment(
    significant_reactions = significant_diff_reactions, 
    model_reactions = ec_cobra_reaction_ids, 
    reaction_to_pathways = reaction_to_pathways)

The plot_pathway_enrichment function takes as input the hypergeometric_pathway_enrichment_df dataframe created with the hypergeometric_test_pathway_enrichment function and plots the enrichment results in a bubble plot. Bubble size stands for pathway size (number of reactions) and reaction colour stands for -log10(p-value)

  • hypergeometric_enrichment_df is a dataframe from the hypergeometric_test_pathway_enrichment function.

  • pval_threshold is a significance p value threshold for filtering.

  • use_fdr is a boolean for whether to use ‘fdr’ or ‘pval’ for filtering.

  • font_size is the font size for plot labels and title.

  • width is the width of the figure in pixels.

  • height is the height of the figure in pixels.

plot_pathway_enrichment(
    hypergeometric_enrichment_df = hypergeometric_pathway_enrichment_df, 
    pval_threshold = 2,
    use_fdr = True,
    font_size = 14,
    width = 900,
    height = 600)

pathway_enrichment