easyLSEA provides a complete pipeline for Lipid Set Enrichment Analysis (LSEA) in R. Starting from a table of differential lipid abundances, it annotates lipids into biological groups, tests whether those groups are systematically shifted between conditions, and produces publication-ready bubble plots, distribution plots, and fatty acid chain visualizations.
The package runs two complementary enrichment engines:
Running both engines and comparing their results (convergence analysis) gives a more complete picture of lipid remodeling than either method alone.
Install the stable version from CRAN:
Or the development version from GitHub:
To enable the fgsea engine, install the optional Bioconductor dependency:
easyLSEA expects a data.frame with at
least:
| Column | Description |
|---|---|
LipidName |
Lipid identifier in standard shorthand notation
(e.g. PC 36:4, TG 54:3) |
logFC |
log2 fold-change (case vs reference) |
P.Value |
Raw (unadjusted) p-value — used for the fgsea pi-value rank metric |
Additional columns are used when present:
| Column | Used for |
|---|---|
adj.P.Val |
Counting significantly changed lipids |
Confidence_rank |
Annotation confidence filtering |
Shorthand |
Alternative lipid name fallback |
Column names are configurable via lipid_col,
fc_col, and pval_col arguments — only the
defaults are shown above.
The entire pipeline runs in a single call to
easyLSEA():
library(easyLSEA)
result <- easyLSEA(
data = my_lipid_data,
lipid_col = "LipidName",
fc_col = "logFC",
pval_col = "P.Value",
case_lbl = "NASH",
ref_lbl = "Control",
engine = "both", # run KS and fgsea
min_rank = "E" # include all confidence ranks except P and NA (default)
)This returns an easyLSEA_result object with five slots
described in the next section.
result$metaA named list with run metadata: date, comparison labels, engine used, number of lipids, and the original function call.
result$lseaContains the enrichment statistics. The key sub-elements are:
# KS results — one row per lipid set per grouping level
head(result$lsea$ks)
# fgsea results
head(result$lsea$fgsea)
# Combined table with Convergence column
head(result$lsea$combined)Key columns in the KS results:
| Column | Description |
|---|---|
Group |
Lipid set name (e.g. PC,
Glycerolipids) |
Level |
Grouping level (LipidClass,
LipidCategory_LMAPS,
LipidCategory_functional) |
N_group |
Number of lipids in the set |
DirectionalScore |
Standardized mean difference (Cohen’s d analog). Positive = up in case. |
KS_pval |
Two-sided KS p-value |
FDR_LSEA |
BH-adjusted FDR |
DS_perm_pval |
Permutation p-value for the DirectionalScore |
ContributingLipids_KS |
Lipids on the enriched side of the CDF divergence point |
Key columns in the fgsea results:
| Column | Description |
|---|---|
NES |
Normalized Enrichment Score. Positive = enriched toward top of ranked list (up in case). |
FDR_fgsea |
BH-adjusted FDR from fgsea |
N_leading |
Leading edge size |
LeadingEdge |
Lipids in the leading edge |
rank_metric |
Rank metric used (pi_value, logFC, or
t_stat) |
The Convergence column in the combined
table classifies each set as:
| Value | Meaning |
|---|---|
KS+fgsea [strongest] |
Significant by both engines — highest confidence |
KS only [distributed effect] |
Moderate shift across many lipids |
fgsea only [extreme-driven] |
A few strongly regulated lipids drive the signal |
Neither |
Not significant by either engine |
result$chainsFatty acid chain analysis results, available when
run_chains = TRUE:
# Long format — one row per acyl chain per lipid
head(result$chains$parsed)
# Parsing status — one row per lipid
head(result$chains$summary)
# Wide format — one row per lipid with sn positions and totals
head(result$chains$wide)The wide table is the most convenient for reporting.
Each row is one lipid, with columns sn1, sn2,
sn3, sn4 containing the individual acyl chain
positions (e.g. "18:1"), and total_carbons /
total_unsat with the summed totals. The
chain_type column clarifies how to interpret the sn
columns:
chain_type |
sn1 |
sn2 |
sn3 |
sn4 |
|---|---|---|---|---|
sn2 (PC, PE, PS…) |
sn-1 chain | sn-2 chain | NA | NA |
nacyl (Cer, SM…) |
sphingoid base | N-acyl chain | NA | NA |
long_format (TG) |
chain 1 | chain 2 | chain 3 | NA |
long_format (CL) |
chain 1 | chain 2 | chain 3 | chain 4 |
single (CAR, LPC) |
the chain | NA | NA | NA |
result$plotsA named list of ggplot2 objects, available when
plots = TRUE:
Plot naming convention for LSEA bubble plots:
| Name pattern | Description |
|---|---|
bubble_ks_01_Class |
KS bubble plot — lipid class level |
bubble_ks_sig_01_Class |
KS bubble plot — significant sets only |
bubble_fgsea_01_Class |
fgsea bubble plot — lipid class level |
bubble_fgsea_sig_01_Class |
fgsea bubble plot — significant sets only |
dist_01_Class |
Distribution (boxplot) — lipid class level |
Levels: 01_Class (lipid class), 02_LMAPS
(LIPID MAPS category), 03_Functional (functional
category).
Individual plots can be displayed directly:
# KS bubble plot — all lipid classes
result$plots$lsea$bubble_ks_01_Class
# fgsea bubble plot — significant sets only
result$plots$lsea$bubble_fgsea_sig_01_Class
# Distribution plot — lipid class level
result$plots$lsea$dist_01_ClassTo customize bubble labels:
# Regenerate plots showing only FDR and n
plots <- plot_lsea(
result$lsea,
case_lbl = "NASH",
ref_lbl = "Control",
bubble_label = c("FDR", "n")
)export_lsea() saves all results to a timestamped folder.
Supply the output directory explicitly via dir (here a
temporary directory; for a real analysis use a folder of your
choice):
This creates:
easyLSEA_NASH_vs_Control_2024-01-15_1430/
tables/
lsea_results_ks.csv
lsea_results_fgsea.csv
lsea_combined.csv
chain_results.csv
plots/
lsea/
01_Class/
bubble_ks_01_Class.pdf
bubble_ks_sig_01_Class.pdf
bubble_fgsea_01_Class.pdf
bubble_fgsea_sig_01_Class.pdf
dist_01_Class.pdf
02_LMAPS/ ...
03_Functional/ ...
chains/
tile/ ...
trend/ ...
results.xlsx
# Step 1: annotate
annotated <- annotate_lipids(my_lipid_data, lipid_col = "LipidName")
# Step 2: run enrichment
lsea_res <- run_lsea(
data = annotated,
fc_col = "logFC",
engine = "both",
case_lbl = "NASH",
ref_lbl = "Control"
)
# Step 3: generate plots manually
plots <- plot_lsea(
lsea_res,
case_lbl = "NASH",
ref_lbl = "Control",
fdr_thresh = 0.05,
bubble_label = c("FDR", "DS", "NES", "n")
)
# Step 4: distribution plot for a specific level
p_dist <- plot_distribution(
data = annotated,
lsea_result = lsea_res,
group_col = "LipidClass",
case_lbl = "NASH",
ref_lbl = "Control"
)# Default: pi-value = sign(logFC) * -log10(P.Value)
# Combines effect size and statistical evidence
# Alternative: logFC only
result_fc <- easyLSEA(
data = my_lipid_data,
engine = "fgsea",
fgsea_rank = "logFC"
)
# Alternative: LIMMA t-statistic (requires a 't' column)
result_t <- easyLSEA(
data = my_lipid_data,
engine = "fgsea",
fgsea_rank = "t_stat"
)When lipid annotations include a confidence rank (A > B > C
> D > E > P), min_rank controls which lipids enter
the chain analysis:
# Default: include all except P and NA
result_all <- easyLSEA(data = my_lipid_data, min_rank = "E")
# Strict: include only high-confidence annotations (A and B)
result_strict <- easyLSEA(data = my_lipid_data, min_rank = "B")
# Or apply directly to parse_lipid_chains
chains_strict <- parse_lipid_chains(annotated, min_rank = "B")
table(chains_strict$summary$status)Lipids excluded by min_rank appear in
result$chains$summary with status
excluded_rank_below_X (where X is the chosen threshold),
making it easy to audit which lipids were filtered.
Both engines test the same null hypothesis (no systematic enrichment) but are sensitive to different signal patterns:
DirectionalScore ≈ 0 suggests a variance or shape
difference rather than a net directional shift — interpret with
caution.Convergent results (significant by both) provide the strongest evidence for coordinated lipid class remodeling.
The DirectionalScore is a standardized mean difference
(Cohen’s d analog):
DS = (mean_logFC_set - mean_logFC_background) / SD_pooled
It quantifies the direction and magnitude of the shift, independent of the KS p-value. A set can have a significant KS p-value (distributional difference) with a near-zero DirectionalScore (no net direction) — this pattern suggests heterogeneous within-set regulation.
By default, fgsea ranks lipids by:
pi-value = sign(logFC) × −log10(P.Value)
This combines the direction of change with statistical confidence, giving more weight to lipids that are both strongly and significantly regulated. It is preferred over logFC alone when p-values are available.
sessionInfo()
#> R version 4.5.1 (2025-06-13)
#> Platform: aarch64-apple-darwin20
#> Running under: macOS Tahoe 26.2
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
#>
#> locale:
#> [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: America/New_York
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.39 R6_2.6.1 fastmap_1.2.0 xfun_0.57
#> [5] cachem_1.1.0 knitr_1.51 htmltools_0.5.9 rmarkdown_2.31
#> [9] lifecycle_1.0.5 cli_3.6.6 sass_0.4.10 jquerylib_0.1.4
#> [13] compiler_4.5.1 rstudioapi_0.18.0 tools_4.5.1 evaluate_1.0.5
#> [17] bslib_0.11.0 yaml_2.3.12 otel_0.2.0 jsonlite_2.0.0
#> [21] rlang_1.2.0