| Title: | Lipid Set Enrichment Analysis with Dual KS and 'fgsea' Engines |
| Version: | 0.2.1 |
| Description: | Provides biology-aware lipid set enrichment analysis (LSEA) for lipidomics data using dual engines: the Kolmogorov-Smirnov test and the fast gene set enrichment algorithm from the 'fgsea' package. Annotates lipids into biological groups at three levels (lipid class, LIPID MAPS category, functional category) and tests for coordinated directional shifts between conditions. Includes fatty acid chain analysis with trend plots weighted by lipid abundance (Spearman rank correlation, configurable smoothing), wide-format chain position output (sn-1, sn-2, sn-3, sn-4), annotation confidence filtering, and export utilities for reproducible reporting in CSV, 'Excel', and PDF formats. Vignettes are available in English and Spanish. Methods are based on Subramanian et al. (2005) <doi:10.1073/pnas.0506580102> and Korotkevich et al. (2021) <doi:10.1101/060012>. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| Depends: | R (≥ 4.1.0) |
| RoxygenNote: | 8.0.0 |
| Imports: | ggplot2, withr |
| Suggests: | fgsea, knitr, openxlsx, rmarkdown, spelling, testthat (≥ 3.0.0) |
| URL: | https://github.com/DavidGO464/easyLSEA |
| BugReports: | https://github.com/DavidGO464/easyLSEA/issues |
| LazyData: | true |
| Config/testthat/edition: | 3 |
| VignetteBuilder: | knitr |
| Language: | en-US |
| NeedsCompilation: | no |
| Packaged: | 2026-06-08 17:35:46 UTC; david_md |
| Author: | David Guardamino Ojeda
|
| Maintainer: | David Guardamino Ojeda <david.guardamino@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-06-16 19:40:02 UTC |
Annotate lipid names with LIPID MAPS classification
Description
Parses lipid names in any format used by lipidomics software (LipidSearch, MS-DIAL, LipidView) and returns a structured data frame with LIPID MAPS canonical classification, chain-level metadata, and optional shorthand notation per Liebisch et al. (2020).
Usage
annotate_lipid(
molecules,
detail = c("compact", "standard", "full"),
shorthand = FALSE,
sn_confirmed = FALSE,
lyso_explicit = FALSE,
no_match = c("warn", "remove", "ignore"),
sphingoid_default = "d"
)
Arguments
molecules |
Character vector of lipid names to parse. |
detail |
Level of detail in the output table:
|
shorthand |
Logical. If |
sn_confirmed |
Logical. If |
lyso_explicit |
Logical. If |
no_match |
How to handle unparsed names: |
sphingoid_default |
Default sphingoid base prefix for sphingolipids
without explicit prefix. |
Value
A data frame with one row per unique lipid name. Key columns include
Class, lm_category, lm_class_id, annotation_level,
is_ether, is_plasmalogen, is_istd, sphingoid_prefix,
total_cl, total_cs, and optionally shorthand_lm.
References
Liebisch G et al. Update on LIPID MAPS classification, nomenclature, and shorthand notation for MS-derived lipid structures. J Lipid Res. 2020;61(12):1539-1555. doi:10.1194/jlr.S120001025
Conroy MJ et al. LIPID MAPS: update to databases and tools for the lipidomics community. Nucleic Acids Res. 2024;52(D1):D1677-D1682. doi:10.1093/nar/gkad896
Examples
lipids <- c("PC 16:0/18:1", "PC O-18:1/20:4", "Cer d18:1/16:0",
"TG(16:0/18:1/18:1)", "Lyso PE 18:1(d7)",
"plasmenylPE (16:0/18:1)", "Sa1P d 18:0",
"WE 16:0/18:1", "CoA 16:0",
"15-HETE", "PGE2", "LTB4", "Resolvin D1", "12(13)-EpOME")
annotate_lipid(lipids)
annotate_lipid(lipids, detail = "standard")
annotate_lipid(lipids, detail = "full", shorthand = TRUE)
Annotate lipid names with class and category information
Description
Assigns lipid class (e.g. PC, TG, Cer), full class name, LIPID MAPS
structural category, and functional category to each lipid in data.
Returns the input data.frame with annotation columns appended, ready for
use in run_lsea and parse_lipid_chains.
Usage
annotate_lipids(
data,
lipid_col = "LipidName",
shorthand_col = "Shorthand",
method = c("internal", "lipidAnnotator"),
verbose = TRUE
)
Arguments
data |
A |
lipid_col |
Character(1). Name of the column containing lipid
identifiers. Default: |
shorthand_col |
Character(1) or |
method |
Character(1). Annotation method:
|
verbose |
Logical(1). Print annotation summary (class distribution
and count of unclassified lipids). Default: |
Value
The input data.frame with five columns appended:
LipidClassAbbreviated class (e.g. "PC", "TG", "Cer").
LipidClass_FullDescriptive class name (e.g. "Ceramide", "Ether-PC").
LipidCategory_LMAPSLIPID MAPS structural category (e.g. "Glycerophospholipids", "Sphingolipids").
LipidCategory_functionalFunctional category, with Oxylipins and Bile Acids as standalone groups rather than nested under Fatty Acyls.
LipidCategorySimplified category for plotting: same as
LipidCategory_functionalexcept Saccharolipids are shown as "Glycolipids".
Lipids that cannot be classified receive LipidClass = "Unknown".
See Also
Examples
df <- data.frame(
LipidName = c("PC 36:2", "TG(54:3)", "SM d18:1/16:0",
"Cer(d18:1/24:0)", "LPC 18:0", "CE 18:1"),
logFC = c(1.2, -0.8, 0.5, -1.1, 0.3, 0.9),
stringsAsFactors = FALSE
)
annotated <- annotate_lipids(df)
annotated[, c("LipidName", "LipidClass", "LipidCategory")]
Default chain analysis class configuration
Description
Returns the default list that maps lipid classes to their parsing strategy.
Pass the output of this function as the cls_config argument of
parse_lipid_chains() to override individual entries.
Usage
default_chain_config()
Value
Named list with elements sn2, nacyl, long,
single, and excl.
Lipid Set Enrichment Analysis — full pipeline
Description
One-call interface to the complete easyLSEA workflow:
lipid annotation, KS and/or fgsea enrichment across three biological
levels (class, LIPID MAPS category, functional category), and fatty
acid chain analysis. Returns a structured easyLSEA_result object
that can be plotted and exported.
Usage
easyLSEA(
data,
lipid_col = "LipidName",
fc_col = "logFC",
pval_col = "P.Value",
case_lbl = "Case",
ref_lbl = "Reference",
engine = c("both", "ks", "fgsea"),
annotator = c("internal", "lipidAnnotator"),
run_chains = TRUE,
min_rank = "E",
group_cols = NULL,
min_n = 3L,
n_perm = 2000L,
fgsea_nperm = 10000L,
plots = TRUE,
bubble_label = c("FDR", "DS", "NES", "n"),
output = c("combined", "separate"),
seed = 42L,
verbose = TRUE
)
Arguments
data |
A |
lipid_col |
Character(1). Name of the lipid identifier column.
Default: |
fc_col |
Character(1). Name of the log2 fold-change column.
Default: |
pval_col |
Character(1) or |
case_lbl |
Character(1). Label for the case group, used in output
tables and plot titles. Default: |
ref_lbl |
Character(1). Label for the reference group.
Default: |
engine |
Character(1). Enrichment engine: |
annotator |
Character(1). Lipid annotation method:
|
run_chains |
Logical(1). Whether to run fatty acid chain analysis
in addition to LSEA. Default: |
min_rank |
Character(1). Minimum confidence rank for chain analysis.
Ranks are ordered |
group_cols |
Character vector. Grouping columns to test in LSEA.
If |
min_n |
Integer(1). Minimum set size to test. Default: |
n_perm |
Integer(1). KS permutations for |
fgsea_nperm |
Integer(1). fgsea Monte Carlo permutations.
Default: |
plots |
Logical(1). Whether to generate ggplot2 objects.
Set to |
bubble_label |
Character vector. Which statistics to show next to
each bubble in the LSEA bubble plots. Any subset of |
output |
Character(1). Return format when both modules run:
|
seed |
Integer(1) or |
verbose |
Logical(1). Print progress messages. Default: |
Value
An object of class easyLSEA_result: a named list with
five slots.
$metaNamed list: call, date, labels, engine, counts.
$lseaNamed list:
results(data.frame with KS and/or fgsea statistics),combined(merged table with Convergence column).$chainsNamed list:
parsedandsummaryfromparse_lipid_chains, orNULLifrun_chains = FALSE.$plotsNamed list of
ggplotobjects, orNULLifplots = FALSE.$inputNamed list:
data(annotated input),group_cols.
When output = "separate", returns
list(lsea = ..., chains = ...) instead.
See Also
annotate_lipids for standalone annotation,
run_lsea for the enrichment engine,
parse_lipid_chains for chain analysis,
plot_lsea, plot_chains,
export_lsea() to save results.
Examples
data("lipid_example", package = "easyLSEA")
result <- easyLSEA(
data = lipid_example,
lipid_col = "LipidName",
fc_col = "logFC",
case_lbl = "NASH",
ref_lbl = "Control",
engine = "ks",
plots = FALSE
)
print(result)
head(result$lsea$results)
Export easyLSEA results to disk
Description
Saves the contents of an easyLSEA result object to a
timestamped output folder. Supported formats: CSV tables, a multi-sheet
Excel workbook, PDF or PNG plots, and a standalone HTML report.
Any combination of formats can be requested in a single call.
Usage
export_lsea(
result,
dir,
prefix = "easyLSEA",
format = c("csv", "excel", "pdf"),
overwrite = FALSE,
plot_width = NULL,
plot_height = NULL,
plot_dpi = 300L,
verbose = TRUE
)
Arguments
result |
An |
dir |
Character(1). Base directory where the output folder will be
created. Required: there is no default, so the function never writes
to the working directory, the package directory, or the user's home
filespace unless the caller explicitly provides a location. For
examples, tests, or throwaway output, pass |
prefix |
Character(1). Prefix for the output folder name. The folder
is named |
format |
Character vector. One or more of |
overwrite |
Logical(1). If |
plot_width |
Numeric(1) or |
plot_height |
Numeric(1) or |
plot_dpi |
Integer(1). Resolution for PNG output. Default: |
verbose |
Logical(1). Print progress messages. Default: |
Details
Output folder structure
<prefix>_<YYYY-MM-DD>/
tables/
lsea_results_ks.csv
lsea_results_fgsea.csv
lsea_combined.csv
chain_results.csv
chain_parsed.csv
chain_wide.csv
plots/
lsea/
bubble_ks.pdf
bubble_fgsea.pdf
chains/
tile/
tile_PC.pdf
tile_TG.pdf ...
trend/
trend_length_PC.pdf
trend_unsat_PC.pdf ...
results.xlsx
report.html
Dependencies for optional formats
Excel export requires openxlsx (install.packages("openxlsx")).
HTML export requires rmarkdown and knitr.
Value
Invisibly returns a named character vector of all file paths created. Useful for programmatic use or verification.
See Also
easyLSEA, run_lsea,
parse_lipid_chains
Examples
data("lipid_example", package = "easyLSEA")
result <- suppressWarnings(suppressMessages(easyLSEA(
data = lipid_example,
engine = "ks",
n_perm = 100L,
plots = FALSE,
verbose = FALSE
)))
# Export CSV and PDF to a temporary folder
paths <- export_lsea(result, dir = tempdir(), format = c("csv", "pdf"))
paths
Example lipidomics dataset
Description
A synthetic dataset of 200 lipid species simulating a case vs control lipidomics comparison, with known enrichment patterns built in: PC and PE species are enriched in the case group, TG species are depleted. Used in package examples and tests.
Usage
lipid_example
Format
A data.frame with 200 rows and 6 columns:
- LipidName
Character. Lipid identifier in shorthand notation (e.g. "PC 36:2").
- LipidClass
Character. Pre-assigned lipid class abbreviation.
- logFC
Numeric. Log2 fold change (case / control).
- P.Value
Numeric. Raw p-value from simulated differential analysis.
- adj.P.Val
Numeric. Benjamini-Hochberg adjusted p-value.
- sig
Integer. 1 if adj.P.Val < 0.05 and |logFC| > log2(1.25), 0 otherwise.
Source
Simulated data. See data-raw/lipid_example.R for the
generation script. Seed: 2026.
Parse acyl chain composition from a lipidomics data.frame
Description
Applies biology-aware chain parsing to each lipid in data,
routing each species to the appropriate parser based on its lipid class:
sn-2 (PC, PE, PE O), N-acyl (SM, Cer, HexCer, GlcCer, Hex2Cer, Hex3Cer),
long-format (TG, DG, PS, PG, PA, PI, CL),
single-chain (LPC, LPE, LPI, LPG, LPA, LPS, CAR, FFA, FA, CE), or excluded.
Usage
parse_lipid_chains(
data,
lipid_col = "LipidName",
class_col = "LipidClass",
shorthand_col = "Shorthand",
rank_col = "Confidence_rank",
min_rank = "E",
cls_config = default_chain_config()
)
Arguments
data |
A |
lipid_col |
Character(1). Name of the lipid identifier column.
Default: |
class_col |
Character(1). Name of the lipid class column (must contain
abbreviated class names such as "PC", "TG", "SM"). Default:
|
shorthand_col |
Character(1) or |
rank_col |
Character(1) or |
min_rank |
Character(1). Minimum confidence rank to include in
analysis. Ranks are ordered |
cls_config |
Named list from |
Value
A named list with two elements:
parsedLong-format
data.framewith one row per chain observation. Contains all columns fromdataplus chain fields (analysis_chain_cl,analysis_chain_cs,chain_type, etc.).summaryPer-lipid parsing log
data.framewith columnsLipidName,LipidClass,Confidence_rank,status, andchain_type.
See Also
default_chain_config, plot_chains()
Examples
data("lipid_example", package = "easyLSEA")
annotated <- annotate_lipids(lipid_example)
chains <- parse_lipid_chains(annotated)
head(chains$parsed)
head(chains$summary)
Generate chain analysis plots
Description
Produces tile and trend plots for each lipid class with sufficient
chain observations. Returns a named list of ggplot
objects; does not write files. Use export_lsea() to save.
Usage
plot_chains(
chains_result,
case_lbl = "Case",
ref_lbl = "Reference",
fdr_thresh = 0.05,
min_n_tile = 4L,
min_n_trend = 5L,
smooth_method = c("loess", "lm"),
smooth_span = 0.75,
smooth_weighted = TRUE,
smooth_se = TRUE,
show_points = TRUE,
tile_label = c("both", "n", "sig", "none"),
trend_test = c("spearman", "lm", "none"),
trend_x_step_length = 2L,
trend_x_step_unsat = 1L
)
Arguments
chains_result |
Named list returned by |
case_lbl |
Character(1). Label for the case group. Default:
|
ref_lbl |
Character(1). Label for the reference group. Default:
|
fdr_thresh |
Numeric(1). FDR threshold to colour individual lipid
points in trend plots (red = FDR sig, grey = NS) and to label
significant counts in tile cells. Default: |
min_n_tile |
Integer(1). Minimum chain observations per class to
produce a tile plot. Default: |
min_n_trend |
Integer(1). Minimum chain observations per class to
produce trend plots. Default: |
smooth_method |
Character(1). Smoothing method for trend plots.
|
smooth_span |
Numeric(1). Span for loess smoothing (only used when
|
smooth_weighted |
Logical(1). If |
smooth_se |
Logical(1). Whether to display the 95\
interval ribbon around the smoothing curve. Default: |
show_points |
Logical(1). Whether to display individual lipid points
in trend plots, coloured by FDR significance. Default: |
tile_label |
Character(1). What to display inside each tile cell:
|
trend_test |
Character(1). Statistical test to annotate on trend plots.
|
trend_x_step_length |
Integer(1) or |
trend_x_step_unsat |
Integer(1) or |
Value
Named list of ggplot objects with elements
tile_<CLASS>, trend_length_<CLASS>,
trend_unsat_<CLASS>.
See Also
parse_lipid_chains, export_lsea()
Distribution enrichment boxplot per lipid set
Description
Produces a boxplot of logFC distributions for each lipid set, with jittered
individual lipid points, FDR/DS/NES labels for significant sets, and
red borders for significant sets. When engine = "both" (KS + fgsea),
fill colour encodes convergence (KS only, fgsea only, or KS+fgsea).
Usage
plot_distribution(
data,
lsea_result,
group_col,
fc_col = "logFC",
case_lbl = "Case",
ref_lbl = "Control",
fdr_thresh = 0.05,
min_n = 3L,
sig_only = FALSE,
label_angle = 0
)
Arguments
data |
A |
lsea_result |
A named list as returned by |
group_col |
Character(1). Grouping column name
(e.g. |
fc_col |
Character(1). Column with log fold-change values.
Default: |
case_lbl |
Character(1). Label for the case group. Default:
|
ref_lbl |
Character(1). Label for the reference group. Default:
|
fdr_thresh |
Numeric(1). FDR threshold for significance.
Default: |
min_n |
Integer(1). Minimum number of lipids per set to include.
Default: |
sig_only |
Logical(1). If |
label_angle |
Numeric(1). Angle for FDR labels. |
Value
A ggplot object, or NULL if no groups pass
min_n.
Generate LSEA enrichment plots
Description
Produces bubble, barplot, and running sum plots from a run_lsea()
result. Returns a named list of ggplot objects.
Usage
plot_lsea(
lsea_result,
which = c("bubble_ks", "bubble_fgsea", "bubble_combined", "barplot", "running_sum"),
fdr_thresh = 0.05,
case_lbl = "Case",
ref_lbl = "Reference",
bubble_label = c("FDR", "DS", "NES", "n")
)
Arguments
lsea_result |
Named list returned by |
which |
Character vector. Which plots to generate:
|
fdr_thresh |
Numeric(1). Significance threshold for highlighting.
Default: |
case_lbl |
Character(1). Case label for plot annotations. |
ref_lbl |
Character(1). Reference label for plot annotations. |
bubble_label |
Character vector. Which statistics to display next to
each bubble. Any subset of |
Value
Named list of ggplot objects.
See Also
run_lsea, export_lsea()
Print method for easyLSEA_result
Description
Print method for easyLSEA_result
Usage
## S3 method for class 'easyLSEA_result'
print(x, ...)
Arguments
x |
An |
... |
Ignored. |
Value
Invisibly returns the input easyLSEA_result object
(x). Called for its side effect of printing a formatted
summary of the enrichment results to the console.
Lipid Set Enrichment Analysis
Description
Runs KS-based LSEA, fgsea, or both for each grouping level in
group_cols and returns a tidy data.frame with enrichment
statistics.
Usage
run_lsea(
data,
group_cols = c("LipidClass", "LipidCategory_LMAPS", "LipidCategory_functional"),
fc_col = "logFC",
pval_col = "P.Value",
lipid_id_col = NULL,
case_lbl = "Case",
ref_lbl = "Reference",
engine = c("both", "ks", "fgsea"),
fgsea_rank = c("pi_value", "logFC", "t_stat"),
min_n = 3L,
n_perm = 2000L,
fgsea_nperm = 10000L,
fgsea_eps = 0,
seed = 42L,
verbose = TRUE
)
Arguments
data |
A |
group_cols |
Character vector. Names of grouping columns to test.
Each column defines one level of analysis (e.g. class, LIPID MAPS
category, functional category). Default:
|
fc_col |
Character(1). Log2 fold-change column. Default:
|
pval_col |
Character(1) or |
lipid_id_col |
Character(1) or |
case_lbl |
Character(1). Case group label. Default: |
ref_lbl |
Character(1). Reference group label. Default:
|
engine |
Character(1). Enrichment engine: |
fgsea_rank |
Character(1). Rank metric for fgsea: |
min_n |
Integer(1). Minimum set size to test. Default: |
n_perm |
Integer(1). KS permutations for |
fgsea_nperm |
Integer(1). fgsea Monte Carlo permutations.
Default: |
fgsea_eps |
Numeric(1). fgsea epsilon (0 = reduce approximation
error). Default: |
seed |
Integer(1) or |
verbose |
Logical(1). Print progress messages. Default: |
Value
A named list with elements:
ksdata.frame of KS results (or
NULLifengine = "fgsea").fgseadata.frame of fgsea results (or
NULLifengine = "ks"or fgsea is not installed).combineddata.frame merging both engines by Group and Level, including a Convergence column.
References
Korotkevich G, Sukhov V, Budin N, Shpak B, Artyomov MN, Sergushichev A (2021). Fast gene set enrichment analysis. bioRxiv. doi:10.1101/060012
Xiao Y, Hsiao TH, Suresh U, Chen HI, Wu X, Wolf SE, Chen Y (2014). A novel significance score for gene selection and ranking. Bioinformatics, 30(6), 801–807. doi:10.1093/bioinformatics/btr671
See Also
annotate_lipids(), plot_lsea,
export_lsea()
Examples
data("lipid_example", package = "easyLSEA")
annotated <- annotate_lipids(lipid_example)
result <- run_lsea(
data = annotated,
fc_col = "logFC",
engine = "ks",
case_lbl = "NASH",
ref_lbl = "Control",
n_perm = 100L
)
head(result$ks)
Summary method for easyLSEA_result
Description
Summary method for easyLSEA_result
Usage
## S3 method for class 'easyLSEA_result'
summary(object, padj_cutoff = 0.05, ...)
Arguments
object |
An |
padj_cutoff |
Numeric(1). FDR threshold for significant sets.
Default: |
... |
Ignored. |
Value
Invisibly returns the input easyLSEA_result object
(object). Called for its side effect of printing a summary
table of the significant lipid sets to the console.