% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/build_XY_from_peaks.R
\name{build_XY_from_peaks}
\alias{build_XY_from_peaks}
\title{Build design matrix X and response Y from peak intensities}
\usage{
build_XY_from_peaks(
  peaks,
  labels,
  normalize = c("max", "none"),
  sparse = FALSE,
  name_cols = FALSE,
  name_digits = 4
)
}
\arguments{
\item{peaks}{Peak data from which to build the design matrix X. Either:
\itemize{
\item a numeric matrix/data.frame of intensities with rows = samples and columns = peaks, or
\item a list of per-sample numeric vectors (or small tables) aligned to a common set of peaks
(e.g., same names or column order).
Values are assumed non-negative; NAs are allowed and are ignored when computing per-sample maxima.
}}

\item{labels}{Outcome/response labels used to build Y. A vector or factor with one entry per sample
(length must equal nrow(peaks) for matrix/data.frame input, or length(peaks) for list input).}

\item{normalize}{Normalization to apply to each sample’s peak intensities before constructing X.
One of "max" or "none" (matched via match.arg). "max" scales each sample by its maximum
non-NA intensity; "none" applies no scaling.}

\item{sparse}{Logical; if TRUE, return X as a sparse Matrix::dgCMatrix. If FALSE, return a base R
dense matrix. Default is FALSE.}

\item{name_cols}{Logical; if TRUE, set column names of X from the m/z values
returned by MALDIquant::intensityMatrix (formatted as "mz_<mz>"). Default is FALSE.
This only applies when peaks is a list of MALDIquant::MassPeaks (or when attr(X, "mass")
is available); otherwise it is ignored. Enabling this can add noticeable overhead for
very wide matrices.}

\item{name_digits}{Integer scalar; number of decimal digits to use when formatting
m/z values into column names if name_cols = TRUE. Default is 4. Must be a
non-negative integer. Ignored when name_cols = FALSE or when no m/z vector is
available (i.e., for plain matrix/data.frame or generic list inputs).}
}
\value{
A list with:
\itemize{
\item X: numeric matrix or Matrix::dgCMatrix of dimension n_samples x n_peaks
\item Y: response vector/factor aligned to rows of X (returned as supplied/coerced by the function)
}
}
\description{
Constructs a sample-by-peak design matrix (X) and an outcome vector/factor (Y) from
peak-intensity input. Accepts either a numeric matrix/data.frame (rows = samples,
columns = peaks) or a list of aligned per-sample peak vectors. Optionally applies
per-sample max normalization and can return X as a sparse dgCMatrix for memory efficiency.
}
\examples{
data("CitrobacterRKIspectra", "CitrobacterRKImetadata", package = "MSclassifR")

spectra <- SignalProcessing(CitrobacterRKIspectra)
peaks <- MSclassifR::PeakDetection(x = spectra, averageMassSpec = FALSE)

labels <- CitrobacterRKImetadata$Species   # adjust to your label column

xy <- build_XY_from_peaks(peaks, labels, normalize = "max", sparse = TRUE)

}
