dget package

Submodules

dget.adduct module

Class for adduct calculations.

class dget.adduct.Adduct(base: Formula, adduct: str)

Bases: object

Class used to create a molmass.Formula from a base molmass.Formula and an adduct string. This string should be in the format [nM+nX-nY]n+ where M is the base molecule and X, Y are gains / losses. Some valid examples are:

  • [M]+

  • [M-H]-

  • [M+Na]+

  • [2M-H]-

  • [M+2H]+

  • [M+K-2H]-

adduct

adduct string in the form [nM+nX-nY]n+

base

formula of the base molecule, represented by M in adduct

num_base

number of base molecules in adduct

formula

formula of the adduct

property composition: Composition

The composition of the adduct.

static is_valid_adduct(adduct: str) bool

Test to see if adduct string is valid.

Tests string against Adduct.regex and makes sure any +/- adducts match Adduct.regex_split.

Parameters:

adduct – adduct string in the form [nM+nX-nY]n+

Returns:

True if valid

mz_range(min_fraction: float = 0.0) Tuple[float, float]

Return the spectrum mz range.

regex = re.compile('\\[(\\d*)M(.*)\\](\\d+)?([+-])')
regex_split = re.compile('([+-])(\\d*)(\\w+)')
property spectrum: Spectrum

The spectrum of the adduct.

dget.convolve module

Convolution implementations.

Deconvolution is used by DGet to recover the original deuteration pattern from a given mass spectrum.

dget.convolve.deconvolve(x: ndarray, psf: ndarray) Tuple[ndarray, ndarray]

Inverse of convolution.

Deconvolution is performed in frequency domain.

Parameters:
  • x – array

  • psf – point spread function

Returns:

recovered data remainder

Notes

Based on https://rosettacode.org/wiki/Deconvolution/1D

dget.dget module

Class for deuteration calculations.

class dget.dget.DGet(deuterated_formula: str | Formula, tofdata: str | Path | TextIO | Tuple[ndarray, ndarray], adduct: str = '[M]+', cutoff: float | str | None = None, signal_mass_width: float = 0.33, signal_mode: str = 'peak height', loadtxt_kws: dict | None = None)

Bases: object

Deuteration calculation class.

This class contains functions for calculating deuteration from a molecular formula and mass spectra.

The lowest deuteration state to include in the calculation can be selected using the cutoff argument. This accepts floats to specify an m/z or a string in the format ‘D<int>’ to specify the lowest state. By default the lowest state will be the first where 2 consecutive states are < 1% and the accumulated probability is > 10%.

Signals are read from the data using the signal_mode, ‘peak area’ will integrate the signal_mass_width region around each m/z, while ‘peak height’and ‘raw’ will select the highest peak within this region. If ‘raw’ is selected, no de-convolution is performed.

Mass spectra files are expected to be a delimited text file with at least 2 columns, one for mass and one for signals. Specify columns using the keyword ‘usecols’ in loadtxt_kws, a (zero indexed) tuple of ints for (mass, signal) columns. The delimiter can be specified using the ‘delimiter’ keyword. Mass spectra can also be passed as a tuple of numpy arrays, (masses, signals).

deuterated_formula

formula of fully deuterated molecule

tofdata

path to mass spectra text file, or tuple of masses, signals

adduct

form of adduct ion, see dget.adduct

cutoff

cutoff for calculation as an m/z ‘123.4’ or state ‘D<int>’

signal_mass_width

range around each m/z to search for maxima or integrate

signal_mode

detection mode, one of ‘peak area’, ‘peak height’, ‘raw’

loadtxt_kws

parameters passed to numpy.loadtxt, defaults to {‘delimiter’: ‘,’, ‘usecols’: (0, 1)}

align_tof_with_spectra(alignment_mz: float | None = None) float

Shifts ToF data to better align with monoisotopic m/z.

Please calibrate your MS instead of using this.

Parameters:

alignment_mz – m/z used for alignment, defaults to monoisotopic m/z

Returns:

offset used for alignment

property base_name: str

The name of the base formula, with D instead of [2H].

common_adducts = ['[M]+', '[M+H]+', '[M+Na]+', '[M+H2]2+', '[2M+H]+', '[M-H]-', '[2M-H]-', '[M-H2]2-', '[M+Cl]-', '[M-H3O]-']
property deuteration: float

The deuteration of the base molecule.

Deuteration is calculated as the fraction of deuterium in the molecular formula that have been deuterated successfully. For example: 60% C2H5D1, 40% C2H6 would give a deuteration of 0.6.

Deuteration is only calculated for the states above the deuteration cutoff.

property deuteration_probabilities: ndarray

The deuteration fraction of each possible deuteration.

Probabilities are listed in order of D=0 to N, where N is the number of deuterium in the original molecular formula. Probabilities will sum to 1.0.

property deuteration_states: ndarray

Indexes of the valid deuteration states.

Valid states are those Dx-Dn, where n is the number of deuterium atoms in the base molecule as x is inferred from self.deuteration_cutoff if defined or the last 2 consecutive probabilities that are < 1% with an accumulative probability of at least 10%.

property deuterium_count: int

The number of deuterium atoms in the adduct.

property formula: Formula

The adduct formula.

guess_adduct_from_base_peak(adducts: List[Formula] | None = None) Tuple[Adduct, float]

Search for the adduct with the highest intensity.

If multiple adducts have the maximum intensity then the adduct with the monoisotopic mass closest to the local base peak is returned. This function will work best with highly deuterated samples.

Parameters:

adducts – adducts to try, defaults to DGet.common_adducts

Returns:

best adduct mass difference from adducts base peak

min_fraction_for_spectra = 0.001
plot_predicted_spectra(ax: matplotlib.axes.Axes, mass_range: Tuple[float, float] | str = 'targets') None

Plot spectra over mass spectra on ax.

mass_range can be passed as a tuple of floats (start m/z, end m/z), ‘full’ to plot the entire mass range or ‘targets’ to plot the region around the predicted spectra.

Parameters:
  • ax – matplotlib axes to plot on

  • mass_range – range to plot

print_results(file: TextIO | None = None) None

Print results.

Parameters:

file – file to print to, or sys.stdout if None

property psf: ndarray

The point spread function used for (de)convolution.

This is the normalised spectrum of the adduct.

property residual_error: float | None

The normalised (0.0 - 1.0) sum of deonvolution residuals.

A high residual error is indicitive of a poor fit between the data and isotopic spectra. This can result from an incorrect formula or contaminants in the mass spectra.

spectra(**kwargs) Generator[Spectrum, None, None]

Spectrum of all compounds from non to fully deuterated.

kwargs are passed to molmass.Formula.spectrum()

property spectrum: Spectrum

The adduct spectrum.

subtract_baseline(mass_range: Tuple[float, float] | None = None, percentile: float = 25.0) float

Subtracts baseline of region.

Calculates the percentile percentile of the designated mass region and subtracts it from the mass spec signals.

Parameters:
  • mass_range – region to find baseline

  • percentile – percentile to use

Returns:

amount subtracted from baseline

property target_masses: ndarray

The m/z of every possible spectrum.

A new spectrum is created by combining the spectra of every possible deuteration state.

property target_signals: ndarray

The signal for every m/z in the possible spectrum.

The mass_width area around each of the target_masses is integrated or searched for the maximum peak height, depending on the current signal_mode.

dget.formula module

Module containing molmass helper functions.

dget.formula.divide_formulas(a: Formula, b: Formula) Tuple[int, Formula]

Divide Formula a by b. Returns the number of times b is in a and remainder.

Parameters:
  • a – numerator Formula

  • b – divisor Formula

Returns:

number of times b in a Formula of remainder

dget.formula.formula_in_formula(a: Formula, b: Formula) bool

Check if all atoms of a are in b.

Returns:

True if a in b

dget.formula.spectra_mz_spread(spectra: List[Spectrum], charge: int = 0) Spectrum

Calculte the m/z spread of the given spectra.

Each entry with the same unit mass is averaged, weighted by its relative intensity.

Parameters:

spectra – list of Spectrum to combine

Returns:

array of mean m/z values