Loading and Preparing Data ========================== This guide shows how to load observational data into ``pgmuvi`` and prepare it for fitting. .. contents:: On this page :local: :depth: 2 Overview -------- ``pgmuvi`` expects data as three parallel arrays: * **times** — observation epochs (any consistent time unit, e.g., days, MJD). * **fluxes** — flux or magnitude measurements. * **errors** — 1-σ uncertainties on the measurements. All three arrays must have the same length. For multiband data, each array has one row per observation across all bands (see :doc:`multiband`). Creating a Lightcurve ---------------------- Pass the arrays directly to the constructor:: import pgmuvi import numpy as np times = np.array([...]) # shape (N,) fluxes = np.array([...]) # shape (N,) errors = np.array([...]) # shape (N,) lc = pgmuvi.lightcurve.Lightcurve(times, fluxes, errors) The data are stored internally as PyTorch tensors. You can retrieve them as NumPy arrays via ``lc.xdata.cpu().numpy()``, etc. Loading from a File -------------------- **From a CSV file** :meth:`~pgmuvi.lightcurve.Lightcurve.from_csv` reads a CSV file directly. Column names are matched case-insensitively using common aliases, so in most cases no extra arguments are required:: import pgmuvi lc = pgmuvi.lightcurve.Lightcurve.from_csv("my_lightcurve.csv") For multiband CSV files that include a numeric wavelength column, pass the column name explicitly or let the method auto-detect it:: # Explicit wavelength column lc = pgmuvi.lightcurve.Lightcurve.from_csv( "multiband.csv", wavelcol="wavelength_um" ) # Or specify time and wavelength together lc = pgmuvi.lightcurve.Lightcurve.from_csv( "multiband.csv", xcol=["mjd", "wavelength_um"] ) If the CSV contains a **string band-identifier column** (e.g. ``band`` or ``filter`` with values like ``"V"``, ``"R"``), that column may be automatically stored in :attr:`~pgmuvi.lightcurve.Lightcurve.band` for labelling purposes. For **2-D (multiband) lightcurves** this happens automatically. For **1-D lightcurves**, auto-population only occurs when the band-ID column contains exactly one distinct non-empty label (matching the 1-D constructor contract); if multiple distinct labels are present, ``band`` is left unset and a warning is emitted. Note that these string labels are for human readability only — the GP model requires a numeric wavelength in column 1 of ``xdata`` (see :doc:`multiband`). **From an Astropy-compatible format** :meth:`~pgmuvi.lightcurve.Lightcurve.from_table` builds a light curve from an :class:`astropy.table.Table` instance or any file format that Astropy can read (FITS, VOTable, many ASCII dialects):: import pgmuvi lc = pgmuvi.lightcurve.Lightcurve.from_table("my_lightcurve.vot") Example from an in-memory table:: from astropy.table import Table import pgmuvi t = Table.read("my_lightcurve.fits") lc = pgmuvi.lightcurve.Lightcurve.from_table(t) **From raw arrays** For any other format, read the data manually and pass arrays directly:: import numpy as np import pgmuvi data = np.loadtxt("my_lightcurve.csv", delimiter=",") lc = pgmuvi.lightcurve.Lightcurve(data[:, 0], data[:, 1], data[:, 2]) Adding More Observations -------------------------- **Merging a new band into an existing multiband lightcurve** :meth:`~pgmuvi.lightcurve.Lightcurve.merge` appends a new band to an existing 2-D light curve. The calling object must already be 2-D; 1-D inputs are promoted automatically when a wavelength is supplied. For 1-D inputs that have no ``band`` attribute set, you must also pass ``band=`` explicitly (otherwise a :class:`ValueError` is raised):: # lc2d is an existing 2-D lightcurve; lc_new is a new single-band lc merged = lc2d.merge(lc_new, wavelength=0.80, band="I") # 0.80 μm, band "I" You can also merge directly from a CSV path:: merged = lc2d.merge("new_band.csv", wavelength=0.80, band="I") **Combining multiple lightcurves into one multiband object** :meth:`~pgmuvi.lightcurve.Lightcurve.concat` is a class method that builds a 2-D light curve from a list of single-band (or already-multiband) objects. Every 1-D input must carry both band information (either set at construction time via ``band=`` or via :meth:`~pgmuvi.lightcurve.Lightcurve.from_csv`) **and** a scalar wavelength value (``lc.wavelength``, ``lc.wave``, or ``lc.lambda_``); ``concat()`` raises a :exc:`ValueError` if either is missing:: lc_V.band = "V"; lc_V.wavelength = 0.55 lc_R.band = "R"; lc_R.wavelength = 0.64 lc_I.band = "I"; lc_I.wavelength = 0.80 combined = pgmuvi.lightcurve.Lightcurve.concat([lc_V, lc_R, lc_I]) Both methods accept ``on_conflict="skip"`` to drop duplicate bands and emit a :class:`UserWarning` rather than raising an error. **Concatenating arrays before construction** For simple cases where band information is not needed, concatenate the NumPy arrays before constructing the :class:`~pgmuvi.lightcurve.Lightcurve`:: import numpy as np import pgmuvi all_times = np.concatenate([times, new_times]) all_fluxes = np.concatenate([fluxes, new_fluxes]) all_errors = np.concatenate([errors, new_errors]) lc = pgmuvi.lightcurve.Lightcurve(all_times, all_fluxes, all_errors) .. note:: For 2D / multiband data, ``xdata`` must have shape ``(N, 2)`` with column 0 being time and column 1 being a numeric wavelength. See :doc:`multiband`. Data Transformations --------------------- GP optimisation can be sensitive to the scale of the input data. ``pgmuvi`` provides built-in transformations to rescale the time and flux axes: .. list-table:: :header-rows: 1 :widths: 20 50 * - Transform - Description * - ``'minmax'`` - Rescale to [0, 1] using min and max. * - ``'zscore'`` - Standardise to zero mean, unit variance. * - ``'robust_score'`` - Standardise using median and MAD (median absolute deviation; robust to outliers). Apply a transformation at construction time via the ``xtransform`` and ``ytransform`` keyword arguments:: lc = pgmuvi.lightcurve.Lightcurve( times, fluxes, errors, xtransform="minmax", ytransform="zscore", ) The GP is trained in the transformed space, but all results and plots are automatically inverse-transformed back to the original units. .. _working-with-magnitudes: Working with Magnitudes ------------------------ Native magnitude support is planned for a future release but is not currently available. If your data are in magnitudes, convert them to (relative) flux before constructing the :class:`~pgmuvi.lightcurve.Lightcurve`. A common choice is: .. math:: f \propto 10^{-0.4\,m} In code:: import numpy as np import pgmuvi # mags and mag_errors are your input magnitudes and uncertainties fluxes = 10 ** (-0.4 * mags) errors = fluxes * np.log(10) * 0.4 * mag_errors lc = pgmuvi.lightcurve.Lightcurve(times, fluxes, errors) Only relative variations matter for most ``pgmuvi`` analyses, so the overall flux normalisation is arbitrary. Checking Data Quality ---------------------- Before fitting, assess whether the observations are sufficient to detect the variability timescales you are interested in:: lc.assess_sampling_quality() See :doc:`preprocessing` for more detail on sampling quality metrics and filtering. Exporting Data --------------- The loaded data can be exported to an Astropy table or a VO Table file:: table = lc.to_table() lc.write_votable("lightcurve_output.xml")