Database Schema¶
The LightcurveDB schema is designed to efficiently store and retrieve astronomical time-series data from the TESS mission. The schema uses PostgreSQL with SQLAlchemy ORM, supporting complex relationships and polymorphic models.
Schema Overview¶
The database schema consists of several interconnected model groups:
Mission Hierarchy: Mission → MissionCatalog → Target
Instrument Hierarchy: Self-referential instrument tree
Observation System: Polymorphic observation models with FITS frame support
Processing Pipeline: PhotometricSource + DetrendingMethod → ProcessingGroup
Data Products: DataSet (lightcurves), TargetSpecificTime, and QualityFlagArray
Entity Relationship Diagram¶
--- config: theme: neutral layout: elk elk: mergeEdges: true nodePlacementStrategy: LINEAR_SEGMENTS --- erDiagram Mission ||--o{ MissionCatalog : "has catalogs" MissionCatalog ||--o{ Target : "contains targets" Instrument ||--o{ Instrument : "parent/child" Instrument ||--o{ Observation : "produces" Observation ||--o{ FITSFrame : "polymorphic" Observation ||--o{ TargetSpecificTime : "has times" Observation ||--o{ DataSet : "has datasets" Observation ||--o{ QualityFlagArray : "has quality flags" Target ||--o{ TargetSpecificTime : "has times" Target ||--o{ DataSet : "has lightcurves" Target ||--o{ QualityFlagArray : "has quality flags" PhotometricSource ||--o{ ProcessingGroup : "used in" DetrendingMethod ||--o{ ProcessingGroup : "used in" ProcessingGroup ||--o{ DataSet : "produces" Mission { UUID id PK string name UK string description string time_unit decimal time_epoch string time_epoch_scale string time_epoch_format string time_format_name UK } MissionCatalog { int id PK UUID host_mission_id FK string name UK string description } Target { bigint id PK int catalog_id FK bigint name } Instrument { UUID id PK string name json properties UUID parent_id FK } Observation { int id PK string type array cadence_reference UUID instrument_id FK } FITSFrame { int id PK string type bigint cadence int observation_id FK bool simple int bitpix int naxis array naxis_values bool extended float bscale float bzero path file_path } PhotometricSource { int id PK string name UK string description } DetrendingMethod { int id PK string name UK string description } ProcessingGroup { int id PK string name string description int photometric_source_id FK int detrending_method_id FK } DataSet { int id PK int processing_group_id FK int target_id FK int observation_id FK array values array errors } TargetSpecificTime { bigint id PK int target_id FK int observation_id FK array barycentric_julian_dates } QualityFlagArray { bigint id PK string type int observation_id FK int target_id FK array quality_flags datetime created_on }
LightcurveDB Entity Relationship Diagram¶
Model Descriptions¶
Mission Models¶
- class lightcurvedb.models.Mission(**kwargs)
Bases:
LCDBModel
,NameAndDescriptionMixin
,CreatedOnMixin
Represents a space mission or survey program.
A Mission defines the top-level context for astronomical observations, including time system definitions and associated catalogs. Examples include TESS (Transiting Exoplanet Survey Satellite).
- id
Unique identifier for the mission
- Type:
UUID
- name
Unique name of the mission (e.g., “TESS”)
- Type:
str
- description
Detailed description of the mission
- Type:
str
- time_unit
Unit of time measurement (e.g., “day”)
- Type:
str
- time_epoch
Reference epoch for time calculations
- Type:
Decimal
- time_epoch_scale
Time scale for the epoch (e.g., “tdb”)
- Type:
str
- time_epoch_format
Format of the epoch specification
- Type:
str
- time_format_name
Unique name for the mission’s time format
- Type:
str
- catalogs
Associated catalogs for this mission
- Type:
list[MissionCatalog]
Examples
>>> mission = Mission(name="TESS", ... description="Transiting Exoplanet Survey Satellite")
- created_on: orm.Mapped[datetime.datetime]
- name: orm.Mapped[str]
- class lightcurvedb.models.MissionCatalog(**kwargs)
Bases:
LCDBModel
,NameAndDescriptionMixin
,CreatedOnMixin
A catalog of astronomical targets associated with a mission.
MissionCatalog represents a specific catalog within a mission context, such as the TESS Input Catalog (TIC). It serves as a container for organizing targets observed by the mission.
- id
Primary key identifier
- Type:
int
- host_mission_id
Foreign key to the parent Mission
- Type:
UUID
- name
Unique catalog name (e.g., “TIC” for TESS Input Catalog)
- Type:
str
- description
Detailed description of the catalog
- Type:
str, optional
- host_mission
Parent mission this catalog belongs to
- Type:
Mission
- targets
Collection of targets in this catalog
- Type:
list[Target]
- created_on: orm.Mapped[datetime.datetime]
- name: orm.Mapped[str]
- class lightcurvedb.models.Target(**kwargs)
Bases:
LCDBModel
An astronomical target (star, planet, etc.) in a mission catalog.
Target represents an individual astronomical object that is observed during a mission. Each target is uniquely identified within its catalog by a numeric identifier (e.g., TIC ID for TESS targets).
- id
Primary key identifier
- Type:
int
- catalog_id
Foreign key to the MissionCatalog
- Type:
int
- name
Catalog-specific identifier (e.g., TIC ID)
- Type:
int
- catalog
The catalog this target belongs to
- Type:
MissionCatalog
- datasets
Processed lightcurve datasets for this target
- Type:
list[DataSet]
- target_specific_times
Time series specific to this target
- Type:
list[TargetSpecificTime]
- quality_flag_arrays
Target-specific quality flags
- Type:
list[QualityFlagArray]
Notes
The combination of catalog_id and name must be unique, ensuring no duplicate targets within a catalog.
Instrument Model¶
- class lightcurvedb.models.Instrument(**kwargs)
Bases:
LCDBModel
Represents a scientific instrument or assembly used for observations.
Instruments form a hierarchical structure where an instrument can be either a physical device (e.g., a CCD) or an assembly containing other instruments (e.g., a camera with multiple CCDs). This allows modeling complex instrument configurations.
- id
Unique identifier for the instrument
- Type:
UUID
- name
Name of the instrument (e.g., “Camera 1”, “CCD 2”)
- Type:
str
- properties
JSON dictionary of instrument-specific properties and metadata
- Type:
dict
- parent_id
Foreign key to parent instrument (None for top-level instruments)
- Type:
UUID, optional
- parent
Parent instrument in the hierarchy
- Type:
Instrument, optional
- children
Child instruments if this is an assembly
- Type:
list[Instrument]
- observations
Observations made using this instrument
- Type:
list[Observation]
Examples
>>> camera = Instrument(name="TESS Camera 1") >>> ccd = Instrument(name="CCD 1", parent=camera)
Notes
The self-referential relationship allows building instrument trees of arbitrary depth, useful for complex telescope configurations.
Observation Models¶
- class lightcurvedb.models.Observation(**kwargs)
Bases:
LCDBModel
Base class for astronomical observations.
Observation is a polymorphic base class that represents a collection of measurements taken by an instrument. Subclasses can specialize this for different observation types while maintaining a common interface.
This design allows mission-specific observations (e.g., TESSObservation, HSTObservation) to extend the base model with mission-specific fields while sharing common functionality.
- id
Primary key identifier
- Type:
int
- type
Polymorphic discriminator for subclass type
- Type:
str
- cadence_reference
Array of cadence numbers for time ordering
- Type:
ndarray[int64]
- instrument_id
Foreign key to the instrument used
- Type:
uuid.UUID
- instrument
The instrument that made this observation
- Type:
- datasets
Processed versions of this observation
- Type:
list[DataSet]
- target_specific_times
Target-specific time corrections
- Type:
list[TargetSpecificTime]
Examples
Creating a mission-specific observation subclass:
>>> class TESSObservation(Observation): ... __mapper_args__ = { ... "polymorphic_identity": "tess_observation", ... } ... sector: Mapped[int] ... orbit_number: Mapped[int]
Notes
This is a polymorphic base class using single table inheritance. The ‘type’ field determines the specific observation subclass. Mission-specific fields should be added via subclassing, not by modifying this base class.
- class lightcurvedb.models.TargetSpecificTime(**kwargs)
Bases:
LCDBModel
Time series data specific to a target-observation pair.
This model stores barycentric-corrected time values that account for the specific position of a target. It serves as a junction between Target and Observation with additional time data.
- id
Primary key identifier
- Type:
int
- target_id
Foreign key to the target
- Type:
int
- observation_id
Foreign key to the observation
- Type:
int
- barycentric_julian_dates
Array of barycentric Julian dates corrected for target position
- Type:
ndarray[float64]
- target
The astronomical target
- Type:
- observation
The observation these times correspond to
- Type:
Notes
Barycentric correction accounts for Earth’s motion around the solar system barycenter, providing consistent timing for astronomical observations.
Frame Models¶
- class lightcurvedb.models.FITSFrame(**kwargs)
Bases:
LCDBModel
,CreatedOnMixin
Represents a FITS (Flexible Image Transport System) frame.
FITSFrame stores metadata about individual FITS files used in astronomical observations. It uses polymorphic inheritance to support different frame types while maintaining FITS standard compliance.
- id
Primary key identifier
- Type:
int
- type
Polymorphic discriminator for frame type
- Type:
str
- cadence
Time-ordered frame number
- Type:
int
- observation_id
Foreign key to parent observation
- Type:
int
- simple
FITS primary keyword - file conforms to FITS standard
- Type:
bool
- bitpix
FITS primary keyword - bits per pixel
- Type:
int
- naxis
FITS primary keyword - number of axes
- Type:
int
- naxis_values
Array representation of NAXIS1, NAXIS2, etc.
- Type:
list[int]
- extended
FITS primary keyword - file may contain extensions
- Type:
bool
- bscale
Linear scaling factor (physical = bzero + bscale * stored)
- Type:
float
- bzero
Zero point offset for scaling
- Type:
float
- file_path
File system path to the FITS file
- Type:
Path, optional
Notes
The type-cadence combination must be unique, enforced by database constraint. Polymorphic on ‘type’ field allows for specialized frame subclasses.
- created_on: orm.Mapped[datetime.datetime]
Processing Models¶
- class lightcurvedb.models.PhotometricSource(**kwargs)
Bases:
LCDBModel
,NameAndDescriptionMixin
Defines a source or method of photometric measurement.
PhotometricSource represents different ways of extracting photometric data from observations, such as aperture photometry with different aperture sizes or PSF photometry.
- id
Primary key identifier
- Type:
int
- name
Name of the photometric method (inherited from mixin)
- Type:
str
- description
Detailed description (inherited from mixin)
- Type:
str
- processing_groups
Processing groups using this photometric source
- Type:
list[ProcessingGroup]
Examples
>>> aperture_2px = PhotometricSource(name="Aperture_2px", ... description="2 pixel radius aperture")
- name: orm.Mapped[str]
- class lightcurvedb.models.DetrendingMethod(**kwargs)
Bases:
LCDBModel
,NameAndDescriptionMixin
Represents a method for removing systematic trends from lightcurves.
DetrendingMethod defines algorithms used to remove instrumental or systematic effects from photometric time series, such as PDC-SAP (Pre-search Data Conditioning Simple Aperture Photometry).
- id
Primary key identifier
- Type:
int
- name
Name of the detrending method (inherited from mixin)
- Type:
str
- description
Detailed description (inherited from mixin)
- Type:
str
- processing_groups
Processing groups using this detrending method
- Type:
list[ProcessingGroup]
Examples
>>> pdc = DetrendingMethod(name="PDC-SAP", ... description="Pre-search Data Conditioning")
- name: orm.Mapped[str]
- class lightcurvedb.models.ProcessingGroup(**kwargs)
Bases:
LCDBModel
,NameAndDescriptionMixin
Combines a photometric source with a detrending method.
ProcessingGroup represents a unique combination of how photometry was extracted and how it was detrended. This allows tracking different processing pipelines applied to the same observations.
- id
Primary key identifier
- Type:
int
- name
Name of the processing group (inherited from mixin)
- Type:
str
- description
Detailed description (inherited from mixin)
- Type:
str
- photometric_source_id
Foreign key to photometric source
- Type:
int
- detrending_method_id
Foreign key to detrending method
- Type:
int
- photometric_source
The photometric extraction method
- Type:
- detrending_method
The detrending algorithm
- Type:
- datasets
Lightcurve datasets using this processing
- Type:
list[DataSet]
Notes
The combination of photometric_source_id and detrending_method_id must be unique, ensuring no duplicate processing groups.
- name: orm.Mapped[str]
- class lightcurvedb.models.DataSet(**kwargs)
Bases:
LCDBModel
A processed lightcurve for a specific target and observation.
DataSet is the central model that connects a target, an observation, and a processing method to produce a final lightcurve. It stores the actual photometric measurements and uncertainties.
- id
Primary key identifier
- Type:
int
- processing_group_id
Foreign key to processing group
- Type:
int
- target_id
Foreign key to target
- Type:
int
- observation_id
Foreign key to observation
- Type:
int
- values
Array of photometric measurements (flux or magnitude)
- Type:
ndarray[float64]
- errors
Array of measurement uncertainties
- Type:
ndarray[float64], optional
- processing_group
The processing method used
- Type:
- target
The astronomical target
- Type:
- observation
The source observation
- Type:
Notes
This is the main table for storing lightcurve data. Each row represents one complete lightcurve for a target processed with a specific method.
Quality Flag Model¶
- class lightcurvedb.models.QualityFlagArray(**kwargs)
Bases:
LCDBModel
,CreatedOnMixin
Stores quality flag arrays for astronomical observations.
QualityFlagArray represents bit-encoded quality information for time-series astronomical data. Each element in the array corresponds to a cadence in the observation, with individual bits representing different quality conditions or data issues.
This is a polymorphic base class that can be extended for mission-specific quality flag implementations with specialized bit definitions.
- Parameters:
type (str) – Quality flag type identifier (e.g., ‘pixel_quality’, ‘cosmic_ray’)
observation_id (int) – Foreign key to the parent observation
target_id (int, optional) – Foreign key to a specific target when flags are target-specific
quality_flags (ndarray[int32]) – Array of 32-bit integers where each bit represents a quality condition
- id
Primary key identifier
- Type:
int
- type
Polymorphic discriminator and quality flag category
- Type:
str
- observation_id
Reference to the parent observation
- Type:
int
- target_id
Reference to specific target (null for observation-wide flags)
- Type:
int or None
- quality_flags
Bit-encoded quality flag array
- Type:
ndarray[int32]
- observation
Parent observation relationship
- Type:
- target
Target relationship when flags are target-specific
- Type:
Target or None
- created_on
Timestamp of record creation (from CreatedOnMixin)
- Type:
datetime
Examples
Creating observation-wide quality flags:
>>> obs_flags = QualityFlagArray( ... observation_id=12345, ... quality_flags=np.array([0, 1, 4, 5], dtype=np.int32) ... )
Creating target-specific quality flags:
>>> target_flags = QualityFlagArray( ... observation_id=12345, ... target_id=67890, ... quality_flags=np.array([0, 0, 2, 8], dtype=np.int32) ... )
Interpreting bit flags:
>>> # Bit 0: Cosmic ray >>> # Bit 1: Saturation >>> # Bit 2: Bad pixel >>> cosmic_ray_mask = (flags.quality_flags & 1) != 0 >>> saturated_mask = (flags.quality_flags & 2) != 0
Extending with single-table inheritance (simple approach):
>>> class TESSQualityFlags(QualityFlagArray): ... """TESS-specific quality flags with known bit definitions.""" ... __mapper_args__ = { ... "polymorphic_identity": "tess_quality", ... } ... @property ... def cosmic_ray_events(self): ... """Return mask of cosmic ray events (bit 0).""" ... flags = np.array(self.quality_flags, dtype=np.int32) ... return (flags & 1) != 0 ... @property ... def saturated_pixels(self): ... """Return mask of saturated pixels (bit 1).""" ... flags = np.array(self.quality_flags, dtype=np.int32) ... return (flags & 2) != 0 ... @property ... def spacecraft_anomaly(self): ... """Return mask of spacecraft anomalies (bit 4).""" ... flags = np.array(self.quality_flags, dtype=np.int32) ... return (flags & 16) != 0
Extending with joined-table inheritance (advanced approach):
class SpectroscopicQualityFlags(QualityFlagArray): """Quality flags for spectroscopic observations.""" __tablename__ = "spectroscopic_quality_flags" __mapper_args__ = { "polymorphic_identity": "spectroscopic_quality", } # Primary key also serves as foreign key to parent table id = orm.mapped_column( sa.ForeignKey("quality_flag_array.id"), primary_key=True ) # Additional columns specific to spectroscopic data wavelength_calibration_quality = orm.mapped_column( sa.types.Float, comment="Wavelength calibration quality score (0-1)" ) spectral_resolution = orm.mapped_column( sa.types.Float, comment="Actual spectral resolution achieved" ) calibration_lamp_id = orm.mapped_column( sa.ForeignKey("calibration_lamp.id"), nullable=True ) @property def wavelength_drift(self): """Return mask of wavelength drift (bit 8).""" flags = np.array(self.quality_flags, dtype=np.int32) return (flags & 256) != 0 class PhotometricQualityFlags(QualityFlagArray): """Quality flags for photometric observations.""" __tablename__ = "photometric_quality_flags" __mapper_args__ = { "polymorphic_identity": "photometric_quality", } id = orm.mapped_column( sa.ForeignKey("quality_flag_array.id"), primary_key=True ) # Photometry-specific metadata sky_background_level = orm.mapped_column( sa.types.Float, comment="Median sky background in counts" ) fwhm = orm.mapped_column( sa.types.Float, comment="Full width at half maximum of PSF" ) extinction_coefficient = orm.mapped_column( sa.types.Float, nullable=True )
Polymorphic querying examples:
>>> # Query all quality flags for an observation >>> all_flags = session.query(QualityFlagArray).filter_by( ... observation_id=12345 ... ).all()
>>> # Query only TESS quality flags >>> tess_flags = session.query(TESSQualityFlags).filter_by( ... observation_id=12345 ... ).all()
>>> # Use with_polymorphic for efficient joined loading >>> from sqlalchemy.orm import with_polymorphic >>> >>> poly_flags = with_polymorphic( ... QualityFlagArray, ... [SpectroscopicQualityFlags, PhotometricQualityFlags] ... ) >>> query = session.query(poly_flags).filter( ... poly_flags.observation_id == 12345 ... ) >>> >>> # Access subclass-specific attributes without additional queries >>> for flag in query: ... if isinstance(flag, SpectroscopicQualityFlags): ... print(f"Spectral resolution: {flag.spectral_resolution}") ... elif isinstance(flag, PhotometricQualityFlags): ... print(f"Sky background: {flag.sky_background_level}")
>>> # Filter by polymorphic type >>> spectro_only = session.query(QualityFlagArray).filter_by( ... type="spectroscopic_quality" ... ).all()
Notes
The combination of (type, observation_id, target_id) must be unique, preventing duplicate quality flag arrays for the same context. NULL values in target_id are treated as equal, so only one observation-wide quality flag array (with NULL target_id) is allowed per type and observation_id combination.
Quality flag bit definitions are mission and type-specific. Subclasses should document their specific bit meanings and may add helper methods for flag interpretation.
See also
Observation
Parent observation model
Target
Associated target for target-specific flags
- created_on: orm.Mapped[datetime.datetime]
Polymorphic Models¶
The schema uses SQLAlchemy’s polymorphic inheritance for flexibility:
Observation: Base class with polymorphic_on=”type”
Allows different observation types to share common attributes
Subclasses can add specialized fields while maintaining relationships
FITSFrame: Configured for polymorphism with polymorphic_on=”type”
Supports different FITS frame types (e.g., science frames, calibration frames)
Identity “basefits” serves as the default type
QualityFlagArray: Base class with polymorphic_on=”type”
Enables mission-specific quality flag implementations
Identity “base_quality_flag” serves as the default type
Can be extended to add mission-specific bit interpretations
Mission-Specific Extensions¶
LightcurveDB supports mission-specific data through SQLAlchemy’s polymorphic inheritance. The Observation model serves as a base class that can be extended for specific missions.
Design Pattern¶
To add support for a new mission:
Create a subclass of Observation
Set a unique polymorphic_identity
Add mission-specific fields as Mapped columns
Register in your mission’s module
Example: TESS Observations¶
from lightcurvedb.models import Observation
from sqlalchemy import orm
class TESSObservation(Observation):
"""TESS-specific observation with orbit and sector information."""
__mapper_args__ = {
"polymorphic_identity": "tess_observation",
}
# TESS-specific fields
sector: orm.Mapped[int]
orbit_number: orm.Mapped[int]
spacecraft_quaternion: orm.Mapped[dict] # Store as JSON
cosmic_ray_mitigation: orm.Mapped[bool]
Example: HST Observations¶
class HSTObservation(Observation):
"""Hubble Space Telescope observation with visit information."""
__mapper_args__ = {
"polymorphic_identity": "hst_observation",
}
visit_id: orm.Mapped[str]
program_id: orm.Mapped[int]
filter_name: orm.Mapped[str]
exposure_time: orm.Mapped[float]
Querying Mission-Specific Data¶
# Query all observations (any mission)
all_obs = session.query(Observation).all()
# Query only TESS observations
tess_obs = session.query(TESSObservation).filter_by(sector=1).all()
# Polymorphic loading - automatically returns correct subclass
obs = session.query(Observation).first()
if isinstance(obs, TESSObservation):
print(f"TESS Sector: {obs.sector}")
Benefits¶
Type Safety: Mission-specific fields are properly typed
Clean Schema: No unused fields for other missions
Extensibility: New missions don’t require schema changes
Performance: Single table inheritance is efficient
Flexibility: Can query all observations or mission-specific ones
Key Relationships¶
One-to-Many:
Mission → MissionCatalog → Target (hierarchical)
Instrument → Observation
ProcessingGroup → DataSet
Observation → QualityFlagArray
Target → QualityFlagArray (optional relationship)
Many-to-Many (via junction tables):
Target ↔ Observation (via TargetSpecificTime)
PhotometricSource + DetrendingMethod → ProcessingGroup
Self-Referential:
Instrument parent/child hierarchy for complex instrument configurations
Central Hub:
DataSet connects Target + Observation + ProcessingGroup
This is where the actual lightcurve data resides
Usage Examples¶
Querying for a target’s lightcurves:
from lightcurvedb.models import Target, DataSet
# Get all lightcurves for a specific TIC ID
target = session.query(Target).filter_by(name=12345678).first()
lightcurves = target.datasets
# Get lightcurves with specific processing
for lc in lightcurves:
print(f"Processing: {lc.processing_group.name}")
print(f"Values: {lc.values}")
Creating instrument hierarchy:
from lightcurvedb.models import Instrument
# Create camera with CCDs
camera = Instrument(name="TESS Camera 1")
ccd1 = Instrument(name="CCD 1", parent=camera)
ccd2 = Instrument(name="CCD 2", parent=camera)
session.add_all([camera, ccd1, ccd2])
Working with quality flags:
from lightcurvedb.models import QualityFlagArray, Target, Observation
import numpy as np
# Get quality flags for a specific target observation
target = session.query(Target).filter_by(name=12345678).first()
observation = target.observations[0]
# Get quality flags for this target in this observation
quality_flags = session.query(QualityFlagArray).filter_by(
observation=observation,
target=target,
type="base_quality_flag"
).first()
if quality_flags:
# Check for cosmic ray events (bit 0)
cosmic_ray_mask = (quality_flags.quality_flags & 1) != 0
num_cosmic_rays = np.sum(cosmic_ray_mask)
print(f"Found {num_cosmic_rays} cadences with cosmic ray events")
# Check for saturated pixels (bit 1)
saturation_mask = (quality_flags.quality_flags & 2) != 0
print(f"Saturated in {np.sum(saturation_mask)} cadences")
Database Constraints¶
The schema enforces several important constraints:
Unique Constraints:
Mission.name must be unique
MissionCatalog.name must be unique
Target: (catalog_id, name) combination must be unique
ProcessingGroup: (photometric_source_id, detrending_method_id) must be unique
FITSFrame: (type, cadence) combination must be unique
QualityFlagArray: (type, observation_id, target_id) must be unique (with NULL handling)
Cascade Deletes:
Deleting an Observation cascades to FITSFrame, TargetSpecificTime, DataSet, and QualityFlagArray
Deleting a ProcessingGroup cascades to DataSet
Deleting a Target cascades to DataSet
Referential Integrity:
All foreign keys are enforced at the database level
Orphaned records are prevented through proper relationships