Pipeline Overview

← Dashboard

Overview_functionality_March.md · last modified 2026-03-18 14:15

report_pipeline_2026 — Functionality Overview (March 2026)

Apps (apps/)

Web applications for interactive data exploration and annotation. All use conda env SA_3.13.

Node Inspector (DB) — port 5011

Interactive visualization of time-frequency acoustic data from PostgreSQL.
Browse per-node, per-day cochleograms with perceptual layers (EdB, Fg, Bg) and LAeq overlays.
- Stack: Bokeh, HoloViews, Datashader, PostgreSQL (via sa_fetcher)
- Start: ./start_node_inspector_db_app.sh
- Key files: main.py, config.py, projects.toml, datashaded_images.py, day_analysis_functions.py, data/retrieval.py, data/processing.py, plotting/overlays.py, ui/date_picker.py

Sound Annotator — port 5012

Web app for annotating soundscape audio clips with acoustic feature labels.
Supports multi-timescale annotation, multiple concurrent annotators, project management, and clip filtering.
- Stack: FastAPI, HTMX, Jinja2, Uvicorn
- Start: ./start_sa_annotator.sh
- Key files: main.py, config.py, app/services/annotation_service.py, app/services/project_service.py, app/services/session_service.py, models.py

Template Management — port 5013

Full template-building pipeline: prototype collection → clustering → classification → export.
- Stack: FastAPI, HTMX, Jinja2, HoloViews, NumPy, PostgreSQL
- Start: ./template_management/start_template_management.sh
- Key files: main.py, config.py, services/prototype_service.py, services/template_service.py, services/visualization_service.py, services/project_state.py

TF Rasterizer Browser — port 5015 (default)

Library (not a standalone server) for interactive rasterized time-frequency visualization in the browser. Import from notebooks/scripts via from apps.tf_rasterizer_browser import show_tf_data.
- Stack: Panel, HoloViews, Datashader, Bokeh, Zarr, xarray
- Key file: tf_rasterizer_in_browser.py

SA App Manager

Launcher/manager for the above apps. Reads apps_registry.toml to start/stop/check status.
- Key files: main.py, app_runner.py, config.py


Source Library (src/)

Shared Python modules used by apps and notebooks.

src/sa_config.py — Unified Configuration

Single entry point for all config. Loads and merges TOML files from config/ into one dict.
- load_config() / get_config() — load all shared config (paths, projects, analysis params, DB)
- get_project(name) — get merged project info (nodes, dates, portal, etc.)
- get_db_config() — get PostgreSQL connection config

src/sa_data/ — Data Access

src/node_day_analysis/ — Node-Day Selection

src/Analysis/ — Acoustic Analysis

src/Templates/ — Template Pipeline

src/Visualization/ — Visualization Helpers


Pipeline Package (sa_pipeline/)

Installable package for data retrieval and batch downloading.

CLI entry points in scripts/: data_retriever, batch_downloader, data_retriever_from_postgress.py


Configuration (config/)

All TOML-based, loaded by src/sa_config.py:
- path_config.toml — filesystem paths (data dirs, workspace, cache, SSD)
- project_config.toml — project/node definitions, date ranges, portal info
- analysis_config.toml — analysis parameters (clustering, template building, layers)
- data_retriever.toml — data retrieval settings
- batch_downloader.toml — batch download jobs and schedule
- acoustic_annotations_default.toml / acoustic_annotations_urban_NL.toml — annotation label definitions


Notebooks (notebooks/)

Development and analysis notebooks:
- start_with_me.ipynb — getting started / orientation
- Clip_download+processing.ipynb — audio clip downloading and processing
- day_overview_development.ipynb — day-level analysis development
- Template_development.ipynb — template building development
- template_experiment.ipynb / _v2 / _layered — template matching experiments
- template_finding_and_application.ipynb — end-to-end template workflow
- connect_annotations_to_templates.ipynb — link annotations to template classes
- read_annotations.ipynb — read and explore annotation data
- harmonic_sieve.ipynb / harmonic_tracker.ipynb / sieve.ipynb — harmonic/tonal analysis
- ridge_anaysis_dev.ipynb — ridge analysis development
- Weather_Collection_Demo.ipynb — weather data collection demo


Subprojects (git submodules)

sa_scheduler/

SA Projects report runner. Provides DB config (config.toml) and scheduling infrastructure.

sa_projects_testspace/

Test workspace for SA project templates and configuration testing.


Data Directories


Other Files