Overview_functionality_March.md · last modified 2026-03-18 14:15
apps/)Web applications for interactive data exploration and annotation. All use conda env SA_3.13.
Interactive visualization of time-frequency acoustic data from PostgreSQL.
Browse per-node, per-day cochleograms with perceptual layers (EdB, Fg, Bg) and LAeq overlays.
- Stack: Bokeh, HoloViews, Datashader, PostgreSQL (via sa_fetcher)
- Start: ./start_node_inspector_db_app.sh
- Key files: main.py, config.py, projects.toml, datashaded_images.py, day_analysis_functions.py, data/retrieval.py, data/processing.py, plotting/overlays.py, ui/date_picker.py
Web app for annotating soundscape audio clips with acoustic feature labels.
Supports multi-timescale annotation, multiple concurrent annotators, project management, and clip filtering.
- Stack: FastAPI, HTMX, Jinja2, Uvicorn
- Start: ./start_sa_annotator.sh
- Key files: main.py, config.py, app/services/annotation_service.py, app/services/project_service.py, app/services/session_service.py, models.py
Full template-building pipeline: prototype collection → clustering → classification → export.
- Stack: FastAPI, HTMX, Jinja2, HoloViews, NumPy, PostgreSQL
- Start: ./template_management/start_template_management.sh
- Key files: main.py, config.py, services/prototype_service.py, services/template_service.py, services/visualization_service.py, services/project_state.py
Library (not a standalone server) for interactive rasterized time-frequency visualization in the browser. Import from notebooks/scripts via from apps.tf_rasterizer_browser import show_tf_data.
- Stack: Panel, HoloViews, Datashader, Bokeh, Zarr, xarray
- Key file: tf_rasterizer_in_browser.py
Launcher/manager for the above apps. Reads apps_registry.toml to start/stop/check status.
- Key files: main.py, app_runner.py, config.py
src/)Shared Python modules used by apps and notebooks.
src/sa_config.py — Unified ConfigurationSingle entry point for all config. Loads and merges TOML files from config/ into one dict.
- load_config() / get_config() — load all shared config (paths, projects, analysis params, DB)
- get_project(name) — get merged project info (nodes, dates, portal, etc.)
- get_db_config() — get PostgreSQL connection config
src/sa_data/ — Data Accessnode_data_access.py — Fetch data from PostgreSQL:fetch_day_data() — raw day data from DBfetch_node_day_values() — processed day data with scaling and Fg derivationfetch_interval_values() — fetch arbitrary time intervalsdata_retrieval.py — DataProcessor class: transforms raw measurement data into interpretable format (TF layers, 1D layers, Fg derivation, dequantization, timestamp conversion)src/node_day_analysis/ — Node-Day Selectionnode_day_selection.py:get_available_node_days() — query DB for all available node-day combinationsselect_day_node_combinations() — filter by nodes, portals, date rangesrc/Analysis/ — Acoustic Analysisharmonic_analysis.py — Ridge-based harmonic analysis:compute_ridge_props() — summarise ridges (frequency, energy, duration, concurrency, salience)find_harmonic_candidates() — generate candidate harmonic complexes from ridgesclassify_ridges_fast() — classify ridges into harmonic complexes vs. isolated tonespeak_mask_functions.py — Peak/ridge detection in spectrograms:detect_narrow_events() — detect tonal and pulsed events at multiple timescalesfind_peaks_simple() / find_peaks_above_bg() — column/row peak detectionfind_ridges_in_peakMask() — trace ridges through peak masks (Numba-accelerated)mask_lenghts(), mov_av() — supporting functionssrc/Templates/ — Template Pipelineprototype_collection.py:sample_random_seconds_from_node_days() — sample random seconds from DB for prototype buildingsample_random_seconds() — random sampling from a single day's dataselect_hourly_weighted_prototype_indices() — hourly-weighted prototype selectiontemplate_pipelines.py:fit_templates() — match data columns to templates via cosine similarityreconstruct_from_templates() — reconstruct layers using best-matching templates + scalingprototype_collection_pipeline() — end-to-end prototype collection from task listtemplate_fitting_pipeline() — end-to-end template fitting across node-daystemplate_matching.py:find_best_templates_l1_offset_approx() — fast approximate L1 matching with offsetfind_best_templates_l1_offset_exact() — exact L1 matching (slower, mathematically exact)match_templates_to_layers() — match templates across multiple layers, compute residualstemplate_analysis.py:build_match_info_contiguous() — segment template match sequences into contiguous runssummarize_match_info() — per-class statistics (duration, longest segment, factor extremes)create_class_match_summary() — aggregate summary across multiple node-dayscreate_aggregated_template_heatmap() — time-windowed template prevalence heatmaptemplate_fit_and_count_loop() — batch fit + count templates across task listtemplate_ordering.py:similarity_order_for_templates() — hierarchical clustering order (cosine similarity)prune_templates() — remove templates with too few memberscreate_valid_order() — order: valid → no_data → artifactsreorder_by_grouped_sorting() — sort within groups by prevalence countstemplate_visualization.py:plot_templates() — visualize template matrix with optional prevalence curvessrc/Visualization/ — Visualization Helpersim_info_builder.py:create_im_info() — convert values dict into im_info format for tf_rasterizer_browsersa_pipeline/)Installable package for data retrieval and batch downloading.
data_retriever/ — Retrieve soundscape data from PostgreSQL, save to Zarr/pickleretriever.py — DataRetriever class (core logic)cli.py — CLI interface via Typerbatch_downloader/ — Scheduled batch downloading from SA portalsscheduler.py — PreloaderScheduler with network resiliencecommon/ — Shared utilitiestimezone_utils.py — localize_and_convert_to_local()CLI entry points in scripts/: data_retriever, batch_downloader, data_retriever_from_postgress.py
config/)All TOML-based, loaded by src/sa_config.py:
- path_config.toml — filesystem paths (data dirs, workspace, cache, SSD)
- project_config.toml — project/node definitions, date ranges, portal info
- analysis_config.toml — analysis parameters (clustering, template building, layers)
- data_retriever.toml — data retrieval settings
- batch_downloader.toml — batch download jobs and schedule
- acoustic_annotations_default.toml / acoustic_annotations_urban_NL.toml — annotation label definitions
notebooks/)Development and analysis notebooks:
- start_with_me.ipynb — getting started / orientation
- Clip_download+processing.ipynb — audio clip downloading and processing
- day_overview_development.ipynb — day-level analysis development
- Template_development.ipynb — template building development
- template_experiment.ipynb / _v2 / _layered — template matching experiments
- template_finding_and_application.ipynb — end-to-end template workflow
- connect_annotations_to_templates.ipynb — link annotations to template classes
- read_annotations.ipynb — read and explore annotation data
- harmonic_sieve.ipynb / harmonic_tracker.ipynb / sieve.ipynb — harmonic/tonal analysis
- ridge_anaysis_dev.ipynb — ridge analysis development
- Weather_Collection_Demo.ipynb — weather data collection demo
sa_scheduler/SA Projects report runner. Provides DB config (config.toml) and scheduling infrastructure.
sa_projects_testspace/Test workspace for SA project templates and configuration testing.
zarr_data/ — Zarr-format acoustic datazst_data/ — Zstandard-compressed dataweather_store/ — Weather data (parquet: node-to-station index)logs/ — Application and scheduler logsarchive/ — Archived/old filespyproject.toml — package config (sa-pipeline v0.1.0)toc_src.md — table of contents for src modulestest_fib_runnable — compiled test binary (Fibonacci, likely a build-system test)