Data Architecture
Technical documentation for the NTZ Conveyor Belt system
Entity Hierarchy
Data is organized in a strict hierarchy from collection mission down to individual gaze samples:
Mission
│
└── Day
│
└── Session
│
└── Run
│
└── Segment
│
└── Turn
│
└── Frame
│
└── GazePoint
Current Data Model
┌─────────────────────────────────────────────────────────────────────────────┐
│ Mission │
│ ───────── │
│ id: int (PK) │
│ name: string │
│ start_date: date │
│ end_date: date │
│ location: string │
│ notes: text │
└─────────────────────────────────────────────────────────────────────────────┘
│
│ 1:many
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ Day │
│ ───── │
│ id: int (PK) │
│ mission_id: int (FK) │
│ date: date │
│ notes: text │
└─────────────────────────────────────────────────────────────────────────────┘
│
│ 1:many
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ Session │
│ ───────── │
│ id: int (PK) │
│ day_id: int (FK) │
│ skier_name: string │
│ start_time: datetime │
│ notes: text │
└─────────────────────────────────────────────────────────────────────────────┘
│
│ 1:many
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ Run │
│ ───── │
│ id: int (PK) │
│ session_id: int (FK) │
│ run_number: int │
│ video_file: string │
│ notes: text │
└─────────────────────────────────────────────────────────────────────────────┘
│
│ 1:many
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ Segment │
│ ───────── │
│ id: int (PK) │
│ run_id: int (FK) │
│ segment_number: int │
│ segment_type: enum │
│ start_frame_number: int │
│ end_frame_number: int │
└─────────────────────────────────────────────────────────────────────────────┘
│
│ 1:many
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ Turn │
│ ────── │
│ id: int (PK) │
│ segment_id: int (FK) │
│ turn_number: int │
│ cumulative_turn_number: int │
└─────────────────────────────────────────────────────────────────────────────┘
│
│ 1:many
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ Frame │
│ ─────── │
│ id: int (PK) │
│ turn_id: int (FK) │
│ frame_number: int │
│ timestamp: float │
│ video_frame_number: int │
│ phase_angle: float │
│ imu_data: JSON │
└─────────────────────────────────────────────────────────────────────────────┘
│
│ 1:many
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ GazePoint │
│ ─────────── │
│ id: int (PK) │
│ frame_id: int (FK) │
│ x: float │
│ y: float │
│ confidence: float │
│ ntz_index: int │
└─────────────────────────────────────────────────────────────────────────────┘
Data Volume Estimates
| Entity |
Per Run |
Per Day (5 runs) |
Per Mission (3 days) |
| Runs | 1 | 5 | 15 |
| Segments | ~4 | ~20 | ~60 |
| Turns | ~50 | ~250 | ~750 |
| Frames | ~18,000 | ~90,000 | ~270,000 |
| GazePoints | ~18,000 | ~90,000 | ~270,000 |
Assuming 30fps video, ~10 min runs, ~20 turns per skiing segment.
Data Model Options
Option A: Current Model Implemented
Turn-centric hierarchy
Frames belong to Turns. Turn boundaries determined by phase angle crossing 0°/360°.
+ Natural for "aggregate by turn" analysis
+ Turn number available at query time
− Requires pre-processing to assign frames to turns
− Can't easily redefine turn boundaries later
Option B: Flat Frame Model Alternative
Frames belong directly to Segment
Segment
└── Frame (with phase_angle, no turn FK)
└── GazePoint
Turn number computed at query time from phase angle.
+ Simpler import (no turn boundary detection)
+ Can redefine turn boundaries without re-import
− Turn aggregation requires window functions
− Heavier queries for turn-level stats
Option C: Time-series Store Future
Store frames as time-series data
Use TimescaleDB or similar for efficient time-range queries.
+ Optimized for "give me frames 1000-2000"
+ Better compression for large datasets
− Additional infrastructure
− Overkill for prototype scale
Option D: Denormalized for Viz Alternative
Pre-compute flat table for visualization
VizData (materialized view or table)
──────────────────────────────────────────
segment_id | turn_number | phase_angle | timestamp | gaze_x | gaze_y | ntz_index
Keep normalized model for storage, denormalized for fast reads.
+ Single query for visualization
+ Best of both worlds
− Data duplication
− Refresh on updates
NTZ Classification Options
Current: Fixed Pixel Bands
NTZ determined by gaze Y coordinate in video frame:
y > 700 → NTZ-0 (bottom of frame = close)
y > 500 → NTZ-1
y > 300 → NTZ-2
y ≤ 300 → NTZ-3+ (top of frame = far ahead)
+ Simple, fast
− Doesn't account for head tilt or terrain angle
− Arbitrary thresholds
Alternative: Perspective-Corrected
Map video Y to estimated real-world distance, then classify.
Requires camera calibration and terrain model.
+ More accurate
− Needs additional data (camera params, terrain)
Alternative: PTL-Based
Detect actual mogul locations (PTLs) in video, classify gaze by nearest PTL.
Requires computer vision / drone mapping.
+ Ground truth accuracy
− Significant additional work
API Endpoints
| Endpoint |
Purpose |
/api/missions/ |
List/detail missions |
/api/segments/ |
List segments (filter by type) |
/api/conveyor/segment_data/?segment_id=X |
Flat data for visualization |
/api/conveyor/ntz_summary/?segment_id=X |
NTZ distribution by phase bin |
Open Design Questions
- Should Turn be a first-class entity or computed at query time?
- How should NTZ bands be defined? (fixed pixels, perspective, PTL-based?)
- Should we store raw IMU or just derived phase angle?
- Multiple gaze points per frame — is this fixation vs saccade, or multi-eye?
- How to handle video sync if gaze/IMU clocks drift?
View on GitHub