| segment_id | UUID | Primary Key - Globally unique identifier for each transcript segment |
| video_id | UUID | Unique YouTube video identifier |
| channel_id | UUID | Immutable Babbl Labs internal unique identifier for channel |
| channel_uri | UUID | Immutable YouTube’s unique identifier for channel |
| channel_custom_url | STRING | Mutable custom URL for channel |
| channel_name | STRING | Channel title from YouTube metadata |
| channel_description | STRING | Channel description from YouTube metadata |
| channel_locale | ENUM | Channel geographic country (ISO 3166-1 alpha-2) |
| channel_published_at | TIMESTAMP | Timestamp when channel was created |
| channel_coverage_initiated_at | TIMESTAMP | Timestamp when we initiated coverage |
| video_title | STRING | Video title from YouTube metadata |
| video_description | STRING | Video description from YouTube metadata |
| video_language | STRING | Video language code (ISO 639-1) |
| video_published_dt | TIMESTAMP | Timestamp when video was originally published |
| video_download_dt | TIMESTAMP | Timestamp when we downloaded the video |
| video_transcribed_at | TIMESTAMP | Timestamp when we transcribed the video |
| video_in_dataset_at | TIMESTAMP | Timestamp when video was included in dataset |
| model_transcription_tag | STRING | Identifier for transcription model used |
| segment_start | FLOAT | Starting point of segment in seconds from video start |
| segment_end | FLOAT | End point of segment in seconds from video start |
| segment_start_char | INT | Character index where segment starts in transcript |
| segment_end_char | INT | Character index where segment ends in transcript |
| segment_text | STRING | Complete verbatim transcript text for this segment |
| speaker_name | STRING | Name of speaker (optional - may be null) |