The Core Dataset provides comprehensive YouTube Social Media Entity Intelligence, combining channel metadata, video content, transcription data, and entity recognition with sentiment analysis. This dataset enables deep analysis of how companies, brands, and financial instruments are discussed across YouTube’s financial media landscape.Documentation Index
Fetch the complete documentation index at: https://docs.babbl-labs.com/llms.txt
Use this file to discover all available pages before exploring further.
Complete Data Dictionary
| Field | Type | Description |
|---|---|---|
| channel_id | UUID | Primary Key - Immutable Babbl Labs internal unique identifier for channel |
| channel_uri | UUID | Immutable YouTube’s unique identifier for channel |
| channel_custom_url | STRING | Mutable custom URL for channel |
| channel_name | STRING | Channel title from YouTube metadata |
| channel_description | STRING | Channel description from YouTube metadata |
| channel_locale | ENUM | Channel geographic country (ISO 3166-1 alpha-2) |
| channel_view_count | INT | Most recent channel total view count |
| channel_subscriber_count | INT | Most recent channel total subscriber count |
| channel_video_count | INT | Most recent channel total video count |
| channel_published_at | TIMESTAMP | Timestamp when channel was created |
| channel_coverage_initiated_at | TIMESTAMP | Timestamp when we initiated coverage |
| channel_last_updated_at | TIMESTAMP | Timestamp when we last polled YouTube metadata |
| video_id | UUID | Primary Key - Unique YouTube video identifier |
| video_view_count | INT | Most recent view count |
| video_like_count | INT | Most recent video like count |
| video_comment_count | INT | Most recent video comment count |
| video_published_dt | TIMESTAMP | Timestamp when video was originally published |
| video_transcribed_at | TIMESTAMP | Timestamp when we originally transcribed video |
| video_scored_at | TIMESTAMP | Timestamp when we originally processed video |
| video_last_updated_at | TIMESTAMP | Timestamp when we last polled YouTube metadata |
| video_title | STRING | Video title from YouTube metadata |
| video_description | STRING | Video description from YouTube metadata |
| video_language | STRING | Video language code (ISO 639-1) |
| model_transcription_tag | STRING | Identifier for transcription model used |
| model_scoring_tag | STRING | Identifier for scoring model used |
| segment_start | FLOAT | Starting point of segment in seconds |
| segment_end | FLOAT | End point of segment in seconds |
| segment_text | STRING | Transcribed text (50 words before/after entity) |
| speaker_name | STRING | Name of speaker in this segment |
| speaker_associated_entity | UUID | Entity (company) speaker is associated with (FIGI) |
| speaker_position | STRING | Known title of speaker |
| speaker_role_context | ENUM | Role of speaker (Host, Guest, ReferencedSource, Other) |
| entity_id | UUID | Immutable identifier of entity referenced |
| entity_symbol | STRING | Public company ticker symbol |
| entity_figi_id | UUID | Financial Instrument Global Identifier (FIGI) |
| entity_type | ENUM | Type of entity (ORG, PERSON, PRODUCT) |
| entity_name | STRING | Mapped entity name |
| entity_name_raw_anchor | STRING | Raw string of named entity detected |
| entity_sentiment_overt_buy_sell | ENUM | Overt buy/sell sentiment (POSITIVE, NEGATIVE, NONE_EXPRESSED, NULL) |
| entity_sentiment_generic | ENUM | General sentiment (POSITIVE, NEGATIVE, NONE_EXPRESSED, NEUTRAL) |