Skip to main content
Field categories organize the YouTube Core Dataset’s 40+ fields into logical groupings for easier understanding and implementation.

Channel Fields

Channel-level metadata and statistics from YouTube.
FieldTypeDescription
channel_idUUIDPrimary Key - Immutable Babbl Labs internal unique identifier
channel_uriUUIDImmutable YouTube’s unique identifier for channel
channel_custom_urlSTRINGMutable custom URL for channel
channel_nameSTRINGChannel title from YouTube metadata
channel_descriptionSTRINGChannel description from YouTube metadata
channel_localeENUMChannel geographic country (ISO 3166-1 alpha-2)
channel_view_countINTMost recent channel total view count
channel_subscriber_countINTMost recent channel total subscriber count
channel_video_countINTMost recent channel total video count
channel_published_atTIMESTAMPTimestamp when channel was created
channel_coverage_initiated_atTIMESTAMPTimestamp when we initiated coverage
channel_last_updated_atTIMESTAMPTimestamp when we last polled YouTube metadata

Video Fields

Video-level metadata and engagement metrics from YouTube.
FieldTypeDescription
video_idUUIDPrimary Key - Unique YouTube video identifier
video_view_countINTMost recent view count recorded
video_like_countINTMost recent video like count recorded
video_comment_countINTMost recent video comment count recorded
video_published_dtTIMESTAMPTimestamp when video was originally published
video_transcribed_atTIMESTAMPTimestamp when we originally transcribed video
video_scored_atTIMESTAMPTimestamp when we originally processed video
video_last_updated_atTIMESTAMPTimestamp when we last polled YouTube metadata
video_titleSTRINGVideo title from YouTube metadata
video_descriptionSTRINGVideo description from YouTube metadata
video_languageSTRINGVideo language code (ISO 639-1)

Processing Fields

Model and processing metadata for transcription and analysis.
FieldTypeDescription
model_transcription_tagSTRINGIdentifier for transcription model used
model_scoring_tagSTRINGIdentifier for scoring model used

Segment Fields

Transcript segment timing and content information.
FieldTypeDescription
segment_startFLOATStarting point of segment - seconds from start of video
segment_endFLOATEnd point of segment - seconds from start of video
segment_textSTRINGTranscribed text (50 words before/after entity mention)

Speaker Fields

Speaker identification and context information.
FieldTypeDescription
speaker_nameSTRINGName of speaker in this segment
speaker_associated_entityUUIDEntity (company) speaker is associated with (FIGI)
speaker_positionSTRINGKnown title of speaker in this segment
speaker_role_contextENUMRole of speaker (Host, Guest, ReferencedSource, Other)

Entity Fields

Named entity recognition and financial instrument mapping.
FieldTypeDescription
entity_idUUIDImmutable identifier of entity referenced
entity_symbolSTRINGPublic company ticker symbol (ORG entities only)
entity_figi_idUUIDFinancial Instrument Global Identifier (FIGI)
entity_typeENUMType of entity (ORG, PERSON, PRODUCT)
entity_nameSTRINGMapped entity name from raw anchor
entity_name_raw_anchorSTRINGRaw string of named entity detected

Sentiment Fields

Multi-layered sentiment analysis for entity mentions.
FieldTypeDescription
entity_sentiment_overt_buy_sellENUMOvert buy/sell sentiment (POSITIVE, NEGATIVE, NONE_EXPRESSED, NULL)
entity_sentiment_genericENUMGeneral sentiment (POSITIVE, NEGATIVE, NONE_EXPRESSED, NEUTRAL)

Data Types & Constraints

TypeFormatExample
UUID36-character identifier550e8400-e29b-41d4-a716-446655440000
STRINGVariable-length text"Bloomberg Technology"
INTInteger number12906
FLOATFloating-point number168.7303438
TIMESTAMPUTC timestamp1747251178
ENUMPredefined valuesPOSITIVE, NEGATIVE, NEUTRAL
All timestamp fields use UTC timezone. Geographic codes follow ISO 3166-1 alpha-2, language codes follow ISO 639-1.
I