Skip to main content
The YouTube Core Dataset provides comprehensive YouTube Social Media Entity Intelligence, combining channel metadata, video content, transcription data, and entity recognition with sentiment analysis. This dataset enables deep analysis of how companies, brands, and financial instruments are discussed across YouTube’s financial media landscape.

Complete Data Dictionary

FieldTypeDescription
channel_idUUIDPrimary Key - Immutable Babbl Labs internal unique identifier for channel
channel_uriUUIDImmutable YouTube’s unique identifier for channel
channel_custom_urlSTRINGMutable custom URL for channel
channel_nameSTRINGChannel title from YouTube metadata
channel_descriptionSTRINGChannel description from YouTube metadata
channel_localeENUMChannel geographic country (ISO 3166-1 alpha-2)
channel_view_countINTMost recent channel total view count
channel_subscriber_countINTMost recent channel total subscriber count
channel_video_countINTMost recent channel total video count
channel_published_atTIMESTAMPTimestamp when channel was created
channel_coverage_initiated_atTIMESTAMPTimestamp when we initiated coverage
channel_last_updated_atTIMESTAMPTimestamp when we last polled YouTube metadata
video_idUUIDPrimary Key - Unique YouTube video identifier
video_view_countINTMost recent view count
video_like_countINTMost recent video like count
video_comment_countINTMost recent video comment count
video_published_dtTIMESTAMPTimestamp when video was originally published
video_transcribed_atTIMESTAMPTimestamp when we originally transcribed video
video_scored_atTIMESTAMPTimestamp when we originally processed video
video_last_updated_atTIMESTAMPTimestamp when we last polled YouTube metadata
video_titleSTRINGVideo title from YouTube metadata
video_descriptionSTRINGVideo description from YouTube metadata
video_languageSTRINGVideo language code (ISO 639-1)
model_transcription_tagSTRINGIdentifier for transcription model used
model_scoring_tagSTRINGIdentifier for scoring model used
segment_startFLOATStarting point of segment in seconds
segment_endFLOATEnd point of segment in seconds
segment_textSTRINGTranscribed text (50 words before/after entity)
speaker_nameSTRINGName of speaker in this segment
speaker_associated_entityUUIDEntity (company) speaker is associated with (FIGI)
speaker_positionSTRINGKnown title of speaker
speaker_role_contextENUMRole of speaker (Host, Guest, ReferencedSource, Other)
entity_idUUIDImmutable identifier of entity referenced
entity_symbolSTRINGPublic company ticker symbol
entity_figi_idUUIDFinancial Instrument Global Identifier (FIGI)
entity_typeENUMType of entity (ORG, PERSON, PRODUCT)
entity_nameSTRINGMapped entity name
entity_name_raw_anchorSTRINGRaw string of named entity detected
entity_sentiment_overt_buy_sellENUMOvert buy/sell sentiment (POSITIVE, NEGATIVE, NONE_EXPRESSED, NULL)
entity_sentiment_genericENUMGeneral sentiment (POSITIVE, NEGATIVE, NONE_EXPRESSED, NEUTRAL)
I