Dataset Store

Broadcast Video / Audio

Broadcasting company's video and audio dataset

Dataset Store

Broadcast Video / Audio

Broadcasting company's video and audio dataset

Broadcast Data

Multimodal

Large-scale Audio

TAG

Broadcast Data

Multimodal

Large-scale Audio

Format

• VIdeo(MP4, MOV)
• Audio(WAV. MP3)
• Script(JSON, TXT), etc.

Volume

1 Million hours+

Language Offered

Korean, English(other languages available upon request, e.g., Malay, Indonesian)

Format

• VIdeo(MP4, MOV)
• Audio(WAV. MP3)
• Script(JSON, TXT), etc.

Volume

1 Million hours+

Language Offered

Korean, English(other languages available upon request, e.g., Malay, Indonesian)

Features

• Includes diverse genres of data from major Korean broadcasters, such as news, entertainment, drama, educational programs, and radio

• Text data aligned with video and audio, including subtitles and scripts, can be provided upon consultation

• Additional data from international broadcasters can be arranged through further collaboration

Application Fields

Multimodal Model Development

Integrated video, audio, and script data enables high-dimensional AI development, optimizing simultaneous audiovisual understanding and comprehensive context/emotion recognition.

Audio Model Development

Utilizing over 1 million hours of audio data to develop high-performance audio analysis (ASR, speaker separation, TTS), supporting refined models tailored to diverse genre characteristics.

Contextual Awareness Boost

Training on realistic complex dialogues (news/dramas) is essential for human-level contextual understanding, accurately grasping intent and background knowledge.

Applicable to diverse other use cases.