Dataset Store

Book

Expert responses in various fields, including law, lifestyle, finance, and health (over 2.3 million counseling conversations)

Dataset Store

Book

Expert responses in various fields, including law, lifestyle, finance, and health (over 2.3 million counseling conversations)

Books available for AI training
Book
E-book
Printed book
Korean book

TAG

Book
E-book
Printed book
Korean Book
Books available for AI training
Format
• Original e-books(EPUB, PDF)
• Scanned physical books
• Customized structured data(e.g., JSON)
Volume
250,000 titles+
Language Offered
Korean(other languages available upon request)
Format
• Original e-books(EPUB, PDF)
• Scanned physical books
• Customized structured data(e.g., JSON)
Volume
250,000 titles+


Language Offered
Korean(other languages available upon request

Purchase Procedure

Category Selection

• Confirm required book categories and conditions(e.g., engineering textbooks, economics, liberal arts, foreign language study books, etc.)

• Review the need for data cleansing

Book List Selection

• Provide books lists based on requested categories and conditions(licensing terms vary by title)

• Select books for purchase from the provided list

Format Agreement

• Original files must be destroyed after extraction and cleansing, within the agreed timeframe

• If structured data is requested, additional cleansing costs will be quoted separately

Additional Processing / Refinement

• Discuss detailed processing and refinement criteria(text, images, tables, footnotes, etc.)

Category Selection

• Confirm required book categories and conditions(e.g., engineering textbooks, economics, liberal arts, foreign language study books, etc.)
• Review the need for data cleansing

Book List Selection

• Provide book lists based on requested categories and conditions(licensing terms vary by title)

• Select books for purchase from the provided list

Format Agreement

• Original files must be destroyed after extraction and cleansing, within the agreed timeframe

• If structured data is requested, additional cleansing costs will be quoted separately

Additional Processing / Refinement

• Discuss detailed processing and refinement criteria(text, images, tables, footnotes, etc.)

Category and Condition Examples

Category

Humanities

Literature

Religion

History

Biography

Society

Science

Computer · Internet

Linguistics

Economy · Business

Dictionaries

Education

Foreign Books

Travel · Maps

Hobbies · Leisure

Family · Health · Lifestyle

Arts · Popular Culture

Self-published Works

Adult

Textbooks

Condition Setting 1

Publication Date

Books published within the last year

Target Audience

No restrictions

Original Format

Prioritize EPUB format for easier text extraction

Others

Maximize text acquisition within budget

Condition Setting 2

Publication Date

No restrictions

Target Audience

Professional books such as university textbooks

Original Format

No restrictions(prefer higher text quality)

Others

• Secure as many diverse topics as possible within the designated category
• If multiple editions exist, retain only the latest edition and remove duplicates

Category

Humanities

Literature

Religion

History

Biography

Society

Science

Computer · Internet

Linguistics

Economy · Business

University Textbooks

Dictionaries

Education

Foreign Books

Travel · Maps

Hobbies · Leisure

Family · Health · Lifestyle

Arts · Popular Culture

Self-published Works

Adult

Condition Setting 1

Publication Date: Books published within the last year

Target Audience: No restrictions

Original Format: Prioritize EPUB format for easier text extraction

Others: Maximize text acquisition within budget

Condition Setting 2

Publication Date: No restrictions

Target Audience: Professional books such as univerity textbooks

Original Format: No restrictions(prefer higher text quality)

Others:
• Secure as many diverse topics as possible within the design ated category

• If multiple editions exisst, retain only the latest edition and remove duplicates

Features

• Wide coverage of books distributed in Korea
• Curation based on target readership (textbooks, practical guides, professional publications, etc.) and publication date
• Book list provided (licensing conditions may vary by title)
• High-quality data cleansing using in-house solutions (pricing negotiable)
• Additional sourcing available for academic papers, overseas books, etc., based on needs

Application Fields

Improve Generalization

Train models using Q&A collected from diverse fields and numerous users. This effectively enhances the model's generalization capability to stably respond to new subjects or unfamiliar question formats.

LLM Performance Improvement

Possesses a clear and simple Q&A pair structure. This allows for immediate deployment in advanced training techniques like Few-shot Learning or Instruction Tuning with0ut complex preprocessing.

Trainig of NLP Models

Training reflects real-world Q&A patterns. This simultaneously trains the AI model for both Natural Language Understanding (NLU) and Natural Language Generation (NLG), enhancing accurate communication skills.

Applicable to diverse other use cases.