Dataset Store

Expert Q&A

Korean Q&A dataset across domains such as law, finance, and medicine. Over 3.8 million high-quality answer sets consisting of general public questions and verified expert answers

Dataset Store

Expert Q&A

Korean Q&A dataset across domains such as law, finance, and medicine. Over 3.8 million high-quality answer sets consisting of general public questions and verified expert answers.

Expert Q&A Dataset across diverse categories
NLP
Question - Answer
QA

TAG

NLP
Question - Answer
QA
Format
JSON (additional structuring negotiable)
Volume
3.5 million entries+
MOQ
10,000 entries+ (per category)
Expert Q&A Dataset across diverse categories
Format
JSON (additional structuring negotiable)
Volume
3.8 Million entries+

MOQ
10,000 entries+ (per category)

*All text are in Korean

Spec

(Based on AnswerVolume*)

Category / Volume(sets)

Law

Medicine

Pharma

Biz · FInance

Insurance

Science

HR · Labor

Tax · Accounting

Real Estate

Childcare

140K

920K

340K

320K

200K

380K

430K

180K

150K

180K

(Based on AnswerVolume*)

Category / Volume(sets)

Legal

Science

140K

380K

Medicine

HR · Labor

920K

430K

Pharma

Tax · Taxation

340K

180K

Biz · Finance

Real Estate

320K

150K

Insurance

Parenting · Childcare

200K

180K

Data covering diverse additional fields, including Humanities · Arts, Dentistry, Traditional Korean Medicine, Trade, Financial Planning, Traffic Accidents, Nutrition · Diet, and Pet Animals, are also available.

* Counting Criterion
The price policy varies depending on which of the two methods below is selected.
1) Answer-Based Volume: Duplicated questions may be included in the dataset if multiple answers exist for a single question(Q:A=1:N)
2) Question-Based Volume: Evene if multiple exist for a single question, the best answer is selected via an internal logic, resulting in a 1:1 match(Q:A=1:1)

Features

• Covers various domains such as law, healthcare, insurance, science, economy, and arts
• Real user questions collected from actual platforms, answered by verified experts per category
• Pricing varies by category, discounts available for larger volumes
• Flexible Q&A pairing[1:N or 1:1 matching available based on needs]

Examples

Law
User Questions

Can I sell real estate without a Certificate of Title(Deed)? I am in the process of selling real estate and heard that a Certificate of Title is required among the necessary documents. After searching my home, I found it is missing, likely lost. Does this mean I cannot sell the property without the Certificate of Title…

Expert Answers

The Certificate of Title (Deed) under the former Real Estate Registration Act has been replaced by the current Registration Completion Information System. Even if the Certificate of Title (Deed) issued under the former law is lost or destroyed, it is absolutely never reissued. However, if you have lost…

Tax· Taxation
User Questions

If I have two businesses, is the progressive tax rate applied separately when reporting income? Hello. I plan to run two businesses: a mail-order business and a registered rental business. If the mail-order business and the rental business each generate 10 million KRW in income, when I file income tax, will the mail-order business’s 10 million KRW…

Expert Answers

Unlike corporations, business owners (sole proprietors) are considered the same individual even when operating multiple businesses concurrently, so their income will be aggregated for taxation. In other words, the income will not be taxed separately by 10 million KRW each; rather, the progressive …

Application Fields

Improve Generalization

Train models using Q&A collected from diverse fields and numerous users. This effectively enhances the model's generalization capability to stably respond to new subjects or unfamiliar question formats.

LLM Performance Improvement

Possesses a clear and simple Q&A pair structure. This allows for immediate deployment in advanced training techniques like Few-shot Learning or Instruction Tuning with0ut complex preprocessing.

Trainig of NLP Models

Training reflects real-world Q&A patterns. This simultaneously trains the AI model for both Natural Language Understanding (NLU) and Natural Language Generation (NLG), enhancing accurate communication skills.

Evaluation Data

Used to evaluate model response errors. It is utilized to periodically check and refine key AI model performance metrics such as factual accuracy, response consistency, and logic, thereby enhancing model completeness.

Expert Chatbot Development

Enables intensive learning of specialized language for industries (e.g., medicine, finance). Effective for building domain-specific chatbots and expert-level QA systems when combined with professional literature.

Expert Information Retrieval

Training is based on Q&A data composed of verified expert answers. Directly utilized in building AI systems that provide users with highly reliable information along with objective reasoning.

Applicable to diverse other use cases.