| WEBINAR Sep 25 Wed 2PM PST | Criteria and Metrics for LLM Evaluation
Software as a Service

Datumo Scope is a visualization-based dataset analysis solution.

Implement Data-centric AI by visually analyzing the coverage and edge cases of your training data.

Datumo Scope

As the first Feature Space-based dataset analysis solution in Korea, Datumo Scope is used throughout the dataset analysis-planning-selection process. Improve your data pipeline and quickly achieve AI performance goals with Datumo Scope.

Datumo Scope - Visualization

Data distribution visualization

Datumo Scope provides a flat graph that places similar data close together and dissimilar data far apart, allowing you to quickly understand the coverage of the entire dataset with easy manipulation and highly readable UI design.

*Feature space is a space expressed by compressing the feature vector, which consists of data information, into multi-dimensional points.

Datumo Scope automatically generates feature vectors or visualizes feature spaces using existing feature vectors.

Datumo Scope - Meta Data & Model Metric

Reflect metadata and model metrics

Datumo Scope provides a data distribution graph that reflects metadata and model metric information. You can filter data in various ways by querying information such as data collection environments and model performance indicators.

*Different data collected under various weather and time conditions are separated and represented in different colors according to model performance metrics, making it easy to quickly identify edge cases.

Data Analysis

Utilize feature vector for even selection of data and search of similar data.

Analyze your dataset by combining multiple features.

Datumo Scope - Curation

Data curation

Automatically selects data while minimizing damage to dataset coverage. You can freely set the number or ratio of data to be selected, as well as the selection algorithm. By using the curation function locally, you can create a more precise dataset.

*The curation function is used in various ways during the machine learning (ML) lifecycle, such as quickly analyzing the entire dataset coverage or classifying the dataset according to its purpose (Train/Test set split).

Datumo Scope - Similar Data Search

Similar data search

Automatically searches for data similar to the specified data. You can repeatedly search and edit the dataset within the specified range to configure the dataset as desired.

*You can select auxiliary data to adjust the search results to reflect your insights. You can also manually search for similar data by clicking on the surrounding points, but using the search function allows for more precise and efficient work.

Data Management

Efficiently manage data to reduce time wastage and augment it as needed for model training.

Data Versioning

Version the evolving guidelines during data collection and labeling, and manage the data in a visually appealing way.

Data Augmentation

Easily augment data that is deemed necessary during feature space analysis.

Model Training

Utilize feature vector for even selection of data and search of similar data.

Analyze your dataset by combining multiple features.

Model Training

Choose an API-available model or upload your own model to quickly train the selected data.

Model Analysis

Visualize the performance results of the trained model in the form of a dashboard.

Feature Vector Upload
Upload existing feature vectors to create a customized feature space. Check various versions of feature spaces and continue analyzing the dataset.
Cloud Integration
The initial data information collected for feature vector generation is stored in a separate database and automatically discarded.
On-Premise
To address security, performance, and compliance-related issues, Datumo Scope is provided in an on-premises environment.

Other Use Cases

Gain insight and improve AI performance through Datumo Scope.

Everything starts from

training data

Identify dataset biases, undersampling, and edge cases through Datumo Scope and reduce labeling time and cost.

Top-tier visualization
Transform source data into multi-dimensional vectors and then reduce it to two dimensions using SOTA (State-of-the-Art) performance models. Check dataset coverage and bias with your own eyes.
Labeling cost optimization
Find data with high labeling utility using metadata and model metric queries, similar data retrieval, and clear definitions of data collection or processing targets to build a training dataset quickly and inexpensively.
Software as a Service

Datumo Scope is a visualization-based dataset analysis solution.

Implement Data-centric AI by visually analyzing the coverage and edge cases of your training data.