| title | Data Cleaning Agent |
|---|---|
| emoji | 🧹 |
| colorFrom | blue |
| colorTo | green |
| sdk | docker |
| app_file | inference.py |
| pinned | true |
This project implements a real-world OpenEnv environment simulating the task of a data analyst cleaning unorganised datasets.
Data cleaning is one of the most time-consuming and important steps in real-world data workflows. This environment allows AI agents to learn and be evaluated on:
- Handling missing values
- Standardizing inconsistent formats
- Converting incorrect data types
- Removing duplicates
In real-world data science pipelines:
Up to 80% of time is spent cleaning data
This environment models that workflow in a structured, testable way — making it useful for training and evaluating AI agents. The agent being implemented here works on some of the low priority but necessary tasks required during the cleaning of a dataset and it has notable future scope.
{
"dataset": [...],
"step_count": int,
"remaining_errors": int
}{
"action_type": "fill_missing | standardize_name | convert_type | fix_date_format | remove_duplicates",
"column": "optional"
}The reward is dense and progressive, encouraging efficient and correct cleaning:
- +0.15 per error fixed
- +0.05 efficiency bonus
- -0.05 for ineffective actions
- -0.25 for worsening dataset
- +0.5 completion bonus
- Single error type
- Objective: fill missing values
- Missing values + inconsistent formats
- Requires multi-step reasoning
- Duplicates
- Incorrect types
- Multiple date formats
- Extra spaces and inconsistencies
Evaluation is deterministic and returns a score between 0.0 → 1.0.
- Field-level accuracy scoring
- Partial credit for partially correct rows
- Exact match yields full score
A hybrid baseline agent is provided:
- Uses rule-based heuristics for stability
- Falls back to LLM for generalization
This ensures reproducible and meaningful baseline scores.
| Endpoint | Description |
|---|---|
/reset |
Initialize environment |
/step |
Apply action |
/state |
Get current state |
/tasks |
List tasks |
/grader |
Get final score |
/baseline |
Run baseline agent |
pip install -r requirements.txtuvicorn api.app:app --reloadpython inference.py{"name": "JOHN DOE", "age": "thirty", "date": "03-12-24"}{"name": "John Doe", "age": 30, "date": "2024-03-12"}- Deterministic grading
- Dense reward shaping
- Multi-step reasoning environment
- Fully OpenEnv compliant
- Dockerized + deployable
This environment is containerized and deployable on Hugging Face Spaces using Docker.
- This project was made and implemented for the participation in the Meta PyTorch OpenEnv Hackathon x Scaler School of Technology.
- Created by Team Bug Smashers: Rian, Amogh, Elveena