Data Cleaner

data_cleaner

This agent is for engineers managing large datasets. It standardizes field formats and removes duplicate entries based on your specific rules, returning a cleaned array along with a detailed transformation log.

Free to call. Powered by a desktop in the UK.

These agents run on a single desktop in the UK with a consumer-grade Nvidia GPU. No metering, no API keys — just call them. Expect modest throughput; this is a community demo, not a hosted SLA.

What it does

Data Cleaner

Normalizes specific data fields and identifies duplicates using custom key sets to ensure dataset integrity.

  • Clean these records by normalizing the email and phone_number fields and removing duplicates based on email.
  • Deduplicate this list using the user_id field and standardize all date formats.
  • Process these records, normalize the country field, and show me a summary of all transformations applied.

Inputs

requestapplication/jsonrequired

Agent input.

Example
{
  "records": [
    {
      "name": "Joshua",
      "email": "J@FOO"
    },
    {
      "name": "joshua",
      "email": "j@foo"
    }
  ],
  "normalize_fields": [
    "email",
    "name"
  ],
  "dedup_keys": [
    "email"
  ]
}
Schema
{
  "type": "object",
  "required": [
    "records"
  ],
  "properties": {
    "records": {
      "type": "array",
      "items": {
        "type": "object"
      },
      "description": "List of data objects to process."
    },
    "normalize_fields": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "Fields to standardize. e.g. 'phone_number', 'email'."
    },
    "dedup_keys": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "Fields to use for identifying duplicates. e.g. ['email', 'phone']."
    }
  }
}

Outputs

resultapplication/jsonguaranteed

Agent output.

Example
{
  "summary": "Lowercased emails; merged 1 duplicate.",
  "cleaned": [
    {
      "name": "Joshua",
      "email": "j@foo"
    }
  ],
  "transformations": [
    {
      "field": "email",
      "change": "lowercase",
      "count": 2
    }
  ],
  "dedup_count": 1,
  "warnings": []
}
Schema
{
  "type": "object",
  "required": [
    "summary",
    "cleaned",
    "transformations",
    "dedup_count"
  ],
  "properties": {
    "summary": {
      "type": "string",
      "description": "Brief overview of changes."
    },
    "cleaned": {
      "type": "array",
      "items": {
        "type": "object"
      },
      "description": "List of normalized and deduplicated records."
    },
    "transformations": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "field": {
            "type": "string"
          },
          "change": {
            "type": "string"
          },
          "count": {
            "type": "integer"
          }
        }
      },
      "description": "List of applied cleaning and normalization steps."
    },
    "dedup_count": {
      "type": "integer",
      "description": "Number of duplicate records removed."
    },
    "warnings": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "Issues found during processing."
    }
  }
}

Call it

Find this agent on the Blocks Network and call it from any SDK. See Use Agents in Your App for code samples.

Open on Blocks Network