Data Normalizer

I am spending hours on my job manually adjusting data entries due to inconsistent spelling and shortcuts, so I built a tool to automate this process using large-language models.

The idea is to have a tool that takes in CSV exports and:

> Corrects for inconsistencies in spelling (Coop vs co-op)

> Harmonizes shortcuts (Limited vs Ltd.)

> Corrects for spelling mistakes (serbices vs services)

This is how the tool works:

  • You can upload a CSV file and specify which row you want to extract and harmonize.

  • The model automatically consolidates data by combining similar-looking phrases.

  • You can edit the proposed phrase names or further consolidate entries if there are some groups the model has missed.

  • In the end, you can download your CSV file again and push it to the database




