Every product page at every major European retailer contains a material composition declaration. In the EU, this is legally required under Regulation 1007/2011. The data is there. It's public. And it's unusable in its raw form.
"Poliester", "polyester (PES)", "polyester fiber", and "100% PES" are four strings collected from different product pages. They mean the same thing. Multiply that across hundreds of fiber labels and dozens of retailers and you have a dataset that cannot be queried without a normalization layer.
The first thing built wasn't a dashboard. It was a material alias table mapping every collected fiber string to a canonical taxonomy entry aligned to EU Regulation 1007/2011. That table was built edge case by edge case, with explicit rulings on ambiguous inputs.
The same process was applied to care instructions and to production country normalization, including regional naming questions and alias cleanup across multiple languages. The result is a database where every row means the same thing as every other row using the same label.
That consistency is the entire point. Without it, a dashboard is decoration. With it, the market becomes comparable.