NER-Ebay-2023

Named entity recognition on noisy product titles using BiLSTM and BiLSTM-CRF models.

NER-Ebay-2023 is a practical pipeline for named entity recognition on e-commerce product titles, built for a competition setting with millions of short, noisy text entries.


🔍 Summary

The project focuses on building sequence labeling models that can reliably extract entities from real-world marketplace data where:

  • Product titles are noisy and inconsistent.
  • Entities often span multiple words.
  • The label space mixes brands, attributes, and product categories.

🛠️ Modeling approach

  • BiLSTM and BiLSTM-CRF architectures for sequence tagging.
  • A BI tagging scheme for multi-word entities, balancing accuracy with memory usage.
  • Scripts and notebooks for:
    • Preprocessing titles.
    • Training / evaluation.
    • Efficient batched inference for large prediction sets.

📂 GitHub repository