Attribute Value Extraction from E-commerce Product Profiles

Kassem Sabeh

Back

Dissertation

Attribute Value Extraction from E-commerce Product Profiles

Kassem Sabeh

Free University of Bozen-Bolzano

Doctor of Philosophy (PHD), Free University of Bozen-Bolzano

11/04/2025

Handle:

https://hdl.handle.net/10863/48539

Abstract

E-commerce platforms are continuously expanding, requiring efficient methods for extracting structured product attribute values from unstructured data such as product descriptions and titles. This thesis addresses key challenges in product attribute value extraction, including the discovery of new values, implicit value identification, product attribute generation, and the correction and refinement of extracted data. Traditional methods, relying on predefined dictionaries and extractive techniques, often fail to generalize in dynamic environments. To tackle these challenges, this thesis proposes several novel systems. OpenBrand integrates character level representations to dynamically discover new brand values, while GAVI uses a category-aware generative model for implicit value extraction. For attribute identification, generative approaches are explored, with large language models (LLMs) showing promise in zero-shot generalization without retraining. CAVE, a post-extraction system, corrects noisy attribute values, and QPAVE refines coarse-grained product attributes into fine-grained values. These contributions are systematically evaluated, demonstrating improvements over existing methods and addressing limitations in scalability, accuracy, and generalization. Through the development and evaluation of these systems, this work provides a robust foundation for enhancing the accuracy and scalability of attribute value extraction in e-commerce. It also sets the stage for future work in multilingual contexts and dynamic product environments, ultimately contributing to both academia and industry applications.

Files and links (1)

pdf

PhD_Thesis___Kassem_Sabeh_final14.71 MB

Embargoed Access

Details

Title: Attribute Value Extraction from E-commerce Product Profiles
Creators: Kassem Sabeh - -, Faculty of Engineering
Contributors: Johann Gamper (Supervisor) - -, Faculty of Engineering
Fabian M. Suchanek (Supervisor)
Roman Klinger (Supervisor)
Awarding Institution: Free University of Bozen-Bolzano
Doctor of Philosophy (PHD)
Theses and Dissertations: Doctor of Philosophy (PHD), Free University of Bozen-Bolzano
Publisher: Free University of Bozen-Bolzano
Number of pages: XIV, 128
Identifiers: 991007099512701241
Academic Unit: Faculty of Engineering
Language: English
Resource Type: Dissertation

Metrics

1 Record Views