Logo image
Structure-Preserving Pipelines for Digital Libraries
Conference proceeding   Peer reviewed

Structure-Preserving Pipelines for Digital Libraries

M Poesio, E Barbu, Egon W. Stemle and C Girardi
Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2011), pp.54-62
The 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (Portland, Oregon, 19/06/2011 - 24/06/2011)
2011
Handle:
https://hdl.handle.net/10863/8906

Abstract

Most existing HLT pipelines assume the input is pure text or, at most, HTML and either ignore (logical) document structure or remove it. We argue that identifying the structure of documents is essential in digital library and other types of applications, and show that it is relatively straightforward to extend existing pipelines to achieve ones in which the structure of a document is preserved.
url
https://www.aclweb.org/portal/content/acl-hlt-2011-workshop-language-technology-cultural-heritage-social-sciences-and-humanitiesView
url
http://www.aclweb.org/anthology/W11-1508View

Details

Metrics

20 Record Views