This paper describes a system which uses entity and topic coherence for improved Text Segmentation (TS) accuracy. First, the Linear Dirichlet Allocation (LDA) algorithm was used to obtain topics for sentences in the document. We then performed entity mapping across a window in order to discover the transition of entities within sentences. We used the information obtained to support our LDA-based boundary detection for proper boundary adjustment. We report the significance of the entity coherence approach as well as the superiority of our algorithm over existing work.
Adebayo Kolawole, J., Di Caro, L., Boella, G. (2017). Text segmentation with topic modeling and entity coherence. Berlin : Springer [10.1007/978-3-319-52941-7_18].
Text segmentation with topic modeling and entity coherence
John, Adebayo Kolawole
;
2017
Abstract
This paper describes a system which uses entity and topic coherence for improved Text Segmentation (TS) accuracy. First, the Linear Dirichlet Allocation (LDA) algorithm was used to obtain topics for sentences in the document. We then performed entity mapping across a window in order to discover the transition of entities within sentences. We used the information obtained to support our LDA-based boundary detection for proper boundary adjustment. We report the significance of the entity coherence approach as well as the superiority of our algorithm over existing work.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.