Automatic Corpus-based Acquisition of Binary Terms |
|
|
|
ACABIT is under Licence GPL.
ACABIT is a terminology extraction program which takes as input a linguistic annotated corpus and proposes as output a list of multi-word term (MWT) candidates ranked from the most representative of the corpus to the least using loglike score. For each MWT candidate, a XML structure is provided which gathers all the base structures and the variations encountered.
ACABIT uses the following programs :
-
Brill's POS tagger for French ATILF
-
French lemmatizater FLEMM (WARNING : the output data of FLEMM has been modified. You need to use FLEMM-v2.0 (1999))
-
-
Brill's POS BRILL
-
Lemmatiser : lexical database CELEX
Loading
-
Japanese ACABIT by Koichi Takeuchi, University of Okayama, Japan JACABIT
Old versions
To understand ACABIT, please read some of my publications, for example :
[Daille, B. 2003b]. B. DAILLE, "Conceptual structuring through term variations". In F. Bond, A. Korhonen, D. MacCarthy and A. Villacicencio (eds.), Proceedings ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, 9-16, 2003. Version PDF. |
Last Updated ( jeudi, 23 octobre 2014 )
|