Gensim tập trung vào mô hình chủ đề
Có rất nhiều thư viện về học máy, như NLTK, SpaCy, Gensim, scikit-learn, TensorFlow, Hugging Face, v.v. Tuy nhiên, Gensim. Theo lời tác giả:
By now, Gensim is—to my knowledge—the most robust, efficient and hassle-free piece of software to realize unsupervised semantic modelling from plain text. It stands in contrast to brittle homework-assignment-implementations that do not scale on one hand, and robust java-esque projects that take forever just to run “hello world”.
Gensim = Generate Similar
Triết lý của Gensim:
- Practicality – as industry experts, we focus on proven, battle-hardened algorithms to solve real industry problems. More focus on engineering, less on academia.
- Memory independence – there is no need for the whole training corpus to reside fully in RAM at any one time. Can process large, web-scale corpora using data streaming.
- Performance – highly optimized implementations of popular vector space algorithms using C, BLAS and memory-mapping.
Nguồn:: What is Gensim? — gensim