"This marks an important milestone where our community focused on and completed improvements to Gensim documentation and tutorials, a much requested feature," said Radim Řehůřek, Managing Director of RaRe Technologies and creator of Gensim.
San Francisco, California (PRWEB) June 22, 2016
RaRe Technologies today announced a major update of the software package Gensim, an open source machine learning toolkit for understanding human language. Gensim users and developers have shaped the newest release 0.13.0. through their comments, requests and Python code contributions on the open source community Github and via the mailing list. This release includes new features like Word Movers’ Distance (WMD)*, a novel distance function between unstructured text documents, plus new Tutorials and Quickstarts. One of the new tools is used to tune Topic Models, visualisation coloring words based on their automatically discovered topic in a specific text document or in a general dictionary.
“Gensim has received a lot of industry traction lately. We’re happy that our dedication to pragmatic, production-ready quality is appreciated by our users,” said Radim Řehůřek, Managing Director of RaRe Technologies and creator of Gensim. “This release marks an important milestone where our community focused on and completed improvements to Gensim documentation and tutorials, a much requested feature. We’re one step closer to fulfilling Gensim’s motto: Topic Modeling for Humans.”
“Sixteen developers from our community have contributed code and tutorials to this release. That is not including community members answering numerous messages on our mailing list and troubleshooting reported issues,” said Lev Konstantinovskiy, Community Manager at Gensim. “It is great to see the community grow from a one-person PhD project in 2010 to a thriving open source community 6 years later with dozens of contributors and thousands of users.”
Gensim is offered as an open source vector space modeling and topic modeling toolkit, under the LGPLv2 open source license. This means Python developers and the community as a whole can download the code and documentation for free via GitHub or PyPI, the Python Package Index. Support is offered on the Google Groups where the Gensim community gathers to discuss the project.
About RaRe Technologies
RaRe Technologies, founded by Radim Řehůřek, expert machine learning data scientist and creator of Gensim, is an innovative, high-tech consulting and development firm powered by top data science PhDs, seasoned computer scientists and industry thought leaders. This elite group specializes in the architectural design and development of deep learning, data mining, natural language processing (NLP) and advanced machine learning systems for global clients. An extension of RaRe’s consulting expertise is demonstrated through hands-on, fast-track training programs designed for IT teams seeking to master data science to include Python, data mining and information retrieval. To learn more about RaRe Technologies, visit their website at rare-technologies.com. Follow updates from RaRe Technologies through Twitter, LinkedIn or by visiting their blog.
Gensim is a commercial-grade topic modeling and natural language processing (NLP) toolkit implemented in the Python programming language. Noted as topic modeling for humans, Gensim is the most robust, efficient and hassle-free piece of software to realize unsupervised semantic modeling from plain, unstructured text. Specifically, Gensim is intended for handling large text collections using efficient streamed, incremental algorithms, including various latent semantic models and deep learning embeddings. Launched in 2011, Gensim is open-source distributed and licensed under the OSI-approved GNU LGPL v2.1 license, opening it up as a free platform for both personal and commercial use. Find Gensim’s code on GitHub and the support forum located on Google Groups. Make sure to follow Gensim on Twitter to receive future updates real-time!
*WMD was introduced in a recent paper by Kusner et al. (2015)