Pilosa Launches Breakthrough Open Source Software to Dramatically Accelerate Data Queries

Share Article

Open distributed bitmap index makes big data act like small data

Pilosa Logo

Pilosa logo

If you’re working with massive data sets, Pilosa can dramatically change the rules of the game

Pilosa, an open distributed bitmap index, today launched into public beta. Pilosa decouples the index from data storage and optimizes it for massive scale. The result is dramatically accelerated query speeds across multiple, massive data sets. Pilosa is available today on GitHub.

Pilosa solves a fundamental problem in data science. The volume of enterprise data has grown faster than Moore's Law, yet the speed at which we can read it has stagnated. Despite several years of major advances in databases, the technology that retrieves data has gone untouched and read speeds have lagged far behind write speeds. Pilosa’s technology addresses this problem head-on, dramatically speeding up both queries to existing databases and the process of joining data from multiple stores.

“The next wave of scientific breakthroughs will come from research projects that work with datasets of a terabyte or more,” said Higinio (H.O.) Maycotte, CEO of Pilosa. “We know how to store that data, but nobody has focused on accelerating access to that data. That changes today. Our commitment to open source ensures that this fundamental problem is solved once and for all.”

Because Pilosa is a bitmap index, it is relatively small in volume and runs in-memory rather than on disk. The first version includes production-tested features including single and multi-node index support, replication, algorithm plugins, a data importer, and basic cluster management. There are eight patents in the first version alone.

The software helps data scientists and engineers make sense of multiple, massive data sets without purchasing more hardware and without hours-long batch job wait times. Benchmark tests indicate Pilosa queries consistently fast even at high volumes and without increasing complexity or processing rigor. No test exceeded 1.8 seconds and most queries were returned in fractions of a second. A simple query can traverse more than 2 billion edges in one second on commodity cloud hardware, approximating speeds only seen when leveraging expensive hardware such as GPUs.

“With Pilosa, you can work with all of your data, all at once. It’s exhilarating to experience this first hand,” said Troy Lanier, Vice President of Product at Pilosa. “Our focus today is really on building a community around this technology. We’ve already seen traction in bioinformatics and information security, but we’re excited to see where users take us. If you’re working with massive data sets, Pilosa can dramatically change the rules of the game.”

Pilosa is a free, open-source software available under an Apache 2.0 license. It can be downloaded on Github at https://github.com/pilosa. To learn more about Pilosa, please visit: https://www.pilosa.com/

About Pilosa

Pilosa is an open distributed bitmap index. We dramatically accelerate query speed across multiple, massive data sets. Pilosa helps big data act much smaller. We created an index independent of storage then optimized it for high cardinality data at scale.

Pilosa was founded in 2017 with a commitment to building community-driven, open source software that unlocks the full power of data science.

Pilosa. Insanely fast queries on really big data.

To learn more about Pilosa, visit: https://www.pilosa.com

Share article on social media or email:

View article via:

Pdf Print

Contact Author

Crystal Germond
Jones-Dilworth, Inc.
+1 857.654.6723
Email >
Visit website