Startups Using Spark in Boston

Via their job posts and information submitted by startups themselves, these are the Boston Spark startups we've found.

Interested in other technologies? Browse or search all of the built-in-boston tech stacks we've curated.

High-accuracy weather tools by integrating existing weather data with analysis of the impact on cellular signals.

Show more details

Outcomes information big data platform for healthcare.

Show more details

“Predictive analytics platform for retail merchandise planning and product assortment optimization.”

Show more details

Marketing platform for B2C marketers, aimed at increasing CLTV by driving more cross-sells and repeat purchases.

Show more details

Platform for more efficient clinical trials.

Show more details

Platform aggregating and analyzing footwear and apparel fit data, providing retailers with tools to offer “highly personalized fit ratings and size recommendations to shoppers.”

Show more details

Predictive analytics for healthcare data, targeting preventable admissions, member retention, and risk-based reimbursement eligibility.

Tech Stack Highlights

Machine Learning – We build models on our Spark platform using MLlib as well as in custom Python environments where we use many of the popular Python-based machine learning libraries. We’ve invested the most in using the Pytorch library, which we use for our deep learning models.

Spark & Scala – We use a Scala-based data pipeline hosted on Spark to ingest customer data and prepare it for use in our models.

Zeppelin & Jupyter – We work with data using Zeppelin notebooks for Spark and Jupyter in our Python environments.

Automation & Infrastructure – We use CircleCI to build and deploy both our services and infrastructure. We use AWS Lambda to automate infrastructure tasks and create custom notifications and alerts to simplify our internal workflows.

AWS – We host our infrastructure on AWS. We’ve built an independently audited platform that supports working with protected health information.

Show more details

Cybersecurity platform to detect threats coming from compromised user accounts.

Tech Stack Highlights

Apache Spark – We use Spark, Spark Streaming, and the Apache Kafka frameworks for fast in-memory compute, real-time streaming, and lambda architecture. These technologies power our cyber threat detection, remediation and visualization software.

Cassandra – Our platform relies on Apache Cassandra NoSQL database for long-term data analytics and reporting. We use Elasticsearch for real-time search and analysis and Redis for in-memory cache.

Docker – We’re built on a Docker container micro-services architecture and Ansible DevOps orchestration framework for flexible bare-metal, virtual machine & cloud deployments.

Angular.js – We use the Angular front-end framework with D3.js, and NodeJS on the backend.

Show more details

“A machine learning platform for data scientists of all skill levels to build and deploy accurate predictive models.”

Show more details

Matches app-wielding street teams with product companies who want better insight & promotion from their retail partners.

Show more details