Startups Using Spark in Boston
Via their job posts and information submitted by startups themselves, these are the Boston Spark startups we've found.
Interested in other technologies? Browse or search all of the built-in-boston tech stacks we've curated.
An app for managing health-related finances, with your insurance plan, claims, and HSA all in one place, plus budget forecasting & bill analysis.
High-accuracy weather tools by integrating existing weather data with analysis of the impact on cellular signals.
Marketing platform for B2C marketers, aimed at increasing CLTV by driving more cross-sells and repeat purchases.
Platform aggregating and analyzing footwear and apparel fit data, providing retailers with tools to offer “highly personalized fit ratings and size recommendations to shoppers.”
Predictive analytics for healthcare data, targeting preventable admissions, member retention, and risk-based reimbursement eligibility.
Tech Stack Highlights
Machine Learning – We build models on our Spark platform using MLlib as well as in custom Python environments where we use many of the popular Python-based machine learning libraries. We’ve invested the most in using the Pytorch library, which we use for our deep learning models.
Spark & Scala – We use a Scala-based data pipeline hosted on Spark to ingest customer data and prepare it for use in our models.
Zeppelin & Jupyter – We work with data using Zeppelin notebooks for Spark and Jupyter in our Python environments.
Automation & Infrastructure – We use CircleCI to build and deploy both our services and infrastructure. We use AWS Lambda to automate infrastructure tasks and create custom notifications and alerts to simplify our internal workflows.
AWS – We host our infrastructure on AWS. We’ve built an independently audited platform that supports working with protected health information.
Cybersecurity platform to detect threats coming from compromised user accounts.
Tech Stack Highlights
Apache Spark – We use Spark, Spark Streaming, and the Apache Kafka frameworks for fast in-memory compute, real-time streaming, and lambda architecture. These technologies power our cyber threat detection, remediation and visualization software.
Cassandra – Our platform relies on Apache Cassandra NoSQL database for long-term data analytics and reporting. We use Elasticsearch for real-time search and analysis and Redis for in-memory cache.
Docker – We’re built on a Docker container micro-services architecture and Ansible DevOps orchestration framework for flexible bare-metal, virtual machine & cloud deployments.
Angular.js – We use the Angular front-end framework with D3.js, and NodeJS on the backend.
“A machine learning platform for data scientists of all skill levels to build and deploy accurate predictive models.”
Matches app-wielding street teams with product companies who want better insight & promotion from their retail partners.