Data Scientist, Analytics

Remote

Company & Opportunity Overview

harpin AI is a seed-stage startup, backed and managed by a proven team, that is focused on revolutionizing how enterprise customers manage and improve the integrity, quality, and authenticity of their upstream consumer identity data.

Why was harpin AI built? The harpin AI team has spent decades building software for renowned customer-facing brands. Time and again, one challenge kept cropping up: Managing identity and PII data securely and efficiently, with a goal to maximize its value to the business. With the explosion of cloud-based systems of records, massive disparate data sets, and privacy and compliance changes, it has become increasingly difficult to wrangle PII, let alone use it to make smarter business decisions.

Our clients would ask, “How can we make our data accessible, easier to use, visually compelling and more accurate?” We decided to tackle this head-on. The answer wasn’t yet another data platform – we wanted to enhance the value of the systems of records that were already out there. Companies had already invested so much in mission-critical systems and we knew that to be successful we would need to leverage what they already had. And so, harpin AI was born: a set of smart workflow-based tools that improves data quality and access.

Our tools, powered by cutting-edge AI, machine learning and simple workflows, turn fragmented data into clear, usable insights. With harpin AI, data is an asset, not an obstacle.

Case Study Examples

  1. Belami ECommerce
  2. Fire & Vine
  3. Morey’s Piers

Size of company: 23 employees
Founded: 2021
Funding: Seed-Stage, $6.5M raised through MK Capital
Revenue: Post-revenue
CEO: Founded by Scott Sahadi who has experienced successful exits from his last 3 start-ups
Current customers: Fortune 500 hospitality companies & small e-commerce brands
Industry focus: Hospitality, E-Commerce, Retail

Position Overview

Join harpin AI’s Data Science team to help shape the future of customer identity resolution. In this role, you’ll work closely with data scientists, engineers, and product leaders to design scalable data models, build reliable pipelines, and develop predictive models that power key business insights. You’ll own data quality and architecture within your domain, enabling self-service analytics through intuitive dashboards and robust metrics. From data exploration to machine learning, you’ll apply analytical rigor to solve complex problems and uncover actionable insights. Ideal candidates are hands-on, thrive in fast-paced startup environments, and are passionate about high-quality data and
impactful analytics. Proficiency in Python, SQL, and Spark, along with a strong foundation in statistical analysis and dimensional modeling, is essential. If you’re excited to collaborate, move fast, and drive results through data, we’d love to hear from you.

What You Will Do

  • Understand data needs by interfacing with fellow data scientists and business partners.
  • Architect, build, and launch efficient & reliable new data models and pipelines in partnership with our Data Science and Engineering teams.
  • Perform data exploration, data analysis and machine learning model development.
  • Build out comprehensive metrics and dimensions to measure operational health and enable analysis and predictive modeling.
  • Design and develop dashboards to enable self-serve data consumption.
  • Become a data expert in your business domain, develop a deep understanding of how your data interacts with the rest of the business, and own data quality.
  • Recognize and adopt best practices in reporting and analysis: data integrity, test design, analysis, validation, and documentation.
  • Work with product leadership to evolve product positioning, roadmap, and use cases based on what we can (and cannot) practically develop.
  • Draw inferences and conclusions, and create dashboards and visualizations of processed data, identify trends, anomalies.

Qualifications

  • 3+ years relevant work experience.
  • BS in Computer Science, Mathematics, Statistics, or Data Science.
  • Experience in a startup company environment is a plus.
  • Passion for high data quality and scaling/automating data science work.
  • Demonstrated leadership in data warehousing concepts.
  • Experience with exploratory data analysis, statistical analysis and machine learning model development.
  • Experience in schema design and dimensional data modeling.
  • Ability to perform basic statistical analysis to inform business decisions.
  • Ability to turn complex problems into simple solutions.
  • Track record of managing multiple projects simultaneously in a fast-paced environment.
  • Experience creating high performance algorithms for real-time systems.
  • Collaborative mindset and generative work culture – we do our best work together.
  • Strong proficiency in Python, Spark, SQL and to work efficiently at scale with large data sets.
  • Flexible, nimble, and scrappy; startup mentality and willingness/ability to change direction quickly if best for the business.

Benefits

  • Stock Options
  • PTO
  • Paid Holidays
  • Medical, Dental, & Vision Benefits
  • 401K

Apply for this position

Accepted file types: pdf, doc, docx, txt, rtf, Max. file size: 50 MB.
Accepted file types: pdf, doc, docx, txt, rtf, Max. file size: 50 MB.