Senior / Lead Data Engineer (Python / AWS / Spark)

  • July 31, 2025
  • Engineering
  • Full time
  • Remote
  • Senior level
  • 2800 to 6000 USD
Amazon Web Services (AWS)
Apache Kafka
Apache Spark
Data Engineering
Python
English advanced

14 talents have already applied, you are still on time!

HE
JV
RG
Share it on:

About the Role We are looking for a Senior or Lead Data Engineer to be the founding member of our new data team. In this high-impact role, you’ll report directly to the CTO and play a critical part in shaping the future of data at our company. You will work across three distinct product lines to standardize and unify data pipelines, processes, and governance. This is a unique opportunity to build from the ground up and influence data architecture and strategy at a multi-product SaaS company. Responsibilities Design, build, and optimize scalable data pipelines using Apache Spark, Kafka, and related tools Develop and orchestrate data workflows using Apache Airflow Architect and manage infrastructure in AWS, including services like S3, Kinesis, Lambda, EMR, Glue, and Redshift Build and support real-time data pipelines and event-driven systems for streaming data ingestion and processing Establish standardized data ingestion, transformation, and storage processes across multiple product lines Work closely with product and engineering teams to unify and document data models and schemas Ensure data quality, lineage, and observability across all pipelines Lead data governance, compliance, and security practices from day one Mentor future hires and define engineering standards as the data team grows Qualifications 5+ years of experience in data engineering, ideally in SaaS or multi-product environments Expertise in AWS services relevant to data infrastructure (S3, Glue, Redshift, EMR, Lambda, Kinesis, etc.) Strong proficiency in Python for building pipelines, ETL/ELT jobs, and infrastructure automation Deep hands-on experience with Apache Spark, Kafka, and the broader Apache ecosystem Proven success designing and supporting real-time data streams and batch workflows Solid experience with Airflow or equivalent orchestration tools Strong understanding of distributed systems, data architecture, and scalable design patterns Passion for clean, maintainable code and building robust, fault-tolerant systems Nice to Have Experience with Delta Lake, Apache Iceberg, or Apache Hudi Familiarity with containerized workloads (Docker, Kubernetes, ECS/EKS) Background in building internal data platforms or centralized data lakes Experience supporting data science and BI/analytics teams Knowledge of data privacy, compliance, and security practices (e.g., SOC 2, GDPR)

You might also like to apply for these jobs

Apply now
How it works for talents

Get hired with Mappa

1

Apply for a job

Our AI-powered matching algorithm considers over 100,000 data points to curate a thoroughly vetted shortlist just for you.

step-1
2

Get matched

Our AI-powered matching algorithm considers over 100,000 data points to curate a thoroughly vetted shortlist just for you.

step-2
3

Meet the company

Our AI-powered matching algorithm considers over 100,000 data points to curate a thoroughly vetted shortlist just for you.

step-3
4

Get hired

Our AI-powered matching algorithm considers over 100,000 data points to curate a thoroughly vetted shortlist just for you.

step-4

Ready to start?

Apply
Extra services

Take your international career to
the next level

Dollar payments

Get paid in US dollars while working remotely and earn ~50% more than working locally.

Career growth

Strengthen your international career by working at the most exciting companies across the US, and Europe.

Benefits

Mappa provides you with an extra annual salary. Make a difference and get rewarded for your efforts and achievements.
Get started

Secure your dream job
in just a few steps

Apply now