Jobs
Benefits & Perks
•Equity
•Equity
Required Skills
Python
Kotlin
Kubernetes
Terraform
Helm
Airbyte is the open‑source standard for data movement. We've enabled data teams to move data from applications, APIs, unstructured sources and databases to data warehouses, lakes, and AI applications. With tens of thousands of connectors built and hundreds of thousands of companies adopting Airbyte, we've proven the economics of data integration at scale. And now Airbyte is building the frontier agentic data infrastructure, purpose-built for AI agents that need fast, accurate access to data across hundreds of sources. Our mission: make data available and actionable, everywhere.
We've raised $181M from the world's top investors (Benchmark, Accel, Altimeter, Coatue, Y Combinator, etc.) and we believe in product-led growth, where we build something awesome that all our users love. We’ve raised enough capital to explore boldly, but we still choose to move quickly, stay scrappy, and experiment constantly as we find the right paths in an AI-native landscape.
THE ROLE:
As a Software Engineer on our Data Replication team, you will design and build intelligent systems that dramatically improve how data moves through Airbyte. From first deployment and initial sync to ongoing execution at scale. You’ll leverage LLM-based tools, agentic workflows, and automation to accelerate connector rollout, improve sync reliability, reduce TCO (total cost of ownership), and make the data movement experience seamless for both OSS and Cloud users.
This role sits at the intersection of AI systems, distributed data platforms, and developer experience. Your work will directly impact sync performance, operational excellence, and how quickly Airbyte can ship improvements across its control plane, data plane, and connector ecosystem.
WHAT YOU’LL DO:
-
Build AI-driven systems for data replication and connector lifecycle management, accelerating connector development, rollout, testing, and upgrades across OSS, Enterprise, and Cloud
-
Design and implement agentic workflows that assist with diagnosing sync failures, schema evolution issues, performance regressions, and rollout risks across large fleets of connectors
-
Build connectors and frameworks with AI to scale a wide range of reliable integrations
-
Develop observability, anomaly detection, and automated remediation systems (ML + LLM hybrid) for data sync execution, job correctness, and CDC pipelines
-
Improve control plane and data plane operations by automating deployment validation, release qualification, and environment testing (AWS, GCP, local, KIND)
-
Own AI systems across the full lifecycle: design, prompt engineering, evaluation, deployment, monitoring, and iteration in production (LLMOps)
-
Partner closely with platform, infra, and product teams to embed AI-powered capabilities into Airbyte’s deployment flows, APIs, and Cloud self-serve experience
-
Build high-leverage internal tooling that helps Airbyte ship connector and CDK changes faster while maintaining correctness, performance, and cost efficiency
WHAT YOU’LL NEED:
-
5+ years of engineering experience (backend, platform, or distributed systems) with strong proficiency in Python and/or Kotlin
-
Hands-on experience building or operating data pipelines, replication systems, or ETL/ELT platforms
-
Experience designing systems that integrate LLMs with structured data, logs, APIs, or retrieval systems
-
Familiarity with agentic or orchestration frameworks (e.g., Lang Chain, Pydantic AI, Temporal-style workflows)
-
Experience deploying and monitoring production systems, including LLMOps, observability, and alerting
-
Experience running services on Kubernetes, Helm, Terraform, and major cloud providers
-
Strong understanding of APIs, databases, connectors, schemas, and telemetry in distributed environments
-
Systems-level thinking with an emphasis on performance, reliability, cost, and scalability
-
A startup-ready mindset: comfortable with ambiguity, moving fast, and owning problems end-to-end
-
A builder’s instinct for automation, leverage, and developer experience
NICE TO HAVE:
-
Experience with open-source platforms, especially in data integration or infrastructure tooling
-
Familiarity with Airbyte, CDKs, or connector-based architectures
-
Exposure to large-scale connector fleets, schema evolution, CDC, or long-running sync execution
-
Background in control plane/data plane architectures or internal developer platforms
LOCATION:
- Onsite 5 days/week in San Francisco, CA
If you find this role exciting, we encourage you to apply even if you think you don’t meet all of the requirements!
Airbyte is an equal opportunity employer that does not discriminate on the basis of actual or perceived race, creed, color, religion, national origin, ancestry, age, physical or mental disability, pregnancy, genetic information, sex, sexual orientation, gender identity or expression, marital status, familial status, domestic violence victim status, veteran or military status, or any other legally recognized protected basis under federal, state or local laws. Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Airbyte is committed to providing reasonable accommodations for qualified individuals with disabilities in our job application procedures. Please let us know if you need assistance or accommodations due to a disability.
Total Views
0
Apply Clicks
0
Mock Applicants
0
Scraps
0
Similar Jobs

Software Engineer - Acceleration
Perplexity AI · San Francisco

Senior Security Operations Engineer
Brex · San Francisco, California, United States

Staff Machine Learning Engineer
Tubi (Fox) · San Francisco, CA; Los Angeles, CA; New York, NY (Hybrid); USA - Remote

Software Engineer — GPU Networking & Distributed Systems
Baseten · San Francisco

Product Manufacturing Engineer
OpenAI · San Francisco
Reviews
3.9
1 reviews
Work Life Balance
3.0
Compensation
3.0
Culture
3.5
Career
3.8
Management
3.0
75%
Recommend to a Friend
Pros
Modern tech stack integration
Works well with BigQuery and GCP
Good compatibility with DBT
Cons
Limited feedback available
No specific drawbacks mentioned
Insufficient review data
Salary Ranges
38 data points
Junior/L3
VP
Director
Junior/L3 · Enterprise Business Development Representative
1 reports
$130,197
total / year
Base
$81,574
Stock
-
Bonus
$17,766
$87,057
$202,041
Interview Experience
2 interviews
Difficulty
3.0
/ 5
Duration
14-28 weeks
Offer Rate
50%
Experience
Positive 50%
Neutral 50%
Negative 0%
Interview Process
1
First-round technical
News & Buzz
Any major drawbacks of using self-hosted Airbyte?
I plan on self-hosting Airbyte to run 100s of pipelines. So far, I have installed it using abctl (kind setup) on a remote machine and have tested several connectors I need (postgres, hubspot, google sheets, s3 etc). Everything seems to be working fine. And I love the fact that there is an API to s
·
5w ago
·
5
·
19
25 Fastest Growing Companies & Startups (2025) - Exploding Topics
Source: Exploding Topics
News
·
11w ago
Airbyte OSS is driving me insane
I’m trying to build an ELT pipeline to sync data from Postgres RDS to BigQuery. I didn’t know it Airbyte would be this resource intensive especially for the job I’m trying to setup (sync tables with thousands of rows etc.). I had Airbyte working on our RKE2 Cluster, but it kept failing due to not en
·
24w ago
·
66
·
35
Airbyte vs Fivetran for our ELT stack? Any other alternatives?
Hey, I’m stuck picking between Airbyte and Fivetran for our ELT stack and could use some advice. Sources we're dealing with: Salesforce (the usual - Accounts, Contacts, Opps) HubSpot (Contacts, Deals) Postgres OLTP that's pushing ~350k rows/day across several transactional tables We’ve got a tigh
·
29w ago
·
39
·
52
