招聘
必备技能
SQL
Salesforce
## About Git
Hub
GitHub is the world’s leading platform for agentic software development — powered by Copilot to build, scale, and deliver secure software. Over 180 million developers, including more than 90% of the Fortune 100 companies, use GitHub to collaborate, and more than 77,000 organisations have adopted GitHub Copilot.
Locations
In this role you can work from Remote, United States
Overview
The Revenue Operations & Data Governance team is building a Master Data Management (MDM) foundation to establish a trusted, unified view of our customers and accounts. As our MDM & Data Quality Intern, you will work directly alongside a Solution Architect to design and validate our MDM Proof of Concept, focused on a single, well-scoped entity domain (Accounts). Your work will lay the analytical and documentary groundwork that carries this initiative from exploration into a production-ready recommendation. This is a hands-on, high-impact role where your findings and deliverables will directly shape GitHub's long-term data strategy.
This is a remote summer internship for 12 consecutive weeks with start dates between May18 - June 15, 2026.
Responsibilities
- Partner with the Solution Architect to audit and map Account data across source systems (CRM, billing, product), documenting field-level lineage, ownership, and quality gaps.
- Support the design and testing of match/merge and survivorship rules for the Account entity, defining which source system wins for each attribute and why.
- Assist in building and validating a sandbox POC Golden Record for the Account domain, including deduplication logic, confidence scoring, and a sample output dataset.
- Measure and report on baseline data quality metrics, duplicate rates, completeness scores, and field-level accuracy, to establish a benchmark for MDM impact.
- Document POC outcomes, key decisions, edge cases, and a clear handoff package to guide the production engineering team.
- Develop a draft data stewardship process, including how records get flagged, reviewed, and approved, in collaboration with the Solution Architect and business stakeholders.
Qualifications
Required Qualifications:
- Currently pursuing a Master's Degree in Data Management, Information Systems, Data Analytics, or a related field, with at least one quarter/semester remaining after the internship.
- Expected conferral date between December 2026 and August 2027.
- Foundational understanding of data quality, data modeling, or MDM concepts through coursework or project experience.
- Comfortable working with SQL and exploring relational or CRM datasets.
Preferred Qualifications:
- Familiarity with CRM data structures (Salesforce or similar) and common data quality challenges like duplicates, incomplete records, or inconsistent formatting.
- Exposure to MDM concepts such as entity resolution, match/merge logic, or survivorship rules, even in an academic context.
- Strong analytical thinking, able to investigate messy data, identify patterns, and form clear recommendations.
- Excellent documentation skills, able to translate technical findings into clear, business-facing write-ups and process guides.
- Collaborative and curious, comfortable asking questions and working within a structured mentorship model.
Compensation Range
The base salary range for this job is USD $31.82 - USD $84.42 /Hr.
These pay ranges are intended to cover roles based across the United States. An individual's base pay depends on various factors including geographical location and review of experience, knowledge, skills, abilities of the applicant. At GitHub certain roles are eligible for benefits and additional rewards, including annual bonus and stock. These rewards are allocated based on individual impact in role. In addition, certain roles also have the opportunity to earn sales incentives based on revenue or utilization, depending on the terms of the plan and the employee's role.
GitHub values
-
Customer-obsessed
-
Ship to learn
-
Growth mindset
-
Own the outcome
-
Better together
-
Diverse and inclusive
Manager fundamentals
-
Model
-
Coach
-
Care
Leadership principles
-
Create clarity
-
Generate energy
-
Deliver success
Who We Are
GitHub is the world’s leading AI-powered developer platform with 150 million developers and counting. We’re also home to the biggest open-source community on earth (and 99% of the world’s software has open-source code in its DNA). Many of the apps and programs you use every day are built on GitHub.
Our teams are dreamers, doers, and pioneers, leading the way in AI, driving humanitarian efforts around the globe, and even sending open source to Mars (and beyond!).
At GitHub, our goal is to create the space you need to do your best work. We’re remote-first and offer competitive pay, generous learning and growth opportunities, and excellent benefits to support you, wherever you are—because we know that people flourish when they can work on their own terms.
Join us, and let’s change the world, together.
EEO Statement
- GitHub is made up of people from a wide variety of backgrounds and lifestyles. We embrace diversity and invite applications from people of all walks of life. We don't discriminate against employees or applicants based on gender identity or expression, sexual orientation, race, religion, age, national origin, citizenship, disability, pregnancy status, veteran status, or any other differences. Also, if you have a disability, please let us know if there's any way we can make the interview process better for you; we're happy to accommodate!
- Partner with the Solution Architect to audit and map Account data across source systems (CRM, billing, product), documenting field-level lineage, ownership, and quality gaps.
- Support the design and testing of match/merge and survivorship rules for the Account entity, defining which source system wins for each attribute and why.
- Assist in building and validating a sandbox POC Golden Record for the Account domain, including deduplication logic, confidence scoring, and a sample output dataset.
- Measure and report on baseline data quality metrics, duplicate rates, completeness scores, and field-level accuracy, to establish a benchmark for MDM impact.
- Document POC outcomes, key decisions, edge cases, and a clear handoff package to guide the production engineering team.
- Develop a draft data stewardship process, including how records get flagged, reviewed, and approved, in collaboration with the Solution Architect and business stakeholders.
Required Qualifications:
- Currently pursuing a Master's Degree in Data Management, Information Systems, Data Analytics, or a related field, with at least one quarter/semester remaining after the internship.
- Expected conferral date between December 2026 and August 2027.
- Foundational understanding of data quality, data modeling, or MDM concepts through coursework or project experience.
- Comfortable working with SQL and exploring relational or CRM datasets.
Preferred Qualifications:
- Familiarity with CRM data structures (Salesforce or similar) and common data quality challenges like duplicates, incomplete records, or inconsistent formatting.
- Exposure to MDM concepts such as entity resolution, match/merge logic, or survivorship rules, even in an academic context.
- Strong analytical thinking, able to investigate messy data, identify patterns, and form clear recommendations.
- Excellent documentation skills, able to translate technical findings into clear, business-facing write-ups and process guides.
- Collaborative and curious, comfortable asking questions and working within a structured mentorship model.
总浏览量
0
申请点击数
0
模拟申请者数
0
收藏
0
相似职位

Data Engineer Internship (Part-time)
Bosch · Beograd

Associate Data Scientist (Data Engineering)
Amgen · India - Hyderabad

Data & Analytics Internship
ABB · Krakow, Lesser Poland, Poland

Data Engineer Intern
Skydio · San Mateo, California, United States

Data Migration Specialist - Associate
JPMorgan Chase · Bengaluru, Karnataka, India, IN
关于GitHub

GitHub
Series BA software company that offers code hosting services that allow developers to build software for open-source and private projects.
501-1,000
员工数
San Francisco
总部位置
$7.5B
企业估值
评价
2.6
3条评价
工作生活平衡
2.0
薪酬
3.0
企业文化
2.5
职业发展
2.0
管理层
2.0
25%
推荐给朋友
优点
Remote-first culture transition
Cost savings potential
Technical assessment processes
缺点
Job security concerns from layoffs
Overly complicated hiring process
Poor work-life balance
薪资范围
22个数据点
Senior/L5
Senior/L5 · SENIOR PARTNER ENGINEER / STRATEGIC SOLUTIONS ENGINEER / SENIOR SOLUTIONS ENGINEER
3份报告
$212,394
年薪总额
基本工资
$163,380
股票
-
奖金
-
$200,460
$236,600
面试经验
3次面试
难度
3.3
/ 5
时长
14-28周
录用率
33%
体验
正面 33%
中性 67%
负面 0%
面试流程
1
Application Review
2
Recruiter Screen
3
Technical Phone Screen
4
Onsite/Virtual Interviews
5
Team Matching
6
Offer
常见问题
Coding/Algorithm
System Design
Behavioral/STAR
Technical Knowledge
Culture Fit
新闻动态
GitHub Copilot Exposes Enterprise Data and Secrets - Let's Data Science
Let's Data Science
News
·
2d ago
GitHub Copilot CLI now supports Copilot auto model selection - The GitHub Blog
The GitHub Blog
News
·
3d ago
OpenAI debuts GPT-Rosalind, a new limited access model for life sciences, and broader Codex plugin on Github - VentureBeat
VentureBeat
News
·
4d ago
How GitHub uses eBPF to improve deployment safety - The GitHub Blog
The GitHub Blog
News
·
4d ago