热门公司

招聘

职位GitHub

Master Data Management & Data Quality Intern

GitHub

Master Data Management & Data Quality Intern

GitHub

United States

·

On-site

·

Internship

·

3w ago

必备技能

SQL

Salesforce

## About Git

Hub

GitHub is the world’s leading platform for agentic software development — powered by Copilot to build, scale, and deliver secure software. Over 180 million developers, including more than 90% of the Fortune 100 companies, use GitHub to collaborate, and more than 77,000 organisations have adopted GitHub Copilot.

Locations

In this role you can work from Remote, United States

Overview

The Revenue Operations & Data Governance team is building a Master Data Management (MDM) foundation to establish a trusted, unified view of our customers and accounts. As our MDM & Data Quality Intern, you will work directly alongside a Solution Architect to design and validate our MDM Proof of Concept, focused on a single, well-scoped entity domain (Accounts). Your work will lay the analytical and documentary groundwork that carries this initiative from exploration into a production-ready recommendation. This is a hands-on, high-impact role where your findings and deliverables will directly shape GitHub's long-term data strategy.

This is a remote summer internship for 12 consecutive weeks with start dates between May18 - June 15, 2026.

Responsibilities

  • Partner with the Solution Architect to audit and map Account data across source systems (CRM, billing, product), documenting field-level lineage, ownership, and quality gaps.
  • Support the design and testing of match/merge and survivorship rules for the Account entity, defining which source system wins for each attribute and why.
  • Assist in building and validating a sandbox POC Golden Record for the Account domain, including deduplication logic, confidence scoring, and a sample output dataset.
  • Measure and report on baseline data quality metrics, duplicate rates, completeness scores, and field-level accuracy, to establish a benchmark for MDM impact.
  • Document POC outcomes, key decisions, edge cases, and a clear handoff package to guide the production engineering team.
  • Develop a draft data stewardship process, including how records get flagged, reviewed, and approved, in collaboration with the Solution Architect and business stakeholders.

Qualifications

Required Qualifications:

  • Currently pursuing a Master's Degree in Data Management, Information Systems, Data Analytics, or a related field, with at least one quarter/semester remaining after the internship.
  • Expected conferral date between December 2026 and August 2027.
  • Foundational understanding of data quality, data modeling, or MDM concepts through coursework or project experience.
  • Comfortable working with SQL and exploring relational or CRM datasets.

Preferred Qualifications:

  • Familiarity with CRM data structures (Salesforce or similar) and common data quality challenges like duplicates, incomplete records, or inconsistent formatting.
  • Exposure to MDM concepts such as entity resolution, match/merge logic, or survivorship rules, even in an academic context.
  • Strong analytical thinking, able to investigate messy data, identify patterns, and form clear recommendations.
  • Excellent documentation skills, able to translate technical findings into clear, business-facing write-ups and process guides.
  • Collaborative and curious, comfortable asking questions and working within a structured mentorship model.

Compensation Range

The base salary range for this job is USD $31.82 - USD $84.42 /Hr.

These pay ranges are intended to cover roles based across the United States. An individual's base pay depends on various factors including geographical location and review of experience, knowledge, skills, abilities of the applicant. At GitHub certain roles are eligible for benefits and additional rewards, including annual bonus and stock. These rewards are allocated based on individual impact in role. In addition, certain roles also have the opportunity to earn sales incentives based on revenue or utilization, depending on the terms of the plan and the employee's role.

GitHub values

  • Customer-obsessed

  • Ship to learn

  • Growth mindset

  • Own the outcome

  • Better together

  • Diverse and inclusive

Manager fundamentals

  • Model

  • Coach

  • Care

Leadership principles

  • Create clarity

  • Generate energy

  • Deliver success

Who We Are

GitHub is the world’s leading AI-powered developer platform with 150 million developers and counting. We’re also home to the biggest open-source community on earth (and 99% of the world’s software has open-source code in its DNA). Many of the apps and programs you use every day are built on GitHub.

Our teams are dreamers, doers, and pioneers, leading the way in AI, driving humanitarian efforts around the globe, and even sending open source to Mars (and beyond!).
At GitHub, our goal is to create the space you need to do your best work. We’re remote-first and offer competitive pay, generous learning and growth opportunities, and excellent benefits to support you, wherever you are—because we know that people flourish when they can work on their own terms.

Join us, and let’s change the world, together.

EEO Statement

  • GitHub is made up of people from a wide variety of backgrounds and lifestyles. We embrace diversity and invite applications from people of all walks of life. We don't discriminate against employees or applicants based on gender identity or expression, sexual orientation, race, religion, age, national origin, citizenship, disability, pregnancy status, veteran status, or any other differences. Also, if you have a disability, please let us know if there's any way we can make the interview process better for you; we're happy to accommodate!
  • Partner with the Solution Architect to audit and map Account data across source systems (CRM, billing, product), documenting field-level lineage, ownership, and quality gaps.
  • Support the design and testing of match/merge and survivorship rules for the Account entity, defining which source system wins for each attribute and why.
  • Assist in building and validating a sandbox POC Golden Record for the Account domain, including deduplication logic, confidence scoring, and a sample output dataset.
  • Measure and report on baseline data quality metrics, duplicate rates, completeness scores, and field-level accuracy, to establish a benchmark for MDM impact.
  • Document POC outcomes, key decisions, edge cases, and a clear handoff package to guide the production engineering team.
  • Develop a draft data stewardship process, including how records get flagged, reviewed, and approved, in collaboration with the Solution Architect and business stakeholders.

Required Qualifications:

  • Currently pursuing a Master's Degree in Data Management, Information Systems, Data Analytics, or a related field, with at least one quarter/semester remaining after the internship.
  • Expected conferral date between December 2026 and August 2027.
  • Foundational understanding of data quality, data modeling, or MDM concepts through coursework or project experience.
  • Comfortable working with SQL and exploring relational or CRM datasets.

Preferred Qualifications:

  • Familiarity with CRM data structures (Salesforce or similar) and common data quality challenges like duplicates, incomplete records, or inconsistent formatting.
  • Exposure to MDM concepts such as entity resolution, match/merge logic, or survivorship rules, even in an academic context.
  • Strong analytical thinking, able to investigate messy data, identify patterns, and form clear recommendations.
  • Excellent documentation skills, able to translate technical findings into clear, business-facing write-ups and process guides.
  • Collaborative and curious, comfortable asking questions and working within a structured mentorship model.

总浏览量

0

申请点击数

0

模拟申请者数

0

收藏

0

关于GitHub

GitHub

GitHub

Series B

A software company that offers code hosting services that allow developers to build software for open-source and private projects.

501-1,000

员工数

San Francisco

总部位置

$7.5B

企业估值

评价

2.6

3条评价

工作生活平衡

2.0

薪酬

3.0

企业文化

2.5

职业发展

2.0

管理层

2.0

25%

推荐给朋友

优点

Remote-first culture transition

Cost savings potential

Technical assessment processes

缺点

Job security concerns from layoffs

Overly complicated hiring process

Poor work-life balance

薪资范围

22个数据点

Senior/L5

Senior/L5 · SENIOR PARTNER ENGINEER / STRATEGIC SOLUTIONS ENGINEER / SENIOR SOLUTIONS ENGINEER

3份报告

$212,394

年薪总额

基本工资

$163,380

股票

-

奖金

-

$200,460

$236,600

面试经验

3次面试

难度

3.3

/ 5

时长

14-28周

录用率

33%

体验

正面 33%

中性 67%

负面 0%

面试流程

1

Application Review

2

Recruiter Screen

3

Technical Phone Screen

4

Onsite/Virtual Interviews

5

Team Matching

6

Offer

常见问题

Coding/Algorithm

System Design

Behavioral/STAR

Technical Knowledge

Culture Fit