refresh

Trending Companies

Trending

Jobs

JobsSamsung

Data Engineer/Scientist for ML

Samsung

Data Engineer/Scientist for ML

Samsung

24A, Kifissias Avenue,, Athens, Greece

·

On-site

·

Full-time

·

1w ago

Required Skills

Python

Pandas

NumPy

SQL

NoSQL

Position Summary

We are seeking a specialized Data Engineer or Data Scientist to manage the complete lifecycle of the training data that powers our AI models. This role is pivotal in curating, sanitizing, and structuring high-quality speech and text datasets, serving as the foundation for training state-of-the-art Automatic Speech Recognition (ASR), Text-to-Speech (TTS), and Machine Translation (MT) systems

Role and Responsibilities

Data Pipeline Architecture Design, build, and maintain robust pipelines for the ingestion, processing, and management of heterogeneous data sources, ensuring efficient flow from raw collection to model-ready inputs.

Unstructured Data Extraction Extract and process high-fidelity speech data from complex, unstructured sources, including video feeds, multi-channel audio recordings, and raw text archives.

Corpus Curation & Management Organize, structure, and analyze complex linguistic datasets, including speech-to-text alignments and parallel translation corpora, ensuring metadata accuracy and consistency.

Data Cleaning & Noise Reduction Implement rigorous quality control protocols to identify and correct errors, remove artifacts, and apply noise reduction techniques to enhance audio clarity.

Dataset Enhancement Strategies Develop and execute strategies to improve data quantity and diversity, including the application of data augmentation techniques and synthetic data generation.

Cross-Functional Collaboration Partner closely with Machine Learning Engineers to align data preprocessing workflows and formatting with the specific requirements of various model architectures.

Skills and Qualifications:

Programming Proficiency Advanced proficiency in Python and core data manipulation libraries (e.g., Pandas, Num Py) with the ability to write clean, efficient, and scalable code.

Audio & Data Tooling Hands-on experience with audio processing and analysis tools (e.g., librosa, torchaudio, Praat) and database management systems (SQL/NoSQL).

ML & NLP Fundamentals Solid understanding of Machine Learning principles and the specific preprocessing and tokenization requirements for Natural Language Processing (NLP) and speech tasks.

Data Quality Expertise:

Proven track record in handling large-scale, messy, or unstructured datasets, with a strong focus on data validation, cleaning, and sanitization techniques.

  • Please visit Samsung membership to see Privacy Policy, which defaults according to your location, at: https://account.samsung.com/membership/policy/privacy. You can change Country/Language at the bottom of the page. If you are European Economic Resident, please click here: https://europe-samsung.com/ghrp/PrivacyNoticeforEU.html

Total Views

0

Apply Clicks

0

Mock Applicants

0

Scraps

0

About Samsung

Samsung

A technology company that engages in consumer electronics, IT and mobile communications, and device solutions.

10,001+

Employees

Seoul

Headquarters

$267B

Valuation

Reviews

3.7

15 reviews

Work Life Balance

2.0

Compensation

2.5

Culture

1.5

Career

2.0

Management

1.8

15%

Recommend to a Friend

Pros

Hardware/technology leadership

Competitive salary offers for some roles

Sign-on bonuses available

Cons

Toxic culture and politics

Poor work-life balance with strict RTO policies

Micromanagement and employee tracking

Salary Ranges

22 data points

Senior/L5

Senior/L5 · Digital Transformation Manager

1 reports

$180,827

total / year

Base

$157,414

Stock

-

Bonus

-

$180,827

$180,827

Interview Experience

6 interviews

Difficulty

2.2

/ 5

Duration

14-28 weeks

Offer Rate

67%

Experience

Positive 33%

Neutral 33%

Negative 34%

Interview Process

1

Application Review

2

Phone Screen

3

Technical/Video Interview

4

Team Interview

5

Offer

Common Questions

Technical Knowledge

Behavioral/STAR

Past Experience

Role-Specific Skills