refresh

トレンド企業

Trending

採用

JobsCitigroup

Data Engineer - Pyspark

Citigroup

Data Engineer - Pyspark

Citigroup

CHENNAI, Tamil Nādu, India

·

On-site

·

Full-time

·

1mo ago

This is a data engineer position - a programmer responsible for the design, development implementation and maintenance of data flow channels and data processing systems that support the collection, storage, batch and real-time processing, and analysis of information in a scalable, repeatable, and secure manner in coordination with the Data & Analytics team.

The overall objective is defining optimal solutions to data collection, processing, and warehousing. Must be a Spark Java development expertise in big data processing, Python and Apache spark particularly within banking & finance domain.  He/She designs, codes and tests data systems and works on implementing those into the internal infrastructure.

Responsibilities:

  • Ensuring high quality software development, with complete documentation and traceability
  • Develop and optimize scalable Spark Java-based data pipelines for processing and analyzing large scale financial data
  • Design and implement distributed computing solutions for risk modeling, pricing and regulatory compliance
  • Ensure efficient data storage and retrieval using Big Data
  • Implement best practices for spark performance tuning including partition, caching and memory management
  • Maintain high code quality through testing, CI/CD pipelines and version control (Git, Jenkins)
  • Work on batch processing frameworks for Market risk analytics
  • Promoting unit/functional testing and code inspection processes
  • Work with business stakeholders and Business Analysts to understand the requirements
  • Work with other data scientists to understand and interpret complex datasets

Qualifications:

  • 5- 8 Years of experience in working in data eco systems.
  • 4-5 years of hands-on experience in Hadoop, Scala, Java, Spark, Hive, Kafka, Impala, Unix Scripting and other Big data frameworks.
  • 3+ years of experience with relational SQL and NoSQL databases: Oracle, MongoDB, HBase
  • Strong proficiency in Python and Spark Java with knowledge of core spark concepts (RDDs, Dataframes, Spark Streaming, etc) and Scala and SQL
  • Data Integration, Migration & Large Scale ETL experience (Common ETL platforms such as PySpark/DataStage/AbInitio etc.) - ETL design & build, handling, reconciliation and normalization
  • Data Modeling experience (OLAP, OLTP, Logical/Physical Modeling, Normalization, knowledge on performance tuning)
  • Experienced in working with large and multiple datasets and data warehouses
  • Experience building and optimizing ‘big data’ data pipelines, architectures, and datasets.
  • Strong analytic skills and experience working with unstructured datasets
  • Ability to effectively use complex analytical, interpretive, and problem-solving techniques
  • Experience with Confluent Kafka, Redhat JBPM, CI/CD build pipelines and toolchain – Git, BitBucket, Jira
  • Experience with external cloud platform such as OpenShift, AWS & GCP
  • Experience with container technologies (Docker, Pivotal Cloud Foundry) and supporting frameworks (Kubernetes, OpenShift, Mesos)
  • Experienced in integrating search solution with middleware & distributed messaging - Kafka
  • Highly effective interpersonal and communication skills with tech/non-tech stakeholders.
  • Experienced in software development life cycle and good problem-solving skills.
  • Excellent problem-solving skills and strong mathematical and analytical mindset
  • Ability to work in a fast-paced financial environment

Education:

  • Bachelor’s/University degree or equivalent experience in computer science, engineering, or similar domain

Technical Skills:

  • Very proficient in Scala & Python
  • Advance SQL knowledge
  • Very good and fast debugging skills
  • Versioning - Github (or Bitbucket )
  • CICD pipeline - RLM / UDeploy / Harnes
  • IDE - Intilij and vsCode
  • Big Data - HDFS / Hive / Impala
  • Unix - Scripting skills
  • Database - Hive / Hbase / Snowflake / Mongo

Total Views

0

Apply Clicks

0

Mock Applicants

0

Scraps

0

About Citigroup

Citigroup

Citigroup

Public

Citigroup Inc. or Citi is an American multinational investment bank and financial services company based in New York City. The company was formed in 1998 by the merger of Citicorp, the bank holding company for Citibank, and Travelers; Travelers was spun off from the company in 2002.

10,001+

Employees

New York City

Headquarters

Reviews

3.3

4 reviews

Work Life Balance

3.0

Compensation

3.2

Culture

2.8

Career

2.5

Management

2.7

35%

Recommend to a Friend

Pros

Compensation increases for investment banking roles

Legitimate investment banking employer

Internship opportunities available

Cons

Unclear career progression paths

Limited meaningful experience in internships

Compensation raises lower than competitors

Salary Ranges

28 data points

Mid/L4

Senior/L5

Staff/L6

Mid/L4 · Business Risk Intermediate Analyst

1 reports

$77,165

total / year

Base

$67,100

Stock

-

Bonus

-

$77,165

$77,165

Interview Experience

5 interviews

Difficulty

2.8

/ 5

Duration

14-28 weeks

Experience

Positive 0%

Neutral 40%

Negative 60%

Interview Process

1

Application Review

2

Recruiter Screen

3

Programming Assessment

4

Hiring Manager Interview

5

Panel/Superday Interviews

6

Final Decision

Common Questions

Technical Knowledge

Case Study

Behavioral/STAR

Past Experience

Culture Fit