Data Engineer

Job ID: 73137

Posted 1 days ago

Los angeles, CA

 

60 - 70/hr

Los angeles, CA

Contract

60 - 70/hr

Remote

Job Details

Essential Skills
Proficiency in Python: Must be proficient in designing classes, working in Terraform, and using Git for code reviews and collaborative development.

Kafka Expertise: Strong experience with Apache Kafka, including setting up consumers and S3 sinks for topics, with a focus on streaming data pipelines; the more hands-on Kafka experience, especially with S3 integration, the better.
AWS Services: General experience with Amazon MSK (Managed Streaming for Apache Kafka), Glue for ETL processes, S3 for storage, ECS for container orchestration, and related services to build scalable data infrastructure.
Data Lake Architectures: Strong understanding of working with semi-structured data and modern data lake architectures, including handling raw data in formats like JSON and YAML.
Data Pipeline Development: Proven ability to design and implement ETL/ELT processes for data ingestion, transformation, and loading, including incremental loads, data quality checks, and integration with streaming sources.
SQL Mastery: Advanced SQL skills for querying, transforming, and optimizing data across data warehouses and lakes (e.g., Redshift, Snowflake, or similar).
Version Control: Comfort with Git for collaborative development, branching, and merging data engineering projects.
Desirable Skills:
Cloud Data Platforms: Experience with cloud data warehouses and lakes (e.g., AWS Redshift, Snowflake, or Google BigQuery) to support data pipeline deployment and management.
Data Quality and Testing: Ability to implement data validation strategies, tests (e.g., uniqueness, referential integrity), and monitoring in pipelines to ensure reliability.

Performance Optimization: Skills in optimizing data pipelines and queries for large datasets, including partitioning, indexing, and leveraging formats like Apache Iceberg for scalable table management (nice to have).
Collaboration Tools: Proficiency with tools like Slack or Jira for coordinating with Analytics Engineers (AEs) and tracking progress.
Documentation: Capability to create clear documentation for data pipelines, schemas, and integration processes to support team handoff.
Additional Considerations

Adaptability: Ability to quickly learn existing data infrastructure and adapt pipelines to incorporate new streaming sources, data lakes, or raw assets.
Problem-Solving: Strong analytical skills to troubleshoot pipeline challenges, such as data inconsistencies in semi-structured formats, and propose effective solutions.
Communication: Comfort working with AEs to understand requirements and provide updates, ensuring smooth collaboration.


TSG is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.
#LI-KY1

 

Share This Job

Similar Jobs