Data Engineer

We are looking for a strong core Data Engineer with hands-on experience in PySpark, Databricks, and Azure data platforms to design, build, and support end-to-end data pipelines. The ideal candidate will develop and optimize data transformations, build production-grade Python components, and maintain cloud-native Azure environments while collaborating with application teams and ensuring high-quality, reliable data delivery. This role offers the opportunity to work with large-scale datasets, implement ETL/ELT best practices, optimize Databricks clusters, and leverage modern cloud technologies to support AI/ML initiatives.

About the Role

Location: 3 days Hybrid in Chicago, IL

Duration: 6+ Month Contract

Interview: 2 video interview and final onsite

Responsibilities

Design, build, and support end-to-end data pipelines, including ingestion, transformation, validation, and publishing.
Develop and optimize SQL and PySpark/Databricks transformations for large datasets.
Build production-grade Python modules with logging, error handling, testing, and integration with APIs/files.
Create, maintain, and operate Azure Data Factory (ADF) pipelines, including triggers, parameterization, monitoring, and failure handling.
Work within Azure environments: ADLS Gen2 (Blob Storage), Azure SQL, Azure App Service, and resource groups.
Provision and maintain Azure components using Pulumi (Infrastructure as Code).
Optimize Databricks clusters, workflows, and jobs for performance and reliability.
Participate in code reviews, documentation, and operational support, including triage and root cause analysis.
Collaborate with application teams for integration, troubleshooting, and operational coordination.

Qualifications

Education: Bachelor's degree in Computer Science, Engineering, or a related technical field (or equivalent experience).

Required Skills

Experience: 5+ years as a Data Engineer; 3+ years in ETL/ELT concepts, PySpark, and SQL.
SQL: Advanced querying, CTEs, views, joins, complex transformations, and performance tuning.
Python: 2+ years building production-quality modules, unit/integration testing, logging, and CI/CD integration.
Databricks: 1+ year working with notebooks, jobs, workflows, external/managed tables, Delta Lake, and basic cluster configuration.
Azure Data Factory (ADF): 1+ year creating and maintaining pipelines, including triggers, parameterization, monitoring, and error handling.
Azure Cloud: Hands-on with ADLS Gen2, Azure SQL, Azure App Service, and general Azure portal/resource group operations.
Infrastructure as Code: Experience provisioning Azure resources with Pulumi.
ETL/ELT Concepts: Strong understanding of pipeline patterns, incremental loads, data validation, and troubleshooting.

Preferred Skills

Additional Skills (nice-to-have): R for data validation, TypeScript for Pulumi pipelines, Java/.NET for integration, Angular/Spring Boot for minor troubleshooting.

Data Engineer

Motion Recruitment

Chicago, IL, United States