Data Engineer - Databricks Specialist

Remote Full-time
We are seeking an experienced Data Engineer with deep expertise in Databricks to design, build, and maintain scalable data pipelines and analytics solutions. This role requires 5 years of hands-on experience in data engineering with a strong focus on the Databricks platform. Key Responsibilities: - Data Pipeline Development & Management - Design and implement robust, scalable ETL/ELT pipelines using Databricks and Apache Spark - Process large volumes of structured and unstructured data - Develop and maintain data workflows using Databricks workflows, Apache Airflow, or similar orchestration tools - Optimize data processing jobs for performance, cost efficiency, and reliability - Implement incremental data processing patterns and change data capture (CDC) mechanisms Databricks Platform Engineering: - Build and maintain Delta Lake tables and implement medallion architecture (bronze, silver, gold layers) - Develop streaming data pipelines using Structured Streaming and Delta Live Tables - Manage and optimize Databricks clusters for various workloads - Implement Unity Catalog for data governance, security, and metadata management - Configure and maintain Databricks workspace environments across development, staging, and production Data Architecture & Modeling: - Design and implement data models optimized for analytical workloads - Create and maintain data warehouses and data lakes on cloud platforms (Azure, AWS, or GCP) - Implement data partitioning, indexing, and caching strategies for optimal query performance - Collaborate with data architects to establish best practices for data storage and retrieval patterns Performance Optimization & Monitoring: - Monitor and troubleshoot data pipeline performance issues - Optimize Spark jobs through proper partitioning, caching, and broadcast strategies - Implement data quality checks and automated testing frameworks - Manage cost optimization through efficient resource utilization and cluster management - Establish monitoring and alerting systems for data pipeline health and performance Collaboration & Best Practices: - Work closely with data scientists, analysts, and business stakeholders to understand data requirements - Implement version control using Git and follow CI/CD best practices for code deployment - Document data pipelines, data flows, and technical specifications - Mentor junior engineers on Databricks and data engineering best practices - Participate in code reviews and contribute to establishing team standards Required Qualifications Experience & Skills: - 5+ years of experience in data engineering with hands-on Databricks experience - Strong proficiency in Python and/or Scala for Spark application development - Expert-level knowledge of Apache Spark, including Spark SQL, DataFrames, and RDDs - Deep understanding of Delta Lake and Lakehouse architecture concepts - Experience with SQL and database optimization techniques - Solid understanding of distributed computing concepts and data processing frameworks - Proficiency with cloud platforms (Azure, AWS, or GCP) and their data services - Experience with data orchestration tools (Databricks Workflows, Apache Airflow, Azure Data Factory) - Knowledge of data modeling concepts for both OLTP and OLAP systems - Familiarity with data governance principles and tools like Unity Catalog - Understanding of streaming data processing and real-time analytics - Experience with version control systems (Git) and CI/CD pipelines Preferred Qualifications - Databricks Certified Data Engineer certification (Associate or Professional) - Experience with machine learning pipelines and MLOps on Databricks - Knowledge of data visualization tools (Power BI, Tableau, Looker) - Experience with infrastructure as code (Terraform, CloudFormation) - Familiarity with containerization technologies (Docker, Kubernetes)
Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

Senior Software Engineer; Drupal PHP - Remote Security Clearance

Remote Full-time

Experienced Customer Service Representative for Nights, Weekends, and Holidays – Delivering Exceptional Support and Resolution

Remote Full-time

Senior Threat Detection and Response Engineer - Blue Team

Remote Full-time

**Experienced Remote Call Center Customer Service Representative – Patient Care Advocate**

Remote Full-time

Sr. Employment Attorney (Remote)

Remote Full-time

Experienced Managerial Accounting Tutor for Staunton - Remote Online Teaching Opportunity

Remote Full-time

**Experienced Part-Time Remote Data Entry Specialist – Market Research and Data Insights**

Remote Full-time

**Experienced Remote Data Entry Specialist – Administrative Support for Market Research at blithequark**

Remote Full-time

Director - Regulatory Compliance

Remote Full-time

Social Media Content Creator

Remote Full-time
← Back to Home