Data Platform Development: Architect and optimize a scalable Lakehouse-based Data Platform (AWS S3, Delta Lake, Redshift) with robust pipelines using Airflow, Airbyte, dbt, and Kafka.
ML/AI Integration: Design and implement data infrastructure for ML/AI workflows, including feature stores, model training pipelines, and inference systems.
Collaboration: Partner with Data Scientists and stakeholders to translate business requirements into scalable data and ML/AI solutions, enabling analytics, reporting, and predictive modeling.
Performance Optimization: Ensure scalability, reliability, and cost-efficiency of data systems, optimizing tools like Kafka, dbt, Airflow, and storage solutions.
Innovation & Research: Continuously research and implement new technologies to enhance the platform, applying best practices in Data Engineering, DataOps, and MLOps.
Data Governance Implementation: Design and implement Data Governance frameworks, including policies, processes, and tools for data quality, data lineage, data cataloging, access control, and compliance with regulations (e.g., CCPA, PII). Utilize tools like AWS Glue, Lake Formation, or Collibra to ensure data integrity and security.
Technical Leadership: Define the technical roadmap, establish best practices in Data Engineering and DataOps, and mentor a team of data engineers.
Team Management: Plan and assign tasks to data engineers, evaluate performance, and support the professional development of the team