Technical Lead Data Engineer

  • Full-time
  • Tech.PMI
  • HAN

Description

Join our Technology Innovations Center of Excellence (CoE) at Crossian, a fast-growing startup that achieved a milestone of $100M in 2024. In this dynamic role, you'll check in daily with your team to stay aligned on project goals and collaborate across our ecosystem with an Agile mindset infused with a unicorn-worthy startup culture.

We are building an E-commerce Data Platform following Lakehouse Architecture, leveraging Airflow, Airbyte, dbt, and Redshift on AWS Cloud. Our system is currently in the early development stage, providing a unique opportunity for you to be involved from the ground up—shaping the architecture, defining data pipelines, and implementing best practices. Here are some examples: Storefront, Payment Gateway, Inventory, Catalog, Logistics, Marketing Insights, CRM.

Why Join Us?

We’re not just about building platforms; we’re about creating a workplace where talent thrives. Here's what you’ll gain:

Exceptional Compensation
  • Up to 30 months of salary per year through competitive salary packages and performance-based bonuses.
  • Total annual compensation $70,000, reflecting your expertise and contribution.
Growth Opportunities
  • Hands-on exposure to cutting-edge technologies and complex system architecture in a global-scale project.
  • Clear career advancement pathways and access to continuous professional development programs.
Global Vision
  • Contribute to projects that redefine how brands connect with consumers globally.
  • Opportunity to work on challenging problems in data analytics, customer engagement, and operational efficiency.

In addition to development tasks, you’ll collaborate closely with other CoEs to understand their unique challenges. With your technical expertise, you’ll continuously propose innovative solutions to elevate their performance and contribute to the overall success of the organization.

Responsibilities

  1. Data Platform Development: Architect and optimize a scalable Lakehouse-based Data Platform (AWS S3, Delta Lake, Redshift) with robust pipelines using Airflow, Airbyte, dbt, and Kafka.
  2. ML/AI Integration: Design and implement data infrastructure for ML/AI workflows, including feature stores, model training pipelines, and inference systems.
  3. Collaboration: Partner with Data Scientists and stakeholders to translate business requirements into scalable data and ML/AI solutions, enabling analytics, reporting, and predictive modeling.
  4. Performance Optimization: Ensure scalability, reliability, and cost-efficiency of data systems, optimizing tools like Kafka, dbt, Airflow, and storage solutions.
  5. Innovation & Research: Continuously research and implement new technologies to enhance the platform, applying best practices in Data Engineering, DataOps, and MLOps.
  6. Data Governance Implementation: Design and implement Data Governance frameworks, including policies, processes, and tools for data quality, data lineage, data cataloging, access control, and compliance with regulations (e.g., CCPA, PII). Utilize tools like AWS Glue, Lake Formation, or Collibra to ensure data integrity and security.
  7. Technical Leadership: Define the technical roadmap, establish best practices in Data Engineering and DataOps, and mentor a team of data engineers.
  8. Team Management: Plan and assign tasks to data engineers, evaluate performance, and support the professional development of the team.


Requirements & Skills

  1. Bachelor's degree in Computer Science or a related field.
  2. 6+ years of experience in data engineering, with at least 2 years in a leadership or team management role.
  3. Proficiency in advanced SQL and programming languages such as Python, C#, or Java for data processing and scripting.
  4. Strong experience with relational databases (e.g., PostgreSQL, MySQL) and cloud platforms (e.g., AWS, GCP, Azure).
  5. Deep knowledge of data architecture, including Lakehouse, Data Warehouse, and document databases (e.g., Elasticsearch, Redis).
  6. Experience designing and implementing complex data architectures (e.g., Lakehouse, Data Mesh) and data processing systems (streaming and batch).
  7. Proven experience deploying ML/AI models into production, including building and managing data pipelines for model training and inference.
  8. Strong leadership skills, including team management, project planning, and effective communication with both technical and non-technical stakeholders.
  9. Excellent problem-solving skills, attention to detail, and the ability to work independently or lead a team in an Agile environment.
  10. Proficient in English, with the ability to communicate, read, and write effectively in a professional work environment.

Preferred (but not required)


  1. Knowledge of container orchestration, such as AWS EKS or Kubernetes.
  2. Experience with microservice architecture.
  3. Experience with data visualization tools, such as AWS QuickSight or Grafana.
  4. Experience with data streaming platforms, such as Apache Kafka or Apache Nifi.
  5. Knowledge of data governance and compliance regulations (e.g., GDPR, CCPA).
  6. Familiarity with the Agile toolset, such as GitLab, Jira, Confluence, and Slack.