TZF-287 - Lead-sr. Data Engineer Azure

Buenos Aires
Permanente
Tiempo completo

Hace 1 día

**About Fusemachines**Fusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic and more than 450 full-time employees). Fusemachines seeks to bring its global expertise in AI to transform companies around the world.**About the role**:
This is a remote, 1 year contract position responsible for leading, designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization and Advanced Analytics) using Microsoft Azure in the Media domain.**Qualification & Experience**
- Must have a full-time Bachelor's degree in Computer Science or similar from a top tier school.
- 4+ years of experience with Azure DevOps, Azure Cloud Platform, or other hyperscalers.
- At least 4 years of experience as a data engineer with strong expertise in Azure, working on generation of big datasets using different data sources, in the Media industry.
- Proven experience delivering projects and products for Data and Analytics as a data engineer.**Following certifications**:- Microsoft Certified: Azure Fundamentals
- Microsoft Certified: Azure Data Engineer Associate
- Microsoft Certified: Azure Solutions Architect Expert: nice to have
- Databricks Certified Associate Developer for Apache Spark
- Databricks Certified Data Engineer Associate, nice to have**Required skills/Competencies**
- Strong programming Skills in one or more languages such as Python (must have), Scala, and proficiency in writing efficient and optimized code for data integration, storage, processing and manipulation.
- Strong experience using Markdown to document code or automated documentation tools (e.g PyDoc).
- Strong experience with scalable and distributed Data Processing Technologies such as Spark/PySpark (must have, experience with Azure Databricks is a plus), DBT and Kafka, to be able to handle large volumes of data.
- Expert in designing and implementing efficient ELT/ETL processes in Azure (experience with Azure Data Factory is a plus) and using open source solutions being able to develop custom integration solutions as needed.
- Skilled in Data Integration from different sources such as APIs, databases, flat files, event streaming, with technologies such as Azure Data Factory.
- Expertise in data cleansing, transformation, and validation.
- Hands-on experience with Jupyter Notebooks and python packaging and dependency management: Poetry, PipEnv.
- Proficiency with Relational Databases (Oracle, SQL Server, MySQL, Postgres, or similar) and NonSQL Databases (MongoDB or Table).
- Good understanding of Data Modeling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions.
- Strong understanding and experience with SQL and writing advanced SQL queries.
- Strong experience in designing and implementing Data Warehousing solutions in Azure with Azure Synapse Analytics and/or Snowflake.
- Familiarity with migration of code from one or more of SAS, R, Julia, SPSS to Python.
- Proven technical leadership on prior Big Data projects.
- Strong understanding of the software development lifecycle (SDLC), especially Agile methodologies.
- Strong knowledge of SDLC tools and technologies Azure DevOps, including project management software (Jira, Azure Boards or similar), source code management (GitHub, Azure Repos, Bitbucket or similar), CI/CD system (GitHub actions, Azure Pipelines, Jenkins or similar) and binary repository manager (Azure Artifacts or similar).
- Strong understanding of DevOps principles, including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC), configuration management, automated testing and cost management.
- Strong knowledge in cloud computing specifically in Microsoft Azure services related to data and analytics, such as Azure Data Factory, Azure Databricks, Azure Synapse Analytics (formerly SQL Data Warehouse), Azure Stream Analytics, SQL Server, Azure Blob Storage, Azure Data Lake Storage, Azure SQL Database, etc.
- Experience in Orchestration using technologies like Apache Airflow
- Strong analytical skills to identify and address technical issues, performance bottlenecks, and system failures.
- Proficiency in debugging and troubleshooting issues in complex data and analytics environments and pipelines.
- Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent.
- Good understanding of BI solutions including PowerBI and Tableau.
- Knowledge in containers and their environments (Docker, Podman, Docker-Compose, Kubernetes, Minikube, Kind, etc.) is a plus
- Effective

Kit Empleo

Postularse