MU289 - Data Engineer

Buenos Aires
Permanente
Tiempo completo

Hace 1 día

**About Fusemachines**Fusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic and more than 450 full-time employees). Fusemachines seeks to bring its global expertise in AI to transform companies around the world.**About the role**:
This is a remote, contract position responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization and Advanced Analytics).**Qualification & Experience**
- Must have a full-time Bachelor's degree in Computer Science or similar
- At least 2 years of experience as a data engineer with strong expertise in Azure or other hyperscalers.
- 2+ years of experience with Azure DevOps, Azure Cloud Platform, or other hyperscalers.
- Proven experience delivering projects for Data and Analytics, as a data engineer
- Following certifications:- Microsoft Certified: Azure Fundamentals
- Microsoft Certified: Azure Data Engineer Associate
- Databricks Certified Associate Developer for Apache Spark
- Databricks Certified Data Engineer Associate, nice to have**Required skills/Competencies**
- Strong programming Skills in one or more languages such as **Python** (must have), Scala, and proficiency in writing efficient and optimized code for data integration, storage, processing and manipulation.
- Strong experience using Markdown to document code or automated documentation tools (e.g PyDoc).
- Strong experience with scalable and distributed Data Processing Technologies such as Spark/**PySpark** (must have: experience with Azure Databricks is a plus), DBT and Kafka, to be able to handle large volumes of data.
- Strong experience in designing and implementing efficient ELT/ETL processes in Azure and using open source solutions being able to develop custom integration solutions as needed.
- Skilled in Data Integration from different sources such as APIs, databases, flat files, event streaming.
- Expertise in data cleansing, transformation, and validation.
- Hands-on experience with Jupyter Notebooks and python packaging and dependency management: Poetry, PipEnv.
- Proficiency with Relational Databases (Oracle, SQL Server, MySQL, Postgres, or similar) and NonSQL Databases (MongoDB or Table).
- Good understanding of Data Modeling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions.
- Strong knowledge in SQL.
- Strong experience in designing and implementing Data Warehousing solutions in Azure with Azure Synapse Analytics and/or Snowflake.
- Strong understanding of the software development lifecycle (SDLC), especially Agile methodologies.
- Strong knowledge of SDLC tools and technologies Azure DevOps, including project management software (Jira, Azure Boards or similar), source code management (GitHub, Azure Repos, Bitbucket or similar), CI/CD system (GitHub actions, Azure Pipelines, Jenkins or similar) and binary repository manager (Azure Artifacts or similar).
- Strong understanding of DevOps principles, including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC), configuration management, automated testing and cost management.
- Knowledge in cloud computing specifically in Microsoft Azure services related to data and analytics, such as Azure Data Factory, **Azure Databricks**, Azure Synapse Analytics (formerly SQL Data Warehouse), Azure Stream Analytics, SQL Server, Azure Blob Storage, Azure Data Lake Storage, Azure SQL Database, etc.
- Experience in Orchestration using technologies like Apache Airflow
- Strong analytical skills to identify and address technical issues, performance bottlenecks, and system failures.
- Proficiency in debugging and troubleshooting issues in complex data and analytics environments and pipelines.
- Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent.
- Experience with BI solutions including PowerBI and Tableau is a plus.
- Knowledge of containers and their environments (Docker, Podman, Docker-Compose, Kubernetes, Minikube, Kind, etc.).
- Good Problem-Solving skills: being able to troubleshoot data processing pipelines and identify performance bottlenecks and other issues.
- Strong written and verbal communication skills to collaborate with cross-functional teams, including data architects, DevOps engineers, data analysts, data scientists, developers, and operations teams.
- Ability to document processes, procedures, and deployment configurations.
- Understanding of Azure security practices, including network security groups, Azure Active Directory, encrypt

Kit Empleo

Postularse