[Remote] Data Automation Engineer
Note: The job is a remote job and is open to candidates in USA. Aptonet is seeking a Data Automation Engineer to design and implement innovative data automation solutions for an enterprise-scale Microsoft Azure-based analytics and reporting platform. The role involves building scalable data pipelines, automating workflows, and supporting AI/ML initiatives while collaborating with various stakeholders to optimize cloud-based data environments.
Responsibilities
- Design, develop, and maintain high-performance data pipelines using:
- Azure Data Factory
- Azure Synapse Pipelines
- Apache Spark Notebooks
- Python
- SQL
- Stored Procedures
- Translate business requirements into scalable data engineering and AI-driven solutions
- Continuously improve automation tools for reliability, scalability, and adaptability
- Research and implement AI/ML and Generative AI solutions to automate data processes and eliminate workflow bottlenecks
- Collaborate with implementation specialists, engineering teams, and customers to develop data-driven solutions
- Design and implement data ingestion, transformation, integration, and processing solutions
- Support advanced analytics, reporting, visualization, and AI/ML initiatives
- Implement:
- Data migration
- Data quality
- Data integrity
- Metadata management
- Data security functions
- Monitor, troubleshoot, and optimize data pipeline performance
- Execute ETL performance testing and validate benchmark results
- Analyze:
- Pipeline runtime
- Throughput
- Latency
- Resource utilization
- Participate in performance testing for:
- Azure Data Factory (ADF)
- Azure Synapse
- Databricks
- Support performance tuning activities including:
- Query optimization
- Partitioning
- Indexing
- Validate data consistency and completeness after performance testing
- Collaborate with DevOps and infrastructure teams on compute, memory, and scaling optimization
- Document test results, findings, and recommendations
- Support Agile DevOps processes including Program Increment planning
- Maintain strict versioning and configuration control processes
Skills
- 2+ years of experience with two or more of the following: SQL, T-SQL, MDX/DAX, Python, PySpark
- Experience designing and building ETL and data engineering solutions
- Experience with: Azure Data Lake Services, Azure Synapse Analytics, Azure Data Factory, Integration Runtime
- Experience with Microsoft data and BI technologies including: SQL Server, Stored Procedures, SSIS, SSRS, SSAS (Cubes), Power BI
- Experience automating data processes using: Azure CLI, AWS CLI, Bash, PowerShell
- Experience with: Azure DevOps Repos, GitHub, Pipeline versioning, Release management
- Experience supporting: Production environments, Development environments, Testing environments, Integration environments
- Knowledge of Agile development methodologies
- Strong analytical, troubleshooting, and problem-solving skills
- Ability to support multiple projects simultaneously
- Strong communication and collaboration skills
- Bachelor's degree in: Computer Science, Related technical field
- 2+ years of relevant professional experience
- U.S. Citizenship required
- Ability to successfully obtain and maintain a Public Trust clearance
- Demonstrated commitment to continuous learning and professional development
- Generative AI development experience
- Generative AI for Data Analytics experience
- Microsoft certifications including: Azure Fundamentals, Azure Data Engineer, Power BI, Azure AI, AWS Certified Data Engineer certification
- Experience with: Databricks, REST APIs, Docker, Enterprise ETL toolsets
- Performance tuning experience including: Indexing, Execution plans, Query analytics, Data profiling
- Knowledge of: Data encryption, Cloud virtual networks, Routing, Firewalls, Log Analytics, Monitoring tools
- Experience with: ARM templates, Bicep templates, RBAC access controls
- Data lineage and impact analysis experience using: Microsoft Purview, Synapse Pipeline Tracing
Company Overview