[Remote] Platform Engineer

Remote role Full-time Open position

Note: The job is a remote job and is open to candidates in USA. Hyrhub is seeking a Senior Infrastructure Architect / Platform Engineer for their AI/ML platform to provide technical leadership for cloud platforms that support enterprise-scale generative AI applications. The role involves defining infrastructure architecture, leading platform standards, and collaborating with various engineering teams to enhance operational maturity across AI platforms.

Responsibilities

Define and drive the technical strategy for AI/ML platform infrastructure supporting generative AI applications, LLM integrations, model routing, and enterprise AI services
Architect, build, and operate scalable cloud platforms using AWS services such as EKS, ECS Fargate, Lambda, DynamoDB, S3, OpenSearch, Secrets Manager, CloudWatch, ALB, and MWAA
Establish reusable infrastructure patterns using CloudFormation, Helm, and Terraform to support reliable multi-environment and multi-region deployments
Lead CI/CD architecture using GitHub Actions, reusable workflows, OIDC-based AWS authentication, automated quality gates, deployment promotion, and environment approvals
Design and improve observability across AI platforms, including CloudWatch dashboards, logs, alarms, Prometheus/Grafana, OpenSearch, Langfuse, and LLM-specific operational metrics
Build platform capabilities for GenAI workloads, including model availability monitoring
Partner with software engineering teams to improve deployment reliability, rollback strategies, health checks, autoscaling, load testing, and runtime performance
Define and enforce security and compliance practices for infrastructure, including IAM permission boundaries, Secrets Manager usage, secret scanning, audit logging, tagging standards, and change-management controls
Provide technical leadership for cost optimization, capacity planning, environment standardization, and operational resilience across development, test, production, and sandbox environments
Mentor engineers, review architecture and infrastructure designs, and influence platform engineering practices across teams
Troubleshoot complex production issues across cloud infrastructure, networking, containers, serverless workloads, CI/CD systems, and observability platforms
Translate enterprise requirements for security, compliance, reliability, and governance into pragmatic engineering standards and automation

Skills

Bachelor's degree in Computer Science, Engineering, Information Technology, or a related technical field, or equivalent practical experience
7+ years of experience in DevOps, platform engineering, cloud infrastructure, site reliability engineering, or software engineering roles
Strong hands-on experience with AWS/Azure/GCP infrastructure and services, including container, serverless, networking, storage, observability, and security services
Experience designing and operating production systems on Kubernetes, ECS/Fargate, or comparable container orchestration platforms
Proficiency with infrastructure-as-code, especially CloudFormation, Terraform, Helm, or similar tooling
Strong CI/CD experience with GitHub Actions or similar platforms, including reusable workflows, automated testing, deployment gates, and cloud authentication
Experience building and operating observability solutions using CloudWatch, Prometheus/Grafana, OpenSearch, or similar tools
Strong understanding of cloud security practices, IAM, secrets management, least-privilege access, audit logging, and compliance requirements
Experience supporting distributed systems, microservices, APIs, asynchronous workloads, and multi-environment deployments
Demonstrated ability to lead technical design, mentor engineers, and influence engineering practices across teams
Experience supporting AI/ML or generative AI platforms, including LLM gateways, model routing, prompt observability, token metering, or model failover
Experience operating platforms in regulated enterprise environments, ideally healthcare, pharmaceutical, finance, or life sciences
Experience with multi-account, multi-region AWS architectures and enterprise governance patterns
Experience with cost optimization, autoscaling strategies, capacity planning, and cloud budget monitoring
Experience with load testing and performance validation using tools such as Locust or comparable frameworks
Strong Python or scripting skills for platform automation, operational tooling, and CI/CD extensions
Ability to communicate complex technical decisions clearly to engineering, security, operations, and leadership audiences

Company Overview

Hyrhub was founded in 2014, hiring niche talent is still a problem faced by many companies. It was founded in 2018, and is headquartered in Bangalore, Karnataka, IN, with a workforce of 2-10 employees. Its website is .

Apply To This Job

Apply

[Remote] Platform Engineer

Further positions

[Remote] IT Administrator (Contractor, 40 hours per month)

[Remote] Concur Consultant

[Remote] SEO Freelancer (E-commerce) – Website Audit & Growth Strategy

[Remote] Document Administrator

[Remote] Customer Service Representative

[Remote] Software Engineer - Doc Management & EDI X12 Translation

[Remote] Territory Sales Manager-New York area

[Remote] Recruiter

[Remote] Clinical Educator - Western U.S.

[Remote] Financial Professional

[Remote] Sr. FP&A Analyst | PE-Backed High-Growth HealthTech SaaS Platform | Remote | $100k - $150k

Immediate Hiring: Settlement Coordinator (Real Estate) - Remote

Experienced Full Stack Data Entry Specialist – Remote Database Management and Operations

High School Chemistry Curriculum Writer Consultant

Food Preparer G-Food Service-Mount Sinai Hospital-Part Time/Evenings/Every Weekend

Manual Quality Assurance Tester, Mobile (Remote)

(Work At Home) Data Entry Position - Remote - Customer Care Reps

Visual Designer IV - Remote (6036)

SAP G-Invoicing Consultant-Remote with Travel to Kansas City, MO - Apetan Consulting

Data Entry Operator-Remote