DevOps Engineer
Overview
We are looking for an experienced DevOps Engineer to design, implement, and maintain
our cloud infrastructure, CI/CD pipelines, and AI operational workflows. You will ensure
the reliability, scalability, and cost-e iciency of our services while embedding best
practices for machine-learning operations.
Key Responsibilities
CI/CD Pipeline Development: Build and maintain automated build, test, and
deployment pipelines using GitHub Actions (or GitLab CI) and Docker.
Infrastructure as Code: Define and manage AWS infrastructure (EC2, Lambda,
RDS, S3, CloudFront, SQS) using Terraform or AWS CDK; implement autoscaling
and container orchestration (ECS/Fargate).
Monitoring & Alerting: Configure monitoring (CloudWatch, Prometheus) and
visualization (Grafana) to track system health, application performance, and API
usage; establish alerting for key metrics.
Cost Optimization: Analyze resource utilization and recommend optimizations
such as spot instances, serverless functions, and e icient caching strategies to
manage API and compute costs.
MLOps Workflows: Develop and maintain LangChain orchestration pipelines,
version control for prompts, and token-usage logging; integrate Redis caching for
prompt e iciency.
Security & Compliance: Implement robust IAM policies, manage secrets with
AWS Secrets Manager, and enforce encryption (SSL/TLS) across all services.
Required Qualifications
Bachelor’s degree in Computer Science, Engineering, or equivalent.
2+ years of experience with:
o CI/CD & Containers: GitHub Actions (or GitLab CI), Docker, container
registries
o IaC: Terraform or AWS CDK (TypeScript or Python)
o Cloud Platforms: AWS services including EC2, Lambda, RDS, S3,
CloudFront, SQS/SNS, IAM
o Monitoring: AWS CloudWatch, Prometheus, Grafana
o MLOps: LangChain pipeline orchestration; Redis; OpenAI SDK integration;
usage instrumentation
Strong understanding of security best practices (IAM, secrets management,
network policies).
Preferred Skills
Kubernetes or ECS Fargate orchestration.
Production experience with AWS CDK v2.
Familiarity with policy-as-code frameworks (e.g. Terraform Sentinel).
Demonstrated success optimizing costs for high-throughput AI workloads.