Job Title: Senior Software Engineer, Platform Reliability
Location: Remote-LATAM
We are looking for a Senior Software Engineer with 5+ years of experience in platform reliability, backend systems, and cloud infrastructure.
This role focuses on improving the reliability, scalability, and performance of a large-scale consumer video streaming platform.
Key Responsibilities:
Design and enhance highly reliable, scalable, and self-healing systems
Build and maintain observability (monitoring, logging, tracing, alerting)
Collaborate with engineering teams to improve system performance and architecture
Define and maintain SLAs/SLOs for backend services
Troubleshoot production issues and drive long-term solutions
Act as SME for platform reliability across streaming systems Required Skills:
Strong coding experience in Golang and Java Script/Type Script
Expertise in microservices and distributed systems
Hands-on experience with Kubernetes, Terraform, and Linux systems
Knowledge of networking (TCP/IP, load balancing)
Experience with observability tools (Datadog, Open Telemetry) Good to Have:
Experience with e BPF
Knowledge of video streaming (HLS, transcoding, playback)