WFHJOBSLIST

Headquarters: Beaverton, Oregon
URL: https://www.discogs.com/about/careers

The Discogs Platform team is focused on several objectives: building and supporting performant, cost-effective, reliable infrastructure; developer experience tooling and mentorship; and creating "golden paths" for organization-wide standards and velocity. As a key member of the Platform team, the Senior Site Reliability Engineer - Data will be working closely with other Discogs engineering squads to develop and optimize scalable, well-planned relational database architectures, drive best practices and stability for our use of Kafka and change data capture, and contribute to the Platform team’s operations.

Location

This is a remote position. Open to candidates located in OR, WA, CA, CO, TX, IL

Compensation

Starting Base Salary Range: $130,000 - $140,000 yearly

What You’ll Accomplish

Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

Stewarding Discogs’ data stores as a key subject matter expert
Leading efforts on the reliability and design patterns of our Kafka and Kafka Connect implementations
Establishing data contracts and clear communication standards between CDC producers and consumers
Working closely with engineering squads to refactor and re-architect MySQL database schema and indexing for long-term scalability, performance, and cost effectiveness
Mentoring engineering squads on Platform best practices for MySQL, Kafka, and other software development lifecycle areas
Writing documentation and runbooks that contribute to the engineering organization’s knowledge base
Working in a containerized, orchestrated environment
Contributing to the Platform team’s disciplines of site reliability and operations, supporting both our squads and Platform’s central infrastructure
Participating in on-call rotation, responding to incidents, and troubleshooting data and other operations issues

What You’ll Contribute

Minimum Education and Experience

A Bachelor's Degree in Computer Science or similar area of focus, or equivalent relevant work experience.
5+ years of experience working with Kafka and relational database management systems (RDBMS).
6+ years experience in Ops, DevOps, Site Reliability, Platform or other systems roles.

Required Skills & Abilities:

Relational database schema design, query performance optimization, administration (MySQL, Percona Server, AWS RDS)
Kafka: Cluster administration (Strimzi), Kafka Connect (Debezium, JDBC)
CI/CD (GitHub Actions)
GitOps (ArgoCD)
Kubernetes (EKS, Kustomize, Karpenter, administration, application manifests)
AWS and cloud development (VPC, EKS, RDS, S3)
Observability (Datadog, Sentry)
Scripting (Shell, Python)
Track record of collaboration and mentorship
Excellent written communication and documentation skills
Continuous learning
Ownership and proactive approach to solving large problems

Preferred:

Infrastructure-as-code (Terraform)
Elasticsearch (ECK administration, scaling, performance)
Python (SQLAlchemy, FastAPI)
GraphQL (schema design, Apollo federation)
REST API
Hashicorp Vault
Redis
Memcached
NoSQL Database
Data Lake/Warehouse
Data Governance
Data Security

The Platform team covers a wide range of technical topics and we'd love to hear about your skills beyond this list!

To apply: https://weworkremotely.com/remote-jobs/discogs-inc-senior-site-reliability-engineer-data-remote

Senior Site Reliability Engineer - Data (REMOTE)

Frequently Asked Questions

Senior Site Reliability Engineer - Data (REMOTE)

Frequently Asked Questions

What are legitimate work from home jobs that pay well?

How can I find real work from home jobs with no experience?

Which companies offer the best work from home opportunities in 2024?

What equipment do I need to start working from home immediately?

How much can I make working from home full time?

Are there any legitimate work from home jobs that pay weekly?

What are the most in-demand work from home skills for 2025?