Software Engineer - Cloud Platform(Americas Remote) at Grafana Labs
United States (Remote)
Applications are now closed
Get more jobs like this straight to your inbox
Our Grafana Cloud pipeline moves millions of data points, log lines, and traces per second from our customers' environments into a highly available, low-latency stack that processes and stores the data, and serves it to dashboards and alerting tools. We aim to grow this to hundreds of millions per second, and it's critical that as we grow, we improve our performance, increase our reliability, and do it all more efficiently.
Cloud roles at Grafana Labs require engineers with a passion for performance and reliability, and who enjoy taking projects from conception to production. Grafana Cloud hosts services in Kubernetes. The Cloud Platform team owns and maintains the platform delivering Kubernetes and its required complementary services, including our release and deployment tools and services to Grafana Engineering, as well as designing, implementing and maintaining the virtual network infrastructure.
Because we deploy production services, we have on-call rotations to ensure the health of the system. We dogfood our own services so being on call is an important way to understand our system and how to use the products we create.
Our culture is one of remote-first, and our engineering organization is largely remote. We provide guidance and meet regularly using video calls, and we need people who can work independently and can communicate well. Even if you are located near one of our small offices, working from home is both common and encouraged. Our teams also plan in-person team building meetups and also gather to attend industry conferences.
We care deeply about open source and the projects generally are open source, check them out: https://github.com/grafana.
About the role:
We are looking for an experienced software or site reliability engineer to join the Grafana Labs R&D team. We are hiring for the Cloud Platform team that provides the platform on which Grafana Cloud delivers its services.
- Maintain and improve Grafana Labs’ provisioning, release and deployment tools and processes for infrastructure and services
- Provision and administer the core infrastructure platform, Kubernetes
- Provision and administer the required Cloud Service Provider resources
- Maintain and improve Grafana Labs’ monitoring tools and practices to maximise system uptime and health
- Commercial experience as a site reliability, network and/or software engineer in Cloud and Linux environments, especially with distributed architectures
- Programming experience -- we use Go, Jsonnet, Python and Shell
- Experience with containers and orchestration -- we use Docker and Kubernetes
- Proficiency with infrastructure as code and/or configuration management -- we use Terraform and Tanka/Jsonnet
- Experience with dashboards and monitoring tools like Grafana and Prometheus
Nice to have:
- Commercial experience in designing and managing networking in a Virtual Private Cloud
- Commercial experience of network services, including load balancers, firewalls and DNS
- Commercial experience of layer 2 and layer 3 networking, including VLANs and VPNs
- Commercial experience of the IP protocol suite, including BGP and NAT
- Experience working in remote and/or distributed business environments, demonstrating self motivation and communication skills
- Flexible hours
- The equipment you need to get the job done
- Generous vacation policy of 30 days per annum with national holidays in your country of residence on top
- Grafana operates in 32+ countries. We try to operate as one team and focus on global benefits which our whole team can enjoy. Inevitably there are some regional variations and we discuss the benefits offered in your country of residence through our interview process.
- We offer a competitive healthcare plan (Medical, Dental & Vision) for our US based employees via our co-employer JustWorks.
- We offer a 4% employer contribution match on our 401K/pension plans or a one time 4% salary increase after 6 months tenure depending on your location
Our hiring process:
- Video chat with one of our Talent Managers (30 mins)
- Video chat with 2 Hiring Managers (30 mins)
- Live Coding Interview with 2 Engineers (60 mins)
- Systems Design focused interview (45 mins)
Applications are now closed