- Design and test data transformation pipelines using big data platforms such as Airflow, Athena, Flink, Kinesis, Spark, Kafka, etc.
- Develop streaming and batch ETLs transforming and slicing data from multiple data sources (mostly using AWS services)
- Write, test, and optimize complex SQL queries and reports
- Administer, configure, run, and POC managed data services and platforms
- At least 3 years’ experience as a backend engineer or cloud engineer writing backend codeption
- You are comfortable with Java and Python, and well versed with SQL
- You are familiar with AWS services, components and cloud infra design
- You read documentation, love to learn new technologies and run POCs
- You are a team player that can take ownership on complicated tasks, become an expert in your area while mentoring others and sharing your knowledge, and lead group discussion
Although it’s not mandatory, it’s even nicer if you
- Have experience running and configuring big data services, such as Kafka, Airflow, Presto, and using K8S and Docker as infra in your projects.
- Have experience in either Scala, Java, or GO
- Know how to configure and spin an EMR cluster, run Athena queries, write a lambda, create a spark job, monitor EKS, and write a cloud formation script
- Have a degree in CS or a quantitative discipline