Back to jobsHigh proficiency in Data Warehouses (Redshift, Databricks, etc) and manipulating data within them (using SQL or Spark).
High proficiency in the design, development, and monitoring of ETL pipelines
Moderate experience (at least 2 years) in working with AWS or any Cloud providers (such as GCP or Azure).
Moderate experience in creating and evangelizing best practices and tools
Moderate experience in interacting with different stakeholders at different levels.
Some experience (at least 1 year) with common data science tools, packages (Pandas, SKLearn), and concepts
Good programming skills (Python, R, Bash scripting, or any languages for ETL pipelines)
Moderate Experience working in an Agile, Dev Ops, Test Driven Development environment
Experience in designing, developing, and optimizing ML Feature Store is a plus.
Experience in working with Sagemaker is a plus.
Experience in building CI/CD pipelines and data testing for data integrity and correctness is a plus.
Experience with building streaming applications using Kafka, Kinesis, or other message queues is a plus.
Experience with using Data Build Tool (DBT) for ETL is a plus