
Senior Data Engineer
Job Description
At AmaWaterways, we believe meaningful careers begin with purpose, passion and a shared commitment to delivering unforgettable experiences. For those who value curiosity, connection and personal enrichment, AmaWaterways offers the opportunity to help craft meaningful river journeys that invite travelers to follow their own current. Built on a foundation of heartfelt hospitality, we treat our guests—and each other—with genuine care, warmth and respect. AmaWaterways fosters a collaborative environment both onboard our ships and across our global network of offices, where team members grow together, support one another and take pride in upholding the high standards and thoughtful service our company is known for.
We invite talented, motivated professionals to explore our career opportunities and begin their journey with AmaWaterways today.
Role Summary
AmaWaterways is hiring our first Senior Data Engineer to scale the modern data platform we are actively building on Snowflake, dbt, AWS, and Airflow. You will own the next generation of warehouse-native ingestion pipelines that are replacing our remaining Fivetran connectors, partner directly with the Director of Data Engineering on platform architecture, and help establish the engineering standards for a growing team. You will also become a power user of AI-native development tooling. This is not a 'we are exploring AI' role. Our daily engineering environment is Claude Code with custom skills and MCP servers, Snowflake Cortex for in-warehouse AI, and multi-model sparring through zen-mcp for architecture review. Candidates who already work this way will be productive in week one.
What You Will Build
You will inherit and extend an active portfolio of Snowflake-native pipelines. Twelve are already in production, and roughly twenty are on the roadmap. Recent and near-term work includes:
- A unified Brand Intelligence pipeline ingesting reviews, social, trade press, and awards across the river-cruise segment, with Snowflake Cortex driving sentiment, classification, translation, and entity extraction.
- A Competitive Intelligence pipeline with ten direct competitor scrapers feeding a unified pricing and promotions schema.
- An EDW build-out across Bronze, Silver, Gold, Reporting, and Activation layers, including dbt-mesh project structuring, the dbt Semantic Layer, dbt unit tests, and SCD2 modeling for conformed dimensions.
- The migration of the remaining Fivetran connectors to our standard Snowflake-native ingestion pattern: Snowpark Python stored procs, External Access Integrations, INGESTION_CONFIG and RUN_LOG admin tables, and Snowflake Tasks for scheduling. AWS Lambda handles the workloads that cannot run inside Snowpark.
- An Astronomer or AWS MWAA layer to govern task graphs once the pipeline count exceeds what Snowflake Tasks can cleanly manage. You will help decide which path we take.
- Salesforce Data Cloud, LiveRamp, and SFTP outbound integrations from our ACTIVATE layer.
Day to Day Responsibilities
Snowflake-native ingestion engineering
- Build pipelines using our standard template: Snowpark Python stored procs, External Access Integrations, network rules, Snowflake SECRETs hydrated from AWS Secrets Manager, idempotent deploy.py scripts, and config-driven INGESTION_CONFIG tables.
- Author and tune incremental load patterns (watermark cursors, MERGE statements, change-data capture where supported).
- Design conformed dimensions with SCD2 snapshots and append-only fact tables in dbt.
Transformation and modeling
- Build and maintain dbt models across Bronze, Silver, Gold, and Reporting layers in our medallion warehouse.
- Use dbt Core and dbt Cloud, dbt unit tests, dbt Mesh for cross-project refs, and the dbt Semantic Layer for governed metrics.
- Keep VARIANT columns confined to Bronze. Gold and Reporting models are strictly typed.
Cloud infrastructure and DevOps
- Manage the AWS side of our pipelines: S3 staging, IAM roles, Lambda functions in Python, API Gateway where needed.
- Author Terraform for every AWS resource. No ad-hoc console work.
- Use AWS Secrets Manager as the source of truth for machine credentials. Naming convention is ama/{env}/{domain}/{name}. Never put credentials in .env files or GitHub repo secrets.
- Build and own CI/CD pipelines in GitHub Actions. The standard automation identity is SVC_ETL_RUNNER with RSA keypair auth to Snowflake.
Orchestration
- Operate Snowflake Tasks for the current generation of pipelines.
- Help design and stand up the next-tier orchestration layer in Astronomer or AWS MWAA, including dbt Cosmos integration and DAG migration from Snowflake Tasks.
Observability and quality
- Configure Snowflake Alerts (failure, zero-row, missed-run, freshness) and Microsoft Teams notifications through Power Automate.
- Build data quality checks into every pipeline using our ADMIN.DQ_EXCEPTION_LOG pattern and dedicated QC layers for cross-system reconciliation.
Web scraping and source integration
- Use Playwright with persistent browser profiles for SSO-protected and API-less sources (TrueVault, Tableau Admin Insights, internal SharePoint).
- Author OAuth, MSAL certificate auth, and refresh token flows for source APIs.
AI-native engineering
- Use Claude Code as your primary engineering environment, including custom skills, MCP servers, and the Claude Agent SDK for sub-agent fan-out work.
- Use Snowflake Cortex for in-warehouse LLM tasks. Author Cortex Analyst semantic YAML.
- Use multi-model sparring (Gemini, GPT-5, Ollama) through zen-mcp for architecture review and race-condition debugging.
- Author and maintain shared team skills in our internal AmaWaterways-IT/data-team-skills marketplace, following our skill conventions (under 500 lines, templates folder for heavy SQL or Jinja, trigger-only descriptions).
- Apply the Trail of Bits differential-review workflow to significant diffs.
Required Qualifications
- 6+ years of professional data engineering experience, including a stretch as a senior engineer on a small-to-mid-sized team.
- Expert SQL on Snowflake. You can read a query profile, identify spillage and partition pruning issues, and rewrite the query.
- Strong production Python. Code is ruff-clean, has pytest coverage, uses type hints, and never contains hardcoded secrets.
- Hands-on production experience with dbt (Core or Cloud), including incremental models, SCD2 snapshots, and dbt tests.
- Hands-on experience with Airflow or Astronomer for production orchestration.
- AWS fundamentals: IAM, S3, Lambda, and Terraform.
- GitHub plus GitHub Actions CI/CD with branch protection and code review discipline.
- Shipped at least one production pipeline that replaced a managed ELT tool (Fivetran, Stitch, Airbyte, ADF) with custom warehouse-native code.
- You already use Claude Code, Cursor, Aider, or equivalent agent tooling daily. You can speak concretely about what works, what does not, and your context-management practices.
Strongly Preferred
- Snowflake Snowpark Python: writing and deploying stored procs, External Access Integrations, network rules, Snowflake SECRETs.
- dbt Mesh, dbt Semantic Layer, dbt unit tests.
- Snowflake Cortex (Complete, Search, Analyst) used in production.
- MCP server authoring or Claude Agent SDK applications.
- Playwright for browser automation.
- Salesforce Data Cloud, LiveRamp, or SFTP outbound delivery.
- Travel, hospitality, or consumer industry context.
Nice to Have
- Cortex Analyst semantic YAML authoring.
- Power Automate flows for Teams notifications.
- Tableau REST API or GraphQL Metadata API work.
- Microsoft Graph API with MSAL certificate auth.
- Familiarity with the DeepMind agent-trap threat taxonomy or similar agent-security thinking.