AI Data Architect

Reports into Rochester, NY$150K - $200KPosted 11 months ago

Full-timeonsite

Job Description

As a Data/AI Architect, you'll design and build data-driven cloud architectures on AWS — from S3 data lakes and Glue ETL pipelines to data warehouses and RAG-powered AI systems. You'll own the full data stack across a variety of industries and projects: one engagement you're designing a Redshift data warehouse with medallion architecture processing 31M transactions/month, the next you're building a Bedrock Knowledge Base with OpenSearch vector search. Real ownership, real variety.

What You'll Do:

Design and build S3 data lakes with multi-zone organization, partitioning strategies, lifecycle policies, and encryption
Implement medallion architecture (bronze/silver/gold) for data warehouses on Redshift, Snowflake, or Databricks
Build AWS Glue ETL pipelines (Python Shell and Spark) with incremental extraction, Data Catalog management, and optimized Parquet output
Design star/snowflake schemas, materialized views, and gold-layer models optimized for BI consumption (QuickSight, PowerBI)
Configure data warehouse platforms — Redshift with Zero-ETL from Aurora, Snowflake with Snowpipe, Databricks with Delta Lake and Auto Loader
Design RAG systems using Bedrock Knowledge Base with OpenSearch Serverless vector search and Titan Embeddings
Architect document AI pipelines using Textract, Comprehend, and Bedrock for entity extraction
Design SageMaker ML pipelines for training, Model Registry, and inference
Lead data discovery sessions with client stakeholders and present architecture recommendations to technical and business audiences
Mentor delivery team members on data architecture patterns and AWS data services
Contribute to R&D projects evaluating emerging AWS data and AI capabilities

Required Skills:

5+ years professional IT experience, 2+ years professional AWS experience
At least one AWS Professional-level certification (Solutions Architect Professional or Data Engineer Specialty preferred)
Python for data pipelines (Glue jobs, Lambda, SageMaker scripts) and PySpark for Glue Spark jobs
SQL and NoSQL on AWS — Aurora PostgreSQL, RDS PostgreSQL, DocumentDB, DynamoDB — including schema design and query optimization
Data modeling — conceptual, logical, and physical models for AWS data platforms; normalized silver-layer schemas, denormalized star/snowflake gold-layer schemas, data dictionaries
Dimensional modeling and medallion architecture (bronze/silver/gold) on Redshift, Snowflake, or Databricks, including materialized views and incremental refresh patterns
AWS Glue ETL (Python Shell and Spark), Glue Data Catalog, and crawlers
S3 data lake architecture with partitioning, lifecycle policies, and encryptions

Preferred:

RAG systems with Bedrock Knowledge Base and OpenSearch Serverless vector search
Amazon SageMaker for ML training, Model Registry, and inference
AWS HealthLake, FHIR R4 transformation, and HIPAA-compliant data pipelines
Document AI with Amazon Textract and Comprehend
Amazon Athena, QuickSight, or PowerBI integration
Terraform or CloudFormation for data infrastructure as code
Step Functions, EventBridge, and Lambda for event-driven pipeline orchestration

What You'll Do:

Design and build S3 data lakes with multi-zone organization, partitioning strategies, lifecycle policies, and encryption
Implement medallion architecture (bronze/silver/gold) for data warehouses on Redshift, Snowflake, or Databricks
Build AWS Glue ETL pipelines (Python Shell and Spark) with incremental extraction, Data Catalog management, and optimized Parquet output
Design star/snowflake schemas, materialized views, and gold-layer models optimized for BI consumption (QuickSight, PowerBI)
Configure data warehouse platforms — Redshift with Zero-ETL from Aurora, Snowflake with Snowpipe, Databricks with Delta Lake and Auto Loader
Design RAG systems using Bedrock Knowledge Base with OpenSearch Serverless vector search and Titan Embeddings
Architect document AI pipelines using Textract, Comprehend, and Bedrock for entity extraction
Design SageMaker ML pipelines for training, Model Registry, and inference
Lead data discovery sessions with client stakeholders and present architecture recommendations to technical and business audiences
Mentor delivery team members on data architecture patterns and AWS data services
Contribute to R&D projects evaluating emerging AWS data and AI capabilities

Required Skills:

5+ years professional IT experience, 2+ years professional AWS experience
At least one AWS Professional-level certification (Solutions Architect Professional or Data Engineer Specialty preferred)
Python for data pipelines (Glue jobs, Lambda, SageMaker scripts) and PySpark for Glue Spark jobs
SQL and NoSQL on AWS — Aurora PostgreSQL, RDS PostgreSQL, DocumentDB, DynamoDB — including schema design and query optimization
Data modeling — conceptual, logical, and physical models for AWS data platforms; normalized silver-layer schemas, denormalized star/snowflake gold-layer schemas, data dictionaries
Dimensional modeling and medallion architecture (bronze/silver/gold) on Redshift, Snowflake, or Databricks, including materialized views and incremental refresh patterns
AWS Glue ETL (Python Shell and Spark), Glue Data Catalog, and crawlers
S3 data lake architecture with partitioning, lifecycle policies, and encryptions

Preferred:

RAG systems with Bedrock Knowledge Base and OpenSearch Serverless vector search
Amazon SageMaker for ML training, Model Registry, and inference
AWS HealthLake, FHIR R4 transformation, and HIPAA-compliant data pipelines
Document AI with Amazon Textract and Comprehend
Amazon Athena, QuickSight, or PowerBI integration
Terraform or CloudFormation for data infrastructure as code
Step Functions, EventBridge, and Lambda for event-driven pipeline orchestration

See Your Match Score

About Innovative Solutions

Website

Similar roles

IN-Sr Associate_Databricks Engineer _D&A _Advisory _Mumbai

Strategy& · Mumbai Shivaji Park

Project Officer, Payments Development & Data Connectivity (Contract) (Remote)

Sggovterp · MAS: MAS Building

Data Engineer, Ads Auction Platform

roku · Manchester, United Kingdom

Genomics Data Scientist (2 year contract)

Roche · Basel

Senior Data Scientist

mylo · New Cairo City, Cairo Governorate, Egypt

Genomics Data Scientist (2 year contract)

Roche · Basel, Basel City, Switzerland

AI Data Architect

Job Description

See Your Match Score

More jobs at Innovative Solutions

Similar roles

More jobs at Innovative Solutions

Similar roles