Job Description
About BetaNXT:
BetaNXT is a leading provider of frictionless wealth management infrastructure, real-time data solutions, and an enhanced advisor experience. We invest in platforms, products, and partnerships to accelerate growth for the ecosystem we serve. Our connective approach empowers our clients to deliver a comprehensive, end-to-end advisor and investor experience.
BetaNXT is a premier provider of technology, data, and operations as services to a rich client base of wealth managers, institutional wealth firms, and digital brokers. It is comprised of three industry-leading businesses which, combined, provide end-to-end solutions across the investment lifecycle.
Overview of the Senior Cloud Platform Engineer – OCI & Observability:
The Senior Cloud Platform Engineer is responsible for designing, operating, and continuously improving BetaNXT’s OCI‑based cloud platform, with a strong focus on observability, platform hygiene, capacity management, and production readiness. This role operates deeply in the technical layer of cloud infrastructure while partnering closely with Cloud Operations, DevOps, and external service providers to ensure stable, scalable, and well‑governed environments.
The role serves as a hands‑on technical authority for OCI environments, platform cleanup, patching readiness, and monitoring architecture, while actively participating in operational governance and incident response.
Duties and Responsibilities of the Senior Cloud Platform Engineer – OCI & Observability:
- Manage and improve OCI platform hygiene, including resource organization, compartment alignment, cleanup, and safe termination of unused assets, ensuring all actions follow formal change processes with clear technical guidance on execution sequencing
- Contribute to observability and monitoring architecture by supporting OCI monitoring migrations and integrating monitoring components (e.g., Alloy, node exporter, custom scripts) into environment build and bootstrapping processes to ensure consistent visibility across environments
- Support platform performance and capacity management through participation in review forums, providing insights on performance trends, scaling considerations, and enabling data-driven decisions for long-term platform health
- Coordinate and support production operations and maintenance activities, including scheduled outages (e.g., compute power-offs), aligning with internal teams, vendors, and partners to ensure proper timing, coverage, and governance adherence
- Respond to and support production incidents, including middleware and messaging platforms (e.g., IBM MQ), ensuring appropriate escalation, cross-team coordination, and post-change validation to confirm system stability
- Strengthen patching and environment readiness by identifying gaps in testing processes, defining requirements for reliable development/test environments, and collaborating with development teams to enable safe validation of platform changes
- Participate in change governance and operational forums (e.g., Emergency CAB), ensuring all platform changes are reviewed, risks are communicated, and execution aligns with governance standards
- Collaborate cross-functionally with Cloud Operations, DevOps, Security, development teams, and external vendors, communicating clearly with both technical and non-technical stakeholders to support platform reliability and change execution
Skills and Experience of the Senior Cloud Platform Engineer – OCI & Observability:
- 6+ years of hands-on experience with OCI (Oracle Cloud Infrastructure), including working directly in high-accountability production environments
- Bachelor’s degree in Engineering or related field
- Proven experience designing, implementing, and supporting cloud observability and monitoring solutions, including large-scale migrations or standardization efforts
- Deep expertise in cloud resource lifecycle management, platform patching and maintenance coordination, distributed system operations, and messaging/middleware platforms (e.g., IBM MQ)
- Strong ability to manage and operate complex cloud platforms, particularly in regulated or financial-services environments
- Excellent analytical and problem-solving skills, with the ability to troubleshoot complex platform and infrastructure issues
- Clear and effective written and verbal communication skills, especially during incidents, escalations, and complex change scenarios
- Demonstrated ability to coordinate across multiple teams, including internal engineering groups, operations, and third-party providers (e.g., DXC)
- Strong focus on automation, standardization, and maintaining long-term platform health and reliability
- Background in Site Reliability Engineering (SRE), platform engineering, or similar infrastructure-focused roles
