Back to jobs
T

LLM Fine-Tuning Engineer (Open-Weight Models / Secure Environments)

GlobalPosted 1 months ago
Full-timeremote

Job Description

We are hiring an LLM Fine-Tuning Engineer to help us adapt and evaluate open-weight language models for use in secure environments.

This role is focused on the full post-training lifecycle: dataset design, supervised fine-tuning, parameter-efficient tuning, synthetic data generation, evaluation, and deployment support. You will work closely with technical teams operating in advanced cybersecurity contexts, helping turn foundation models into reliable task-specific systems.

IMPORTANT NOTE: We are not looking for prompt engineers or “LLM whisperers.” We are looking for someone who understands post-training, data evaluation, and deployment deeply enough to make open-weight models reliable in real operational environments.

What You’ll Do

  • Fine-tune and adapt open-weight LLMs for specialized use cases in secure/local environments

  • Design, run, and compare different post-training approaches, including:

    • supervised fine-tuning (SFT)

    • parameter-efficient fine-tuning (LoRA / QLoRA and related methods)

    • preference tuning approaches such as DPO where appropriate

    • full fine-tuning when justified by the use case

  • Build and improve high-quality datasets for training and evaluation

  • Generate synthetic data and use it responsibly to expand coverage, improve robustness, and accelerate iteration

  • Assess model quality beyond headline metrics

  • Work with engineering teams to operationalize tuned models for local or restricted deployments

  • Collaborate with domain experts to translate real operational needs into measurable model requirements

What We’re Looking For

  • Strong experience fine-tuning LLMs or adjacent foundation models in production or serious research environments

  • Practical experience with multiple tuning approaches, ideally including SFT, LoRA, QLoRA

  • Experience building and curating datasets for post-training

  • Experience generating and validating synthetic data for model training

  • Strong Python skills and solid experience with:

    • PyTorch

    • Hugging Face Transformers

    • tokenization, data preprocessing and training pipelines

  • Good understanding of GPU constraints, memory/performance tradeoffs, quantization-aware workflows and practical training optimization

  • Strong debugging mindset and ability to investigate why a model improved, regressed, or failed

Nice to Have

  • Experience with secure, air-gapped, or otherwise restricted deployment environments

  • Experience deploying or serving tuned models

  • Background in cybersecurity, especially offensive security, vulnerability research

  • Experience with distributed training frameworks

How We Think About the Role

This is not a “prompt engineer” role, and it is not limited to running a few LoRA jobs. We want someone who can think deeply about:

  • when fine-tuning is the right tool versus prompting or orchestration

  • how to create data that improves behavior instead of just inflating metrics

  • how to evaluate models in ways that reflect real operational value

  • how to make open-weight models reliable in constrained environments

See Your Match Score

Sign up and Renata will show you how this job matches your skills and experience.

Get Started Free
LLM Fine-Tuning Engineer (Open-Weight Models / Secure Environments) at Trenchant | Renata