14358 – AI/Machine Learning Engineer (onsite) – Austin, TX
Start Date: ASAP
Type: Temporary Project
Estimated Duration: 12+ months with possible extensions
Work Setting: Onsite. Working remotely is accepted in accordance with TxDOT’s policies. The resource must be in the office a minimum of four days a week, or as approved by TxDOT
Required:
• Availability to work 100% of the time at the Client’s site in Austin, TX (required);
• Experience with Python production (8+ years)
• Experience with AI/ML Production - Built and deployed 2-3+ ML models serving real users, not just experiments (8+ years)
• Experience with AWS, Azure, GCP, or OCI for deploying and managing ML workloads (8+ years)
• Experience with Docker and Kubernetes (8+ years)
• Experience with Databases - SQL (PostgreSQL, MySQL) and NoSQL/vector databases (8+ years)
• Experience with scripting in both Bash and PowerShell for automation (8+ years)
• Experience with transformers (BERT, GPT, T5), RAG systems, fine-tuning, prompt engineering, or building LLM applications
• Experience with MLflow, Weights & Biases, Kubeflow, Airflow, or similar platforms
Preferred:
• Experience with CI/CD such as Azure DevOps, GitHub Actions, Jenkins, or similar automation pipelines
• Experience with PyTorch/TensorFlow, OpenCV, object detection, segmentation, or real-time inference
• Experience for performance-critical components (Go or Rust)
• Experience with feature stores (Feast, Tecton) or advanced feature engineering
• Experience with model optimization: quantization, pruning, knowledge distillation
• Experience with edge deployment or resource-constrained model deployment
• Experience with frameworks for A/B testing ML models
• Experience with open-source ML projects
• Experience with real-time streaming data processing (Kafka, Kinesis)
Responsibilities include but are not limited to the following:
• Design, build, and deploy end‑to‑end AI/ML systems from initial concept through production, ensuring models serve real users at scale and comply with TxDOT’s governance and SDLC standards.
• Develop scalable ML pipelines and data workflows using Python, cloud‑native services (Azure AI, AWS SageMaker/Bedrock, GCP Vertex AI, OCI AI Services), and modern MLOps tooling such as MLflow, Kubeflow, Airflow, or Weights & Biases.
• Implement and maintain production‑grade infrastructure for model training, deployment, monitoring, and distributed large‑scale training across multi‑GPU or multi‑node environments.
• Engineer solutions leveraging advanced ML domains including NLP/LLMs (transformers, RAG, fine‑tuning), time‑series forecasting, anomaly detection, recommender systems, and vector/NoSQL database integrations.
• Develop DevOps‑aligned automation and containerized environments using Docker, Kubernetes, Bash, and PowerShell to support reliable CI/CD, reproducibility, and cloud‑based ML workload orchestration.
• Create internal tools, frameworks, and CLI‑first utilities that improve team efficiency, accelerate experimentation, and support greenfield AI initiatives across TxDOT.
• Collaborate with cross‑functional teams to translate ambiguous requirements into working AI solutions, providing technical leadership, identifying risks, ensuring compliance, and guiding the adoption of standardized AI governance practices.