We are seeking a skilled AI/ML Database Engineer residing in Texas with 5 or more years' experience to design, build, and maintain scalable data systems that support analytics, applications, and emerging AI/ML use cases. This is a hybrid role (work onsite at least 1 day a week) requiring strong expertise in data modeling, data pipelines, and modern database technologies across cloud environments. The ideal candidate combines solid computer science fundamentals with hands-on experience in both traditional and next-generation data platforms.
Responsibilities
• Design and implement data models and database architectures to support business and technical requirements
• Develop, optimize, and maintain ETL processes and data pipelines for reliable data ingestion and transformation
• Work with structured and unstructured data across multiple storage technologies
• Implement and manage SQL and NoSQL databases for performance, scalability, and reliability
• Write clean, efficient Python code for data processing, automation, and integration
• Deploy and manage vector databases to support AI/ML and semantic search use cases
• Collaborate with data scientists, engineers, and application teams to enable data-driven solutions
• Ensure data quality, security, and governance best practices
• Monitor and troubleshoot database performance and reliability issues
Required
• Strong knowledge of data modeling and database design principles
• Experience building and maintaining ETL and data pipeline solutions
• Solid understanding of data structures (trees, graphs, hash tables, etc.)
• Hands-on experience with SQL and NoSQL databases
• Proficiency in Python for data engineering tasks
• Experience with vector databases (e.g., Pinecone, Weaviate, Chroma)
• Familiarity with at least one major cloud platform (GCP, AWS, or Azure)
• Strong problem-solving and analytical skills
• Good communication and collaboration abilities
Preferred
• Experience supporting AI/ML or GenAI workloads
• Knowledge of data governance and data security frameworks
• Experience with big data technologies or distributed systems
• DevOps/DataOps experience (CI/CD, containerization, orchestration)