Careers · Software · Full-time
Member of Technical Staff — Software Engineering.
Define the end-to-end software platform and services roadmap for the AI Factory, and execute the strategy to keep the platform reliable, efficient, performant, and always available at scale.
Apply: send your resume to jobs@neyon.ai.
At NEYON AI, we are building the future of reliable, resilient, scalable, and performant AI infrastructure. In this role you will define the end-to-end software platform and services roadmap for the AI Factory and execute the strategy to build a platform, services, and tooling that is reliable, efficient, performant, and always available at scale. You will partner with a team of great software engineers and architects to solve complex problems in the AI infrastructure space.
Key responsibilities
- Software roadmap and execution. Define and execute the software strategy and roadmap for NEYON AI platforms and services, and develop end-to-end data-center-as-a-service management systems, platforms, and automated workflows.
- Software reliability. Ensure software platforms and systems are productionized with high availability, scalability, and performance — fault tolerance, disaster recovery, removing scaling bottlenecks, and performance optimization across customer environments.
- Observability and monitoring. Develop, implement, and maintain comprehensive monitoring, alerting, logging, and tracing for deep insight into AI infrastructure health and performance.
- Reliability best practices. Collaborate with engineers to embed reliability principles into the development lifecycle and drive operational excellence.
Minimum qualifications
- Bachelor's degree in Computer Science and Engineering, a related technical field, or equivalent practical experience.
- 10+ years of experience in software design, development, and system design.
- Experience defining and executing multi-year AI infrastructure software strategy and roadmaps across compute, network, storage, and critical environments for AI Factories.
- Deep expertise building and debugging complex software systems and platforms across compute, network, storage, and critical data center environments.
- Proven ability to troubleshoot complex issues across the entire stack.
- Excellent communication, collaboration, and problem-solving skills.
Preferred qualifications
- Hands-on experience in security and data protection.
Compensation
Total compensation includes meaningful equity in a fast-growing startup, along with a competitive salary and comprehensive benefits.
NEYON AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.
Interested? Send your resume to jobs@neyon.ai.