Những gì chúng tôi có thể cung cấp
- Training and development
- premium healthcare
- Full Social Insurance
- 13-month bonus
Mô tả công việc
Navigos Search’s client is a startup developing AI S/W stacks for large-scale AI models and GPU/NPU cluster systems. Navigos Search’s client operates clusters of hundreds GPU/NPUs for development and testing purposes within its own data centers. It also provides and manages on-premise and cloud infrastructure for a variety of customers.
Navigos Search’s clientʼs System Engineer work with global development teams to install, integrate, and secure the H/W components of various systems they own or supply, continuously automating this process.
Main responsibilities:
• Install/maintenance various equipment in data centers, including CPU/GPU/NPU servers, high-speed interconnection networks such as InfiniBand and RoCE, storage servers, and firewalls.
• Initialize system H/W components including firmware such as GPU/NPU device drivers, communication libraries for clustering.
• Analyze and resolve the causes of various H/W errors.
• Provide overall management and technical consulting for the company's own/customer operating infrastructure.
Yêu cầu
• 1 years of experience operating and managing Linux-based cluster systems
• Extensive understanding of various H/W components of computer systems.
• Experience in analyzing various logs and operating monitoring solutions for large-scale IT infrast