· Mainstream SoC Platform Support
· Fully Hot-Swappable Modular Design
· Android Multi-Instance Support
· 80 Gbps Aggregate Peak Bandwidth
· Granular Network Security Control
· Smaller Single-Point Failure Impact Range
· Up to 3840-Channel AI Video Processing
· Highly Integrated Server Design
Gemma
Llama
Qwen
Stable Diffusion
CSD2-N128 features 16 built-in compute blades (128 compute nodes in total), with each node delivering 6-60 TOPS of computing power. Platform options include Qualcomm, Rockchip, Sophgo and SpacemiT. It supports private deployment of mainstream AI large models and multiple deep learning frameworks. Equipped with 8 x 10GbE ports, achieving peak switch bandwidth of 80 Gbps. With standard 2U rackmount server chassis design, it also comes with intelligent BMC management system.
Mainstream SoC Platform Support
All-Module Hot-Swap Design
Android Multi-Instance Support
80 Gbps Aggregate Peak Bandwidth
Granular Network Security Control
Smaller Failure Impact Range
3840-Channel AI Video Processing
Efficient and Low-Cost
Fully configured system can deploy up to 128 compute nodes, with each node delivering 6-60 TOPS of computing power. Each node can support 5-10 containers deployed in parallel based on business requirements, enabling the entire system to virtualize and host 640-1,280 system containers. This effectively revitalizes hardware resources and significantly improves overall resource utilization efficiency.
Widely applicable in industry fields such as Edge Computing, Large Model Localization, Smart City, Smart Healthcare, Smart Industry and Intelligent Security.

CSD2-N128Q8550 |
CSD2-N128R3588S |
CSD2-N128R3576 |
CSD2-N128SPK3 |
||
|
Technical Specifications |
Launch Status |
Launch in June 2026 |
|||
|
Server form |
2U rack-mounted computing power server |
||||
|
Architecture |
ARM architecture |
RISC-V architecture |
|||
|
Number of nodes |
16 compute blades (128 distributed compute nodes) + 1 control node |
||||
|
Compute nodes |
Octa-core 64-bit processor Qualcomm QCS8550, up to 3.36GHz |
Octa-core 64-bit processor RK3588S, up to 2.4GHz |
Octa-core 64-bit processor RK3576, up to 2.2GHz |
Octa-core 64-bit processor SpacemiT Key Stone K3, up to 2.4GHz |
|
|
Video encoding |
8K@30fps/4K@120fps H.264 |
H.264: 1×8K@30fps, 16×1080P@30fps |
H.264: 1×4K@60fps |
4K@60fps H.264 |
|
|
Video decoding |
8K@60fps/4K@240fps H.264/VP9/AV1 |
8K@60fps/4K@120fps (VP9/AVS2) 8K@30fps (H.264/AVC/MVC) 30×1080P@30fps (H.264) |
1×4K@120fps (VP9,AVS2,AV1) 1×4K@60fps (H.264/AVC) |
4K@120fps H.264/VP9 |
|
|
Control nodes |
Octa-core 64-bit processor RK3588, main frequency up to 2.4GHz, the highest computing power is 6TOPS |
||||
|
AI computing power |
6144TOPS (48T × 128, INT8) |
768TOPS (6T × 128, INT8) |
7680TOPS (60T× 128, INT8) |
||
|
RAM |
16GB LPDDR5X × 128 |
16GB LPDDR5 × 128 (4/8/16/32GB) |
8GB LPDDR4/LPDDR5 × 128 (4/8/16GB) |
32GB LPDDR5 × 128 (8/16/32GB) |
|
|
Storage |
256GB UFS4.0 × 128 |
256GB eMMC × 128 (16/32/64/128/256GB) |
64GB eMMC × 128 (16/32/64/128/256GB) |
128GB UFS2.2 × 128 |
|
|
Power |
2 × 1300W hot-swappable power supplies, 1+1 redundancy support |
||||
|
Fan module |
14 high-speed cooling fans |
||||
|
Physical Specifications |
Size |
Standard 2U rack servers: 495.60mm × 928.52mm × 88.80mm |
|||
|
Installation requirements |
IEC 297 Universal Cabinet Installation: 19 inches wide and 800 mm deep and above Retractable slideway installation: The distance between the front and rear holes of the cabinet is 543.5mm~848.5mm |
||||
|
Environment |
Operating Temperature: 0ºC ~ 30ºC, Storage Temperature: -40ºC ~ 60ºC, Operating Humidity: 5% ~ 80%RH (non-condensing) |
||||
|
Software Specifications |
BMC |
The BMC management system is integrated with the web-based management interface, supporting Redfish, VNC, NTP, monitoring advanced and virtual media, and the BMC management system can be redeveloped |
|||
|
Large language models |
All models support private deployment of ultra-large-scale parameter models under the Transformer architecture, such as large language models including Deepseek-R1 Series, Gemma Series, Llama Series, ChatGLM Series, Qwen Series, Phi Series, etc. |
||||
|
Visual large model |
K3: Supports private deployment of all vision large models QCS8550: Supports private deployment of vision large models including Qwen2.5-VL, InternVL3, etc. |
||||
|
AI Painting |
K3: Supports private deployment of all image generation models QCS8550: Supports private deployment of the Stable Diffusion image generation model |
||||
|
Deep learning |
All models: Support traditional network architectures such as CNN, RNN, LSTM, and support various deep learning frameworks such as TensorFlow, PyTorch, PaddlePaddle, ONNX, and Caffe. Support custom operator development and Docker containerization management technology |
||||
|
Interface Specifications |
Internet |
8 × 10Gbps SFP+, 1 × Gigabit Ethernet (RJ45, MGMT is used as BMC management network) |
|||
|
Console |
1 × Console (RJ45, BMC debug serial port, baud rate 115200) |
||||
|
Display |
1 × VGA (maximum resolution 1080P, BMC management display) |
||||
|
USB |
3 × USB3.0, 1 × Type-C (OTG) |
||||
|
Button |
1 × Power, 1 × UID, 1 × Recovery, 1 × Reset |
||||
Firefly team, with over 20 years of experience in product design, research and development, and production, provides you with services such as hardware, software, complete machine customization, and OEM server.