The Predicament of the ASIC Designer—Part I

By Bikash Roy Choudhury, Principal Architect, AKA “TechApps InfraGuru”

The business part of IT is synonymous with the databases, mail servers, and business intelligence applications that are responsible for running different business processes. The engineering R&D part of IT, which involves ASIC designers and developers, is more focused on software development, design, implementation, manufacturing, and the entire product-life process. Even though business users and engineering R&D users are two sides of the same coin, the infrastructures and technology use cases are different.

Most core IT teams seem to have very little knowledge about the infrastructure that is required for workloads generated from applications and for workflows used by software developers and ASIC chip designers, CAD users, and so on. Talking about the requirements and the infrastructure challenges with core IT teams and developers/designers at the same time is enlightening. There is always a disconnect between the two parties.

Engineering team members always have their own requirements:

  • How do I scale my compute cores and storage space seamlessly?
  • How can I complete my design jobs quickly for faster time to market?
  • How can I optimize my license cost to improve my ROI?

Core IT team members always have questions for the engineering R&D team:

  • How much storage do you need?
  • How much network bandwidth is required?
  • What are the IOP and latency requirements for your applications?

When ASIC designers complain about slow performance, they open a ticket with IT. IT then scans the infrastructure to identify and fix the problem. Sometimes the problem lies with the EDA application, but most of the time it lies within the infrastructure below the application layer. Most performance problems are endemic in nature.

Bottlenecks in the storage, network, and compute infrastructures are mainly from bursty I/O traffic generated mostly during logical or physical verification. During these times, a greater number of cores is used from the compute farm to simulate the designs that generate a large amount of I/O. It is important to design the best possible infrastructure (storage, network, and compute) to optimally handle the workload generated by the different EDA applications or workflows. The most important data points to consider are:

  • Identify the NUMA-related challenges when dealing with a number of cores and crossing memory boundaries for multisocketed hardware platforms.
  • Identify the right Linux® versions to run on the compute nodes to support file system protocols such as NFSv3 or pNFS to enhance the performance of the applications and workflows.
  • Identify the pressure points—cores, memory, disks, network—to eliminate any I/O bottleneck while EDA applications access the design files from the shared storage infrastructure.
  • Determine the file system layout on a shared storage infrastructure. Most EDA design environments consist of millions of files—small and large—depending on the workflow. This is the hardest point to consider.

Almost all EDA applications are file based and use NFS as the file system protocol. All of these applications are accessed from a common location in a shared storage infrastructure. Many EDA applications have different workload signatures. It is imperative to design and architect the different parts of the workflow in different logical and physical containers on a single or on multiple storage controllers. The quick and easy approach is to provide a modular architecture for the different EDA applications and workloads.

I always envisage the ultimate infrastructure that can provide the best results for EDA applications. I also think that IT is engaged in the same process to achieve the best possible architecture that can scale, can perform, is reliable, and has a single pane of management for the ASIC design requirements.

Stay tuned for Part II. I will map out the numerous EDA applications and workflows that are commonly found in a production chip design environment. Also included will be details around how the right choice of storage platform and sizing as well as architecting and tuning all the infrastructure components have a significant impact on overall performance and job completion.