A hybrid ASIC and FPGA Architecture
FPGA is English Field Programmable Gate Array abbreviation, namely the scene programmable gate array, it is the product which in PAL, GAL, EPLD and so on in the programmable component foundation further develops. It is took in the special-purpose integrated circuit (ASIC) domain one kind partly has custom-made, both solves has had custom-made the electric circuit which the electric circuit appears the insufficiency, and has overcome the original programmable component gate number limited shortcoming.
FPGA used logical unit array LCA (Logic Cell Array) this kind of new concept, the interior including has been possible to dispose logical module CLB (Configurable Logic Block), output load module IOB (Input Output Block) and internal segment (Interconnect) three parts. The FPGA essential feature mainly has:
1) Uses FPGA to design the ASIC electric circuit, the user does not need to throw the piece production, can obtain the chip which comes in handy. - - 2) FPGA may make other all to have custom-made or partly to have custom-made the ASIC electric circuit the experimental preview.
2) The FPGA interior has the rich trigger and the I/O pin.
3) FPGA is in the ASIC electric circuit designs the cycle to be shortest, the development cost is lowest, one of risk smallest components.
4) FPGA is in the ASIC electric circuit designs the cycle to be shortest, the development cost is lowest, one of risk smallest components.
5) FPGA uses the high speed CHMOS craft, the power loss is low, may and CMOS, the TTL level is compatible.
It can be said that, the FPGA chip is the small batch system enhances the system integration rate, one of reliable best choices. FPGA is by deposits the procedure establishes its active status in internal RAM, therefore, time work needs to carry on the programming to internal RAM .The user may act according to the different disposition pattern, selects the different programming method.
When adds the electricity, the FPGA chip the data read-in internal programs EPROM in RAM, after the disposition completes, FPGA thrust build-up .After falls the electricity, FPGA restores the unsoldered glass, internal logic relations vanishing, therefore, FPGA can use repeatedly. The FPGA programming does not need the special-purpose FPGA programmer, only must use general EPROM, the PROM programmer then. When needs to revise the FPGA function, only must trade piece of EPROM then. Thus, identical piece FPGA, the different programming data, may have the different electric circuit function. Therefore the FPGA use is extremely flexible. FPGA has many kinds of disposition pattern: Parallel principal-mode -like is piece of FPGA adds piece of EPROM the way; The host may support piece of PROM from the pattern to program multi-piece FPGA; The serial pattern may use serial PROM to program FPGA; The peripheral pattern may FPGA take the microprocessor the peripheral, programs by the microprocessor to it.
In the electrical observation and control system, needs to gather each kind of simulation quantity signal, the digital quantity signal frequently, and carries on corresponding processing to them. In the ordinary circumstances, in the observation and control system with ordinary MCU (for example 51, 196 and so on monolithic integrated circuits or control DSP) is may complete the system task.。But when in the system must gather the signal quantity are specially many when (is specially each kind of signal quantity, condition quantity), depends on merely with the ordinary MCU resources on often with difficulty completes the task。This time, generally only can adopt the multi-MCU in-line processing pattern, or depends on other chip expansion system resources to complete the system the monitor duty. Not only did this increased the massive exterior electric circuits and the system cost, moreover increased the system complexity greatly, thus the system reliability could receive certain influence, this was not obviously the designer is willing to see. One kind based on the FPGA technology simulation quantity, digital quantity gathering and the processing system, uses FPGA the I/O port to be many, also may program the control freely, define its function the characteristic, matches by VHDL the compilation FPGA interior execution software, can solve gathering signal way many problems well。Because compiles with VHDL the execution software interior to each group of digital quantity is according to the parallel processing, moreover the FPGA hardware speed is the ns level, this is a speed which current any MCU all with difficulty achieved, therefore this system compared to other systems can real-time, monitor the signal quantity fast the change。Therefore in the condition quantity specially many monitor system, this system will be able to display own superiority.
The practice proved that, Designs the DDS electric circuit with FPGA to use the special-purpose DDS chip to be more nimble. Because, so long as changes in FPGA the ROM data, DDS may have the random profile, thus has the quite big flexibility。Comparatively: The FPGA function is decided completely by the design demand, may complex also be possible to be simple, moreover the FPGA chip also supports in the system scene promotes, although has the insufficiency slightly in the precision and the speed, but also can satisfy the overwhelming majority system basically the operation requirements. Moreover, inserts the DDS design in the system which constitutes to the FPGA chip, its system cost cannot increase how many, but purchases the special-purpose chip the price is the former very many times. Therefore uses FPGA to design the DDS system to have the very high performance-to-price ratio.
1 Applications Emerge for Hybrid Devices
Implementation using an ASIC approach typically yields a faster, smaller, and lower power design than implementation in FPGA technology. The growing requirements in the marketplace for design flexibility however, are driving the need for hybrid ASIC/FPGA devices. The potential to change hardware configuration in real time, to support multiple design options with a single mask set, and to prolong a product’s usable life, all compel designers to look for a blending of high density ASIC circuits along with the inherent FPGA circuit flexibility.
The ability to create a “base design” and then reuse the base with minimal changes for subsequent devices helps reduce design time and encourages standardization. Since many consumer and office products are offered with a range of low to high-end options, this base design concept can be effectively used---with features added to each successive model. Printers, fax machines, PC’ s and digital imaging equipment are example where this concept can be useful
DSP applications are also well suited to FPGA fast multiply and accumulate (MAC) processing capability. When building a DSP system, the design can take advantage of parallel structures and arithmetic algorithms to minimize resources and exceed performance of single or multiple purpose DSP devices. DSP designers using both ASIC and FOGA within the same design can optimize a system for performance beyond the capabilities of either separate circuit technology.
Other applications that lend themselves to the hybrid ASIC/FPGA approach are designs that support multiple standards such as USB, Fire Wire and Camera LINK, in a single device. Similarly, designs that are finalized, with the exception of any undefined features or emerging standard, are excellent candidates for this technology. Without the benefit of programmable logic, the designer must decide between taping-out the chip knowing that the PCI logic has a high probability for change, or waiting until the design requirements are firm-potentially impacting the end product’s schedule. With both programmable logic and ASIC working together on a single device, some situation like these can be accommodated. Other similar issues like differing geographic or I/O standards could also be incorporated within the FPGA cores, without requiring mask and fabrication updates for each change.
10.2 Economics Play a Role in Using Hybrid Devices
While technical applications are emerging for the hybrid architecture, it is unlikely that design teams would utilize this new capability unless it is also economically viable. We will now explore the economics behind this new architecture.
To realize the performance and density advantages of an ASIC ,design teams must accept higher NRE and longer TAT than a FPGA. Unlike off-the-shift FPGA, each ASIC design requires a custom set of masks for silicon fabrication. The custom mask set allows circuitility and interconnections to be tailored to the requirements of each unique application---yielding high performance and density .However, the cost of the mask sets is rapidly increasing(nearly doubling with each successive technology node).as a result, mask costs are becoming as significant portion of the per-die cost in many cases.
For example, consider the case where’re mask set costs $1,000,000.For applications where only 1,000 chip are required, each chip will over $1000, since the mask cost (plus many other expenses) must be amortized over the volume of chip sold. As the volume for this same ASIC rise, effective cost of each die decrease..
Conversely, FPGA are standard products, where the mask charges for small number of design passes are amortized over a large number of customers and chips, so the mask cost per chip sold is minimal. As a result, for each technology node there is a volume threshold, below which it’ more cost-effective to buy an FPGA chip vs. a smaller ASIC chip. TAT is another primary economic driver, having a direct impact on time-to-market for many applications. The time required for ASIC layout and fabrication is typically in the range 2-5months---much longer than FPGA, which generally require 1-4weeks once a customer’s RTL is firm.
These NRE and TAT issues are compounded by customers’ needs for multiple design passes. Since each ASIC design requires a unique mask set, if a custom discover logic error or need to add features after tape out, they must initiate another ASIC design pass, requiring additional NRE charges and silicon fabrication time. As silicon technologies progress and chip become more complex, design verification becomes increasingly difficult, and the chance for logic errors grows. In many cases, time to market pressures drive design teams to continue verification well into layout and sometimes beyond chip tape out. This increases the risk that logic updates will be required, and therefore cost per chip will increase.
In summary, ASIC to date have offered higher performance in smaller chip sizes than FPGA. However, the NRE for current technology nodes has rendered them very expensive for applications that require low quantities of chips---particularly when multiple design passes are required.
10.3 T he Hybrid ASIC/FPGA Solution
Enter the hybrid ASIC/FPGA. Like an ASIC, the initial mask set must be purchased. But with the incorporation of FPGA cores into the ASIC, it is now possible to use the programmable circuitry to enable a single physical chip designs to satisfy several different applications .This has the potential to eliminate multiple design and in some cases, avoid costly repines. In the case where a customer requires similar ASIC for a family of products, FPGA circuitry can be added to the base ASIC logic and configured as needed to satisfy the multiple applications. Similarly, logic updates required to correct bugs discovered late in the verification process, or to accommodate changing market needs, can be handled with appropriately placed FPGA cores.
The question must be asked: why embed FPGA into an ASIC if a two chips solution could achieve the same results? The answer is both technical and economic. Technically, for a certain class of applications, the embedded solution offers greater performance with lower power dissipation. By embedding the FPGA into the ASIC, signals that must propagate from the ASIC through the FPGA, then back to the ASIC can avoid four chip boundary delays, two card crossings, and the associated power dissipation. By keeping the ASIC to FPGA interconnections on the die, valuable ASIC I/O pins are also conserved.
Economically, the embedded solution can be the less expensive option. As we will discuss, the FPGA fabric does not require any unique semiconductor processing above and beyond the base ASIC (unlike embedded flash or embedded DRAM). The resulting increase in ASIC cost is associated with the area occupied by the embedded FPGA core. In addition, the cost of assembly, test and packaging of a secong chip are eliminated.
In certain cases, it can be advantageous to include embedded FPGA on an ASIC if that FPGA eliminates the need for additional design passes. For example, at volumes of up to 250000 pieces, 50K gates of embedded FPGA are cost effectives. Similarly, 10K gates of embedded FPGA are cost effective versus a 2 pass ASIC design at volume of up to 1M. In general, if mask costs rise, volumes decrease, or more design passes are avoided, then the embedded FPGA approach becomes progressively more cost-effective compared to the ASIC approach. This is because at low volumes, the mask costs (and NRE) for additional design passes becomes a significant adder to per-chip cost, and this can outweigh the cost impact of the larger die area required by the embedded FPGA circuitry. This analysis leads us to conclude that technology and market trends have created a need for the development of the hybrid ASIC/FPGA product. Mask costs for advanced technologies are growing – marking multiple design passes too costly for many applications. Fortunately, the technology advancements that have driven this trend have also opened up the potential to embed significant amounts of FPGA gates onto an ASIC die – enough to handle some of the design updates that would otherwise require additional design passes.
10.4 Hybrid Offering Overview
The IBM/Xilinx hybrid will first be available in IBM’s Cu-08 90nm ASIC offering, and will consist of three FPGA block sizes. Multiple blocks used can be mixed and matched.
Physically, the FPGA cores are being ported to the same semiconductor process that the ASIC product uses. The issues encountered in doing this porting are similar to those of other 3rd party IP ports. One of the largest challenges is full chip physical verification. Common design rules and transistor design points are critical in blending of IP between suppliers. Minor difference in design rules can be accommodated, assuming that checking decks and other verification software are able to handle the mixture of design rules. Designing these tools for increased flexibility will likely be needed as more companies share IP.
To ensure that the FPGA can be integrated with the rest of the ASIC power, agreement must be reached on metal stack options. In the case of the Cu-08 hybrid offering,5 level of metal were allocated to the FPGA blocks. This requires a re-layout of the FPGA cores, which were originally designed for a standard product with 9 levels of metal.
As part of the re-layout, the power distribution of the FPGA blocks will be designed to integrate easily into the ASIC power distribution methodology. Care needs to be taken to ensure the power density required by the FPGA blocks are within the capability of the ASIC power supply routing. Due to extensive use of pass-gate structures, the FPGA blocks require standard 1.2V levels, while the bulk of the chip operates at lower levels.
The embedded FPGA blocks consist of programmable logic blocks, configuration logic, test interface logic, and simplified IO buffers for use in driving and receiving on-chip nets. Multiple end user configuration mode are supported including FPGA, serial and parallel modes. Individual cores can be configured asynchronously, allowing for “on-the-fly” reconfiguration.
To design the new hybrid chips, a modified design methodology is being developed as shown in Figure 10.1. This hybrid design flow incorporates two proven design methodologies, the IBM ASIC flow and the XILINX FPGA flow, including several third party vendor synthesis options. The ASIC methodology integrates the embedded FPGA as a hard core with appropriate ASIC label models. The FPGA flow, including timing closure of the FPGA configuration, is done using XILINX tools, the designer has the choice of using constraints or detailed timing from the Xilinx tool flow to close the ASIC timing at the FPGA cote interfaces. If an FPGA configuration is known prior to the design of the ASIC, actual timing information can be passed to the ASIC tools from the FPGA tools. If the logic content of the embedded FPGA is unknown, the ASIC design can be completed using timing assertions and the embedded FPGA design can be completed later. If the embedded FPGA design is being reconfigured after the ASIC is in manufacturing, the final timing constraints from the completed ASIC can be passed to the FPGA tools for the new FPGA design.
The logic design of the chip must be partitioned prior to final synthesis. The logic destined for an FPGA block is processed independently of the logic design for ASIC logic .When multiple FPGA logic are used ,each must be designed and optimized independently.
The logical design of the chip must be partitioned prior to final synthesis. The logic destined for an FPGA block is processed independently of the logic destined for ASIC logic. When multiple FPGA logic blocks are used, each must be designed and optimized independently.
The ASIC physical design process treats the FPGA macro similarly to other large place able objects, except for port assignment. During the initial ASIC design, the port assignment of each embedded FPGA block can be modified to accommodate floor planning or timing requirements. Once the final ASIC design is tape-out ,the port assignments are fixed of subsequent FPGA configurations.
The IBM ASIC methodology has been described in references, and the Xilinx FPGA methodology is described in reference. As to be expected, most of the issues in cresting the hybrid methodology occur at the boundary between the two methodologies. The mechanics of the communications between the two stems can be accomplished by creating data translators, however, optimization between the two systems can be difficult, due to the significant architectural differences between traditional ASIC flows and traditional FPGA flows.