CST和CUDA
非常感謝大家及時詳細的解答,我先將我自己的理解總結(jié)一下,以便后人觀看:(如果有問題請?zhí)岢鲆员阈薷模?br />一、什么是CUDA
首先CUDA(Compute Unified Device Architecture)是一種利用顯卡上的芯片(GPU)以及顯存來分擔(dān)CPU運算任務(wù)的并行計算架構(gòu),最早由NVIDIA提出并實現(xiàn)。
二、CUDA和CST MWS
歷史:首先是一家叫做Acceleware的公司利用NVIDIA的CUDA提供軟件接口,使得用戶可以用以CUDA技術(shù)解決復(fù)雜的計算問題,2008年,CST也采用了這樣的技術(shù),并在官網(wǎng)上宣稱利用CUDA加速可以提高40%的仿真速度。
等到了CST2010 我們再也看不到Acceleware的身影,CST已經(jīng)將CUDA技術(shù)較好的融入到了CST2010中,根據(jù)CST2010的官方宣傳,如下圖所示
從上圖就可以看出他快的而不僅僅是40%那么簡單,這個是在進行檢測chip的signal integrity的時候進行的,最高有可能提高將近28倍
下圖是CST的仿真加速結(jié)構(gòu),從底層的硬件到仿真的時候的cluster分塊,再到仿真算法的優(yōu)化,
三、CST硬件加速
CST的硬件加速,包括:CUDA、Distributed Computing(分布式計算)和MPI Computing(線程并行計算),當(dāng)然好東西不是白用的 需要收費,根據(jù)版主EDATOP提供Acceleration Token 9000歐元一年,之后是每年15%的維護費。關(guān)于acceleration的list如下圖,
關(guān)于CUDA在CST里的介紹我能想起來的也就那么多了,但是CUDA并不僅僅是在CST里的應(yīng)用,在這個本站里面也有不少關(guān)于CUDA應(yīng)用的介紹,
利用CUDA在MATLAB中加速:/read-htm-tid-30481.html
還有一個巨牛的帖子手把手教大家CUDA編程的:/read-htm-tid-18915.html (看了這個帖子心潮澎湃,內(nèi)牛滿面啊,再次向牛人致敬)
最后感謝版主EDATOP,lantianyi ,以及zhknpu的幫助
P.S. 其實我心里還存在這樣的僥幸心里,在CST2010的前一個版本是否可以自信安裝Acceleware的第三方驅(qū)動以后進行加速,呵呵,邪念!
CST MICROWAVE STUDIO users can gain up to 40 percent performance improvements for electromagnetic simulations with Acceleware's next generation of software acceleration products
CALGARY, Alberta, and DARMSTADT, Germany, Aug. 25 -- Acceleware Corp., a leading developer of high performance computing applications, and Computer Simulation Technology (CST), a supplier of leading-edge electromagnetic (EM) simulation software, today announced a new CUDA (Compute Unified Device Architecture)-based Acceleware solution for accelerating EM simulations with CST MICROWAVE STUDIO (CST MWS). Trial tests of this new acceleration product have delivered performance gains of up to 40 percent compared to the current product. This faster version is compatible with current generation GPUs, delivering significant speed-ups to CST customers without having to upgrade hardware.
"CST MWS users already benefiting from our acceleration solutions will experience significant performance improvements with a simple software update," said Ryan Schneider, CTO of Acceleware. "This combined offering enables Acceleware and CST to deliver the first ever CUDA powered solution to solve electromagnetic problems."
The new Acceleware software uses NVIDIA's CUDA programming language that was designed to solve complex computational problems, enabling users to access multi-core parallel processing technology. Engineers and product designers who value the CST MWS time domain solver's efficiency, accuracy and user-friendly interface, will benefit from increased acceleration to help them tackle larger simulation problems and meet production deadlines.
"NVIDIA Tesla products in combination with CST and Acceleware's software provide an accelerated design environment for end users to reduce the time spent simulating products," said Andy Keane, general manager of the GPU computing business at NVIDIA. "Acceleware's use of CUDA in its libraries to harness the parallel architecture of the GPU will provide compelling value and considerable speed-ups for designers who frequently perform EM simulations."
Acceleware's CUDA enabled solvers take advantage of highly parallel NVIDIA compute hardware to offer performance gains while performing lengthy EM design simulations. Users will be able run their CST MWS simulations even faster while supporting the same accelerated feature sets that designers use today, including open boundaries, lossy metals, Thin Sheet Technology (TST), Perfect Boundary Approximation (PBA), dispersive materials, and far field monitors. Single card, dual card, and quad card configurations are supported.
"Acceleware's hardware acceleration solutions have been well received by our customers. Achieving an accurate solution in a shorter time frame is instrumental in bringing their products to market earlier," commented Jonathan Oakley, VP of sales and marketing at CST of America. "We are really excited by the prospect of offering them an even faster, CUDA enhanced solution as part of their regular maintenance."
Availability
The CUDA enabled solver is expected to be available by the end of 2008. CST and Acceleware end users will benefit from this development as part of their maintenance contract. All 30 series products will be supported. The range of hardware acceleration products may be extended in the future. For more information, visit www.acceleware.com or www.cst.com.
CST develops and markets software for the simulation of electromagnetic fields in all frequency bands. Its success is based on the implementation of unique, leading-edge technology in a user-friendly interface. Furthermore, CST's "complete technology" complements its market and technology leading time domain solver, thus offering unparalleled accuracy and versatility for all applications. CST's customers operate in industries as diverse as telecommunications, defence, automotive, electronics, and medical equipment, and include market leaders such as IBM, Intel, Mitsubishi, Samsung, and Siemens. Timely, local support is provided through CST's direct sales and technical support forces. Together with its highly qualified distributors and representatives, CST supports its EM products in over 30 countries. CST's flagship product, CST MICROWAVE STUDIO (CST MWS) is the leading-edge tool for the fast and accurate simulation of high frequency (HF) devices such as antennas, filters, couplers, planar and multi-layer structures and SI and EMC effects. CST MWS offers considerable product to market advantages such as shorter development cycles, virtual prototyping before physical trials, and optimization instead of experimentation. Further information about CST is available on the Web at www.cst.com.
About Acceleware
Acceleware (TSX-V: AXE) develops and markets solutions that enable software vendors to leverage heterogeneous, multi core processing hardware without rewriting their applications for parallel computing. This acceleration middleware allows customers to speed-up simulation and data processing algorithms, benefiting from high performance computing technologies available in the market such as multiple-core CPUs, GPUs or other acceleration hardware. Acceleware solutions are deployed by companies worldwide such Philips, Boston Scientific, Samsung, Eli Lilly, General Mills, Nokia, LG, RIM, Medtronic, Hitachi, Fujifilm, FDA, Mitsubishi, Sony Ericsson, AGC, NTT DoCoMo, and Renault to speed up product design, analyze data and make better business decisions in areas such as electronic manufacturing, oil & gas, medical and security imaging, industrial and consumer products, financial, and academic research. Acceleware is a public company on Canada's TSX Venture Exchange under the trading symbol AXE. For more information about Acceleware, visit www.acceleware.com.
wiki 對CUDA的概述
CUDA (an acronym for Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA. CUDA is the computing engine in NVIDIA graphics processing units (GPUs) that is accessible to software developers through variants of industry standard programming languages. Programmers use 'C for CUDA' (C with NVIDIA extensions and certain restrictions), compiled through a PathScale Open64 C compiler,[1] to code algorithms for execution on the GPU. CUDA architecture shares a range of computational interfaces with two competitors -the Khronos Group's Open Computing Language[2] and Microsoft's DirectCompute[3]. Third party wrappers are also available for Python, Perl, Fortran, Java, Ruby, Lua, and MATLAB.
CUDA gives developers access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs. Using CUDA, the latest NVIDIA GPUs become accessible for computation like CPUs. Unlike CPUs however, GPUs have a parallel throughput architecture that emphasizes executing many concurrent threads slowly, rather than executing a single thread very fast. This approach of solving general purpose problems on GPUs is known as GPGPU.
In the computer game industry, in addition to graphics rendering, GPUs are used in game physics calculations (physical effects like debris, smoke, fire, fluids); examples include PhysX and Bullet. CUDA has also been used to accelerate non-graphical applications in computational biology, cryptography and other fields by an order of magnitude or more.[4][5][6][7] An example of this is the BOINC distributed computing client.[8]
CUDA provides both a low level API and a higher level API. The initial CUDA SDK was made public on 15 February 2007, for Microsoft Windows and Linux. Mac OS X support was later added in version 2.0[9], which supersedes the beta released February 14, 2008.[10] CUDA works with all NVIDIA GPUs from the G8X series onwards, including GeForce, Quadro and the Tesla line. NVIDIA states that programs developed for the GeForce 8 series will also work without modification on all future NVIDIA video cards, due to binary compatibility
這個就是CST所說的GPU Computing,是Hardware Acceleration方法之一,現(xiàn)在和Distributed Computing、MPI Computing統(tǒng)一使用Acceleration Token實現(xiàn)授權(quán)。具體的說明可以參考GPU Computing Guide,官網(wǎng)可以下載。
現(xiàn)在公司正在考慮要不要買一個token,9000歐元一年,之后是每年15%的維護費。用最簡單的Nvidia Quadro FX 5800或者Nvidia Tesla C1060可以實現(xiàn)6-7倍的提速。我們讓CST技術(shù)支持做過測試,現(xiàn)在我正在仿真的模型用HP個人工作站xw8400耗時19個小時,CST那邊用GPU加速用時不到兩個小時!
使用GPU加速有硬件配置的要求,上面的兩種加速卡要求最少12 GB內(nèi)存,工作站需要800 W - 1000 W功耗。更高級的加速設(shè)備要求更高,GPU Computing Guide有列出來需要的硬件配置要求。
謝謝版主的回復(fù),剛才已經(jīng)閱讀了您推薦的 GPU Computing Guide,
可惜的我現(xiàn)在只有一臺雙核2.6+4GRAM的PC機,如果我沒有理解錯的話,就算我買了支持cuda的顯卡,我也還需要購買cst的token才能實現(xiàn)加速?
觀望,等待!
簡單的回答:是的!
使用一個GPU,需要一個Acceleration Token。
具體內(nèi)容可以參考CST Licensing Guide 2010。
非常感謝版主
看來只有花錢才能買到好東西 呵呵
這個我已經(jīng)用了一年多了,還有些經(jīng)驗。有什么問題可以給我留言。
據(jù)我所知,并不是單單買一塊顯卡就能搞定的,這個硬件加速卡最早是安裝在機箱里面的,后來由于功耗和重量的原因,現(xiàn)在都是做成一個單獨的模塊。
硬件加速功能一共是三家公司共同實現(xiàn)的,cst提供仿真軟件,Nvidia提供加速卡硬件,Acceleware提供兩者之間的接口模塊。
前面的信息已經(jīng)有點兒老了……。
CST公司剛剛又更新了一次GPU Computing Guide(8月26日),更新了Telsa M系列加速卡的支持信息,更新了驅(qū)動的鏈接,現(xiàn)在使用GPU也可以支持Remote Login了。
通讀現(xiàn)在的GPU Computing Guide,你已經(jīng)看不到Acceleware的影子了,而且上一次和CST技術(shù)支持的人開網(wǎng)絡(luò)會議,他們也沒提到要和Acceleware打什么交道。
要使用GPU Computing,加速設(shè)備自備,再購買Token就好了,就這么簡單。硬件兼容方面是客戶自己需要考慮的。
現(xiàn)在單GPU的Nvidia Quadro FX 5800和Nvidia Tesla C1060還是安裝在主板PCI-E G2 *16插槽上,顯卡需要200 W功耗。今年底有新一代的加速卡會推出(Tesla 2系列,Quadro系列好像也有更新),單GPU加速卡仍然是插在主板上。
NVIDIA Quadro Plex 2200 D2是一個獨立的模塊,兩個GPU,功耗640 W,工作站功耗750 W,看尺寸應(yīng)該是獨立于機箱之外,占用一個PCI-E G2 *16插槽。
NVIDIA Tesla S1070要用到rack-mount system,四個GPU,功耗800 W,工作站功耗750 W,占用兩個PCI-E G2 *16插槽。
新列出來的NVIDIA Tesla M1060是Embedded module,PCI-E接口模式,一個GPU,不過看不出來是插在主板上還是連接在外面。
這個帖子技術(shù)很先進啊,未來的趨勢啊~