May 17, 2009

Japan Next Generation Supercomputer Project may get stronger by present revising.



I almost fell off my chair about the press releases on May 14 by RIKEN and NEC as follows.

RIKEN: "RIKEN revises the configuration of the Next Generation Supercomputer System (NGS) (in Japanese)"
NEC: "NEC revises role of participation in the MEXT's Next Generation Supercomputer Project (in Japanese)"

I guess that NEC chooses moderate improvement in their vector machine instead of innovation which opportunity is given by the NGS. I am very sorry that it looks a last ice dealer's business model.

A race of supercomputer development is practically a race among computer vendors, not governments even if governments lead them as a national project. Hence a 10 PetaFLOPS race is to be challenged by Fujitsu that implements the NGS for RIKEN and IBM that implements Sequoia for Lawrence Livermore National Laboratory after NEC gave up to build a vector unit side of NGS (Hitachi follows NEC).

A remaining Fujitsu looks active because they previewed a new processor prototype that is to be used in NGS on May 13: "Fujitsu developed the fastest CPU in the world. A Japanese manufacturer won back for the first time in a decade. (in Japanese)", and already released their architecture in March: " Fujitsu released HPC-ACE architecture that is used in NGS."

An outsider like me is interested in influence by acquisition of Sun Microsystems. But Fujitsu must be fully aware of it.

When all of three applicants for NGS R&D, Fujitsu, NEC and Hitachi, are chosen as a member of NGS developers, some friend of IBM Research told me that such formation was impossible to imagine and leads to failure easily in U.S.
Yes, Earth Simulator was implemented by a single vendor, NEC.

Now I expect a 10PetaFLOPS competitor can know the NGS project becomes stronger than before due to present revision that means a single implementer. But it discards Japanese flavor of "Scalar-Vector hybrid supercomputer".

Anyway, insufficient information about NGS probably makes strange guess like a joke. For example, "NEC abandons Japan's 'next-gen' supercomputer" guesses that Taiwan manufacturer may construct the NGS vector side. Hence, it must be getting more important that RIKEN continues to release timely and right information about NGS that can contribute to maximize NGS value as a result, I believe.

May 6, 2009

Flu and Supercomputing



Infection cases by the swine flu virus are reported every day. Watching the 2009 H1N1 Flu Outbreak Map, we can well understand that the swine flu virus is transported by traffic networks. In order to imagine present infection status correctly, this map is much better than common maps that paint out whole country even if only one person is infected.

I guess that many countries are studying propagation of infection by time and location since it could be also useful for bioterrorism cases. For example, IBM is developing Spatiotemporal Epidemiological Modeler (STEM) code and STEM is now available as an open source code from the Eclipse.org. In April, 2009, Eclipse has approved STEM as a top-level Eclipse Technology Project and the air travel model for every US airport is available to drag in and use. The core team at IBM Almaden Research Center continues to run very large scale computations using large data sets.

With capability of Blue Gene supercomputer technology in mind, a joint team of the Scripps Research Institute and IBM Watson Research Center initiated the Project Checkmate that challenges to investigate smart behavior of avian flu virus and human immune system. IBM simulates a giant molecule structure of H of an avian flu (H5N1) virus with Blue Gene supercomputer where H is hemagglutinin that plays a role of a hook to attach with a host cell.

Still we only know the purpose and concept about Project Checkmate (movie). I hope that we will know good results by the Project Checkmate at some early date.

Apr 23, 2009

SGI into Rackable Systems, and Sun into Oracle



Following SGI's $25M purchasing by Rackable Systems on April 1, now Oracle announced "Oracle Buys Sun" with $7.4B on April 20.

After IBM withdrawal of Sun's deal, we concerned about Sun's future.
Now Sun Microsystems seems that they succeed in finding a suitable boatman for Sun ship. I wonder why Sun did not propose their purchasing to Oracle first.

However, according to the article of NY Times "In Sun, Oracle Sees a Software Gem ", "Oracle executives emphasized that they did not regard Sun as a hardware company" although "Safra Catz, Oracle’s president, called Sun a “modern technology company.”
This message has something on our chest in HPC point of view.

According to Wall Street Journal "Oracle Snatches Sun, Foiling IBM", they reported a slightly different message about Sun's hardware business, "Safra Catz, one of two Oracle presidents, said Oracle intends to make Sun's hardware operations a profitable business unit."

In any cases, it is true that Oracle expects to transform Sun to a highly profitable organization, and they must make new decision about investment for Sun's system development, such as for HPC, that demands long-term resources and much money. For example, on April 14, Sun made a large announcement about advanced hardware - new blade servers, new family of network products including InfiniBand, storage systems, and large HPC system Constellation which is equivalent that Sun promises much investment for these users. I, therefore, cannot understand why they are not so sensitive to cost as if they neglected buyer's concern before conclusion.

Not surprisingly, Oracle executive said they could make Sun profitable, and industry analysts estimated that Oracle’s cost savings from Sun operations on April 20 in New York Times. It is difficult time, in particular, for Sun's hardware people.


The icing on the cake (Leg of a snake in Japanese):

I feel something wrong about the WSJ's headline "Oracle Snatches Sun, Foiling IBM". Actually NYT introduced that Mark Loughridge, IBM CFO, suggested "a Sun deal did not pass muster" in April 20's article "I.B.M. Affirms Earnings Goal Despite Sales Slide."
It looks more reasonable idea that IBM already lost concern and left from the Sun deal.

Mar 30, 2009

Who Says Elephants Can't Dance?



I am reading the book "Who Says Elephants Can't Dance? (Nikkei, 2002, Japanese translation)" so late. The book is written by Louis Gerstner assuming IBM CEO from a non-IT company in 1993, brought dramatic success to terrible IBM and left IBM in spring of 2002.
I passed intentionally such a book during an IBM employee because it was too close to enjoy for me. Instead of reading, I always imagined some scene of Dumbo from the title. Now I know that this must be a rich and excellent book.

According to the book, Scott McNealy, CEO, Sun Microsystems frankly said to some journalist that they should select a dull person for a new IBM CEO whoever assumed a CEO when the selection board looked for an IBM CEO.

Anyway, Gerstner revived IBM. And relative to HPC, IBM shipped the first scalable parallel system SP1 in 1993 and ignited a fuse of coming HPC prosperity of IBM, led by Irving Wladawsky-Berger.

In March, 2009, just after sixteen years, WSJ showed a rumor about acquisition proposal of Sun Microsystems to both HP and IBM.

Gerstner clearly said that he never bought companies that did not contribute profit but revenue in the book. Hence, we may imagine Gerstner's attitude about the rumor a little bit if he were the CEO.
Now we hold our breath expecting conclusion by Samuel Palmisano whom Louis Gerstner appointed to his successor (if the WSJ's rumor is true.)

Mar 22, 2009

RIKEN Symposium “The Third Generation PC Clusters”



RIKEN Symposium, titled “The Third Generation PC Clusters", was held on March 12 in RIKEN Wako Institute, sponsored by RIKEN Advanced Center for Computing and Communication, and RIKEN Next-Generation Supercomputer R&D Center.

All the presentations (almost in Japanese) are now available from the RIKEN symposium program site.

The symposium was interesting to me since RIKEN reviewed their five year experiences about RIKEN Super Combined Cluster (RSCC) that was the seventh of TOP500 list in June, 2004 and will be transferred to next system, RIKEN Integrated Cluster of Clusters (RICC) to be operated from August, 2009. Aggregated computing capacity of RICC is estimated 100+ 106+ 64 TFLOPS, i.e., 270 TFLOPS according to Kurokawa-san’s presentation (the sixth presentation)

Relative to the Next-Generation Supercomputer (NGS) of Japan, a six floor building framework appears in Kobe Port Island (it looks like a Blue Waters building, and is higher than it), and prototyping and evaluation phase for NGS system units are scheduled this year. Along such schedule, the role of the RICC is enhanced to:
1. Provide application development environments for NGS
2. Provide HW/SW testing environments for post-NGS
3. Provide higher performance systems for current users
4. Develop new application areas of RIKEN,
according to Dr. Ryutaro Himeno, director of RIKEN Advanced Center for Computing and Communication.

In the afternoon sessions, there were three GPGPU presentations, such as performance evaluation with HIMENO benchmark. The RICC will attach 100 GPGPUs with a 100 node-multi purpose cluster system.

More details with many pictures are published by a mycom’s reporter in their web site (in Japanese).

Mar 17, 2009

Prof. Genki Yagawa and Dr. Tadashi Watanabe, world-class researcher and architect in HPC, are awarded the Japan Academy Prize in FY2009



The Japan Academy (Nippon Gakushi-in) was established on 15 January 1879 in the Meiji Period for the purpose of advancing education and science in Japan. Now it is operated under the auspices of the Ministry of Education, Culture, Sports, Science and Technology,

The Japan Academy announced 10 awardees of the Japan Academy Prize in FY2009 last week, and Prof. Genki Yagawa and Dr. Tadashi Watanabe, world-class researcher and architect in HPC are awarded it.

Dr. Genki Yagawa, professor emeritus, University of Tokyo, executive member of Science Council of Japan, and Director of Center for Computational Mechanics Research, Toyo University is well-known in nuclear engineering research activities, and a leader of multi-scale, multi-physics phenomena’s simulation project of Japan recently.

Dr. Tadashi Watanabe, Project Leader of Next Generation Supercomputer R&D Center, RIKEN is a world-class supercomputer architect who received Cray award and so on.

It must be a landmark event that the Japan Academy awards researchers in the third mode of sciences (HPC) besides those of rich and traditional experimental and theoretical sciences.

Mar 10, 2009

HPC ASIA & APAN 2009 in Taiwan Last Week



Last week, HPC ASIA & APAN 2009 was held on March 2 – 5 in Kaohsiung, Taiwan. Its presentation materials and proceedings are ready for download in advanced program.

It is the 10th HPC Asia event hosted by National Center for High-Performance Computing (NCHC) in Taiwan. The seventh HPC Asia 2004 was held in Japan, followed by HPC Asia 2005 in China and HPC Asia 2007 in Korea

Taiwan government is positive in HPC, and researches from Taiwan also look very smart and active, such as Hsu-san, father of Deep Blue project, or Chiu-san who is developing Blue Gene, I think.

The first day of HPC ASIA & APAN 2009 is dedicated to tutorials and three remaining days are dedicated to keynotes, presentations and workshops.

AIST, Tokyo Institute of Technology, Keio University, Kyushu University, ISIT and RIKEN from Japan appear in the proceedings in HPC.

There are several keynote and invited speakers, such as well-known Jack Dongarra, and William Kramer, Deputy Director of Blue Waters project (recently moved from General Manager, NERSC), Zhiwei Xu in ICT developing Dawning 5000, #1 of TOP500 in Asia at present, and then Mark Seager in LLNL leading 20 PFLOPS Sequoia procurement.

There looks nothing new for their presentation materials. What I find is that Rob Pennington (he visited Tokyo last fall) and Edward Seidel left Blue Waters project and William Kramer and William Gropp joined.


This is just an aside relative to HPC in Asia/Middle East. TOP500 Asia is announced. It targets Asia and the Middle East and assembles and maintains a list of the 500 most powerful computer systems in this region. The list is being compiled twice a year during GITEX exhibition in Riyadh and Dubai. There’s also a variety of TOP500.

Mar 6, 2009

New Earth Simulator (ES2) and New Plasma Simulator into operation



According to the recent Japanese news releases, two Japanese supercomputer sites announced start of operation. One is a 131 TFOPS new Earth Simulator (ES2) in Japan Agency for Marine-Earth Science and Technology (JAMSTEC) and another is a 77 TFLOPS new Plasma Simulator in National Institute of Plasma Science (NIFS) in Toki of Japan.

ES2, successor of the well-known Earth Simulator is NEC SX-9/E vector supercomputer and the new Plasma Simulator is Hitachi SR16000 model L2 that is a POWER6 based supercomputer.

The two supercomputers are designed using different architectures from general purpose commodity based cluster machines that rapidly become dominant even in HPC as shown in Top500 supercomputer list.

JAMSTEC and NIFS can clearly specify their real requirements for successors, not only performance but also memory capacity, memory bandwidth, etc. in application points of view and may consider consistency of their major application codes and programming know-how for successors. Hence, it is imagined that NEC SX-9/E and Hitachi SR16000 have been evaluated the best for JAMSTEC and NIFS computational environments respectively.

The following are summary of the two different systems based on each specification below.
(For comparison purpose, some data of ES2 and SR16000 are filled with SX-9 model A and some IBM POWER6 public material respectively that can be reasonable assumption. The value with ** indicates complemented value.)

- The FLOPS per core of SX-9 is five times faster than POWER6 (you do not surprise it because a SX-9 CPU includes 8 vector units and a scalar unit)

- SX-9 gives large and steady memory bandwidth. Memory transfer per FLOPS , 2.5 Byte/FLOP, is very large and stable because of vector architecture (no cache).

- A Byte/FLOPS of POWER6 is varying between 0.21 and 4, depending on data location, i.e., on cache or memory. Hence, cache-aware programming is desirable.

- SX-9 is expensive and not green (low MFLOPS/W), probably due to rich devices in order to deliver highest vector performances.

- Peak performance per node is almost same between two simulators, approx. 820 GFLOPS in ES2 and 620 GFLOPS in the Plasma Simulator (SR16000) and less than 1 TFLOPS.

- SR16000 gives a 102 MFLOPS/W energy efficiency, more than three times better than SX-9. (According to Green 500, top is 500 MFOPS/W in IBM QS22 using PowerXCell 8i processor and then 372 MFLOPS/W in Blue Gene/P. Even a latest Xeon Quad-core server can provide 200+ MFOPS/W energy efficiency.)

- SX-9 uses traditional air cooling. On the contrary, SR16000 adopts an efficient direct water cooling system.

The following are characteristics of ES2 and new Plasma Simulator.

● ES2 (NEX SX-9/E)
(System)
- 131 TFLOPS vector peak performance, 20 TB memory, Fat-tree Network
- Number of nodes: 160
- Air Cooling
- Peak Performance/Power: *about 27.3 MFLOPS/W (819.2 GFLOPS/30 KVA)
- OS: NEC SUPER-UX
-Construction and 6 yr lease fee: about 18.9 B yen (about $192M)

(CPU)
- Vector Peak Performance: 102.4 GFLOPS (3.2 GHz Clock)
- 8 Vector units + 1 Scalar unit
- 32 port-memory port crossbar
- 65 nm CMOS 11 cupper layers

(Node)
- Vector peak performance: 819.2 GFLOPS
- CPU/node: 8
- Memory/node: 128 GB (SMP)
- Memory Band Width: **2,048 GB/s (8 CPU)
- Byte/FLOP: **2.5
- Inter node transfer: 128 GB/s (8 GB/s x 8 x 2)

● New Plasma Simulator (Phase 1: Hitachi SR16000 model L2)

(System)
- 77 TFLOPS peak performance, 16 TB memory, InfiniBand Fat Tree Network
- Number of nodes: 128
- External storage: 0.5 PB
- Direct water cooling
- Peak performance/Power: 102.1 MFLOPS/W
- OS: IBM AIX5L
- Contract price: About 5.4 Byen (about $55M)
In Phase2 (2012/10~2015/3), it is upgraded to 315 TFLOPS)

(CPU chip)
- Chip peak performance: 37.6 GFLOPS (Dual core)
- Dual core POWER6 processor (4.7 GHz clock)
- 32MB L3cache
- 8 channel memory controller (DDR2/DDR3)
- 65 nm CMOS cupper + SOI

(Node)
- Peak performance: 601.6 GFLOPS
- CPU core/node: 32
- Memory/node: 128 GB (**32 way cc-NUMA SMP)
- Memory bandwidth: **128~160 GB/s (**4~5 GB/s x 32 core)
(When data locate in L2 or L3 cache, bandwidth significantly becomes large. Memory band width behavior is different from vector processor which shows steady value.)
- Byte/FLOP: **0.21 (data on memory)~4 (data on L2 cache)
- Inter node transfer: 32 GB/s (bi-direction)

NCAR's 76.4 TFLOPS BLUEFIRE is IBM p575 that is almost same peak performance as the new Plasma Simulator.

If we simply measure value of supercomputers with a performance/price scale, commodity based cluster may become the most favorite. However it must be essential in HPC that a scale depends on a supercomputer site and there should be different architecture systems that user can choose among by their own scale, such as memory, reliability, power, space in addition to peak performance and price.

Based on my hard experience against Japanese vector supercomputers around 1985 to 1995 in IBM Japan, vector supercomputer's advantage should be rediscovered by successive innovative challenges including energy efficiency and price, if possible.

Feb 26, 2009

HPC related seminars (Japan) in March



Unusual wet snow is falling silently outside. However, March is coming soon while no influenza pandemic came that our government strongly warned. It must be very good for medical researchers in infection diseases as well as for us. In addition, fortunately, there is not much notorious cedar pollen circulating in air until now.

Now I briefly list up HPC related seminars in Japan in March as far as I know.

- Pioneering Scientific Computation Forum 2009: “Computing in Molecular Science – Study Report and Introduction of New System”
Date/Time: 13:00-, March 8, 2009
Venue: Research Institute for Information technology, Kyushu University (Fukuoka)

Besides many academic speakers, “Docking Simulation with Molecular Design Software SYBYL” is presented by Dr. Jun Midorikawa, WorldFusion, Inc. that is a long life Japanese bio-venture company.

More details: http://www.cc.kyushu-u.ac.jp/scp/users/c_news/2008/158.html#1

- RIKEN HPC Symposium
Date/Time: 10:00-, March 12, 2009
Venue: Suzuki Umetaro Hall, Wako Institute, RIKEN (Saitama)

Theme is “The third Generation PC Clusters" this time.
More details: http://accc.riken.jp/HPC/Symposium/2008/index.html

RIKEN Benchmark Contest is continued and there are two benchmark problems announced, Poisson FDM-BMT, Poisson solver for incompressible fluid simulation, and ERI MO-BMT, two-electron integrals calculation in Hartree-Fock MO (Molecular Orbital) method.

More details: http://accc.riken.jp/HPC/Symposium/2008/bmt.html

- PC Cluster Workshop in Osaka
Date/Time: 10:00 - 17:45, March 13, 2009/02/27
Venue: Kansai System Laboratory, Fujitsu (Osaka)

PC Cluster Workshop in Osaka is organized by PC Cluster Consortium. There are many presentations, such as “introduction of SCore7", the latest release of high performance parallel programming environment.
More details: http://www.pccluster.org/event/workshop/pcc2009osaka/

PC Cluster Consortium promotes parallel programming contest on PC clusters, in cooperation with SACSIS 2009 (Symposium on Advanced Computing Systems and Infrastructures 2009) in May in Hiroshima.

More details: https://www2.cc.u-tokyo.ac.jp/procon/

- 2009 First Training Workshop for Parallelizing and Tuning of Codes
Date/Time:
(1) MPI Parallel Programming (9:30 - 17:30, March 26)
(2) Program Tuning (9:30 - 15:00, March 27)
(3) MPI Parallel Programming (9:30 - 17:30, April 2)
(4) Program Tuning (9:30 - 15:00, April 3)
Venue: Wako Institute, RIKEN (Saitama)

Combination of scalar code tuning and MPI parallelization is an excellent method in the training workshop. A lecturer, Yukiya Aoyama, Advanced Center for Computing and Communication, RIKEN has long experience – probably almost twenty years – in such workshops

More details: http://accc.riken.jp/HPC/training/2009-1.html


Just for the icing on the cake, I show a couple of overseas events.

- HPC User Forum -- 32nd HPC User Forum -- April 20 to 22, 2009 -- In Roanoke, VA
This user forum is operated by IDC.
More details: http://www.hpcuserforum.com

- ISC'09 -- Hamburg, Germany -- June 23-26
Online registration starts on March 2: Early birds get the savings. The registration for industry costs 900 EURO that is approximately 115 Kyen. Although Japanese yen becomes stronger to EURO than one year before, it is still expensive.

More details: http://www.supercomp.de/isc09/

Feb 23, 2009

CRA Praised U.S. Congress for Strong Investments in Science, Innovation



In U.S., the House Democratic leadership released an official stimulus summary on January 15 and it looks great for researchers including HPC. However, the numbers in the Senate version on January 26 were not as generous as the House numbers. It looked quite difficult to expect conclusion because some of the significant differences between the two versions -- including significant differences in how the science investments in the bill are handled according to the policy blog of Computing Research Association (CRA) that is an association of more than 200 North American academic departments of computer science, computer engineering, and related fields; laboratories and centers in industry, government, and academia engaging in basic computing research; and affiliated professional societies.

CRA policy blog on February 13 showed happy message “Computing Researchers Applaud Congress for Strong Investments in Science, Innovation” since Congress passed an economic stimulus package that included substantial investments in the nation's science and engineering enterprise. It took effect on February 17, just in around one month from release of the official stimulus summary by House.

In the meantime, Japanese government budget draft for FY2009 includes some of investments for Japanese strengthening of science and technology and the draft with related bills can probably pass the Lower House before March after a long struggle according to some newspaper. Hence I have to recall still unchanged difference in speed of policy decision making between U.S. and Japan while I like to share such good news in U.S.

Anyway, the conference agreement for the American Economic Recovery and Reinvestment Act includes investment more than $15 Billion in Scientific Research: $3 billion for the National Science Foundation, $1.6 billion for the Department of Energy’s Office of Science, $400 million for the Advanced Research Project Agency-Energy (ARPA-E), $580 million for the National Institute of Standards and Technology, and $8.5 billion for NIH!, etc. and many of HPC people in U.S. must be very happy.

This stimulus money will be invested besides the U.S. FY2009 budget for October 2008 to September 2009. Such huge and strong boost, maybe, will create very ambitious projects like mushrooms!?

Feb 19, 2009

The first European PetaFLOPS machine to Germany



The German research center, Forschungszentrum Juelich has selected IBM Blue Gene/P to develop the first supercomputer in Europe capable of one PetaFLOPS. IBM will partner with Juelich-based Gauss Centre for Supercomputing to install the new IBM Blue Gene/P System in the first half of this year.

This new Blue Gene System is the first to include new water cooling technology, created by IBM Research that uses room temperature water to cool the servers. This result is a 91 percent reduction in air conditioning units that would have been required to cool Forschungszentrum Juelich's data center with an air-cooled Blue Gene.
Inauguration and naming of the new systems will take place at an opening ceremony in mid 2009. Forschungszentrum Juelich has JUBL (Juelich Blue Gene/L) and then JUGEN (Juelich Blue Gene/P)、and now JU???? (Juelich Blue Gene/P Full System).

Key specifications of 1PetaFLOPS Blue Gene/P
Processors: 294,912 Processors
Type: PowerPC 450 core 850 MHz
Compute node: 4-way SMP processor
Memory: 144 Terabytes
Racks: 72
Network Latency: 160 Nanoseconds
Network Bandwidth: 5.1 Gigabytes
Energy consumption: 2200 Kilowatts

In addition, Forschungszentrum Juelich will install 100TFLOPS HPC-FF (for Fusion) cluster system for the fusion scientists´ simulation programs of ITER experimental fusion reactor. It will consist of 1,080 computing nodes each equipped with two Nehalem EP Quad Core processors from Intel. The grand total of 8,640 processors will have a clock rate of 2.93 GHz each, they will be able to access about 24 terabytes of total main memory and will be water-cooled. French supercomputer manufacturer Bull will integrate the system and InfiniBand ConnectX QDR from the Israeli company Mellanox will be used as the network.
ITER will go into operation in 2018 and will be the first fusion reactor to generate at least 500 megawatts of excess power. ITER will be constructed in Cadarache, in the south of France, by a consortium consisting of the European Union, Japan, the USA, China, Russia, India and South Korea.


Germany (or EU) looks running one or two year behind US but steadily in supercomputer performance. Their government is much more positive in HPC investment than several years ago. In Japan, the fastest system is a T2K in University of Tokyo (140TFLOPS) and I can not see any delivery plan for more than 200TFLOPS system this year. Hence we may have to look at back of German supercomputers for a while (probably until end of FY 2011, delivery of 10PetaFLOPS Japanese Next Generation Supercomputer.)

Feb 17, 2009

Bushfire and flooding in Australia



I have stayed in Cairns, northeast seashore in Australia from middle of last week as originally planned. It is natural that there is much precipitation this time because of rainy season in tropical climate region, but it was special this year and rivers in the northern part of Queensland flooded due to heavy rain, and ferocious crocodiles moved from rivers to residential area of some town this year.

Since main roads from southern area flooded, tracks were not able to reach to Cairns and vegetables have been swept away in supermarket. Fortunately trunk roads became available just a day before my stay, there were rich foods and clean town and I did not imagine heavy flooding, and couldn't find influence by large flood. Although it was quite fine, weather reports of a TV is continuing passing forecast of cloudy and thunderstorm every day. I guess Australian doesn’t care weather forecast so much compared with Japanese.

On the other hand, huge bush fire, once in 100 years, continued and troops just deployed. in the southeast part (the State of Victoria) of Australian. I remember that ABC TV news reporters always spoke "it is still uncontrolled".

Southeastern Australia has many trees of the eucalyptus which a koala likes, and leaves contain a lot of oil. Therefore, it is said that the bushfire is a natural phenomenon of Australia (or the seeds of a eucalyptus carry out prout at the high temperature during bush fire.

Anyway, fire scale is getting large and only this also becomes a social problem from a thing with many persons who lost all of their property and their life, and warning and correspondence of the authorities having been overdue etc.

However bushfire happens in very large scale, and there are many persons who lost entire properties or even their life, and warning and right actions by authorities were postponed. It will be getting political issues later.

Indian Ocean Dipole (IOD) is the keyword to know why such serious drought has happened in southeastern Australia. IOD is discovered by Dr. Yamagata, Program Director of Frontier Research Center for Global Change, JAMSTEC, et al., in 1999, and it was published in Nature.

Moreover, they succeeded in prediction of the dipole mode phenomena in Indian Ocean using Erath Simulator supercomputer in 2007. It was the first time in the world.

It is linked to reduction of disaster how deeply Australian government understands such research findings, and make progress in right actions against risk of drought.However, reality has already showed answer.


I expected an adventure travel this time. But anything happened. The only trouble is our flight (JetStar, known due to cheaper airfare) from Cairns that delayed almost 6 hours, and arrived in Osaka midnight instead of Tokyo.

Feb 9, 2009

NNSA awards IBM Contract to Build Next Generation Supercomputer



The Department of Energy’s National Nuclear Security Administration (NNSA) announced a contract with IBM to bring world-leading supercomputing systems to its Lawrence Livermore National Laboratory to help continue to ensure the safety and reliability of the nation’s aging nuclear deterrent on February 3, 2009.

IBM will deliver two systems: Sequoia, a 20 PetaFLOPS system based on future Blue Gene technology, to be delivered starting in 2011 and deployed in 2012; and an initial delivery system called Dawn, a 500 TeraFLOPS Blue Gene/P system, scheduled for delivery in the first quarter of 2009. Dawn will lay the applications foundation for multi-PetaFLOPS computing on Sequoia.

Sequoia’s delivery and deployment schedule seems to be similar to Next Generation Supercomputer R&D Projec of Japant which completes delivery by March, 2012 (End of FY2011). Hence both supercomputer projects will compete in performance and delivery/deployment schedule each other.

Sequoia will have 1.6 Petabytes of memory, 96 racks, 98,304 compute nodes, and 1.6 million cores. It means that we will enter 1 million core system era soon. I cannot expect what happens in application code point of view.

Anyway, the House Democratic leadership has released an official stimulus summary on January 15 and it looks great. After that, Senate deal protects much of NSF Increase in Stimulus and there are a lot of gaps between both budget plans as shown in Computing Research Policy (CRA) blog. We need to watch progress of U.S. Stimulus plan in HPC view against current economic crisis for a while.

U.S. has been mainly driving HPC from the beginning. It may, thereby, be not surprized even if anything unexpected happens in HPC due to the crisis.

Feb 2, 2009

"Cray CX1 Experience Seminar" in Tokyo



There was a Cray CX1 Experience Seminar was held by Cray Japan in the afternoon of January 27 in Akihabara Convention Hall in Tokyo. Cray CX1 is the system which I was interested in since I found it which was put casually on the floor in the SC08’s Cray booth last November.

It was a good opportunity for me to attend this seminar since I left IBM and am not in any vendors now. (In this industry, we usually do not accept a competitor's entry to seminars mutually.) Many thanks to the courtesy of the Nakano-san, president of Cray Japan with decades-knowing.

As shown next, the seminar was organized as all-in-one about Cray CX-1 probably similar to CX-1 itself. Therefore it was a galloping seminar. But it was an intelligible seminar and I enjoyed it.

The conference room with 130 seats was almost full with attendees. That means that many people had various kinds of interests about the Cray's personal supercomputer with Accelerators.

(Program)
- Trend of HPC (Cray Japan)
- CX1 overview and its advantage (Cray Japan)
- Parallel Computing by CUDA: Prologue (NVIDIA Japan)
- CX1 Demonstration
(1) Employment of SCRYU/Tetra and LS-DYNA with Cray CX1 Job Launcher (Cray Japan)
(2) SCRYU/Tetra Demonstration on Windows HPC Server 2008 (Software Cradle)
(3) LS-DYNA Demonstration (JSOL)
- CPU information and the latest road map (Intel)
- Efforts of Microsoft for HPC (Microsoft)

Cray CX1 looks one of typical small blade servers which can work by 100V power, for example, IBM Blade Center S. Cray provides a computing blade (two models), a storage blade (two models), and a graphics blade which includes NVIDIA QUADRO FX series or TESLA inside. Since it costs several million yen for four blade-configuration models in a campaign price according to a Cray brochure given in the seminar), the price range is in the Workgroup Server. It looks unusual that Cray which has a track record in supercomputer market of hundreds of million yen offers products for workgroup server, the minimum price range of HPC servers. They are going to enhance Cray CX-1 and follow-on systems in cooperation with Intel, NVIDIA and Microsoft.

According to a talk by NVIDIA, because TESLA is just an expensive and professional QUADRO without graphics function, they provide it around half the price of QUADRO. (In addition, it is supported by huge sales volume and revenue by GEFORCE in consumers market.) They target twice in 18 months for performance improvement. The NVIDIA speaker said that double precision feature had been added last November.

They reported that the number of downloads of CUDA, application development environment for GPU, already exceeded 120,000. CUDA ZONE web site may show such prosperous situation in development of GPU applications.

The CX1 demonstration of SCRYU/Tetra by the Software Cradle, well-known CAE software vendor in Japan, was a CFD calculation applied to an intake manifold and an engine cylinder. It is almost scalable up to eight cores for 180k element-model and up to 16 cores for 1 to 8 million element models (Approx. 10 times faster in 16 cores than 1 core. The performance is not sensitive to a number of elements for 1 million or more element case. InfiniBand is used. )

JSOL (Former Japan Research Institute Solutions) demonstrated the LS-DYNA standard problems (Neon refined and 3 Vehicle Collision) for automotive crash analysis. Their results showed a good scalability up to 32 cores. (26.8 times (IB interconnect) or 23.7 times (GbE interconnect) faster in 32 cores that single core)

I wonder why performance differences are so small between a high bandwidth and low latency IB case and a low speed GbE case. Maybe, both codes are already well tuned in order to minimize communication for moderately parallel computing (I do not confirm it yet).

If such ISV's CAE software could show scalable performance and an acceptable licensing fee structure up to 1,024 cores or more, Industrial sector's usage in HPC would be much more accelerated to larger systems, considering trends of HPC platforms. If so, many things will change. Do you agree?

Jan 30, 2009

Besides Cell/B.E., the Programming Contest for GPU Just Started in Japan



January 28 was a deadline of application for “Cell Challenge 2009” of the Multi-core Programming Contest in Japan.

The specified assignment is "Calculation of Edit Distance between Two Character Strings." A champion will be determined by the sum of scores in the preliminary rounds and the final rounds. The final rounds will complete on March 20.

Three special interest groups in Information Processing Society of Japan lead and Kitakyushu Foundation for the Advancement of Industry, Science and Technology (FAIS) and four Cell/B.E. developers - Toshiba, etc., support the contest.

GPU Challenge 2009” newly started on January 21 as an attached program with the Cell Challenge 2009. It is supported by the Global Scientific Information and Computing Center (GSIC), Tokyo Institute of Technology, NVIDIA, and so on.

The specified problem of GPU Challenge 2009 is the same as the Cell Challenge 2009. Applicants should run their program on the computers with NVIDIA GPUs (equivalent to Tesla S1070-400) under a CUDA programming environment that the GPU Challenge 2009 executive committee provides. For the specified problem, an applicant has to use the computer implementing NVIDIA's GPU in Global Scientific Information and Computing Center (GSIC), Tokyo Institute of Technology.

The deadline of application and final program submission is February 13 and March 25 respectively.

In the almost same period as above, Fixstars which has been building up proven performances towards a Cell/B.E. total solution company is carrying out the Cell Programming Contest in Japan called “Hack the Cell 2009” by themselves.

This deadline of application and final program submission is January 31 and March 6 respectively.

The assignment of the Hack the Cell 2009 is "Optimization of Mersenne Twister Random Number Generator."

The champion of a student category will be awarded Fixstar's scholarship priority of 600kyen per year, and invitation to five-day travel to San Francisco.

Fixstar's CTO shows Japanese message on the contest's web site like this; "We are clearly lacking smarter programmers who can get great performances out of current high performance processors like Cell/B.E. Many programmers also are hidden without opportunities to demonstrate their superiorities. We can recognize that these situations should be serious problems and it is meaningful to provide opportunities for excellent programmers to be admired. "

Most of HPC people can agree with his message, I imagine.

Jan 25, 2009

Potsdam Scientists to Tackle New Type of Weather Simulations with IBM iDataPlex

The Potsdam Institute for Climate Impact Research (PIK)'s new iDataPlex computer was put into operation in January and offers 30 teraflops of processing power as described below.

Potsdam Scientists to Tackle New Type of Weather Simulations with IBM iDataPlex

One of the key reasons that iDataPlex stands apart from other high-performance computer platforms is its energy efficiency, approximately 230 megaflops of performance per watt estimated.

Part of IBM’s “Big Green” initiative, iDataPlex maximizes performance per watt with innovative cooling techniques such as a rear-door heat exchanger.

Although IBM iDataPlex should be known much more as one of smarter commodity-based clusters, we have not met a good explanation so often (only in Japan?).

I just happened to meet an chatty and interesting article about iDataPlex by Linux Magazine’s HPC editor, Douglas Eadline.

Doug Meets The iDataPlex

He visited Dave Weber, the program director of the Wall Street Center of Excellence, Worldwide Client Centers for IBM, and held a dialogue with Dave about iDataPlex.

First things first, he knew iDataPlex meant "Large Scale Internet Data Center" and he hoped he was not in the wrong meeting because he was an HPC guy. However during the meeting, he found "this was obviously not your typical 1U server node. Indeed, it was almost like someone ask a group of HPC users to design a node.", and continued his finding about low powered fans, cabling, smarter combination of commodities, energy efficiency, cost performance and TCO, and so on.

Finally Doug commented:
Instead of calling it the “iDataPlex for Web 2.0″, they should have just called it the “iDataPlex for HPC 2.0!

Doug's message can be shared with me very well, and IBM should talk iDataPlex values in HPC better and more like him, shouldn't they?

Jan 21, 2009

HPC will grow under the Obama Government

The House Appropriations Committee released the bill text and the accompanying committee report "THE AMERICAN RECOVERY AND REINVESTMENT ACT OF 2009 [DISCUSSION DRAFT]" on January 15.

Computing Research Policy Blog quickly analyzed it in

More Detail on 2009 House Dem Stimulus and Recovery Plan (January 15, 2009):
http://www.cra.org/govaffairs/blog/archives/000715.html

The blog concluded that "In summary, though, this looks awfully good to us and will likely go a long way towards recharging the Nation's innovation engine."

The conclusion looks natural because a large amount of additional investmentis scheduled for 2009 in the report. For example, the Office of Science in the Department of Energy will see an increase of $2 billion under this plan, NSF will see an increase of $3 billion overall and NIH will receive $1.5 billion for grants to improve university research facilities and another $1.5 billion in new research funding. These are well known players driving HPC.

Therefore I believe that possibility for postponing the U.S. Petascale Computing projects such as on-going Blue Waters Project, due to the U.S. economic crisis, was clearly wiped away.

According to a revised IDC New HPC Market Forecast that I received from Earl Joseph yesterday morning, the new HPC base case forecast predicts a decline in 2008, followed by an additional modest decline in 2009, then a return to growth in 2010 to 2012, resulting in an overall CAGR from 2007 to 2012 of 3.1% in revenues.

In 2008-2012, although HPC market looks very severe in major industrial sectors, such as U.S. automotives, finance, they assume that the government and academic sectors are nuetral and the national security, energy, game, digital-contents sectors and Petascale initiative will be accelerator. I think that it is reaonably consistent with the Stimulus and Recovery Plan.


While the Obama Government began such a stimulus and recovery plan for resuscitation rapidly, our Japanese government has to spend time by childish argument in the Diet and cannot show aggresive plan yet, far from the Obama Government. That may be very anxious.

Jan 18, 2009

Japan's Next Generation Supercomputer R&D Budget in FY2009

According to A-Tip news article (12 January 2009 issued by ATIP), Japan's Next Generation Supercomputer (NGSC) Project will receive a budget of 19,000 Myen for FY2009. The total plan (FY2006 – FY2012) is expected to reach 115,447 Myen.
The Budget was recently sent to the Diet and it is presently under discussion.

The contents of the NGSC Project budget in FY2009 are as follows:
- Pilot manufacturing and testing of system : 10,992 Myen
- Grand Challenge Software R&D: 1,877 Myen
- Facility Construction: 6,131 Myen

The Grand Challenge Software R&D Budget includes the following projects:
1. "Next Generation Integrated Nanoscience Simulation Software" Project, managed by the Institute for Molecular Science (IMS) at Okazaki
2. “Next-Generation Integrated Living Matter Simulation” Project, managed by RIKEN Wako Institute.

Jan 10, 2009

Symposiums for HPC in January in Tokyo



・ International Workshop for Peta-scale Application Development Support Environment

The 3rd Advanced Supercomputing Environment (ASE) meeting will be held as International workshop for peta-scale application development support environment on January 20, 2009 in the large conference room in Information Technology Center (ITC) of the University of Tokyo.

The invited speaker is Dr. Jonathan Carter (NERSC, Lawrence Berkeley National Laboratory) who talks "Optimizing Scientific Applications for Multicore Architectures and T2K Open Supercomputer".

The details of the announcement is shown in
http://nkl.cc.u-tokyo.ac.jp/seminars/0901-ASE003/


・ HPCS2009
High performance computing and computational science symposium 2009


This symposium will be held on January 22 - 23, 2009 in VLSI Design and Education Center (VDEC) of the University of Tokyo. A discounted registration fee is applied by January 15. In particular, students can register free of charge instead of 17,000 yen (normal registration fee).
The organizer must be generous-minded.

In the morning session for large-scale application, a presentation about Blue Gene/P , not so often in Japan,
"Optimization of the first principle molecular dynamics software PHASE in Blue Gene/P.", H. Imai and T. Moriyama (IBM Japan)
is scheduled.

Dr. Dr. Jonathan Carter (NERSC) will deliver the keynote address:
"Optimizing Scientific Applications for Multicore Architectures" in the second day.

The final session will be focused on GPU. The best paper award of the symposium、"Speed-up by CUDA of the singular value decomposition for the square matrix", T. Fukaya, Y. Yamamoto (Nagoya University), T. Uneyama, Y. Nakamura (Kyoto University) will be presented in this session. This may imply that large interest and expectation about accelerator is growing in Japan, too.

The details of the symposium are shown in http://www.hpcc.jp/hpcs/

Jan 1, 2009

Happy New Year from Tokyo!




Greetings for the new year!

Hope this year is filled with good times, happiness and sucess!







(Mt. Fuji from Amagi highland)