找回密码
 注册
查看: 3974|回复: 8

Start using OpenMP

[复制链接]
发表于 2007-9-14 23:15:45 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。

您需要 登录 才可以下载或查看,没有账号?注册

x
[这个贴子最后由newcfd在 2007/09/14 11:24pm 第 3 次编辑]

install gcc 4.2.1 on Linux. I am using Ubuntu 7.04
  1. download 4.2.1 from http://gcc.gnu.org/mirrors.html
    for C/C++: get gcc-g++-4.2.1.tar.gz and gcc-core-4.2.1.tar.gz
    save them anywhere you want for example Desktop
  2. gunzip gcc-g++-4.2.1.tar.gz
  3. gunzip gcc-core-4.2.1.tar.gz
  4. tar -xvf  gcc-core-4.2.1.tar
  5. tar -xvf gcc-g++-4.2.1.tar
  6. cd gcc-g++-4.2.1
  7. configure
  8. make
  9. login as root (if you use sudo, you can type sudo make install)
  10. make intall [br][br][以下内容由 newcfd 在 2007年09月14日 11:20pm 时添加] [br]
Now you have gcc 4.2.1 in /usr/local/lib/gcc. Let me know if you have problems. If you use Fortran, you need to download gcc-fortran-4.2.1.tar.gz instead of gcc-g++-4.2.1.tar.gz. However, you still need gcc-core-4.2.1.tar.gz. I do not think that it is necessary to download gcc-4.2.1.tar.gz which is for all languages(ada, fortran, c/c++, java, etc). You may not use all of them.
 楼主| 发表于 2007-9-18 00:48:33 | 显示全部楼层

Start using OpenMP

[这个贴子最后由newcfd在 2007/09/18 01:13am 第 2 次编辑]

First example: file name is testmp.cpp and working project dir is test3
&#35;include <omp.h>
&#35;include <math.h>
&#35;include <iostream>
using namespace std;
//&#35;define NUM_THREADS 4
int main ()
{         
           //omp_set_num_threads(NUM_THREADS);
&#35;pragma omp parallel
{          double help;
           int id = omp_get_thread_num();      
&#35;pragma omp for
    for ( int i = 0; i < ( int ) 1.0e8; ++i )
    {
                help = pow( 3.4, 5.6 );
                help = pow( 3.4, 5.6 );
                help = pow( 3.4, 5.6 );
                help = pow( 3.4, 5.6 );
                help = pow( 3.4, 5.6 );
                help = pow( 3.4, 5.6 );
                help = pow( 3.4, 5.6 );
                help = pow( 3.4, 5.6 );
                help = pow( 3.4, 5.6 );
                help = pow( 3.4, 5.6 );
                help = pow( 3.4, 5.6 );
    }
    cout << " id = " << id << endl;
}         
    return 0;
}
 楼主| 发表于 2007-9-18 00:50:13 | 显示全部楼层

Start using OpenMP

Make file:

&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;
&#35; Makefile for building: test3
&#35; Generated by qmake (1.07a) (Qt 3.3.8) on: Thu Sep 13 10:54:19 2007
&#35; Project:  test3.pro
&#35; Template: app
&#35; Command: &#36;(QMAKE) -o Makefile test3.pro
&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;&#35;
&#35;&#35;&#35;&#35;&#35;&#35;&#35; Compiler, tools and options
CC       = gcc
CXX      = g++
LEX      = flex
YACC     = yacc
CFLAGS   = -pipe -Wall -W -O3 -fopenmp  -DQT_NO_DEBUG -DQT_THREAD_SUPPORT
CXXFLAGS = -pipe -Wall -W -O3 -fopenmp  -DQT_NO_DEBUG -DQT_THREAD_SUPPORT
LEXFLAGS =
YACCFLAGS= -d
INCPATH  = -I/opt/qt/qt-3.3.8/mkspecs/default -I. -I. -I&#36;(QTDIR)/include
LINK     = g++
LFLAGS   =  -Wl,-rpath,&#36;(QTDIR)/lib
LIBS     = &#36;(SUBLIBS) -L&#36;(QTDIR)/lib  -lgomp -L/usr/X11R6/lib -lqt-mt -L/usr/X11R6/lib -lXrandr -ldl -lpthread -lXext -lX11 -lm
AR       = ar cqs
RANLIB   =
MOC      = &#36;(QTDIR)/bin/moc
UIC      = &#36;(QTDIR)/bin/uic
QMAKE    = qmake
TAR      = tar -cf
GZIP     = gzip -9f
COPY     = cp -f
COPY_FILE= &#36;(COPY)
COPY_DIR = &#36;(COPY) -r
INSTALL_FILE= &#36;(COPY_FILE)
INSTALL_DIR = &#36;(COPY_DIR)
DEL_FILE = rm -f
SYMLINK  = ln -sf
DEL_DIR  = rmdir
MOVE     = mv -f
CHK_DIR_EXISTS= test -d
MKDIR    = mkdir -p
&#35;&#35;&#35;&#35;&#35;&#35;&#35; Output directory
OBJECTS_DIR = ./
&#35;&#35;&#35;&#35;&#35;&#35;&#35; Files
HEADERS =
SOURCES = testmp.cpp
OBJECTS = testmp.o
FORMS =
UICDECLS =
UICIMPLS =
SRCMOC   =
OBJMOC =
DIST       = test3.pro
QMAKE_TARGET = test3
DESTDIR  =
TARGET   = test3
first: all
&#35;&#35;&#35;&#35;&#35;&#35;&#35; Implicit rules
.SUFFIXES: .c .o .cpp .cc .cxx .C
.cpp.o:
        &#36;(CXX) -c &#36;(CXXFLAGS) &#36;(INCPATH) -o &#36;@ &#36;<
.cc.o:
        &#36;(CXX) -c &#36;(CXXFLAGS) &#36;(INCPATH) -o &#36;@ &#36;<
.cxx.o:
        &#36;(CXX) -c &#36;(CXXFLAGS) &#36;(INCPATH) -o &#36;@ &#36;<
.C.o:
        &#36;(CXX) -c &#36;(CXXFLAGS) &#36;(INCPATH) -o &#36;@ &#36;<
.c.o:
        &#36;(CC) -c &#36;(CFLAGS) &#36;(INCPATH) -o &#36;@ &#36;<
&#35;&#35;&#35;&#35;&#35;&#35;&#35; Build rules
all: Makefile &#36;(TARGET)
&#36;(TARGET):  &#36;(UICDECLS) &#36;(OBJECTS) &#36;(OBJMOC)  
        &#36;(LINK) &#36;(LFLAGS) -o &#36;(TARGET) &#36;(OBJECTS) &#36;(OBJMOC) &#36;(OBJCOMP) &#36;(LIBS)
mocables: &#36;(SRCMOC)
uicables: &#36;(UICDECLS) &#36;(UICIMPLS)
&#36;(MOC):
        ( cd &#36;(QTDIR)/src/moc && &#36;(MAKE) )
Makefile: test3.pro  /opt/qt/qt-3.3.8/mkspecs/default/qmake.conf /opt/qt/qt-3.3.8/lib/libqt-mt.prl
        &#36;(QMAKE) -o Makefile test3.pro
qmake:
        @&#36;(QMAKE) -o Makefile test3.pro
dist:
        @mkdir -p .tmp/test3 && &#36;(COPY_FILE) --parents &#36;(SOURCES) &#36;(HEADERS) &#36;(FORMS) &#36;(DIST) .tmp/test3/ && ( cd `dirname .tmp/test3` && &#36;(TAR) test3.tar test3 && &#36;(GZIP) test3.tar ) && &#36;(MOVE) `dirname .tm
p/test3`/test3.tar.gz . && &#36;(DEL_FILE) -r .tmp/test3
mocclean:
uiclean:
yaccclean:
lexclean:
clean:
        -&#36;(DEL_FILE) &#36;(OBJECTS)
        -&#36;(DEL_FILE) *~ core *.core

&#35;&#35;&#35;&#35;&#35;&#35;&#35; Sub-libraries
distclean: clean
        -&#36;(DEL_FILE) &#36;(TARGET) &#36;(TARGET)

FORCE:
&#35;&#35;&#35;&#35;&#35;&#35;&#35; Compile
testmp.o: testmp.cpp
&#35;&#35;&#35;&#35;&#35;&#35;&#35; Install
install:  
uninstall:  
 楼主| 发表于 2007-9-18 00:56:49 | 显示全部楼层

Start using OpenMP

[这个贴子最后由newcfd在 2007/09/18 01:04am 第 2 次编辑]

The make file is generated by Qt. A few things need to be done in order to compile the code.
  1. add -fopenmp to CFLAGS and CXXFLAGS. Qt can not add it.
  2. add -lgomp to LIBS
  3. export LD_LIBRARY_PATH=&#36;LD_LIBRARY_PATH:/usr/local/lib
    ==>/usr/local/lib has OpenMP libraries
  4. export OMP_NUM_THREADS=4
  I have a Intel Core 2 QUAD e6600 CPU. Therefore I set OMP_NUM_THREADS=4.
If you have dual core CPU, you can set OMP_NUM_THREADS=2.
  OpenMP can also be used with MPI.[br][br][以下内容由 newcfd 在 2007年09月18日 01:01am 时添加] [br]
Normally Linux has Qt in it as free. To generate a make file. You can do the following in your working directory
1. qmake -project ==> create a .pro file
2. qmake ==> create Makefile
Then you can type make to compile your code. In order to run Qt, you need to add the path of Qt, like
export QTDIR=/opt/qt/qt-3.3.8
export LD_LIBRARY_PATH=&#36;LD_LIBRARY_PATH:&#36;QTDIR/lib:/usr/local/lib
发表于 2007-10-29 13:09:45 | 显示全部楼层

Start using OpenMP

openmp的安装文件贴完了,继续
期待继续写使用过程
 楼主| 发表于 2007-11-13 02:54:28 | 显示全部楼层

Start using OpenMP

Top 15 Mistakes in OpenMP
Correctness Mistakes
   1. Access to shared variables not protected
   2. Use of locks without flush (as of OpenMP 2.5, this is no longer a mistake)
   3. Read of shared variable without obeying the memory model
   4. Forget to mark private variables as such
   5. Use of ordered clause without ordered construct
   6. Declare loop variable in for-construct as shared
   7. Forget to put down for in &#35;pragma omp parallel for
   8. Try to change the number of threads in a parallel region, after it has been started already
   9. omp_unset_lock() called from non-owner thread
  10. Attempt to change loop variable while in &#35;pragma omp for
Performance Mistakes
   1. Use of critical when atomic would be sufficient
   2. Put too much work inside critical region
   3. Use of orphaned construct outside parallel region
   4. Use of unnecessary flush
   5. Use of unnecessary critical
 楼主| 发表于 2007-11-13 03:03:05 | 显示全部楼层

Start using OpenMP

OpenMP Does Not Scale - Or Does It?
ScaleWhile at the Parco-Conference two weeks ago, I had the pleasure to meet Ruud van der Pas again. He is a Senior Staff Engineer at Sun Microsystems and gave a very enlightening talk called Getting OpenMP Up To Speed. What I would like to post about is not the talk itself (although it contains some material that I wanted to write about here for a long time), but about the introduction he used to get our attention. He used an imaginary conversation, which I am reprinting here with his permission. Only one part of the conversation is shown, but it’s pretty easy to fill in the other one:
    Do you mean you wrote a parallel program, using OpenMP and it doesn’t perform?
    I see. Did you make sure the program was fairly well optimized in sequential mode?
    Oh. You didn’t. By the way, why do you expect the program to scale?
    Oh. You just think it should and used all the cores. Have you estimated the speed up using Amdahl’s Law?
    No, this law is not a new European Union environmental regulation. It is something else.
    I understand. You can’t know everything. Have you at least used a tool to identify the most time consuming parts in your program?
    Oh. You didn’t. You just parallelized all loops in the program. Did you try to avoid parallelizing innermost loops in a loop nest?
    Oh. You didn’t. Did you minimize the number of parallel regions then?
    Oh. You didn’t. It just worked fine the way it was. Did you at least use the nowait clause to minimize the use of barriers?
    Oh. You’ve never heard of a barrier. Might be worth to read up on. Do all processors roughly perform the same amount of work?
    You don’t know, but think it is okay. I hope you’re right. Did you make optimal use of private data, or did you share most of it?
    Oh. You didn’t. Sharing is just easier. I see. You seem to be using a cc-NUMA system. Did you take that into account?
    You’ve never heard of that. That is unfortunate. Could there perhaps be any false sharing affecting performance?
    Oh. Never heard of that either. May come handy to learn a little more about both. So, what did you do next to address the performance ?
    Switched to MPI. Does that perform better then?
    Oh. You don’t know. You’re still debugging the code.
What a great way to start a talk on performance issues with OpenMP, don’t you think? And he manages to pack some of the most important problems while optimizing not only OpenMP-programs, but all parallel programs into a tiny introduction. At the end of his talk, he continued the imaginary conversation as follows:
    While we’re still waiting for your MPI debug run to finish, I want to ask you whether you found my information useful.
    Yes, it is overwhelming. I know.
    And OpenMP is somewhat obscure in certain areas. I know that as well.
    I understand. You’re not a Computer Scientist and just need to get your scientific research done.
    I agree this is not a good situation, but it is all about Darwin, you know. I’m sorry, it is a tough world out there.
    Oh, your MPI job just finished! Great.
    Your program does not write a file called ‘core’ and it wasn’t there when you started the program?
    You wonder where such a file comes from? Let’s get a big and strong coffee first.
I am sure the MPI-crowd doesn’t really approve this ending, but I found the talk way more entertaining than the usual talks at conferences. Of course he is teasing, of course he is exaggerating, but that’s OK when you are a presenter and want to get your point across. Of course it also helps to put a smile on your face so your audience knows you are not a die-hard fanatic. :smile:
By the way, this was not really the end of Ruud’s talk, he went on just a tiny bit further to pitch a new book on OpenMP, for which he is an author. Called Using OpenMP, this is a book I have been looking forward to for a while (and not just because the main author, Prof. Barbara Chapman is the second advisor for my thesis). Maybe I can finally add a real recommendation for a book on OpenMP to my list of recommended books on parallel programming.
 楼主| 发表于 2007-11-13 03:51:16 | 显示全部楼层

Start using OpenMP

The most common mistake: parallelizing loops that are too fine grained or consume a lot of bandwidth and contribute negatively to overall scaling. Even very experienced OpenMP programmers make this mistake. Programmers must understand the performance and scaling of every parallel region to avoid such mistakes. This is one of the worst mistakes because people are just shooting themselves in the foot.
 楼主| 发表于 2007-11-13 04:11:29 | 显示全部楼层

Start using OpenMP

//first test case: you may not be able to see the advantage of OpenMP. This is the code you should use in sum++ and sum--
&#35;include <iostream>
&#35;include <omp.h>
using namespace std;
&#35;define N 200000000
int main()
{
    double a[N], sum = 0.0;
    long i;
   
    for( i = 0; i < N; ++i )
    {
        a = i;
    }
   
&#35;pragma omp parallel for reduction(+:sum)
    for( i = 0; i < N; ++i )
    {
        sum += a[ i ];
    }
    cout << " sum " << sum << endl;
   
    return 0;
}
您需要登录后才可以回帖 登录 | 注册

本版积分规则

快速回复 返回顶部 返回列表