sockets - Open MPI and Boost MPI using too many file handles -

- February 15, 2013

I am running a project using Boost MPI (1.55) on open MPI (1.6.1) on a compute cluster .

Our cluster has nodes in which there are 64 CPUs and we create a single MPI process on each of our most of the communication between individual processes, each has a series of irecv () requests ( For different tags) and sends are blocked from the use of sending.

Our problem is that after a short time of processing (usually within 10 minutes), we are getting this error which ends the program:

  [Btl_tcp_component.c: 1114: mca_btl_tcp_component_accept_handler] Accepted () failed: too many open files in the system (23).    Near debugging shows that this network sockets are taking these files, and we're killing our OS range of 65536 files, most of these conditions are in "TIME_WAIT" Which is apparently TCP (usually to hold any late packet) for 60 seconds after the closure of a socket (usually). I was under the impression that the Open MPI did not close the sockets and kept N ^ 2 sockets open so that all the processes could talk to each other. Obviously 65536 is beyond 64 ^ 2 (The most common reason for this error associated with MPI is that the file range is less than N ^ 2) and most of them were chairs in a recently closed position .  
 Our C ++ code is too big to fit here, but I have written a simplified version of some of them to show at least my implementation and see that there are problems with our technology or not. Is there anything in the use of our MPI that will open OpenMPi to close and reopen a lot of sockets?  
  Namespace mpi = boost :: mpi; MPI :: Communicator World; Bool poll (ourDataType data, mpi :: request & amp; dataReq, ourDataType2 work, mpi :: request workReq) {if (dataReq.test ()) {processData (data); // do a bunch of work data REeq = world.irecv (mpi :: any_source, DATATAG, data); Back true; } If (workReq.test ()) {int target = evaluation (work); World.Send (target, DATATAG, dowork); World.irecv (mpi :: any_source, WORKTAG, data); Back true; } return false; } Bool receiveFinish (mpi :: request finishReq) {if (finishReq.test ()) {world.Send (0, results, results); ResetSelf (); FinishReq = world.irecv (0, Finnish); Back true; } return false; } Zero run () {ourDataType data; Mpi :: Request data REeq = world.irecv (mpi :: any_source, DATATAG, data); Our datatype 2 work; Mpi :: Request workReq = world.irecv (mpi :: any_source, WORKTAG, tasks); Mpi :: Request complete REeq = world.irecv (0, Finnish); // The Root Process can make a Stop call while (! Fetishish (Finnish Reich)) {Bull doWeContinue = poll (Data, Datacrack); If (doWeContinue) {continue; } // Otherwise we do the results of other work = other work (); World.send (0, results, results); }}     
  Lots of sockets to open open MPI, but you can  poll ()  function:  
 Leaks up the request in the following part of  (workReq.test ()) {int target = assessment}; World.Send (target, DATATAG, dowork); World.irecv (mpi :: any_source, WORKTAG, data); // & lt; ------- return true; }     Request handles returned by world.irecv ()  are never saved and thus lost if only one  If the call is done on the workReq  object, then this branch will execute every time the request is complete because the test already completed completes the test of  true . Therefore you will receive many non-interceptions, which will never be waited or tested, not to mention messages sent.  
 The same problem is being passed from the value in  receiveFinish  -  finishReq  and will not affect the assignment value  run () .  
 One note: Is this really the code that you use? It appears that in the  poll ()  function you call  run ()  takes two arguments, while one shows that there are four arguments and with the default values There is no logic.   

 




  



















Get link





Facebook





X





Pinterest





Email





Other Apps




Comments





Post a Comment



Popular posts from this blog




ios - Adding an SKSpriteNode to SKScene from a child SKSpriteNode -



-



May 15, 2015








    I have a SKScene where I am adding an SKSpriteNode I have subclassed the SKSpriteNode class to create this node. In the subclass I am defining some SKActions on Sprite. What do I want to do when the end of this phantom ends the SKAction sequence, then I add a new Prayer node to this scene. How is this possible. The following code is mine:   The code for the sequence is that I'm running on the Scansprinitode subclass (TEMissileNode): -    SKAction * moveDown = [SKAction moveToY: Self.position.y - 20 Duration: 0.2]; SKAction * animation = [SKAction animateWithTextures: textures timePerFrame: time / 7]; SKAction * moveMissileProjectile = [SKAction Moving: Pointoffscreen Term: Time]; SKAction * group = [SKAction group: @ [animation, movement projectile]]; SKAction * Sequence = [SKAction Sequence: @ [Hilldown, Group, [SKAction removalFromParent]]]; [Self run action: sequence];    From the main scene, I call those actions that execute these tasks    TEMissileNode * missile = [tmissil...





Read more





Matlab transpose a table vector -



-



May 15, 2013








    An embarrassing simple question seems to be, but how can I move a Matlab table vector?    aTableT = aTable ';   I tried a standard syntax for the simple interaction of a line vector to a line vector  aTable :   ATableT = reshape (aTable, 1, Height (aTable));    and    aTableT = rot90 (aTable);    According to the head, the last time table should work for table, see. However, I get this error code:     Error type in using table / permit (line 396) Undefined function 'permute' for the input arguments.   Error in rot 90 (line 29) b = magnitude (b, [2 1 3: ndims (a)]);     NB:  fliplr  is not useful either pretty sure I have covered clear angles - any ideas? Thanks!      Try changing your table into an array, move it, then back to a table In other words, try doing this:    aTableArray = table2array (aTable); ATableT = array 2table (aTableArray. ');     I have also read the document for  rot90 , and it says that  rot90  is definitely working for tables Do, and I find you...





Read more





c# - Textbox not clickable but editable -



-



September 15, 2011








    I have a small form with 10 text boxes, I have set them in the correct tab order. I want them on my tab service I was wondering if there is a way to set the text box so that they can be selected for editing until they are tabbed. That is ... I do not want the end user to edit the text box to click on them, I just want to make them editable through tabu.      this should do the trick    public partial category poor text box: Text Box {Safe Override Zero WndProc (Ref: Message Message) {if (m.msg == (Int) WM.LBUTTONDOWN) {Return; // mount down events} base.WndProc (ref m); }}    Window message enum can be found.   How to do this without anherizing:  text box :   class EatMouseDown: netwindows {safe override zero WndProc (ref message message) {if (m.msg == (int) WM.LBUTTONDOWN) {return; } Base.WandProc (Ref M); }} Safe Override Zero Onload (EventEurge E) {base.OnLoad (e);      How to do this without any part:  left a clean part, whatever is important, this can be a buggy but it works R...





Read more

Search This Blog

Coat

sockets - Open MPI and Boost MPI using too many file handles -

Comments

Post a Comment

Popular posts from this blog

ios - Adding an SKSpriteNode to SKScene from a child SKSpriteNode -

Matlab transpose a table vector -

c# - Textbox not clickable but editable -