Possible concurrency error in Network::connect

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Possible concurrency error in Network::connect

Marco Barbosa

Hi,

I believe I may have run into a concurrency error in the Network::connect method. If many calls to Network::connect are made in a program that simply sends a Bottle through a port, it will eventually come to a deadlock. I've tested this in two different computers with YARP's head cvs revision and with both ACE 5.6.7 and 5.6.8 (released a couple of days ago). In both machines the results were similar.

I attach example code that demonstrates the problem. Just decompress, compile, run the "echo" program in one shell and the "producer" in another. "producer" is sending Bottles to the "echo" program which is sending them back. Before each write to the port a Network::connect is called in order to establish the connection between the opened ports. Besides the first time it is run, it is being called to connect already connected ports. I know this is not the optimal way of managing connections between ports, but the program coming to a deadlock is not the behavior one would expect. I would expect the code to work, probably with a greater overhead to send the Bottles because of the repeated attempts to connect already connected ports.


Cheers!

Marco Barbosa

Instituto de Sistemas e Robótica - Lisboa
Instituto Superior Técnico - Torre Norte
Av. Rovisco Pais, 1
1049-001 Lisboa
Portugal


------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
Robotcub-hackers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/robotcub-hackers

yarp_high_speed_test.tgz (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Possible concurrency error in Network::connect

paulfitz
Administrator
Marco Barbosa wrote:

>
> Hi,
>
> I believe I may have run into a concurrency error in the
> Network::connect method. If many calls to Network::connect are made in
> a program that simply sends a Bottle through a port, it will
> eventually come to a deadlock. I've tested this in two different
> computers with YARP's head cvs revision and with both ACE 5.6.7 and
> 5.6.8 (released a couple of days ago). In both machines the results
> were similar.
>
> I attach example code that demonstrates the problem. Just decompress,
> compile, run the "echo" program in one shell and the "producer" in
> another. "producer" is sending Bottles to the "echo" program which is
> sending them back. Before each write to the port a Network::connect is
> called in order to establish the connection between the opened ports.
> Besides the first time it is run, it is being called to connect
> already connected ports. I know this is not the optimal way of
> managing connections between ports, but the program coming to a
> deadlock is not the behavior one would expect. I would expect the code
> to work, probably with a greater overhead to send the Bottles because
> of the repeated attempts to connect already connected ports.

Hi Marco,

Thanks for the test-case.  I see one possible problem - the first thing
Network::connect does is remove any pre-existing connection between the
source and destination port (since it could be the wrong kind of
connection or in an undesired state).  So in your scenario, a connection
is continually being destroyed and recreated.  As far as YARP is
concerned, this shouldn't be a problem, but in the background a lot of
sockets in the TIME_WAIT state will be building up (it takes time for
TCP sockets to be officially declared dead and reusable).  (If you use a
unixy system, check "netstat" while your programs are running to see
what I mean).

You might want to check you're not hitting a system limit there.  On my
machine (debian unstable) your test case runs to the 10000 loop limit
without a problem, unfortunately, so I can't see what happens.  Do you
get any insight from running your programs as:
  YARP_VERBOSE=1 ./echo
and
  YARP_VERBOSE=1 ./producer
(windows equivalent: set environment variable YARP_VERBOSE to 1)?

Best,
Paul


------------------------------------------------------------------------------
_______________________________________________
Robotcub-hackers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/robotcub-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Possible concurrency error in Network::connect

Marco Barbosa
Hi Paul,

Paul Fitzpatrick wrote
Thanks for the test-case.  I see one possible problem - the first thing Network::connect does is remove any pre-existing connection between the source and destination port (since it could be the wrong kind of connection or in an undesired state).  So in your scenario, a connection is continually being destroyed and recreated.  As far as YARP is concerned, this shouldn't be a problem, but in the background a lot of sockets in the TIME_WAIT state will be building up (it takes time for TCP sockets to be officially declared dead and reusable).  (If you use a unixy system, check "netstat" while your programs are running to see what I mean).
You might want to check you're not hitting a system limit there.  On my machine (debian unstable) your test case runs to the 10000 loop limit without a problem, unfortunately, so I can't see what happens.
Indeed I do get a lot of TIME_WAIT ports, but I don't think I'm not bhitting any limit. In fact, event though the example loops for 10000 iterations, I never get over the 1000 mark. My current ephemeral port range is:

# cat /proc/sys/net/ipv4/ip_local_port_range
32768   61000

I have witnessed deadlocks with as little as 60 ports in the TIME_WAIT state.

  Do you get any insight from running your programs as:
 YARP_VERBOSE=1 ./echo
and
 YARP_VERBOSE=1 ./producer
(windows equivalent: set environment variable YARP_VERBOSE to 1)?
I have no clue what to look for, but one thing I've noticed is that there are different behaviors depending on the carrier used. Shmem seems to be the fastest to crash (i.e. less loop iterations needed), followed by tcp and then udp.

I attach copies of the outputs of both programs using the different carriers.

The shmem case is probably the most interesting since the echo program gets a segfault. Here is the backtrace:

#0  0xb7a61845 in __lll_timedlock_wait () from /lib/libpthread.so.0
#1  0xb7a5db72 in _L_timedlock_132 () from /lib/libpthread.so.0
#2  0xb7a5d322 in pthread_mutex_timedlock () from /lib/libpthread.so.0
#3  0xb7dc3d6b in ACE_OS::mutex_lock () from /usr/lib/libACE.so.5.6.8
#4  0xb7eedffd in ACE_Mutex::acquire (this=0xb61041f8, tv=@0xb6a46e7c) at /usr/include/ace/Mutex.inl:89
#5  0xb7eed9ea in yarp::os::impl::ShmemInputStreamImpl::read (this=0xb61028c0, b=@0xb6a46ee8)
    at /home/mafb/software/yarp2/src/libYARP_OS/src/ShmemInputStream.cpp:194
#6  0xb7efd96a in yarp::os::impl::ShmemHybridStream::read (this=0xb6102828, b=@0xb6a46ee8)
    at /home/mafb/software/yarp2/src/libYARP_OS/include/yarp/os/impl/ShmemHybridStream.h:64
#7  0xb7ec8e2c in yarp::os::impl::InputStream::read (this=0xb610282c, b=@0xb6102608, offset=0, len=8)
    at /home/mafb/software/yarp2/src/libYARP_OS/include/yarp/os/impl/InputStream.h:45
#8  0xb7ecfd49 in yarp::os::impl::NetType::readFull (is=@0xb610282c, b=@0xb6102608)
    at /home/mafb/software/yarp2/src/libYARP_OS/src/NetType.cpp:50
#9  0xb7efbf62 in yarp::os::impl::Protocol::defaultExpectAck (this=0xb61025f0)
    at /home/mafb/software/yarp2/src/libYARP_OS/include/yarp/os/impl/Protocol.h:249
#10 0xb7efc183 in yarp::os::impl::AbstractCarrier::expectAck (this=0xb6102800, proto=@0xb61025f0)
    at /home/mafb/software/yarp2/src/libYARP_OS/include/yarp/os/impl/AbstractCarrier.h:119
#11 0xb7f1c94a in yarp::os::impl::Protocol::expectAck (this=0xb61025f0)
    at /home/mafb/software/yarp2/src/libYARP_OS/include/yarp/os/impl/Protocol.h:241
---Type <return> to continue, or q <return> to quit---
#12 0xb7f1db94 in yarp::os::impl::Protocol::write (this=0xb61025f0, writer=@0xb6a470d0)
    at /home/mafb/software/yarp2/src/libYARP_OS/include/yarp/os/impl/Protocol.h:429
#13 0xb7f06488 in yarp::os::impl::PortCoreOutputUnit::sendHelper (this=0xb6104220)
    at /home/mafb/software/yarp2/src/libYARP_OS/src/PortCoreOutputUnit.cpp:292
#14 0xb7f075a8 in yarp::os::impl::PortCoreOutputUnit::run (this=0xb6104220)
    at /home/mafb/software/yarp2/src/libYARP_OS/src/PortCoreOutputUnit.cpp:75
#15 0xb7f29a54 in theExecutiveBranch (args=0xb6104220) at /home/mafb/software/yarp2/src/libYARP_OS/src/ThreadImpl.cpp:45
#16 0xb7dc601e in ACE_OS_Thread_Adapter::invoke () from /usr/lib/libACE.so.5.6.8
#17 0xb7d7ee21 in ace_thread_adapter () from /usr/lib/libACE.so.5.6.8
#18 0xb7a5b175 in start_thread () from /lib/libpthread.so.0
#19 0xb7b3edae in clone () from /lib/libc.so.6



Related to these problems may be the fact that the regression tests are failing on my machine. "make test" blocks on the StampTest. I attach the outputs of
harness_os verbose regression StampTest and harness_os verbose regression. They may give you a hint of what is going on.

Are there any further tests I can make to better understand the problem?

Cheers,

Marco Barbosa

Instituto de Sistemas e Robótica - Lisboa
Instituto Superior Técnico - Torre Norte
Av. Rovisco Pais, 1
1049-001 Lisboa
Portugal




yarp(b7a736d0): Configuration file: /home/mafb/.yarp/conf/yarp_namespace.conf
yarp(b7a736d0): Configuration file: /home/mafb/.yarp/conf/yarp.conf
yarp(b7a736d0): name server address is tcp://127.0.0.1:10000
yarp(b7a736d0): sending to nameserver: NAME_SERVER register /in
yarp(b7a736d0): Registering tcp://localhost:10002 for /in
yarp(b7a736d0): sending to nameserver: NAME_SERVER set /in offers tcp text text_ack udp mcast shmem name_ser
yarp(b7a736d0): sending to nameserver: NAME_SERVER set /in accepts tcp text text_ack udp mcast shmem name_ser
yarp(b7a736d0): sending to nameserver: NAME_SERVER set /in ips 127.0.0.1 127.0.0.2 10.0.5.28 ::1 2001:690:2100:413:21b:24ff:fed7:b337 fe80::21b:24ff:fed7:b337%2
yarp(b7a736d0): sending to nameserver: NAME_SERVER set /in process 5931
yarp(b7a736d0): TcpFace: opening for address tcp://localhost:10002
yarp(b7a736d0): Child thread initializing
yarp(b7a72b90): Thread starting up
yarp(b7a736d0): Child thread initialized ok
yarp: Port /in active at tcp://localhost:10002
yarp(b7a736d0): sending to nameserver: NAME_SERVER register /out
yarp(b7a736d0): Registering tcp://localhost:10012 for /out
yarp(b7a736d0): sending to nameserver: NAME_SERVER set /out offers tcp text text_ack udp mcast shmem name_ser
yarp(b7a736d0): sending to nameserver: NAME_SERVER set /out accepts tcp text text_ack udp mcast shmem name_ser
yarp(b7a736d0): sending to nameserver: NAME_SERVER set /out ips 127.0.0.1 127.0.0.2 10.0.5.28 ::1 2001:690:2100:413:21b:24ff:fed7:b337 fe80::21b:24ff:fed7:b337%2
yarp(b7a736d0): sending to nameserver: NAME_SERVER set /out process 5931
yarp(b7a736d0): TcpFace: opening for address tcp://localhost:10012
yarp(b7271b90): Thread starting up
yarp(b7a736d0): Child thread initializing
yarp(b7a736d0): Child thread initialized ok
yarp: Port /out active at tcp://localhost:10012
yarp(b7a736d0): sending to nameserver: NAME_SERVER query /out
yarp(b7a736d0): creating a writer buffer
yarp(b7a736d0): /out: set envelope to 55 1.0
0 | StampTest: checking Stamp can serialize ok...
0 | StampTest: checking in text mode
0 | StampTest:   [sequence number write] passed ok
0 | StampTest:   [time stamp write] passed ok
0 | StampTest:   [sequence number read] passed ok
0 | StampTest:   [time stamp read] passed ok
0 | StampTest: checking in binary mode
0 | StampTest:   [sequence number write] passed ok
0 | StampTest:   [time stamp write] passed ok
0 | StampTest:   [sequence number read] passed ok
0 | StampTest:   [time stamp read] passed ok
0 | StampTest: checking envelopes work...
yarp(b7a736d0): Sending an onCompletion message
yarp(b7a736d0): freeing up a writer buffer
yarp(b7a736d0): /out: set envelope to 55 1.0
yarp(b7a736d0): finishing writes
yarp(b7a736d0): finished writes
yarp(b7a736d0): Sending an onCompletion message
yarp(b7a736d0): freeing up a writer buffer

0 | BottleTest: testing clear...
0 | BottleTest:   [size ok] passed ok
0 | BottleTest:   [size ok] passed ok
0 | BottleTest: testing sizes...
0 | BottleTest:   [empty bottle] passed ok
0 | BottleTest:   [add int] passed ok
0 | BottleTest:   [add string] passed ok
0 | BottleTest:   [clear] passed ok
0 | BottleTest: testing string representation...
0 | BottleTest:   [string rep] passed ok
0 | BottleTest:   [return from string rep] passed ok
0 | BottleTest: testing binary representation...
0 | BottleTest:   [type check] passed ok
0 | BottleTest:   [type check] passed ok
0 | BottleTest:   [recovery binary, length] passed ok
0 | BottleTest:   [recovery binary, integer] passed ok
0 | BottleTest:   [recovery binary, integer] passed ok
0 | BottleTest:   [type check] passed ok
0 | BottleTest:   [type check] passed ok
0 | BottleTest:   [player bug] passed ok
0 | BottleTest: testing streaming (just text mode)...
0 | BottleTest:   [to/from stream] passed ok
0 | BottleTest: testing types...
0 | BottleTest:   [hex works] passed ok
0 | BottleTest: check for bottle number 0
0 | BottleTest:   [ints] passed ok
0 | BottleTest:   [doubles] passed ok
0 | BottleTest:   [strings] passed ok
0 | BottleTest:   [arg 0] passed ok
0 | BottleTest:   [arg 1] passed ok
0 | BottleTest:   [arg 2] passed ok
0 | BottleTest:   [arg 3] passed ok
0 | BottleTest:   [arg 4] passed ok
0 | BottleTest: check for bottle number 1
0 | BottleTest:   [ints] passed ok
0 | BottleTest:   [doubles] passed ok
0 | BottleTest:   [strings] passed ok
0 | BottleTest:   [arg 0] passed ok
0 | BottleTest:   [arg 1] passed ok
0 | BottleTest:   [arg 2] passed ok
0 | BottleTest:   [arg 3] passed ok
0 | BottleTest:   [arg 4] passed ok
0 | BottleTest: check for bottle number 2
0 | BottleTest:   [ints] passed ok
0 | BottleTest:   [doubles] passed ok
0 | BottleTest:   [strings] passed ok
0 | BottleTest:   [arg 0] passed ok
0 | BottleTest:   [arg 1] passed ok
0 | BottleTest:   [arg 2] passed ok
0 | BottleTest:   [arg 3] passed ok
0 | BottleTest:   [arg 4] passed ok
0 | BottleTest: testing lists...
0 | BottleTest:   [list test 1] passed ok
0 | BottleTest:   [list test 2] passed ok
0 | BottleTest:   [list test 3] passed ok
0 | BottleTest: bot3 is (1 2) 4
0 | BottleTest:   [construction test 1] passed ok
0 | BottleTest:   [construction test 2] passed ok
0 | BottleTest:   [construction test 3] passed ok
0 | BottleTest:   [construction test 4] passed ok
0 | BottleTest: testing Value interface...
0 | BottleTest:   [type check] passed ok
0 | BottleTest:   [type check] passed ok
0 | BottleTest:   [type check] passed ok
0 | BottleTest:   [can get sublist] passed ok
0 | BottleTest:   [type check] passed ok
0 | BottleTest:   [type check] passed ok
0 | BottleTest:   [null type check] passed ok
0 | BottleTest: testing equality...
0 | BottleTest:   [A!=B] passed ok
0 | BottleTest:   [B!=C] passed ok
0 | BottleTest:   [A==C] passed ok
0 | BottleTest: testing range...
0 | BottleTest:   [subrange] passed ok
0 | BottleTest:   [subrange] passed ok
0 | BottleTest:   [self copy] passed ok
0 | BottleTest: testing find...
0 | BottleTest:   [seek key] passed ok
0 | BottleTest:   [seek key] passed ok
0 | BottleTest:   [seek absent key] passed ok
0 | BottleTest: testing vocab...
0 | BottleTest:   [plausible parse] passed ok
0 | BottleTest:   [vocab present] passed ok
0 | BottleTest:   [vocab match] passed ok
0 | BottleTest: testing blob...
0 | BottleTest:   [plausible parse] passed ok
0 | BottleTest:   [blob present] passed ok
0 | BottleTest:   [blob length] passed ok
0 | BottleTest:   [blob match] passed ok
0 | BottleTest: testing white space behavior...
0 | BottleTest:   [ok with tab] passed ok
0 | BottleTest:   [pre-tab ok] passed ok
0 | BottleTest:   [post-tab ok] passed ok
0 | BottleTest: checking pasa problem with lists missing last element...
0 | BottleTest:   [newline test checks out] passed ok
0 | BottleTest: testing standard compliance...
0 | BottleTest:   [exact number of integers, plus type/count] passed ok
0 | BottleTest:   [nested example] passed ok
0 | BottleTest: testing nesting detection...
0 | Botyarp(b7b976d0): Child thread initializing
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initialized ok
yarp(b7b976d0): Child thread initializing
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7395b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread shutting down
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7395b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread shutting down
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7395b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread shutting down
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7395b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread shutting down
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7395b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread shutting down
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7395b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread shutting down
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7395b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread shutting down
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7395b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Child thread initializing
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initialized ok
yarp(b7b976d0): Child thread initializing
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread shutting down
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Child thread initializing
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initialized ok
yarp(b7b976d0): Child thread initializing
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7395b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread shutting down
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7395b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread shutting down
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b976d0): Child thread initializing
yarp(b6b94b90): Thread starting up
yarp(b7b976d0): Child thread initialized ok
yarp(b7395b90): Thread shutting down
yarp(b6b94b90): Thread shutting down
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
tleTest:   [incomplete] passed ok
0 | BottleTest:   [complete] passed ok
0 | BottleTest: testing reread specialization is not broken...
0 | BottleTest:   [length check] passed ok
0 | BottleTest:   [type check] passed ok
0 | BottleTest:   [content check] passed ok
0 | BottleTest:   [length check] passed ok
0 | BottleTest:   [type check] passed ok
0 | BottleTest:   [content check] passed ok
0 | BottleTest: testing special characters...
0 | BottleTest:   [paths starting with a decimal] passed ok
0 | BottleTest:   [roundtripping quotes newline etc] passed ok
0 | BottleTest: testing append...
0 | BottleTest:   [add two bottles] passed ok
0 | BottleTest: testing stack functionality...
0 | BottleTest:   [popping double] passed ok
0 | BottleTest:   [bottle size decreased after pop] passed ok
0 | BottleTest:   [popping list and nested int] passed ok
0 | BottleTest:   [popping string] passed ok
0 | BottleTest:   [empty bottle pops null] passed ok
0 | BottleTest:   [bottle is empty after popping] passed ok
0 | BottleTest: no problems reported
0 | StringTest: testing null insertion
0 | StringTest:   [length with internal null] passed ok
0 | StringTest:   [null is there] passed ok
0 | StringTest:   [after null] passed ok
0 | StringTest: no problems reported
0 | AddressTest: checking string representation
0 | AddressTest:   [string rep example] passed ok
0 | AddressTest: checking address copy
0 | AddressTest:   [string rep example] passed ok
0 | AddressTest:   [invalid source] passed ok
0 | AddressTest:   [invalid copy] passed ok
0 | AddressTest: checking Contact wrapper
0 | AddressTest:   [good invalid] passed ok
0 | AddressTest:   [invalid source] passed ok
0 | AddressTest:   [invalid conversion] passed ok
0 | AddressTest:   [invalid copy] passed ok
0 | AddressTest: no problems reported
0 | StringInputStreamTest: test reading...
0 | StringInputStreamTest:   [len of first read] passed ok
0 | StringInputStreamTest:   [first read] passed ok
0 | StringInputStreamTest:   [the space] passed ok
0 | StringInputStreamTest:   [len of second read] passed ok
0 | StringInputStreamTest:   [second read] passed ok
0 | StringInputStreamTest: no problems reported
0 | TimeTest: testing delay (there will be a short pause)...
0 | TimeTest: delay was late(+) or early(-) by 0 ms
0 | TimeTest:   [delay for 0.75 seconds] passed ok
0 | TimeTest: no problems reported
0 | PortCommandTest: testing text-mode writing...
0 | PortCommandTest:   [basic data command] passed ok
0 | PortCommandTest:   [connect command] passed ok
0 | PortCommandTest: testing text-mode reading...
0 | PortCommandTest:   [basic data command] passed ok
0 | PortCommandTest: no problems reported
0 | StringOutputStreamTest: testing writing...
0 | StringOutputStreamTest:   [single write] passed ok
0 | StringOutputStreamTest:   [multiple writes] passed ok
0 | StringOutputStreamTest: no problems reported
0 | StreamConnectionReaderTest: testing reading...
0 | StreamConnectionReaderTest:   [one line] passed ok
0 | StreamConnectionReaderTest: no problems reported
0 | BufferedConnectionWriterTest: testing writing...
0 | BufferedConnectionWriterTest:   [two line writes] passed ok
0 | BufferedConnectionWriterTest: no problems reported
0 | ThreadTest: testing minimal thread functions to check for mem leakage...
0 | ThreadTest: ...done
0 | ThreadTest: testing cross-thread synchronization...
0 | ThreadTest: starting threads ...
0 | ThreadTest:   [thread count] passed ok
0 | ThreadTest: ... done threads
0 | ThreadTest:   [thread event counts] passed ok
0 | ThreadTest:   [thread event counts] passed ok
0 | ThreadTest: testing isRunning function
0 | ThreadTest:   [thread is running] passed ok
0 | ThreadTest:   [thread quit] passed ok
0 | ThreadTest: done
0 | ThreadTest: testing start/stop
0 | ThreadTest:   [not active] passed ok
0 | ThreadTest:   [active] passed ok
0 | ThreadTest:   [not active] passed ok
0 | ThreadTest:   [onStop was called] passed ok
0 | ThreadTest: done
0 | ThreadTest: Checking init/release synchronization
0 | ThreadTest: Starting thread... thread will wait 0.5 second
0 | ThreadTest:   [Syarp(b7b96b90): Thread shutting down
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Child thread did not initialize ok
yarp: Child thread did not start: Success
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b96b90): Thread starting up
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread did not initialize ok
yarp: Child thread did not start: Success
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b96b90): Thread starting up
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): Child thread initializing
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Child thread initializing
yarp(b7b96b90): Thread starting up
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Child thread did not initialize ok
yarp: Child thread did not start: Success
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Child thread did not initialize ok
yarp: Child thread did not start: Success
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
yarp(b7b96b90): Thread shutting down
yarp(b7b976d0): Thread being deleted
yarp(b7b976d0): TextCarrier::expectSenderSpecifier
yarp(b7b976d0): Sending a message on connection /out->text->/in
yarp(b7b976d0): Registering tcp://localhost:9999 for /foo
yarp(b7b976d0): Configuration file: /home/mafb/.yarp/conf/yarp_namespace.conf
yarp(b7b976d0): Configuration file: /home/mafb/.yarp/conf/yarp.conf
yarp(b7b976d0): name server address is tcp://127.0.0.1:10000
yarp(b7b976d0): sending to nameserver: NAME_SERVER register /foo2 tcp localhost 9999
yarp(b7b976d0): Registering tcp://localhost:9999 for /foo2
yarp(b7b976d0): sending to nameserver: NAME_SERVER set /foo2 offers tcp text text_ack udp mcast shmem name_ser
yarp(b7b976d0): sending to nameserver: NAME_SERVER set /foo2 accepts tcp text text_ack udp mcast shmem name_ser
yarp(b7b976d0): sending to nameserver: NAME_SERVER set /foo2 ips 127.0.0.1 127.0.0.2 10.0.5.28 ::1 2001:690:2100:413:21b:24ff:fed7:b337 fe80::21b:24ff:fed7:b337%2
yarp(b7b976d0): sending to nameserver: NAME_SERVER set /foo2 process 5955
yarp(b7b976d0): sending to nameserver: NAME_SERVER query /foo2
yarp(b7b976d0): sending to nameserver: NAME_SERVER query /bar2
yarp(b7b976d0): sending to nameserver: NAME_SERVER register /foo2 tcp localhost 9999
yarp(b7b976d0): Registering tcp://localhost:9999 for /foo2
yarp(b7b976d0): sending to nameserver: NAME_SERVER set /foo2 offers tcp text text_ack udp mcast shmem name_ser
yarp(b7b976d0): sending to nameserver: NAME_SERVER set /foo2 accepts tcp text text_ack udp mcast shmem name_ser
yarp(b7b976d0): sending to nameserver: NAME_SERVER set /foo2 ips 127.0.0.1 127.0.0.2 10.0.5.28 ::1 2001:690:2100:413:21b:24ff:fed7:b337 fe80::21b:24ff:fed7:b337%2
yarp(b7b976d0): sending to nameserver: NAME_SERVER set /foo2 process 5955
yarp(b7b976d0): sending to nameserver: NAME_SERVER query /junk
yarp(b7b976d0): sending to nameserver: NAME_SERVER query /foo2
yarp(b7b976d0): sending to nameserver: NAME_SERVER query /many/foo/0
yarp(b7b976d0): sending to nameserver: NAME_SERVER query /many/foo/1
yarp(b7b976d0): sending to nameserver: NAME_SERVER query /many/foo/2
yarp(b7b976d0): sending to nameserver: NAME_SERVER query /many/foo/3
yarp(b7b976d0): sending to nameserver: NAME_SERVER query /many/foo/4
yarp(b7b976d0): TcpFace: opening for address tcp://localhost:9999
yarp(b7b96b90): Thread starting up
yarp(b7b976d0): Child thread initializing
yarp(b7b976d0): Child thread initialized ok
tart synchronized on init] passed ok
0 | ThreadTest: Stopping thread... thread will wait 0.5 second
0 | ThreadTest:   [Stop synchronized on release] passed ok
0 | ThreadTest:   [Start synchronized on failed init] passed ok
0 | ThreadTest: done
0 | ThreadTest: Checking init failure/success notification
0 | ThreadTest:   [Thread is running] passed ok
0 | ThreadTest:   [Init success was properly notified] passed ok
0 | ThreadTest:   [Thread is not running] passed ok
0 | ThreadTest:   [Init failure was properly notified] passed ok
0 | ThreadTest: done
0 | ThreadTest: Checking init/release functions are actually called
0 | ThreadTest:   [init/release were called] passed ok
0 | ThreadTest: done
0 | ThreadTest: Checking runnable
0 | ThreadTest: Starting thread
0 | ThreadTest: Stopping thread
0 | ThreadTest:   [threadInit was called] passed ok
0 | ThreadTest:   [afterStart() was called] passed ok
0 | ThreadTest:   [thread main function was executed] passed ok
0 | ThreadTest:   [threadRelease was called] passed ok
0 | ThreadTest: done
0 | ThreadTest: testing running flag (bug #1695724)...
0 | ThreadTest:   [not running before start] passed ok
0 | ThreadTest:   [running after start] passed ok
0 | ThreadTest:   [not running after stop] passed ok
0 | ThreadTest:   [running after start] passed ok
0 | ThreadTest:   [not running after stop] passed ok
0 | ThreadTest:   [not running after thread exits] passed ok
0 | ThreadTest: no problems reported
0 | RateThreadTest: checking init failure/success notification
0 | RateThreadTest:   [thread is running] passed ok
0 | RateThreadTest:   [thread was stopped] passed ok
0 | RateThreadTest:   [init success was properly notified] passed ok
0 | RateThreadTest:   [thread stopped] passed ok
0 | RateThreadTest:   [init failure was properly notified] passed ok
0 | RateThreadTest: done
0 | RateThreadTest: Checking init/release synchronization
0 | RateThreadTest:   [synchronization on init] passed ok
0 | RateThreadTest:   [synchronization on release] passed ok
0 | RateThreadTest:   [synchronization on a failed init] passed ok
0 | RateThreadTest: done
0 | RateThreadTest: Testing runnable
0 | RateThreadTest:   [thread is running] passed ok
0 | RateThreadTest:   [thread was stopped] passed ok
0 | RateThreadTest:   [init was called] passed ok
0 | RateThreadTest:   [afterStart() was called] passed ok
0 | RateThreadTest:   [release was called] passed ok
0 | RateThreadTest: successful
0 | RateThreadTest: testing rate thread precision
0 | RateThreadTest: setting high res scheduler (this affects only windows)
0 | RateThreadTest: Thread1 requested period: 15[ms]
-->Starting rate thread: 15.00[ms]...thread quit
0 | RateThreadTest: Thread1 estimated: 15.16[ms]
0 | RateThreadTest: Thread2 requested period: 10[ms]
-->Starting rate thread: 10.00[ms]...thread quit
0 | RateThreadTest: Thread2 estimated period: 10.16[ms]
0 | RateThreadTest: Thread3 requested period: 1[ms]
-->Starting rate thread: 1.00[ms]...thread quit
0 | RateThreadTest: Thread3 estimated period: 1.16[ms]
0 | RateThreadTest: successful
0 | RateThreadTest: no problems reported
0 | ProtocolTest: trying to send a bottle across a fake stream
0 | ProtocolTest:   [text carrier header] passed ok
0 | ProtocolTest:   [text carrier response] passed ok
0 | ProtocolTest:   [added a bottle] passed ok
0 | ProtocolTest:   [data tag] passed ok
0 | ProtocolTest:   [bottle representation] passed ok
0 | ProtocolTest: no problems reported
0 | NameServerTest: checking register...
0 | NameServerTest:   [recover address] passed ok
0 | NameServerTest:   [machine name matches] passed ok
0 | NameServerTest:   [non-existent address] passed ok
0 | NameServerTest: checking client interface...
0 | NameServerTest:   [recover address] passed ok
0 | NameServerTest:   [machine name matches] passed ok
0 | NameServerTest:   [non-existent address] passed ok
0 | NameServerTest: checking dud connections don't affect memory...
0 | NameServerTest: no problems reported
0 | PortCoreTest: checking start/stop works (requires free port 9999)...
0 | PortCoreTest: there will be a small delay, stress-yarp(b7b976d0): /port: now preparing to shut down port

------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Robotcub-hackers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/robotcub-hackers

verbose_outputs.tgz (224K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Possible concurrency error in Network::connect

paulfitz
Administrator
Marco Barbosa wrote:

>>
>> You might want to check you're not hitting a system limit there.  On
>> my machine (debian unstable) your test case runs to the 10000 loop
>> limit without a problem, unfortunately, so I can't see what happens.
> Indeed I do get a lot of TIME_WAIT ports, but I don't think I'm not
> bhitting any limit. In fact, event though the example loops for 10000
> iterations, I never get over the 1000 mark. My current ephemeral port
> range is:
>
> # cat /proc/sys/net/ipv4/ip_local_port_range
> 32768   61000
>
> I have witnessed deadlocks with as little as 60 ports in the TIME_WAIT
> state.
>
>>   Do you get any insight from running your programs as:
>>  YARP_VERBOSE=1 ./echo
>> and
>>  YARP_VERBOSE=1 ./producer
>> (windows equivalent: set environment variable YARP_VERBOSE to 1)?
> I have no clue what to look for, but one thing I've noticed is that
> there are different behaviors depending on the carrier used. Shmem
> seems to be the fastest to crash (i.e. less loop iterations needed),
> followed by tcp and then udp.
>
> [snip]
>
> Related to these problems may be the fact that the regression tests
> are failing on my machine. "make test" blocks on the StampTest. I
> attach the outputs of
> harness_os verbose regression StampTest and harness_os verbose
> regression. They may give you a hint of what is going on.
>
> Are there any further tests I can make to better understand the problem?

Hi Marco,

If the regression tests are failing on your machine, we should figure
that out first.  I note that one of the crashes was in a semaphore
operation.  There are a set of users reporting problems of that kind,
when compiling against recent packaged versions of ACE on Ubuntu.

Let's do a quick sanity check.  Can you go to:
  http://eris.liralab.it/download/yarp/regression/
and download and run:
  harness_os_static_20090213

Run it as:
  ./harness_os_static_20090213 regression StampTest
and then if that works:
  ./harness_os_static_20090213 regression

Let me know how it goes.  This program is statically linked against
ACE.  If it works, then we should look more closely at your ACE setup.

There's also a recently reported issue where locale settings can affect
YARP, specifically the representation of the decimal point.  There's a
fix for this, see:

https://sourceforge.net/tracker/index.php?func=detail&aid=2526259&group_id=62418&atid=500492

Best,
Paul


------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Robotcub-hackers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/robotcub-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Possible concurrency error in Network::connect

Marco Barbosa
Paul Fitzpatrick wrote:

If the regression tests are failing on your machine, we should figure that out first.  I note that one of the crashes was in a semaphore operation.  There are a set of users reporting problems of that kind, when compiling against recent packaged versions of ACE on Ubuntu.

Let's do a quick sanity check.  Can you go to:
 http://eris.liralab.it/download/yarp/regression/
and download and run:
 harness_os_static_20090213

Run it as:
 ./harness_os_static_20090213 regression StampTest
and then if that works:
 ./harness_os_static_20090213 regression

Both of these worked ok.

Let me know how it goes.  This program is statically linked against ACE.  If it works, then we should look more closely at your ACE setup.
Given that the problem might be in the packaged ACE provided by my distribution (openSUSE), I went on to compile ACE myself. I tried the following versions and all of them gave some kind of problems when running YARP's tests: 5.6, 5.6.1, 5.6.6, 5.6.8. I noticed different behaviors (i.e. different tests passed or failed) depending on how ACE was compiled (using GNU Autoconf or the "traditional ACE/GNU Configuration"). Also none of these build passed completely on all of ACE's tests.

Finally, as I've read some days ago that someone had had success on compiling YARP with ACE 5.5.5, I tried that version and now all of YARP's tests pass!! :-) (even though some of ACE's tests fail, namely: High_Res_Timer_Test, MEM_Stream_Test, Proactor_Test and Service_Config_Test. Are any of these relevant for YARP?)

Given this setup, unfortunately, the example code I sent previously continues to fail with all the carriers (tcp, udp and shmem). :-(

Any idea what to test next?

Cheers,

Marco Barbosa

Instituto de Sistemas e Robótica - Lisboa
Instituto Superior Técnico - Torre Norte
Av. Rovisco Pais, 1
1049-001 Lisboa
Portugal



------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Robotcub-hackers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/robotcub-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Possible concurrency error in Network::connect

paulfitz
Administrator
Marco Barbosa wrote:

> Paul Fitzpatrick wrote:
>>
>> If the regression tests are failing on your machine, we should figure
>> that out first.  I note that one of the crashes was in a semaphore
>> operation.  There are a set of users reporting problems of that kind,
>> when compiling against recent packaged versions of ACE on Ubuntu.
>>
>> Let's do a quick sanity check.  Can you go to:
>>  http://eris.liralab.it/download/yarp/regression/
>> and download and run:
>>  harness_os_static_20090213
>>
>> Run it as:
>>  ./harness_os_static_20090213 regression StampTest
>> and then if that works:
>>  ./harness_os_static_20090213 regression
>>
> Both of these worked ok.
>
>> Let me know how it goes.  This program is statically linked against
>> ACE.  If it works, then we should look more closely at your ACE setup.
> Given that the problem might be in the packaged ACE provided by my
> distribution (openSUSE), I went on to compile ACE myself. I tried the
> following versions and all of them gave some kind of problems when
> running YARP's tests: 5.6, 5.6.1, 5.6.6, 5.6.8. I noticed different
> behaviors (i.e. different tests passed or failed) depending on how ACE
> was compiled (using GNU Autoconf or the "traditional ACE/GNU
> Configuration"). Also none of these build passed completely on all of
> ACE's tests.
>
> Finally, as I've read some days ago that someone had had success on
> compiling YARP with ACE 5.5.5, I tried that version and now all of
> YARP's tests pass!! :-) (even though some of ACE's tests fail, namely:
> High_Res_Timer_Test, MEM_Stream_Test, Proactor_Test and
> Service_Config_Test. Are any of these relevant for YARP?)
>
> Given this setup, unfortunately, the example code I sent previously
> continues to fail with all the carriers (tcp, udp and shmem). :-(
>
> Any idea what to test next?

Hi Marco,

I'm surprised you are seeing trouble with such a range of ACE versions.  
Is it possible there are header files from another version of ACE
lurking in the system (/usr/include/ace, /usr/local/include/ace, ...)
while compiling either ACE or YARP?

As another check to try to pin this down, I've compiled your test case
statically and put it here:

  http://eris.liralab.it/download/yarp/regression/barbosa/

For me, it runs through the 9999 iterations.  If it works for you, then
I strongly suggest you compile ACE with YARP as described here:

  http://eris.liralab.it/wiki/ACE4YARP

If it fails for you, that will be very interesting, and we can then
probe what is different about your environment.

Best,
Paul


------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Robotcub-hackers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/robotcub-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Possible concurrency error in Network::connect

Marco Barbosa

On Feb 17, 2009, at 16:46, Paul Fitzpatrick wrote:
>
> I'm surprised you are seeing trouble with such a range of ACE  
> versions.  Is it possible there are header files from another  
> version of ACE lurking in the system (/usr/include/ace, /usr/local/
> include/ace, ...) while compiling either ACE or YARP?
>

No, I don't think so. I've been compiling ace and copying it to a  
folder i created (/opt/ace) to which I set the ACE_ROOT env.  
variable. After having compiled YARP with it I just rm -rf that  
folder and do a "make clean && rm CMakeCache.txt" inside YARP's  
folder. I believe this doesn't leave any trace of the previous ACE  
"installation".

> As another check to try to pin this down, I've compiled your test  
> case statically and put it here:
>
>  http://eris.liralab.it/download/yarp/regression/barbosa/
>
> If it fails for you, that will be very interesting, and we can then  
> probe what is different about your environment.
>

It does fail! :-O But probably it has to do with me using my own yarp  
server. Could you compile the yarp server statically so I would check  
if it has anything to do with the failure?

Cheers,

Marco Barbosa

Instituto de Sistemas e Robótica - Lisboa
Instituto Superior Técnico - Torre Norte
Av. Rovisco Pais, 1
1049-001 Lisboa
Portugal






------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Robotcub-hackers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/robotcub-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Possible concurrency error in Network::connect

paulfitz
Administrator

>> As another check to try to pin this down, I've compiled your test
>> case statically and put it here:
>>
>>  http://eris.liralab.it/download/yarp/regression/barbosa/
>>
>> If it fails for you, that will be very interesting, and we can then
>> probe what is different about your environment.
>>
>
> It does fail! :-O

Super!  Now we're getting somewhere...

> But probably it has to do with me using my own yarp server. Could you
> compile the yarp server statically so I would check if it has anything
> to do with the failure?

You can just use the statically linked harness_os program:

  http://eris.liralab.it/download/yarp/regression/harness_os_static_20090213

Harness_os is a regular "yarp" executable with extra tests linked in.

So:

  ./harness_os_static_20090213 server

Is the same as running "yarp server" (and all the other yarp commands
are available too).

At this stage, Marco, it might be better to file a bug report on the
YARP sourceforge site:

  https://sourceforge.net/tracker/?func=add&group_id=62418&atid=500492

And move further conversation there.  That way people can track the
problem if they are interested, or tune out.

Cheers,
Paul


------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Robotcub-hackers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/robotcub-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Possible concurrency error in Network::connect

Marco Barbosa

On Feb 17, 2009, at 17:54, Paul Fitzpatrick wrote:

> At this stage, Marco, it might be better to file a bug report on  
> the YARP sourceforge site:

For those of you interested in following the discussion on this  
problem, it will proceed here:

http://sourceforge.net/tracker/?
func=detail&atid=500492&aid=2636658&group_id=62418

Cheers,

Marco Barbosa

Instituto de Sistemas e Robótica - Lisboa
Instituto Superior Técnico - Torre Norte
Av. Rovisco Pais, 1
1049-001 Lisboa
Portugal






------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Robotcub-hackers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/robotcub-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Possible concurrency error in Network::connect

Lorenzo Natale-2
Ok. Thanks.

Just to clarify, there were two problems:

- yarp regression tests failing on your system (Linux, which distribution?)
- the concurrency error in Network::connect()

The could be related of course, so I am wondering if you managed to
solve the first one and just concentrating on the second one (about
which the bug report on sourceforge seems to focus)

I'm asking because by checking the last posts on this it seems that YARP
does not work correctly (regression tests fail) with ACE 5.6, 5.6.1,
5.6.6 and 5.6.8, but it does with ACE 5.5.5 (BTW I'm having similar
problems on Ubuntu after upgrading to 5.6, and I'm now downgrading ACE
to 5.5.5).

Have you guys confirmed that this is so? Or maybe you managed to solve
this problems by compiling ACE in a certain way?

Thanks,
Lorenzo

Marco Barbosa wrote:

> On Feb 17, 2009, at 17:54, Paul Fitzpatrick wrote:
>
>> At this stage, Marco, it might be better to file a bug report on  
>> the YARP sourceforge site:
>
> For those of you interested in following the discussion on this  
> problem, it will proceed here:
>
> http://sourceforge.net/tracker/?
> func=detail&atid=500492&aid=2636658&group_id=62418
>
> Cheers,
>
> Marco Barbosa
>
> Instituto de Sistemas e Robótica - Lisboa
> Instituto Superior Técnico - Torre Norte
> Av. Rovisco Pais, 1
> 1049-001 Lisboa
> Portugal
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
> -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
> -Strategies to boost innovation and cut costs with open source participation
> -Receive a $600 discount off the registration fee with the source code: SFAD
> http://p.sf.net/sfu/XcvMzF8H
> _______________________________________________
> Robotcub-hackers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/robotcub-hackers
>


--
Istituto Italiano di Tecnologia
Lorenzo Natale, PhD
[hidden email]
via Morego, 30 16163 Genova
Ph: +39 010 71781400
Fax: +39 010 7170817
www.iit.it


------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Robotcub-hackers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/robotcub-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Possible concurrency error in Network::connect

Marco Barbosa

On Feb 27, 2009, at 7:58, Lorenzo Natale wrote:

> Ok. Thanks.
>
> Just to clarify, there were two problems:
>
> - yarp regression tests failing on your system (Linux, which  
> distribution?)

openSUSE 11.0 32-bit and openSUSE 11.1 64-bit (the kernel versions  
are on the bug report too).

> - the concurrency error in Network::connect()
>
> The could be related of course, so I am wondering if you managed to  
> solve the first one and just concentrating on the second one (about  
> which the bug report on sourceforge seems to focus)
>
> I'm asking because by checking the last posts on this it seems that  
> YARP does not work correctly (regression tests fail) with ACE 5.6,  
> 5.6.1, 5.6.6 and 5.6.8, but it does with ACE 5.5.5 (BTW I'm having  
> similar problems on Ubuntu after upgrading to 5.6, and I'm now  
> downgrading ACE to 5.5.5).
>
> Have you guys confirmed that this is so? Or maybe you managed to  
> solve this problems by compiling ACE in a certain way?
>


Yes, by compiling ACE 5.5.5 (using the "traditional configuration"  
and not automake) I was able to get YARP passing all regression tests.

I'm now going to install debian in a PC to see if I can, just like  
Paul, have everything working well. Then the hope is to detect some  
kind of difference between the systems.


Cheers,

Marco Barbosa

Instituto de Sistemas e Robótica - Lisboa
Instituto Superior Técnico - Torre Norte
Av. Rovisco Pais, 1
1049-001 Lisboa
Portugal






------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Robotcub-hackers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/robotcub-hackers