Multiple issues with cppuhelper/source/exc_thrower.cxx

classic Classic list List threaded Threaded
5 messages Options
Bjoern Michaelsen Bjoern Michaelsen
Reply | Threaded
Open this post in threaded view
|

Multiple issues with cppuhelper/source/exc_thrower.cxx

Hi Caolán, all,

when looking into
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/745836 which
has by now collected some 20 duplicates, I have more open questions
than answers.

Some disassembling lead to the real point of failure being
(in 3.3.2/3.4 on amd64):
239    OSL_ASSERT( xThrower.is() );
240    xThrower->throwException( exc );
   0x000000000001edd3 <+675>: mov    %rbx,%rsi
   0x000000000001edd6 <+678>: mov    (%rdi),%rax
   0x000000000001edd9 <+681>: callq  *0x18(%rax)
241 }

With an eip of 0x100000 (in the i386 bug reports) meaning the call goes
into nirvana.

Digging deeper, I found ExceptionThrower::get() on the 3.4 branch to:
a) not using a static variable for s_pThrower thus making the if always
   eval to true.
b) not even double-checking s_pThrower after aquireing the guard, making
   the whole thing pretty pointless.
c) However, I think even as is, the should not be harmful at least on
   gcc, if http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13684 is really
   fixed (and the testcode in the issue performs good on my machine).
   But what about other compilers?

Caolan has fixed this on master with commit
ure:c51c13ff92adbe1d3f22bee6d907132c48d16602, if it was broken.
Depending on how compilers handle this, it might be worth to cherrypick
to 3-4.

So unless I overlooked something there, the error really is not in the
singleton creation (at least with my gcc), leading to the cause to
possibly be something going wrong in the uno2cpp.mapInterface(..) call
just before. Does that sound plausible? I just want to make sure, before
digging even deeper into the disassembly.

Best,

Bjoern

--
https://launchpad.net/~bjoern-michaelsen


_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Caolán McNamara Caolán McNamara
Reply | Threaded
Open this post in threaded view
|

Re: Multiple issues with cppuhelper/source/exc_thrower.cxx

On Fri, 2011-05-27 at 10:53 +0200, Bjoern Michaelsen wrote:
> Hi Caolán, all,
>
> Digging deeper, I found ExceptionThrower::get() on the 3.4 branch to:
> a) not using a static variable for s_pThrower thus making the if always
>    eval to true.

heh!

> b) not even double-checking s_pThrower after aquireing the guard, making
>    the whole thing pretty pointless.
> c) However, I think even as is, the should not be harmful at least on
>    gcc, if http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13684 is really
>    fixed (and the testcode in the issue performs good on my machine).
>    But what about other compilers?

Well, if we look at the original code, given that s_pThrower is just a
normal local variable set to 0, then we *always* enter the first if, and
the first if always takes a MutexGuard from the global mutex, so we
always lock inside the first branch, which is the only one executed.

So the original code doesn't do what it apparently thought it was doing
of using a double-check lock, but ends up using a unconditional mutex
lock every time, so it should be ok on all platforms/compilers, no ?

> Caolan has fixed this on master with commit
> ure:c51c13ff92adbe1d3f22bee6d907132c48d16602, if it was broken.
> Depending on how compilers handle this, it might be worth to cherrypick
> to 3-4.

I think, given the unconditional mutex lock that it should be ok as it
stands for 3-4.

> So unless I overlooked something there, the error really is not in the
> singleton creation (at least with my gcc)

That seems to be the case. The bug report's stack is from svx
autocorrect, which means that ucbhelper::cancelCommandExecution and
cppu::throwException have successfully thrown exceptions at least a
hundred times or so before the crash, so its not the case that it's e.g.
the first throw or two through the uno bridge.

> leading to the cause to
> possibly be something going wrong in the uno2cpp.mapInterface(..) call
> just before. Does that sound plausible? I just want to make sure, before
> digging even deeper into the disassembly.

Is the crash reproducible for you in any way ? It'd be much more
confident with a reproducible scenario we could run under valgrind.

As an aside, what gcc is in use ? I recall a gcc bug where in a tight
loop each exception thrown incrementally busted unwind info 16bytes at a
time

https://bugzilla.redhat.com/show_bug.cgi?id=447912
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36419

but that's an old long-fixed bug now.

C.

_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Bjoern Michaelsen Bjoern Michaelsen
Reply | Threaded
Open this post in threaded view
|

Re: Multiple issues with cppuhelper/source/exc_thrower.cxx

Hi Caolán,

On Fri, 27 May 2011 11:02:28 +0100
Caolán McNamara <[hidden email]> wrote:

> Well, if we look at the original code, given that s_pThrower is just a
> normal local variable set to 0, then we *always* enter the first if,
> and the first if always takes a MutexGuard from the global mutex, so
> we always lock inside the first branch, which is the only one
> executed.

right.
 
> So the original code doesn't do what it apparently thought it was
> doing of using a double-check lock, but ends up using a unconditional
> mutex lock every time, so it should be ok on all platforms/compilers,
> no ?

If the stuff protected protected by the mutex has no side effects, yes.
If it has, the code would be executed twice -- although neatly
serialized by the mutex. Now gcc with bug 13684 fixed should never init
the static variable twice, so even if the ctor has side effects, it
would not hurt. And the pointer assignment is atomic and has no side
effects.

> I think, given the unconditional mutex lock that it should be ok as it
> stands for 3-4.

Unless a compiler does not keep static variables threadsafe and the
ctor has side effects (I have not looked at that yet).

> Is the crash reproducible for you in any way ? It'd be much more
> confident with a reproducible scenario we could run under valgrind.

Unfortunately not. Various descriptions:
- Returning to a document after 10 hours
- Calc crash, high CPU load from other apps
- crashed while opening one writer and one calc document

Most of these just point to high load -- of course exceptions are more
likely then.
 
> As an aside, what gcc is in use ?

gcc (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2

Best,

Bjoern

--
https://launchpad.net/~bjoern-michaelsen


_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Bjoern Michaelsen Bjoern Michaelsen
Reply | Threaded
Open this post in threaded view
|

Re: Multiple issues with cppuhelper/source/exc_thrower.cxx

On Fri, 27 May 2011 12:34:01 +0200
Bjoern Michaelsen
<[hidden email]> wrote:

> Unless a compiler does not keep static variables threadsafe and the
> ctor has side effects (I have not looked at that yet).

Ah, forget that. The mutex indeed takes care of that.

Best,

Bjoern

--
https://launchpad.net/~bjoern-michaelsen


_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Caolán McNamara Caolán McNamara
Reply | Threaded
Open this post in threaded view
|

Re: Multiple issues with cppuhelper/source/exc_thrower.cxx

In reply to this post by Bjoern Michaelsen
On Fri, 2011-05-27 at 12:34 +0200, Bjoern Michaelsen wrote:

> Hi Caolán,
>
> On Fri, 27 May 2011 11:02:28 +0100
> Caolán McNamara <[hidden email]> wrote:
>  
> > So the original code doesn't do what it apparently thought it was
> > doing of using a double-check lock, but ends up using a unconditional
> > mutex lock every time, so it should be ok on all platforms/compilers,
> > no ?
>
> If the stuff protected protected by the mutex has no side effects, yes.
> If it has, the code would be executed twice -- although neatly
> serialized by the mutex. Now gcc with bug 13684 fixed should never init
> the static variable twice, so even if the ctor has side effects, it
> would not hurt. And the pointer assignment is atomic and has no side
> effects.

My thinking is that the case in gcc#13684 was a bare

static foo;

while in our 3-4 we have

MutexGuard guard(Mutex::getGlobalMutex();
static foo;

The pointer assignment (in 3-4) is irrelevant really. In 3-4 it's all
just...

MutexGuard guard(...)
static ExceptionThrower s_thrower;
return &s_thrower;

so surely the compiler-generated (potentially thread-unsafe) equivalent
to

if (!thrower_initialized)
{
    call local static's s_thrower ctor
    thrower_initialized = true;
}

is protected by the MutexGuard, only one thread can enter and initialize
the local static, the other is blocked even before the check to see if
the local static is initialized, and only gets to have a look at that
when the other thread leaves, so its fine IMO.

Anyway, I think this is an aside, seeing as at the crash location we're
minutes/hours (given that I know that gets called even during startup)
past the time of initialization of that local object.

> Most of these just point to high load -- of course exceptions are more
> likely then.

We had a rash of crash-during-startup with the first call or two through
the uno bridge would blow up, which we could trace down to disk read
failures on the file-system of the mmaped home dir chunk of memory used
by the bridge. Would be awesome to even get a dmesg or /var/log/messages
dump of the minute before/after the time of the logged crash

C.

_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice