WIP: new VCL scheduler feature branch

classic Classic list List threaded Threaded
9 messages Options
Jan-Marek Glogowski Jan-Marek Glogowski
Reply | Threaded
Open this post in threaded view
|

WIP: new VCL scheduler feature branch

Hi everybody,

finally I have a working solution, which doesn't instantly show bugs on
Linux with KDE4 (feature/new-vcl-scheduler).

This fixes my mail merge performance problem since the 5.0 merge of idle
job handling.

It completely drops the idea of separated idle and timer handling and
drops a lot of special handling and workarounds for problems in the
original code, but keeps the Idle class for convenience. I tried to list
all the revert commit ids and started to test the original bugs.

Everything is just scheduled by priority and Idles now get a very low
one per default, while they previously had the same one, then timers.
Additionally it drops the 1ms lag for tasks, which actually adds up, as
idle tasks weren't instantly scheduled.

I already fixed the two most obvious busy loop bugs in Writer with
workarounds for the general idle job handler and the statistics
collector. Instead of polling for work I would like to switch these to
be enabled on demand, but that'll definitely take more work. Currently
they are converted to timers, which check for work every few seconds,
then should switch to idle - ok 0ms timeouts, until all work is processed.

I tried to simplify the priority groups to make it easier to select the
correct one. There are still some suspicios HIGHEST ones. I had to guess
the new dependencies from the original priority of the idles, but I'm
quite sure there is some missing stuff. I would like to get some kind of
annotations to be able to build a priority tree of all of them for a
better overview.

It would be great to get feedback from people of all other platforms /
VCL backends.

Thanks for your comments and feedback

Jan-Marek

P.S. it adds a VCL python GDB script, which can be used to dump the list
of currently scheduled tasks.

P.P.S. naming is sill a mess. Sometimes it's task, sometimes event. The
scheduler class is actually the base class for the timers and idles and
should be renamed Event or Task.
_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Noel Grandin-2 Noel Grandin-2
Reply | Threaded
Open this post in threaded view
|

Re: WIP: new VCL scheduler feature branch

Nice work!

Some random low priority comments:

In "Just walk the task list once per timeout"
https://cgit.freedesktop.org/libreoffice/core/commit/?h=feature/new-vcl-scheduler&id=435e21fe0330436e76e5e053d5d5d94df734a554
The evaluate_entry label doesn't make the code any easier to read.

In "Reorganize Scheduler priority classes"
https://cgit.freedesktop.org/libreoffice/core/commit/?h=feature/new-vcl-scheduler&id=5e4361e84607fc6d7623b31630505da7c934b945
 (1) you haven't used a consistent mapping from old to new - sometimes LOW maps to HIGH_IDLE, sometimes HIGHEST maps to HIGH_IDLE. Perhaps there is a reason for this?
(2) I would call HIGH_IDLE, either PRE_RESIZE or BEFORE_RESIZE, because it's not any kind of IDLE anymore. I would call DEFAULT_IDLE just IDLE.

In "Handle all main loop and task events"
https://cgit.freedesktop.org/libreoffice/core/commit/?h=feature/new-vcl-scheduler&id=102c41c2e429bee489334361536779aa298bc181
    -    bProcessedEvent = bProcessedEvent || bScheduledEevent;
    +    bProcessedEvent |= bScheduledEevent;

In "Run Idle tasks immediatly"
https://cgit.freedesktop.org/libreoffice/core/commit/?h=feature/new-vcl-scheduler&id=a1ecb280872c5487615558f8d140a380ef3e0d36
It occurs to me that when we get an overload condition, it would be very helpful to dump the names of the current events.

_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Jan-Marek Glogowski Jan-Marek Glogowski
Reply | Threaded
Open this post in threaded view
|

Re: WIP: new VCL scheduler feature branch

Hi Noel,

thanks for the feedback.

Am 17.09.2016 um 13:54 schrieb Noel Grandin:
> Nice work!
>
> Some random low priority comments:
>
> In "Just walk the task list once per timeout"
> https://cgit.freedesktop.org/libreoffice/core/commit/?h=feature/new-vcl-scheduler&id=435e21fe0330436e76e5e053d5d5d94df734a554
> The evaluate_entry label doesn't make the code any easier to read.

So you're suggestion is? Introduce an other level of indention for the
if? The while loop is short enough and the goto labels are just used in
there. If you think it improves readability, I can change it.

> In "Reorganize Scheduler priority classes"
> https://cgit.freedesktop.org/libreoffice/core/commit/?h=feature/new-vcl-scheduler&id=5e4361e84607fc6d7623b31630505da7c934b945
>  (1) you haven't used a consistent mapping from old to new - sometimes
> LOW maps to HIGH_IDLE, sometimes HIGHEST maps to HIGH_IDLE. Perhaps
> there is a reason for this?

From naming and code reading I tried to guess, if the idle should be
processed before drawing. As I wrote I would like to get a comment or
annotation for all non-default priorities, i.e. all calls to "SetPriority".

I have no idea, if my guesses are correct, just that mail merge is much
faster now and LO is still usable while the mail merge runs. Actually
you can see the slowdown of mail merge when doing stuff in LO :-)

> (2) I would call HIGH_IDLE, either PRE_RESIZE or BEFORE_RESIZE, because
> it's not any kind of IDLE anymore. I would call DEFAULT_IDLE just IDLE.

I just took the priority names from glib. I would keep the DEFAULT_IDLE,
or rename DEFAULT to TIMER? OTOH the Scheduler class sets the default
priority...

> In "Handle all main loop and task events"
> https://cgit.freedesktop.org/libreoffice/core/commit/?h=feature/new-vcl-scheduler&id=102c41c2e429bee489334361536779aa298bc181
>     -    bProcessedEvent = bProcessedEvent || bScheduledEevent;
>     +    bProcessedEvent |= bScheduledEevent;

Will do.

> In "Run Idle tasks immediatly"
> https://cgit.freedesktop.org/libreoffice/core/commit/?h=feature/new-vcl-scheduler&id=a1ecb280872c5487615558f8d140a380ef3e0d36
> It occurs to me that when we get an overload condition, it would be very
> helpful to dump the names of the current events.

Dumping the current event would probably not help much. But we could
dump the whole scheduler event state. OTOH you can dump it in GDB with

p *ImplGetSVData()->mpFirstSchedulerData

BTW: there are some LOK-Tests, which generate more then 1000 events. The
1000 is arbitrary from my POV. Probably checking the runtime of
ProcessAllPendingEvents() would be better?

Thanks,

Jan-Marek
_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Noel Grandin-2 Noel Grandin-2
Reply | Threaded
Open this post in threaded view
|

Re: WIP: new VCL scheduler feature branch



On 2016/09/19 2:27 PM, Jan-Marek Glogowski wrote:

>
>> Some random low priority comments:
>>
>> In "Just walk the task list once per timeout"
>> https://cgit.freedesktop.org/libreoffice/core/commit/?h=feature/new-vcl-scheduler&id=435e21fe0330436e76e5e053d5d5d94df734a554
>> The evaluate_entry label doesn't make the code any easier to read.
>
> So you're suggestion is? Introduce an other level of indention for the
> if? The while loop is short enough and the goto labels are just used in
> there. If you think it improves readability, I can change it.
>

No, just inline that single line of code at the evaluate_entry label and then jump to the exit label.
But that's just my taste, if you disagree, feel free to ignore.

>> In "Reorganize Scheduler priority classes"
>> https://cgit.freedesktop.org/libreoffice/core/commit/?h=feature/new-vcl-scheduler&id=5e4361e84607fc6d7623b31630505da7c934b945
>>  (1) you haven't used a consistent mapping from old to new - sometimes
>> LOW maps to HIGH_IDLE, sometimes HIGHEST maps to HIGH_IDLE. Perhaps
>> there is a reason for this?
>
> From naming and code reading I tried to guess, if the idle should be
> processed before drawing. As I wrote I would like to get a comment or
> annotation for all non-default priorities, i.e. all calls to "SetPriority".
>
> I have no idea, if my guesses are correct, just that mail merge is much
> faster now and LO is still usable while the mail merge runs. Actually
> you can see the slowdown of mail merge when doing stuff in LO :-)
>

OK, just checking, because from the commit message it looks like a mechanical change, but then the mapping was not
consistent.

>> (2) I would call HIGH_IDLE, either PRE_RESIZE or BEFORE_RESIZE, because
>> it's not any kind of IDLE anymore. I would call DEFAULT_IDLE just IDLE.
>
> I just took the priority names from glib. I would keep the DEFAULT_IDLE,
> or rename DEFAULT to TIMER? OTOH the Scheduler class sets the default
> priority...
>

Ah, then probably leave them alone, if they map to something we already use.


_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Jan-Marek Glogowski Jan-Marek Glogowski
Reply | Threaded
Open this post in threaded view
|

Merging new VCL scheduler feature branch

In reply to this post by Jan-Marek Glogowski
Hi everybody,

I've updated and fixed the feature branch and I'm confident enough to
merge it :-)

Last build is almost successful
(http://ci.libreoffice.org/job/lo_gerrit/1836/), except that I had three
JUnit failures on Windows for my last three builds, which I couldn't
reproduce locally.

I'll just quote my mail from last month.

Am 16.09.2016 um 23:49 schrieb Jan-Marek Glogowski:

>
> This fixes my mail merge performance problem since the 5.0 merge of idle
> job handling.
>
> It completely drops the idea of separated idle and timer handling and
> drops a lot of special handling and workarounds for problems in the
> original code, but keeps the Idle class for convenience. I tried to list
> all the revert commit ids and started to test the original bugs.
>
> Everything is just scheduled by priority and Idles now get a very low
> one per default, while they previously had the same one then timers.
> Additionally it drops the 1ms lag for tasks, which actually adds up, as
> idle tasks weren't instantly scheduled.
>
> I already fixed the two most obvious busy loop bugs in Writer with
> workarounds for the general idle job handler and the statistics
> collector. Instead of polling for work I would like to switch these to
> be enabled on demand, but that'll definitely take more work. Currently
> they are converted to timers, which check for work every few seconds,
> then should switch to idle - ok 0ms timeouts, until all work is processed.

The last patch is just a hack. Quite probably setting the document to
the busy state should "simply" disable the DocumentTimerManager, also in
regard to SwLayIdle::DoIdleJob / SwLayIdle::SwLayIdle /
SwViewShell::LayoutIdle. This feels duplicated, but seems quite hard to
untangle.

> I tried to simplify the priority groups to make it easier to select the
> correct one. There are still some suspicions HIGHEST ones. I had to guess
> the new dependencies from the original priority of the idles, but I'm
> quite sure there is some missing stuff. I would like to get some kind of
> annotations to be able to build a priority tree of all of them for a
> better overview.
>
> It would be great to get feedback from people of all other platforms /
> VCL backends.
>
> Thanks for your comments and feedback
>
> Jan-Marek
>
> P.S. it adds a VCL python GDB script, which can be used to dump the list
> of currently scheduled tasks.
>
> P.P.S. naming is sill a mess. Sometimes it's task, sometimes event. The
> scheduler class is actually the base class for the timers and idles and
> should be renamed Event or Task.

_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Michael Meeks-5 Michael Meeks-5
Reply | Threaded
Open this post in threaded view
|

Re: Merging new VCL scheduler feature branch

Hi jmux,

On 10/24/2016 03:20 PM, Jan-Marek Glogowski wrote:
> I've updated and fixed the feature branch and I'm confident enough to
> merge it :-)

        First - thanks for grasping the nettle here. There is nothing terribly
beautiful about our old scheduler code, and we need someone brave to
come along and re-work it. This is certainly brave =)

        Overall - there are some good things here; but there are also some bits
I'm not so thrilled with, and since I'm confident that even the
simplest, and most obvious change to this code results in regressions,
the "drastically changes its behaviour" type of patch is going to be
very useful to be able to bisect them out - I would really suggest
re-basing it and cherry-picking the work in pieces.

> I'll just quote my mail from last month.

        Sorry for taking so long to get to review it.

>> It completely drops the idea of separated idle and timer handling and
>> drops a lot of special handling and workarounds for problems in the
>> original code, but keeps the Idle class for convenience.

        I like this piece FWIW. I'd like a much more glib-like main-loop for
what we have here.

        However - what the timer vs. idle separation achieved was to give a
'true idle' sense - ie. some tasks were only performed when there were
no incoming OS events - such as key-events, re-sizing, re-painting etc.
That is really important to be able to handle producer vs. consumer
mis-matches, and to keep rendering and typing crisp when we're loaded.
It is not entirely obvious how that works in the new world but its one
of the most important pieces.

>> P.S. it adds a VCL python GDB script, which can be used to dump the list
>> of currently scheduled tasks.

        This is great =)

>> P.P.S. naming is sill a mess. Sometimes it's task, sometimes event. The
>> scheduler class is actually the base class for the timers and idles and
>> should be renamed Event or Task.

        Sure. I append some more detailed notes on various pieces.

        On windows, with the branch - when I re-size a writer window in master
- after a few seconds of frantic event production - I get the window
failing to respond - there are no re-renders at all.

        Interestingly - I also see a large number of SfxItemDisruptor_Impl
items in the scheduler debug - which is curious. Possibly some problem
in the (legacy) manual linked-list logic ?

        I'd love to see some of the old bugs in this area re-tested - I started
to add a few of them to the tracker:

https://bugs.documentfoundation.org/showdependencytree.cgi?id=103542

        Bug 91727 eg. was a good one to re-test from an idle handling
perspective; but I think there are more of those I've not added yet.

        Have you tested suspend & resume - which (with the associated clock
jumping) can be something of a challenge ?

        Anyhow some more notes appended, I think dripping these in piece by
piece would be great.

        HTH,

                Michael.

* This fragment of another patch turns from a synchronous paint on
  re-size to an async/idle one - is that expected ? and/or why ?
        + would love to have that separated in its own bibisectable
          commit.

diff --git a/vcl/source/window/paint.cxx b/vcl/source/window/paint.cxx
index 2d23020..6eb69af 100644
--- a/vcl/source/window/paint.cxx
+++ b/vcl/source/window/paint.cxx
@@ -670,11 +670,8 @@ IMPL_LINK_NOARG(Window, ImplHandleResizeTimerHdl,
Idle *, void)
     if( mpWindowImpl->mbReallyVisible )
     {
         ImplCallResize();
-        if( mpWindowImpl->mpFrameData->maPaintIdle.IsActive() )
-        {
-            mpWindowImpl->mpFrameData->maPaintIdle.Stop();
-            mpWindowImpl->mpFrameData->maPaintIdle.Invoke( nullptr );
-        }
+        if( !mpWindowImpl->mpFrameData->maPaintIdle.IsActive() )
+            mpWindowImpl->mpFrameData->maPaintIdle.Start();
     }
 }

* void DocumentTimerManager::StartIdling()

        + Not really clear about the switch to a timer here:

@@ -97,7 +100,8 @@ IMPL_LINK( DocumentTimerManager, DoIdleJobs, Idle*,
pIdle, void )
-                pIdle->Start();
+                pTimer->SetTimeout( 2000 );
+                pTimer->Start();

        + Where do these random numbers come from ? why two
          second delay for ~all of the core writer functionality
          layout, spell-checking etc. ?
        + Can we have some 'busy' mode instead - which we can switch
          a document into - when it is undergoing heavy manipulation
          that will disable this work ?
                + I wonder how much of your interactivity problem
                  this would solve by itself as the 1st commit ? =)

* I like this guy:

+    static const SAL_CONSTEXPR sal_uInt64 ImmediateTimeoutMs = 0;
...
-    if ( !nMS )
-        nMS = 1;

        A good place to go =) Then again - we need to skip polling, or
do a non-sleeping poll in several backends in this case - and there was
some frightning OS/X backend comment around this.

* The changes here:
        + void Scheduler::ProcessAllPendingEvents()

        Are quite brave it seems to me; but I like brave =)
                + the 1000 event timeout was something of a sanity check.

* Change Idle to be a Timer subclass
        + Like that.

commit 8dc024db4325d804423355aa3360d5da5fb14a7a
Author: Jan-Marek Glogowski <[hidden email]>
Date:   Tue Sep 13 12:20:45 2016 +0200

    Don't wait in Yield with pending events

    This re-introduces some functionality of commit
      87199d3829257420429057336283c55be6ae7481

commit 06d731428ef6cf93c7333e8228bfb6088853b52f
Author: Luboš Luňák <[hidden email]>
Date:   Sat Jan 31 17:40:48 2015 +0100

    make idle timers actually activate only when idle

        * Worth re-testing bug#91727 ...

        * What is this 'idle' concept - and where do OS
          events fit into it ?

-        pSVData->mpSalTimer->CallCallback( idle );

        * How do we only process 'idle' events when there are
          no more OS events to consume ? ie. we are truly idle ?

        + eg. we want to process as many resize events as possible
                + before we actually do the 'idle' re-paint etc.

commit 5abe75596c3df7fdf100533856361a7ed007f561
Author: Jan-Marek Glogowski <[hidden email]>
Date:   Thu Sep 8 06:55:30 2016 +0200

    Revert all SalYieldResult changes => bool

-SalYieldResult SvpSalInstance::DoYield(bool bWait, bool
bHandleAllCurrentEvents, sal_uLong const nReleased)
+bool SvpSalInstance::DoYield(bool bWait, bool bHandleAllCurrentEvents,
sal_uLong const nReleased)

        * Really dislike that - please just make the enum more
          descriptive to express what is going on.
        * The enum having only two values - but being descriptive
          is good.
        * I can't tell what DoYield returns from the above signature, or
          where to look to find that out.
        * I fear the previous mess of '!' operators, and unclear
          booleans and re-use of this result - which made the
          original code really, really hard to follow (for stupid
          people like me).

* There are going to be big problems here:
        + we are going to need to bisect into this:
        + things like this - need re-basing / squashing out of there:

@@ -279,14 +278,8 @@ void Scheduler::Start()
         return;
     }

-<<<<<<< HEAD
     DBG_TESTSOLARMUTEX();

-    // Mark timer active
-    mbActive = true;
-
-=======
->>>>>>> Use mpSchedulerData for delete and active handling


commit 1478c78dfd60c0c22d3823be2ef1d91d4e615164
Author: Jan-Marek Glogowski <[hidden email]>
Date:   Wed Jul 20 10:54:30 2016 +0200

    tdf#97087 GDB pretty print the Scheduler task list

    In addition to the GDB pretty printer, this annotates a lot more
    Timers and Idles.

        + love that.

--
[hidden email] <><, Pseudo Engineer, itinerant idiot
_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Michael Stahl-2 Michael Stahl-2
Reply | Threaded
Open this post in threaded view
|

Re: Merging new VCL scheduler feature branch

On 27.10.2016 20:39, Michael Meeks wrote:
> Interestingly - I also see a large number of SfxItemDisruptor_Impl
> items in the scheduler debug - which is curious. Possibly some problem
> in the (legacy) manual linked-list logic ?

FYI those are a recent addition from this change which removed some
indirection:

ommit 22c75d86db9351ab271942a755a2a75a76920943
Author:     Michael Stahl <[hidden email]>
AuthorDate: Wed Jul 27 20:55:45 2016 +0200

    sfx2: just use Idle in SfxItemDisruptor_Impl

    This Application::InsertIdleHdl() stuff is ancient and now obsolete.

of course i never checked how many of these would pile up so i hope
"large number" isn't too large to cause new performance issues.

> commit 5abe75596c3df7fdf100533856361a7ed007f561
> Author: Jan-Marek Glogowski <[hidden email]>
> Date:   Thu Sep 8 06:55:30 2016 +0200
>
>     Revert all SalYieldResult changes => bool
>
> -SalYieldResult SvpSalInstance::DoYield(bool bWait, bool
> bHandleAllCurrentEvents, sal_uLong const nReleased)
> +bool SvpSalInstance::DoYield(bool bWait, bool bHandleAllCurrentEvents,
> sal_uLong const nReleased)
>
> * Really dislike that - please just make the enum more
>  descriptive to express what is going on.
> * The enum having only two values - but being descriptive
>  is good.
> * I can't tell what DoYield returns from the above signature, or
>  where to look to find that out.
> * I fear the previous mess of '!' operators, and unclear
>  booleans and re-use of this result - which made the
>  original code really, really hard to follow (for stupid
>  people like me).

i'd like to join your haters-of-meaningless-bools club.

_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Michael Meeks-5 Michael Meeks-5
Reply | Threaded
Open this post in threaded view
|

Re: Merging new VCL scheduler feature branch


On 10/27/2016 07:56 PM, Michael Stahl wrote:
> FYI those are a recent addition from this change which removed some
> indirection:
>
> ommit 22c75d86db9351ab271942a755a2a75a76920943
> Author:     Michael Stahl <[hidden email]>
>     sfx2: just use Idle in SfxItemDisruptor_Impl

        Interesting =) I saw about a dozen+ or so of them in the queue.

        I'd love to replace the sw/ internal idle handling loop with the
standard VCL one too - but that in turn requires really quite some
gradation of priorities - so that we do layout at idle before we do
document statistics, before spell checking, before we do grammar
checking, etc. =) at least if we want to keep the same behavior.

        That rather cuts across the "collapse all priorities" patch FWIW -
personally, I'd prefer them as per glib - separated by some 100's so we
could fine-tune these relative things easily.

        Then again - if the scheudler performs much better after jmux's work -
then we can whack many more, more granular tasks into it - which is
good; but then we will need more granular prioritisation too.

> i'd like to join your haters-of-meaningless-bools club.

        Ah - it's really Tor's club; I'm a late convert - and this particular
piece of code converted me when I did the change ;-)

        ATB,

                Michael.

--
[hidden email] <><, Pseudo Engineer, itinerant idiot
_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Jan-Marek Glogowski Jan-Marek Glogowski
Reply | Threaded
Open this post in threaded view
|

Re: Merging new VCL scheduler feature branch

In reply to this post by Jan-Marek Glogowski
I've rebased the branch to current master and added a documentation
patch on top, containing a README.scheduler.

This lists my "design" ideas, TODO lists, etc.

JMG

_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice