Topics

CI failures due to leaked work on the gui thread

Balazs Racz
 

Hey,

I've been trying to track down some flakiness in the CI runs that are plaguing our developer productivity. I came across a problem that needs a bit more expertise in java/swing and I'm looking for ideas how to identify culprits.

We often have code like
new org.netbeans.jemmy.QueueTool().waitEmpty(100);
to wait for work on the GUI thread to acquiesce. This is great as a synchronization mechanism to avoid race conditions due to the multiple threads progressing in a way that is unpredictable from the main thread.

This call itself will detect work on the GUI thread that is left over from running other tests. These are leaked objects that keep doing work on the GUI thread, and can interfere with the running of the existing test in arbitrary ways (famous example would be a ConcurrentModificationException) in addition to making the above statement be really unpredictable.

I made this mistake of leaking work to the GUI thread in my code (it was a real bug), and after identifying and fixing it I have committed a regression test
Now this regression test fails one out of four times on CI, signaling that there is some work on the GUI thread that *other* tests leaked.

I need ideas on how to figure out 
- what work is done on the GUI thread
- which test it comes from
so we can go and fix it.

A great way to leak work onto the GUI thread is to have a javax.swing.Timer (in code or test) that is not stopped before the test exits.

thanks
Balazs


Bob Jacobsen
 

I think this is an important thing to get a handle on. Thanks for looking into it.

I’ve got comments on two aspects. First, on use of waitEmpty(100):

When a developer is running locally, it’s very hard for the Swing queue to stay empty: Lots of non-test-related things can add to the queue, so you have to run tests with hands off and fingers crossed. Even on the person-less CI servers, things can happen that add to the queue: javax.swing.Timer objects “tick” occasionally, etc. Making all that more robust would be good.

Reading the Jemmy source code (not an entirely reliable activity…), it seems that waitEmpty(100) requires the queue to stay empty for 100msec, while waitEmpty() requires the queue to be instantaneously empty. In many cases, tests could use waitEmpty() and become more robust. But waitEmpty() might not always be sufficient if enough work has to be moved through the event queue. (Even more robust would be to use waitFor(…) on something that’s supposed to happen, which both avoids Swing subtleties and turns the test into a Real Test that actually checks some computation)

Second, I also agree that we should get better at cleaning up the code between tests. Occasionally, people will semi-systematically try to find threads and Timers to kill, but it’s hard to keep up on that.

Again reading the Jemmy code, it _looks_ like it should be getting and displaying the head item on the Swing queue when waitEmpty fails (by timing out), but I don’t see anything like that in the logs. Maybe logging output is not being properly setup for our Jemmy use?

If we could get that to work, it would provide some hints. Maybe that’d be enough to track down the problems sequentially.

Another (slow!) approach is to add a waitEmpty(100) check to the shutDown call that’s at the end of (almost all) tests, the run the scripts/run_tests_separately test. Since it’s only partially effective, it’d take some runs to track down where the all the issues are, but it should eventually flag the file and perhaps even tests that are leaving things behind.

Bob

On Dec 16, 2018, at 12:27 AM, Balazs Racz <balazs.racz@...> wrote:

Hey,

I've been trying to track down some flakiness in the CI runs that are plaguing our developer productivity. I came across a problem that needs a bit more expertise in java/swing and I'm looking for ideas how to identify culprits.

We often have code like
new org.netbeans.jemmy.QueueTool().waitEmpty(100);
to wait for work on the GUI thread to acquiesce. This is great as a synchronization mechanism to avoid race conditions due to the multiple threads progressing in a way that is unpredictable from the main thread.

This call itself will detect work on the GUI thread that is left over from running other tests. These are leaked objects that keep doing work on the GUI thread, and can interfere with the running of the existing test in arbitrary ways (famous example would be a ConcurrentModificationException) in addition to making the above statement be really unpredictable.

I made this mistake of leaking work to the GUI thread in my code (it was a real bug), and after identifying and fixing it I have committed a regression test.
Now this regression test fails one out of four times on CI, signaling that there is some work on the GUI thread that *other* tests leaked.

I need ideas on how to figure out
- what work is done on the GUI thread
- which test it comes from
so we can go and fix it.

A great way to leak work onto the GUI thread is to have a javax.swing.Timer (in code or test) that is not stopped before the test exits.

thanks
Balazs
--
Bob Jacobsen
@BobJacobsen

Bob Jacobsen
 

I checked the following with waitEmpty() in PR #6309. It’s passed five Travis runs at this point, so it’s showing a much-better success rate.

Bob

On Dec 16, 2018, at 10:43 AM, Bob Jacobsen via Groups.Io <rgj1927=gmail.com@groups.io> wrote:

it seems that waitEmpty(100) requires the queue to stay empty for 100msec, while waitEmpty() requires the queue to be instantaneously empty. In many cases, tests could use waitEmpty() and become more robust.
--
Bob Jacobsen
@BobJacobsen

Paul Bender
 

This is somewhat consistent with the last time we discussed this.


Perhaps we have cleaned up a few more timers since then.

Paul


On Dec 16, 2018, at 2:17 PM, Bob Jacobsen <rgj1927@...> wrote:

I checked the following with waitEmpty() in PR #6309.  It’s passed five Travis runs at this point, so it’s showing a much-better success rate.

Bob

On Dec 16, 2018, at 10:43 AM, Bob Jacobsen via Groups.Io <rgj1927@...> wrote:

it seems that waitEmpty(100) requires the queue to stay empty for 100msec, while waitEmpty() requires the queue to be instantaneously empty.  In many cases, tests could use waitEmpty() and become more robust.

--
Bob Jacobsen
rgj1927@...






Bob Jacobsen
 

Found one!

I put some test code in the JUnitUtil-tearDown method to look for events on the AWT queue, then used the run_tests_separately script. It found a consistent case in jmri/jmrit/conditional/ConditionalEditBaseTest, see below (two separate runs).

They’re slightly different, but it seems that the test is ending while windows are still changing (with the AWT LOST_FOCUS events), which is somehow triggering a JMRI-specific event and then further JMRI activity. The last FOCUS_GAINED event can (but don’t always) end up being quite late .

This doesn’t entirely explain why waitEmpty(200) occasionally fails; I need better instrumentation for that, will work on it as time is available, maybe later this week.

In the below case, I’d appreciate any suggestions on how to find what’s going on. I don’t know much about the Conditional GUI code, and I don’t see what jmri.beans.Bean$$Lambda$8/191382150@4c40b76e is (though I suspect it’s Timer-related, I don’t see why it’s a jmri.beans.Beans lambda)

Bob


jmri/jmrit/conditional/ConditionalEditBaseTest
JUnit version 4.12
.Using org.netbeans.jemmy.drivers.APIDriverInstaller driver installer
entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: java.awt.event.InvocationEvent[INVOCATION_DEFAULT,runnable=jmri.beans.Bean$$Lambda$8/191382150@4c40b76e,notifier=null,catchExceptions=false,when=1545062209603] on sun.lwawt.macosx.LWCToolkit@72181dfc
.entry: java.awt.FocusEvent[FOCUS_GAINED,permanent,opposite=javax.swing.JTabbedPane[,0,0,389x304,invalid,layout=com.apple.laf.AquaTabbedPaneUI$AquaTruncatingTabbedPaneLayout,alignmentX=0.0,alignmentY=0.0,border=,flags=16777568,maximumSize=,minimumSize=,preferredSize=,haveRegistered=false,tabPlacement=TOP],cause=TRAVERSAL_FORWARD] on jmri.jmrit.picker.PickListModel$1[,0,0,364x144,invalid,alignmentX=0.0,alignmentY=0.0,border=,flags=251658568,maximumSize=,minimumSize=,preferredSize=,autoCreateColumnsFromModel=false,autoResizeMode=AUTO_RESIZE_SUBSEQUENT_COLUMNS,cellSelectionEnabled=false,editingColumn=-1,editingRow=-1,gridColor=javax.swing.plaf.ColorUIResource[r=255,g=255,b=255],preferredViewportSize=java.awt.Dimension[width=250,height=112],rowHeight=16,rowMargin=1,rowSelectionAllowed=true,selectionBackground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=0,g=0,b=217],selectionForeground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=255,g=255,b=255],showHorizontalLines=true,showVerticalLines=true]
..
Time: 3.748




jmri/jmrit/conditional/ConditionalEditBaseTest
JUnit version 4.12
.Using org.netbeans.jemmy.drivers.APIDriverInstaller driver installer
entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: java.awt.event.InvocationEvent[INVOCATION_DEFAULT,runnable=jmri.beans.Bean$$Lambda$8/191382150@8b96fde,notifier=null,catchExceptions=false,when=1545062327033] on sun.lwawt.macosx.LWCToolkit@fa6a01b
..entry: java.awt.FocusEvent[FOCUS_GAINED,permanent,opposite=javax.swing.JTabbedPane[,0,0,389x304,invalid,layout=com.apple.laf.AquaTabbedPaneUI$AquaTruncatingTabbedPaneLayout,alignmentX=0.0,alignmentY=0.0,border=,flags=16777568,maximumSize=,minimumSize=,preferredSize=,haveRegistered=false,tabPlacement=TOP],cause=TRAVERSAL_FORWARD] on jmri.jmrit.picker.PickListModel$1[,0,0,364x144,invalid,alignmentX=0.0,alignmentY=0.0,border=,flags=251658568,maximumSize=,minimumSize=,preferredSize=,autoCreateColumnsFromModel=false,autoResizeMode=AUTO_RESIZE_SUBSEQUENT_COLUMNS,cellSelectionEnabled=false,editingColumn=-1,editingRow=-1,gridColor=javax.swing.plaf.ColorUIResource[r=255,g=255,b=255],preferredViewportSize=java.awt.Dimension[width=250,height=112],rowHeight=16,rowMargin=1,rowSelectionAllowed=true,selectionBackground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=0,g=0,b=217],selectionForeground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=255,g=255,b=255],showHorizontalLines=true,showVerticalLines=true]
..
Time: 3.729



--
Bob Jacobsen
@BobJacobsen

Balazs Racz
 

I have 45 :)

Jemmy exit wait failed for test apps.FileLocationPaneTest
Jemmy exit wait failed for test jmri.implementation.QuadOutputSignalHeadTest
Jemmy exit wait failed for test jmri.implementation.SingleTurnoutSignalHeadTest
Jemmy exit wait failed for test jmri.implementation.DoubleTurnoutSignalHeadTest
Jemmy exit wait failed for test jmri.jmrit.MemoryContentsTest
Jemmy exit wait failed for test jmri.jmrit.audio.swing.AudioSourceFrameTest
Jemmy exit wait failed for test jmri.jmrit.beantable.LogixTableActionTest
Jemmy exit wait failed for test jmri.jmrit.beantable.usermessagepreferences.UserMessagePreferencesPaneTest
Jemmy exit wait failed for test jmri.jmrit.conditional.ConditionalTreeEditTest
Jemmy exit wait failed for test apps.LaunchJmriAppBase
Jemmy exit wait failed for test jmri.jmrit.display.AnalogClock2DisplayTest
Jemmy exit wait failed for test jmri.jmrit.consisttool.ConsistToolFrameTest
Jemmy exit wait failed for test jmri.jmrit.display.SignalHeadIconTest
Jemmy exit wait failed for test jmri.jmrit.display.IndicatorTrackIconTest
Jemmy exit wait failed for test jmri.jmrit.display.controlPanelEditor.ControlPanelEditorTest
Jemmy exit wait failed for test apps.LaunchJmriAppBase
Jemmy exit wait failed for test apps.LaunchJmriAppBase
Jemmy exit wait failed for test jmri.jmrit.logix.OBlockTest
Jemmy exit wait failed for test jmri.jmrit.operations.automation.actions.SelectTrainActionTest
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.display.layoutEditor.LayoutEditorTest
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.simpleclock.configurexml.SimpleTimebaseXmlTest
Jemmy exit wait failed for test jmri.jmrit.roster.RosterEntryTest
Jemmy exit wait failed for test jmri.jmrit.simpleclock.SimpleTimebaseTest
Jemmy exit wait failed for test jmri.jmrit.symbolicprog.CVNameComparatorTest
Jemmy exit wait failed for test jmri.jmrit.symbolicprog.ComboOffRadioButtonTest
Jemmy exit wait failed for test jmri.jmrit.ussctc.SignalHeadSectionTest
Jemmy exit wait failed for test jmri.jmrix.acela.AcelaLightManagerTest
Jemmy exit wait failed for test jmri.jmrix.acela.AcelaMenuTest
Jemmy exit wait failed for test jmri.jmrix.bachrus.GraphPaneTest
Jemmy exit wait failed for test jmri.jmrix.anyma.UsbLightManagerTest
Jemmy exit wait failed for test jmri.jmrix.can.cbus.CbusThrottleTest
Jemmy exit wait failed for test jmri.jmrix.dcc4pc.swing.monitor.Dcc4PcMonPaneTest
Jemmy exit wait failed for test jmri.jmrix.dccpp.swing.mon.DCCppMonPaneTest

I got this by a slightly aggressive test where in TearDown I look for 500 msec of emptiness on the event thread and if it doesn't happen in 1 sec, I alert.

There are a lot of UI tests here, so I think I would need to understand in a bit more detail what's going on with the frames for these failures. The test was run on Xvfb, so no input events, but if a window is left open that could be receiving focus events from other tests running.

However, the non-UI tests that are on this list are definitely suspicious. For example everything in jmti.implementation: the SignalHead tests are probably leaking a swing timer for the blinking appearances.

How did you get the list of events that do happen on the event thread with the nice printout?

thanks
Balazs


On Mon, Dec 17, 2018 at 5:11 PM Bob Jacobsen <rgj1927@...> wrote:
Found one!

I put some test code in the JUnitUtil-tearDown method to look for events on the AWT queue, then used the run_tests_separately script.  It found a consistent case in jmri/jmrit/conditional/ConditionalEditBaseTest, see below (two separate runs).

They’re slightly different, but it seems that the test is ending while windows are still changing (with the AWT LOST_FOCUS events), which is somehow triggering a JMRI-specific event and then further JMRI activity.  The last FOCUS_GAINED event can (but don’t always) end up being quite late .

This doesn’t entirely explain why waitEmpty(200) occasionally fails; I need better instrumentation for that, will work on it as time is available, maybe later this week. 

In the below case, I’d appreciate any suggestions on how to find what’s going on.  I don’t know much about the Conditional GUI code, and I don’t see what jmri.beans.Bean$$Lambda$8/191382150@4c40b76e is (though I suspect it’s Timer-related, I don’t see why it’s a jmri.beans.Beans lambda)

Bob


jmri/jmrit/conditional/ConditionalEditBaseTest
JUnit version 4.12
.Using org.netbeans.jemmy.drivers.APIDriverInstaller driver installer
entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: java.awt.event.InvocationEvent[INVOCATION_DEFAULT,runnable=jmri.beans.Bean$$Lambda$8/191382150@4c40b76e,notifier=null,catchExceptions=false,when=1545062209603] on sun.lwawt.macosx.LWCToolkit@72181dfc
.entry: java.awt.FocusEvent[FOCUS_GAINED,permanent,opposite=javax.swing.JTabbedPane[,0,0,389x304,invalid,layout=com.apple.laf.AquaTabbedPaneUI$AquaTruncatingTabbedPaneLayout,alignmentX=0.0,alignmentY=0.0,border=,flags=16777568,maximumSize=,minimumSize=,preferredSize=,haveRegistered=false,tabPlacement=TOP],cause=TRAVERSAL_FORWARD] on jmri.jmrit.picker.PickListModel$1[,0,0,364x144,invalid,alignmentX=0.0,alignmentY=0.0,border=,flags=251658568,maximumSize=,minimumSize=,preferredSize=,autoCreateColumnsFromModel=false,autoResizeMode=AUTO_RESIZE_SUBSEQUENT_COLUMNS,cellSelectionEnabled=false,editingColumn=-1,editingRow=-1,gridColor=javax.swing.plaf.ColorUIResource[r=255,g=255,b=255],preferredViewportSize=java.awt.Dimension[width=250,height=112],rowHeight=16,rowMargin=1,rowSelectionAllowed=true,selectionBackground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=0,g=0,b=217],selectionForeground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=255,g=255,b=255],showHorizontalLines=true,showVerticalLines=true]
..
Time: 3.748




jmri/jmrit/conditional/ConditionalEditBaseTest
JUnit version 4.12
.Using org.netbeans.jemmy.drivers.APIDriverInstaller driver installer
entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: java.awt.event.InvocationEvent[INVOCATION_DEFAULT,runnable=jmri.beans.Bean$$Lambda$8/191382150@8b96fde,notifier=null,catchExceptions=false,when=1545062327033] on sun.lwawt.macosx.LWCToolkit@fa6a01b
..entry: java.awt.FocusEvent[FOCUS_GAINED,permanent,opposite=javax.swing.JTabbedPane[,0,0,389x304,invalid,layout=com.apple.laf.AquaTabbedPaneUI$AquaTruncatingTabbedPaneLayout,alignmentX=0.0,alignmentY=0.0,border=,flags=16777568,maximumSize=,minimumSize=,preferredSize=,haveRegistered=false,tabPlacement=TOP],cause=TRAVERSAL_FORWARD] on jmri.jmrit.picker.PickListModel$1[,0,0,364x144,invalid,alignmentX=0.0,alignmentY=0.0,border=,flags=251658568,maximumSize=,minimumSize=,preferredSize=,autoCreateColumnsFromModel=false,autoResizeMode=AUTO_RESIZE_SUBSEQUENT_COLUMNS,cellSelectionEnabled=false,editingColumn=-1,editingRow=-1,gridColor=javax.swing.plaf.ColorUIResource[r=255,g=255,b=255],preferredViewportSize=java.awt.Dimension[width=250,height=112],rowHeight=16,rowMargin=1,rowSelectionAllowed=true,selectionBackground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=0,g=0,b=217],selectionForeground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=255,g=255,b=255],showHorizontalLines=true,showVerticalLines=true]
..
Time: 3.729



--
Bob Jacobsen
rgj1927@...






Balazs Racz
 



On Mon, Dec 17, 2018 at 5:11 PM Bob Jacobsen <rgj1927@...> wrote:
Found one!
[snip]
 
and I don’t see what jmri.beans.Bean$$Lambda$8/191382150@4c40b76e is

lambdas in jmri.Bean are from the various firePropertyChange functions, where the property change listeners of the bean properties are executed on the GUI thread. So this could come from any jmri class calling a firePropertyChange function.
 

(though I suspect it’s Timer-related, I don’t see why it’s a jmri.beans.Beans lambda)

Bob Jacobsen
 

On Dec 17, 2018, at 8:24 AM, Balazs Racz <balazs.racz@...> wrote:

How did you get the list of events that do happen on the event thread with the nice printout?
Pretty brute-force. I inserted this:

Object entry = java.awt.Toolkit.getDefaultToolkit().getSystemEventQueue().peekEvent();
if (entry != null) System.err.println("entry: "+entry);

Bob

--
Bob Jacobsen
@BobJacobsen

Dave Sand
 

Both of these examples are using the PickListModel in conjunction with a JTabbedPane.  This implies that testPickListTables is the current test.

I can re-create the symptoms using the tearDown change.  If I add new EventTool().waitNoEvent(10); before the JUnitUtil.dispose(frame); the symptom does not appear.

Dave Sand



On Dec 17, 2018, at 10:11 AM, Bob Jacobsen <rgj1927@...> wrote:

Found one!

I put some test code in the JUnitUtil-tearDown method to look for events on the AWT queue, then used the run_tests_separately script.  It found a consistent case in jmri/jmrit/conditional/ConditionalEditBaseTest, see below (two separate runs).

They’re slightly different, but it seems that the test is ending while windows are still changing (with the AWT LOST_FOCUS events), which is somehow triggering a JMRI-specific event and then further JMRI activity.  The last FOCUS_GAINED event can (but don’t always) end up being quite late .

This doesn’t entirely explain why waitEmpty(200) occasionally fails; I need better instrumentation for that, will work on it as time is available, maybe later this week.

In the below case, I’d appreciate any suggestions on how to find what’s going on.  I don’t know much about the Conditional GUI code, and I don’t see what jmri.beans.Bean$$Lambda$8/191382150@4c40b76e is (though I suspect it’s Timer-related, I don’t see why it’s a jmri.beans.Beans lambda)

Bob


jmri/jmrit/conditional/ConditionalEditBaseTest
JUnit version 4.12
.Using org.netbeans.jemmy.drivers.APIDriverInstaller driver installer
entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: java.awt.event.InvocationEvent[INVOCATION_DEFAULT,runnable=jmri.beans.Bean$$Lambda$8/191382150@4c40b76e,notifier=null,catchExceptions=false,when=1545062209603] on sun.lwawt.macosx.LWCToolkit@72181dfc
.entry: java.awt.FocusEvent[FOCUS_GAINED,permanent,opposite=javax.swing.JTabbedPane[,0,0,389x304,invalid,layout=com.apple.laf.AquaTabbedPaneUI$AquaTruncatingTabbedPaneLayout,alignmentX=0.0,alignmentY=0.0,border=,flags=16777568,maximumSize=,minimumSize=,preferredSize=,haveRegistered=false,tabPlacement=TOP],cause=TRAVERSAL_FORWARD] on jmri.jmrit.picker.PickListModel$1[,0,0,364x144,invalid,alignmentX=0.0,alignmentY=0.0,border=,flags=251658568,maximumSize=,minimumSize=,preferredSize=,autoCreateColumnsFromModel=false,autoResizeMode=AUTO_RESIZE_SUBSEQUENT_COLUMNS,cellSelectionEnabled=false,editingColumn=-1,editingRow=-1,gridColor=javax.swing.plaf.ColorUIResource[r=255,g=255,b=255],preferredViewportSize=java.awt.Dimension[width=250,height=112],rowHeight=16,rowMargin=1,rowSelectionAllowed=true,selectionBackground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=0,g=0,b=217],selectionForeground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=255,g=255,b=255],showHorizontalLines=true,showVerticalLines=true]
..
Time: 3.748




jmri/jmrit/conditional/ConditionalEditBaseTest
JUnit version 4.12
.Using org.netbeans.jemmy.drivers.APIDriverInstaller driver installer
entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: java.awt.event.InvocationEvent[INVOCATION_DEFAULT,runnable=jmri.beans.Bean$$Lambda$8/191382150@8b96fde,notifier=null,catchExceptions=false,when=1545062327033] on sun.lwawt.macosx.LWCToolkit@fa6a01b
..entry: java.awt.FocusEvent[FOCUS_GAINED,permanent,opposite=javax.swing.JTabbedPane[,0,0,389x304,invalid,layout=com.apple.laf.AquaTabbedPaneUI$AquaTruncatingTabbedPaneLayout,alignmentX=0.0,alignmentY=0.0,border=,flags=16777568,maximumSize=,minimumSize=,preferredSize=,haveRegistered=false,tabPlacement=TOP],cause=TRAVERSAL_FORWARD] on jmri.jmrit.picker.PickListModel$1[,0,0,364x144,invalid,alignmentX=0.0,alignmentY=0.0,border=,flags=251658568,maximumSize=,minimumSize=,preferredSize=,autoCreateColumnsFromModel=false,autoResizeMode=AUTO_RESIZE_SUBSEQUENT_COLUMNS,cellSelectionEnabled=false,editingColumn=-1,editingRow=-1,gridColor=javax.swing.plaf.ColorUIResource[r=255,g=255,b=255],preferredViewportSize=java.awt.Dimension[width=250,height=112],rowHeight=16,rowMargin=1,rowSelectionAllowed=true,selectionBackground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=0,g=0,b=217],selectionForeground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=255,g=255,b=255],showHorizontalLines=true,showVerticalLines=true]
..
Time: 3.729



--
Bob Jacobsen
rgj1927@...






Dan Boudreau
 

I’ll investigate the operations gui thread “failures”.

 

Thanks for doing this.

 

Dan

 


From: jmri@jmri-developers.groups.io <jmri@jmri-developers.groups.io> on behalf of Balazs Racz <balazs.racz@...>
Sent: Monday, December 17, 2018 11:24:27 AM
To: jmri@jmri-developers.groups.io
Subject: Re: [jmri-developers] CI failures due to leaked work on the gui thread
 
I have 45 :)

Jemmy exit wait failed for test apps.FileLocationPaneTest
Jemmy exit wait failed for test jmri.implementation.QuadOutputSignalHeadTest
Jemmy exit wait failed for test jmri.implementation.SingleTurnoutSignalHeadTest
Jemmy exit wait failed for test jmri.implementation.DoubleTurnoutSignalHeadTest
Jemmy exit wait failed for test jmri.jmrit.MemoryContentsTest
Jemmy exit wait failed for test jmri.jmrit.audio.swing.AudioSourceFrameTest
Jemmy exit wait failed for test jmri.jmrit.beantable.LogixTableActionTest
Jemmy exit wait failed for test jmri.jmrit.beantable.usermessagepreferences.UserMessagePreferencesPaneTest
Jemmy exit wait failed for test jmri.jmrit.conditional.ConditionalTreeEditTest
Jemmy exit wait failed for test apps.LaunchJmriAppBase
Jemmy exit wait failed for test jmri.jmrit.display.AnalogClock2DisplayTest
Jemmy exit wait failed for test jmri.jmrit.consisttool.ConsistToolFrameTest
Jemmy exit wait failed for test jmri.jmrit.display.SignalHeadIconTest
Jemmy exit wait failed for test jmri.jmrit.display.IndicatorTrackIconTest
Jemmy exit wait failed for test jmri.jmrit.display.controlPanelEditor.ControlPanelEditorTest
Jemmy exit wait failed for test apps.LaunchJmriAppBase
Jemmy exit wait failed for test apps.LaunchJmriAppBase
Jemmy exit wait failed for test jmri.jmrit.logix.OBlockTest
Jemmy exit wait failed for test jmri.jmrit.operations.automation.actions.SelectTrainActionTest
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.display.layoutEditor.LayoutEditorTest
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase
Jemmy exit wait failed for test jmri.jmrit.simpleclock.configurexml.SimpleTimebaseXmlTest
Jemmy exit wait failed for test jmri.jmrit.roster.RosterEntryTest
Jemmy exit wait failed for test jmri.jmrit.simpleclock.SimpleTimebaseTest
Jemmy exit wait failed for test jmri.jmrit.symbolicprog.CVNameComparatorTest
Jemmy exit wait failed for test jmri.jmrit.symbolicprog.ComboOffRadioButtonTest
Jemmy exit wait failed for test jmri.jmrit.ussctc.SignalHeadSectionTest
Jemmy exit wait failed for test jmri.jmrix.acela.AcelaLightManagerTest
Jemmy exit wait failed for test jmri.jmrix.acela.AcelaMenuTest
Jemmy exit wait failed for test jmri.jmrix.bachrus.GraphPaneTest
Jemmy exit wait failed for test jmri.jmrix.anyma.UsbLightManagerTest
Jemmy exit wait failed for test jmri.jmrix.can.cbus.CbusThrottleTest
Jemmy exit wait failed for test jmri.jmrix.dcc4pc.swing.monitor.Dcc4PcMonPaneTest
Jemmy exit wait failed for test jmri.jmrix.dccpp.swing.mon.DCCppMonPaneTest

I got this by a slightly aggressive test where in TearDown I look for 500 msec of emptiness on the event thread and if it doesn't happen in 1 sec, I alert.

There are a lot of UI tests here, so I think I would need to understand in a bit more detail what's going on with the frames for these failures. The test was run on Xvfb, so no input events, but if a window is left open that could be receiving focus events from other tests running.

However, the non-UI tests that are on this list are definitely suspicious. For example everything in jmti.implementation: the SignalHead tests are probably leaking a swing timer for the blinking appearances.

How did you get the list of events that do happen on the event thread with the nice printout?

thanks
Balazs


On Mon, Dec 17, 2018 at 5:11 PM Bob Jacobsen <rgj1927@...> wrote:
Found one!

I put some test code in the JUnitUtil-tearDown method to look for events on the AWT queue, then used the run_tests_separately script.  It found a consistent case in jmri/jmrit/conditional/ConditionalEditBaseTest, see below (two separate runs).

They’re slightly different, but it seems that the test is ending while windows are still changing (with the AWT LOST_FOCUS events), which is somehow triggering a JMRI-specific event and then further JMRI activity.  The last FOCUS_GAINED event can (but don’t always) end up being quite late .

This doesn’t entirely explain why waitEmpty(200) occasionally fails; I need better instrumentation for that, will work on it as time is available, maybe later this week. 

In the below case, I’d appreciate any suggestions on how to find what’s going on.  I don’t know much about the Conditional GUI code, and I don’t see what jmri.beans.Bean$$Lambda$8/191382150@4c40b76e is (though I suspect it’s Timer-related, I don’t see why it’s a jmri.beans.Beans lambda)

Bob


jmri/jmrit/conditional/ConditionalEditBaseTest
JUnit version 4.12
.Using org.netbeans.jemmy.drivers.APIDriverInstaller driver installer
entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: java.awt.event.InvocationEvent[INVOCATION_DEFAULT,runnable=jmri.beans.Bean$$Lambda$8/191382150@4c40b76e,notifier=null,catchExceptions=false,when=1545062209603] on sun.lwawt.macosx.LWCToolkit@72181dfc
.entry: java.awt.FocusEvent[FOCUS_GAINED,permanent,opposite=javax.swing.JTabbedPane[,0,0,389x304,invalid,layout=com.apple.laf.AquaTabbedPaneUI$AquaTruncatingTabbedPaneLayout,alignmentX=0.0,alignmentY=0.0,border=,flags=16777568,maximumSize=,minimumSize=,preferredSize=,haveRegistered=false,tabPlacement=TOP],cause=TRAVERSAL_FORWARD] on jmri.jmrit.picker.PickListModel$1[,0,0,364x144,invalid,alignmentX=0.0,alignmentY=0.0,border=,flags=251658568,maximumSize=,minimumSize=,preferredSize=,autoCreateColumnsFromModel=false,autoResizeMode=AUTO_RESIZE_SUBSEQUENT_COLUMNS,cellSelectionEnabled=false,editingColumn=-1,editingRow=-1,gridColor=javax.swing.plaf.ColorUIResource[r=255,g=255,b=255],preferredViewportSize=java.awt.Dimension[width=250,height=112],rowHeight=16,rowMargin=1,rowSelectionAllowed=true,selectionBackground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=0,g=0,b=217],selectionForeground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=255,g=255,b=255],showHorizontalLines=true,showVerticalLines=true]
..
Time: 3.748




jmri/jmrit/conditional/ConditionalEditBaseTest
JUnit version 4.12
.Using org.netbeans.jemmy.drivers.APIDriverInstaller driver installer
entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: java.awt.event.InvocationEvent[INVOCATION_DEFAULT,runnable=jmri.beans.Bean$$Lambda$8/191382150@8b96fde,notifier=null,catchExceptions=false,when=1545062327033] on sun.lwawt.macosx.LWCToolkit@fa6a01b
..entry: java.awt.FocusEvent[FOCUS_GAINED,permanent,opposite=javax.swing.JTabbedPane[,0,0,389x304,invalid,layout=com.apple.laf.AquaTabbedPaneUI$AquaTruncatingTabbedPaneLayout,alignmentX=0.0,alignmentY=0.0,border=,flags=16777568,maximumSize=,minimumSize=,preferredSize=,haveRegistered=false,tabPlacement=TOP],cause=TRAVERSAL_FORWARD] on jmri.jmrit.picker.PickListModel$1[,0,0,364x144,invalid,alignmentX=0.0,alignmentY=0.0,border=,flags=251658568,maximumSize=,minimumSize=,preferredSize=,autoCreateColumnsFromModel=false,autoResizeMode=AUTO_RESIZE_SUBSEQUENT_COLUMNS,cellSelectionEnabled=false,editingColumn=-1,editingRow=-1,gridColor=javax.swing.plaf.ColorUIResource[r=255,g=255,b=255],preferredViewportSize=java.awt.Dimension[width=250,height=112],rowHeight=16,rowMargin=1,rowSelectionAllowed=true,selectionBackground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=0,g=0,b=217],selectionForeground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=255,g=255,b=255],showHorizontalLines=true,showVerticalLines=true]
..
Time: 3.729



--
Bob Jacobsen
rgj1927@...






Dan Boudreau
 

I found that I had to increase the wait time for “emptiness” from 1 sec to 2 seconds when testing on my IDE or I would experience random test failures.  I believe I was able to fix all of the operations thread exit failures.  See “Fix thread queue and test cleanup #6324”.

 

Dan

 

From: jmri@jmri-developers.groups.io [mailto:jmri@jmri-developers.groups.io] On Behalf Of Balazs Racz
Sent: Monday, December 17, 2018 11:24 AM
To: jmri@jmri-developers.groups.io
Subject: Re: [jmri-developers] CI failures due to leaked work on the gui thread

 

I have 45 :)

 

Jemmy exit wait failed for test apps.FileLocationPaneTest

Jemmy exit wait failed for test jmri.implementation.QuadOutputSignalHeadTest

Jemmy exit wait failed for test jmri.implementation.SingleTurnoutSignalHeadTest

Jemmy exit wait failed for test jmri.implementation.DoubleTurnoutSignalHeadTest

Jemmy exit wait failed for test jmri.jmrit.MemoryContentsTest

Jemmy exit wait failed for test jmri.jmrit.audio.swing.AudioSourceFrameTest

Jemmy exit wait failed for test jmri.jmrit.beantable.LogixTableActionTest

Jemmy exit wait failed for test jmri.jmrit.beantable.usermessagepreferences.UserMessagePreferencesPaneTest

Jemmy exit wait failed for test jmri.jmrit.conditional.ConditionalTreeEditTest

Jemmy exit wait failed for test apps.LaunchJmriAppBase

Jemmy exit wait failed for test jmri.jmrit.display.AnalogClock2DisplayTest

Jemmy exit wait failed for test jmri.jmrit.consisttool.ConsistToolFrameTest

Jemmy exit wait failed for test jmri.jmrit.display.SignalHeadIconTest

Jemmy exit wait failed for test jmri.jmrit.display.IndicatorTrackIconTest

Jemmy exit wait failed for test jmri.jmrit.display.controlPanelEditor.ControlPanelEditorTest

Jemmy exit wait failed for test apps.LaunchJmriAppBase

Jemmy exit wait failed for test apps.LaunchJmriAppBase

Jemmy exit wait failed for test jmri.jmrit.logix.OBlockTest

Jemmy exit wait failed for test jmri.jmrit.operations.automation.actions.SelectTrainActionTest

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.display.layoutEditor.LayoutEditorTest

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.operations.OperationsTestCase

Jemmy exit wait failed for test jmri.jmrit.simpleclock.configurexml.SimpleTimebaseXmlTest

Jemmy exit wait failed for test jmri.jmrit.roster.RosterEntryTest

Jemmy exit wait failed for test jmri.jmrit.simpleclock.SimpleTimebaseTest

Jemmy exit wait failed for test jmri.jmrit.symbolicprog.CVNameComparatorTest

Jemmy exit wait failed for test jmri.jmrit.symbolicprog.ComboOffRadioButtonTest

Jemmy exit wait failed for test jmri.jmrit.ussctc.SignalHeadSectionTest

Jemmy exit wait failed for test jmri.jmrix.acela.AcelaLightManagerTest

Jemmy exit wait failed for test jmri.jmrix.acela.AcelaMenuTest

Jemmy exit wait failed for test jmri.jmrix.bachrus.GraphPaneTest

Jemmy exit wait failed for test jmri.jmrix.anyma.UsbLightManagerTest

Jemmy exit wait failed for test jmri.jmrix.can.cbus.CbusThrottleTest

Jemmy exit wait failed for test jmri.jmrix.dcc4pc.swing.monitor.Dcc4PcMonPaneTest

Jemmy exit wait failed for test jmri.jmrix.dccpp.swing.mon.DCCppMonPaneTest

 

I got this by a slightly aggressive test where in TearDown I look for 500 msec of emptiness on the event thread and if it doesn't happen in 1 sec, I alert.

 

There are a lot of UI tests here, so I think I would need to understand in a bit more detail what's going on with the frames for these failures. The test was run on Xvfb, so no input events, but if a window is left open that could be receiving focus events from other tests running.

 

However, the non-UI tests that are on this list are definitely suspicious. For example everything in jmti.implementation: the SignalHead tests are probably leaking a swing timer for the blinking appearances.

 

How did you get the list of events that do happen on the event thread with the nice printout?

 

thanks

Balazs

 

 

On Mon, Dec 17, 2018 at 5:11 PM Bob Jacobsen <rgj1927@...> wrote:

Found one!

I put some test code in the JUnitUtil-tearDown method to look for events on the AWT queue, then used the run_tests_separately script.  It found a consistent case in jmri/jmrit/conditional/ConditionalEditBaseTest, see below (two separate runs).

They’re slightly different, but it seems that the test is ending while windows are still changing (with the AWT LOST_FOCUS events), which is somehow triggering a JMRI-specific event and then further JMRI activity.  The last FOCUS_GAINED event can (but don’t always) end up being quite late .

This doesn’t entirely explain why waitEmpty(200) occasionally fails; I need better instrumentation for that, will work on it as time is available, maybe later this week. 

In the below case, I’d appreciate any suggestions on how to find what’s going on.  I don’t know much about the Conditional GUI code, and I don’t see what jmri.beans.Bean$$Lambda$8/191382150@4c40b76e is (though I suspect it’s Timer-related, I don’t see why it’s a jmri.beans.Beans lambda)

Bob


jmri/jmrit/conditional/ConditionalEditBaseTest
JUnit version 4.12
.Using org.netbeans.jemmy.drivers.APIDriverInstaller driver installer
entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: java.awt.event.InvocationEvent[INVOCATION_DEFAULT,runnable=jmri.beans.Bean$$Lambda$8/191382150@4c40b76e,notifier=null,catchExceptions=false,when=1545062209603] on sun.lwawt.macosx.LWCToolkit@72181dfc
.entry: java.awt.FocusEvent[FOCUS_GAINED,permanent,opposite=javax.swing.JTabbedPane[,0,0,389x304,invalid,layout=com.apple.laf.AquaTabbedPaneUI$AquaTruncatingTabbedPaneLayout,alignmentX=0.0,alignmentY=0.0,border=,flags=16777568,maximumSize=,minimumSize=,preferredSize=,haveRegistered=false,tabPlacement=TOP],cause=TRAVERSAL_FORWARD] on jmri.jmrit.picker.PickListModel$1[,0,0,364x144,invalid,alignmentX=0.0,alignmentY=0.0,border=,flags=251658568,maximumSize=,minimumSize=,preferredSize=,autoCreateColumnsFromModel=false,autoResizeMode=AUTO_RESIZE_SUBSEQUENT_COLUMNS,cellSelectionEnabled=false,editingColumn=-1,editingRow=-1,gridColor=javax.swing.plaf.ColorUIResource[r=255,g=255,b=255],preferredViewportSize=java.awt.Dimension[width=250,height=112],rowHeight=16,rowMargin=1,rowSelectionAllowed=true,selectionBackground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=0,g=0,b=217],selectionForeground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=255,g=255,b=255],showHorizontalLines=true,showVerticalLines=true]
..
Time: 3.748




jmri/jmrit/conditional/ConditionalEditBaseTest
JUnit version 4.12
.Using org.netbeans.jemmy.drivers.APIDriverInstaller driver installer
entry: sun.awt.TimedWindowEvent[WINDOW_LOST_FOCUS,opposite=null,oldState=0,newState=0] on frame0
.entry: java.awt.event.InvocationEvent[INVOCATION_DEFAULT,runnable=jmri.beans.Bean$$Lambda$8/191382150@8b96fde,notifier=null,catchExceptions=false,when=1545062327033] on sun.lwawt.macosx.LWCToolkit@fa6a01b
..entry: java.awt.FocusEvent[FOCUS_GAINED,permanent,opposite=javax.swing.JTabbedPane[,0,0,389x304,invalid,layout=com.apple.laf.AquaTabbedPaneUI$AquaTruncatingTabbedPaneLayout,alignmentX=0.0,alignmentY=0.0,border=,flags=16777568,maximumSize=,minimumSize=,preferredSize=,haveRegistered=false,tabPlacement=TOP],cause=TRAVERSAL_FORWARD] on jmri.jmrit.picker.PickListModel$1[,0,0,364x144,invalid,alignmentX=0.0,alignmentY=0.0,border=,flags=251658568,maximumSize=,minimumSize=,preferredSize=,autoCreateColumnsFromModel=false,autoResizeMode=AUTO_RESIZE_SUBSEQUENT_COLUMNS,cellSelectionEnabled=false,editingColumn=-1,editingRow=-1,gridColor=javax.swing.plaf.ColorUIResource[r=255,g=255,b=255],preferredViewportSize=java.awt.Dimension[width=250,height=112],rowHeight=16,rowMargin=1,rowSelectionAllowed=true,selectionBackground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=0,g=0,b=217],selectionForeground=com.apple.laf.AquaImageFactory$SystemColorProxy[r=255,g=255,b=255],showHorizontalLines=true,showVerticalLines=true]
..
Time: 3.729



--
Bob Jacobsen
rgj1927@...





Balazs Racz
 



On Tue, Dec 18, 2018 at 10:15 PM Dan Boudreau <daboudreau@...> wrote:

I found that I had to increase the wait time for “emptiness” from 1 sec to 2 seconds when testing


what was the code you had to add for this timeout change? (sry I can't find it in the PR...) 

Balazs

Dan Boudreau
 

I copied your code and added it to the operations shut down so I could troubleshoot.  Waiting for 500msec or 250msec for emptiness didn’t change the tests, but did speed things up.

 

public class OperationsTestCase {

 

    @Before

    public void setUp() {

        JUnitUtil.setUp();

 

        // Set things up outside of operations

        JUnitUtil.resetInstanceManager();

        JUnitUtil.initInternalTurnoutManager();

        JUnitUtil.initInternalLightManager();

        JUnitUtil.initInternalSensorManager();

        JUnitUtil.initDebugThrottleManager();

        JUnitUtil.initIdTagManager();

        JUnitUtil.initShutDownManager();

        JUnitUtil.resetProfileManager();

 

        JUnitOperationsUtil.resetOperationsManager();

    }

   

    private final boolean testThreadRunning = false;

 

    @After

    public void tearDown() {

        if (testThreadRunning) {

            final Semaphore sem = new Semaphore(0);

            new Thread(new Runnable() {

                @Override

                public void run() {

                    new QueueTool().waitEmpty(250);

                    sem.release();

                }

            }).start();

            try {

                if (!sem.tryAcquire(2000, TimeUnit.MILLISECONDS)) {

                    System.err.println("Jemmy exit wait failed for test " + this.getClass().getName());

                    Assert.fail("Jemmy is not empty after this test");

                }

            } catch (InterruptedException e) {

                // ignore.

            }

        }

        JUnitUtil.tearDown();

    }

}

 

 

Dan

 

From: jmri@jmri-developers.groups.io [mailto:jmri@jmri-developers.groups.io] On Behalf Of Balazs Racz
Sent: Tuesday, December 18, 2018 4:29 PM
To: jmri@jmri-developers.groups.io
Subject: Re: [jmri-developers] CI failures due to leaked work on the gui thread

 

 

 

On Tue, Dec 18, 2018 at 10:15 PM Dan Boudreau <daboudreau@...> wrote:

I found that I had to increase the wait time for “emptiness” from 1 sec to 2 seconds when testing

 

what was the code you had to add for this timeout change? (sry I can't find it in the PR...) 

 

Balazs

Balazs Racz
 

Oh I see what you mean. I misunderstood you first.


On Tue, Dec 18, 2018 at 10:35 PM Dan Boudreau <daboudreau@...> wrote:

I copied your code and added it to the operations shut down so I could troubleshoot.  Waiting for 500msec or 250msec for emptiness didn’t change the tests, but did speed things up.


db123@bergqvist.se
 

A small note:

I think it's a good idea to have:

PUBLIC VOID setUp() {
JUnitUtil.setUp();
// Set things up outside of operations
JUnitUtil.resetInstanceManager();
JUnitUtil.resetProfileManager();

Since the ConfigurationManager may be auto created by init???Manager() and it could be affected by the ProfileManager. I don't know if this is important, but if the ProfileManager is reset after the InstanceManager is reset, this risk is eliminated.

The JavaDoc for ProfileManager say:
This manager, and its configuration, fall outside the control of the ConfigureManager since the ConfigureManager's configuration is influenced by this manager.
http://jmri.sourceforge.net/JavaDoc/doc/jmri/profile/ProfileManager.html

And this is the recommended order by the documentation of JMRI Unit testing:
http://jmri.sourceforge.net/help/en/html/doc/Technical/JUnit.shtml#ResetInstMgr

Daniel

2018-12-18 22:35 skrev Dan Boudreau:

I copied your code and added it to the operations shut down so I could
troubleshoot. Waiting for 500msec or 250msec for emptiness didn’t
change the tests, but did speed things up.
PUBLIC CLASS OperationsTestCase {
@Before PUBLIC VOID setUp() {
JUnitUtil.setUp();
// Set things up outside of operations
JUnitUtil.resetInstanceManager();
JUnitUtil.initInternalTurnoutManager();
JUnitUtil.initInternalLightManager();
JUnitUtil.initInternalSensorManager();
JUnitUtil.initDebugThrottleManager();
JUnitUtil.initIdTagManager();
JUnitUtil.initShutDownManager();
JUnitUtil.resetProfileManager();
JUnitOperationsUtil.resetOperationsManager();
}

Bob Jacobsen
 

Although we started with “leaked work”, I think many of the examples (at least the ones I’ve been looking at) are somewhat different: The test, including the@Afterroutine, are complete but Swing is still working. This can happen because e.g. the window close is followed by lots of listeners being kicked off, which take a while to settle down.

In a couple instances, I’ve been able to see that after-close activity trigger initializations in the _next_ test’s InstanceManager as the close-down effort touches managers.

How to fix this? A long wait at the end of each of 22k tests really isn’t going to work. But perhaps we only have to wait after closing a window. So we could add a wait if the resetWindows(false, false) call had to close something.

That doesn’t catch nice tests that close their own windows, so those would have to have waits added too. But we could add a resettable counter to JmriJFrame to know if any of those were created, and use that in the decision whether to do a end-test wait.

Bob

--
Bob Jacobsen
@BobJacobsen

db123@bergqvist.se
 

Are the tests running sequential or in parallell?

Can we assume that test A is finished before test B is started? (I now that things are leaving behind and threads may still be running). Can we assume that TestA.tearDown() is executed before TestB.setUp() ?

In the later case, there is a radical but simple solution that could work.

If the InstanceManager knows whenether it's in "run" mode (running ordinary application) or in "test" mode (running the JUnit tests), JUnitUtil could tell the InstanceManager when a test starts and ends. The InstanceManager could then hold a list of threads that has been active before and a list of threads that are active now.

InstanceManager.getDefault() would check if in "test" mode, and if so, test if the current thread is in the "before" list. If so, throw an exception. If the current thread is not in either "active now" list or "active before" list, add it to the "active now" list.

JUnitUtil.tearDown (or any similar method) would then tell InstanceManager to move all threads in "active now" list to the "active before" list.

By doing this, the InstanceManager would now which threads are related to the current thread and which threads is "old".

If there is an interest in testing this idea, I could try it and do a PR.

Daniel

2018-12-20 08:13 skrev Bob Jacobsen:

Although we started with “leaked work”, I think many of the examples
(at least the ones I’ve been looking at) are somewhat different: The
test, including the@Afterroutine, are complete but Swing is still
working. This can happen because e.g. the window close is followed by
lots of listeners being kicked off, which take a while to settle down.
In a couple instances, I’ve been able to see that after-close activity
trigger initializations in the _next_ test’s InstanceManager as the
close-down effort touches managers.
How to fix this? A long wait at the end of each of 22k tests really
isn’t going to work. But perhaps we only have to wait after closing a
window. So we could add a wait if the resetWindows(false, false) call
had to close something.
That doesn’t catch nice tests that close their own windows, so those
would have to have waits added too. But we could add a resettable
counter to JmriJFrame to know if any of those were created, and use
that in the decision whether to do a end-test wait.
Bob
--
Bob Jacobsen
@BobJacobsen

Bob Jacobsen
 

TestA@Afterdoes complete before testB@Before(and similar for@AfterClassand @BeforeClass); that can be counted on.

There’s code in JUnitUtil’s setUp and tearDown to do some thread checks. It could certainly do the kinds of things you describe, no need to involve the InstanceManager (which should just _manage_ _instances_) It would be good to get rid of extra threads (though we’d probably have to move to a common Timer implementation to manage it)

But that doesn’t solve the problem in the subject line, because that’s about the AWT/Swing thread. You can’t make that go away.

Bob

On Dec 20, 2018, at 8:24 AM, db123@... wrote:

Are the tests running sequential or in parallell?

Can we assume that test A is finished before test B is started? (I now that things are leaving behind and threads may still be running). Can we assume that TestA.tearDown() is executed before TestB.setUp() ?

In the later case, there is a radical but simple solution that could work.

If the InstanceManager knows whenether it's in "run" mode (running ordinary application) or in "test" mode (running the JUnit tests), JUnitUtil could tell the InstanceManager when a test starts and ends. The InstanceManager could then hold a list of threads that has been active before and a list of threads that are active now.

InstanceManager.getDefault() would check if in "test" mode, and if so, test if the current thread is in the "before" list. If so, throw an exception. If the current thread is not in either "active now" list or "active before" list, add it to the "active now" list.

JUnitUtil.tearDown (or any similar method) would then tell InstanceManager to move all threads in "active now" list to the "active before" list.

By doing this, the InstanceManager would now which threads are related to the current thread and which threads is "old".

If there is an interest in testing this idea, I could try it and do a PR.

Daniel

2018-12-20 08:13 skrev Bob Jacobsen:
Although we started with “leaked work”, I think many of the examples
(at least the ones I’ve been looking at) are somewhat different: The
test, including the@Afterroutine, are complete but Swing is still
working. This can happen because e.g. the window close is followed by
lots of listeners being kicked off, which take a while to settle down.
In a couple instances, I’ve been able to see that after-close activity
trigger initializations in the _next_ test’s InstanceManager as the
close-down effort touches managers.
How to fix this? A long wait at the end of each of 22k tests really
isn’t going to work. But perhaps we only have to wait after closing a
window. So we could add a wait if the resetWindows(false, false) call
had to close something.
That doesn’t catch nice tests that close their own windows, so those
would have to have waits added too. But we could add a resettable
counter to JmriJFrame to know if any of those were created, and use
that in the decision whether to do a end-test wait.
Bob
--
Bob Jacobsen
@BobJacobsen

--
Bob Jacobsen
@BobJacobsen

db123@bergqvist.se
 

The reason for my suggestion was these sentences:

In a couple instances, I’ve been able to see that after-close activity trigger initializations in the _next_ test’s InstanceManager as the close-down effort touches managers.
How to fix this?
Daniel

2018-12-20 17:40 skrev Bob Jacobsen:

TestA@Afterdoes complete before testB@Before(and similar for@AfterClassand @BeforeClass); that can be counted on.
There’s code in JUnitUtil’s setUp and tearDown to do some thread
checks. It could certainly do the kinds of things you describe, no
need to involve the InstanceManager (which should just _manage_
_instances_) It would be good to get rid of extra threads (though
we’d probably have to move to a common Timer implementation to manage
it)
But that doesn’t solve the problem in the subject line, because that’s
about the AWT/Swing thread. You can’t make that go away.
Bob

On Dec 20, 2018, at 8:24 AM, db123@... wrote:
Are the tests running sequential or in parallell?
Can we assume that test A is finished before test B is started? (I now that things are leaving behind and threads may still be running). Can we assume that TestA.tearDown() is executed before TestB.setUp() ?
In the later case, there is a radical but simple solution that could work.
If the InstanceManager knows whenether it's in "run" mode (running ordinary application) or in "test" mode (running the JUnit tests), JUnitUtil could tell the InstanceManager when a test starts and ends. The InstanceManager could then hold a list of threads that has been active before and a list of threads that are active now.
InstanceManager.getDefault() would check if in "test" mode, and if so, test if the current thread is in the "before" list. If so, throw an exception. If the current thread is not in either "active now" list or "active before" list, add it to the "active now" list.
JUnitUtil.tearDown (or any similar method) would then tell InstanceManager to move all threads in "active now" list to the "active before" list.
By doing this, the InstanceManager would now which threads are related to the current thread and which threads is "old".
If there is an interest in testing this idea, I could try it and do a PR.
Daniel
2018-12-20 08:13 skrev Bob Jacobsen:
Although we started with “leaked work”, I think many of the examples
(at least the ones I’ve been looking at) are somewhat different: The
test, including the@Afterroutine, are complete but Swing is still
working. This can happen because e.g. the window close is followed by
lots of listeners being kicked off, which take a while to settle down.
In a couple instances, I’ve been able to see that after-close activity
trigger initializations in the _next_ test’s InstanceManager as the
close-down effort touches managers.
How to fix this? A long wait at the end of each of 22k tests really
isn’t going to work. But perhaps we only have to wait after closing a
window. So we could add a wait if the resetWindows(false, false) call
had to close something.
That doesn’t catch nice tests that close their own windows, so those
would have to have waits added too. But we could add a resettable
counter to JmriJFrame to know if any of those were created, and use
that in the decision whether to do a end-test wait.
Bob
--
Bob Jacobsen
@BobJacobsen
--
Bob Jacobsen
@BobJacobsen

Paul Bender
 

On Dec 20, 2018, at 11:24 AM, "db123@..." <db123@...> wrote:

Are the tests running sequential or in parallell?
this got me thinking...

At least in Maven, It is possible to run tests in separate JVM forks. Several years ago, we tried to do that but found quite a few test dependencies.

If we can use this, I think this would actually eliminate some of the leaked test state ( at least on Travis which uses Maven. Appveyor is using ant still I believe.)

Let me play with this...

Paul