Topics

Note about Timer and TimerTask

danielb987
 

While working on a test for LogixNG and testing code using a timer, I had a problem with a test failing in about 1 of 20 times.

My code called TimerTask.cancel() before register a new timer task. It turned out that the problem was that sometimes TimerTask.cancel() was called when the timer task was started but not completed, and TimerTask.cancel() doesn't wait for the timer task to end before returning. And there is no way to force TimerTask.cancel() to wait.

To solve this, I added code to ensure that if TimerTask.run() was started, the method that called TimerTask.cancel() would not finish until TimerTask.run() was finished. This got rid of the problem completely, but added some extra overhead.

I wasn't aware of this problem before I was hit by it, so I write this to alert others. Note that the code that called TimerTask.cancel() wasn't in the tests but in my normal code, but the tests was hit by it.

Second, I wonder if there is a better timer that doesn't have this problem. I think that it's bad that neither Timer nor TimerTask has a method that cancels the timer task and wait until it's guaranteed that the timer task will not continue. I solved it, but maybe there is a better way?

Daniel

ps.
When I discovered that a particular test had this problem, I created a PackageTest class that looked like the code below. It forced the error to happen on every execution and let me track down the problem much more easily.
ds.



package jmri.jmrit.logixng;

import org.junit.runner.RunWith;
import org.junit.runners.Suite;

@RunWith(Suite.class)
@Suite.SuiteClasses({
MyClassTest.class,
MyClassTest.class,
MyClassTest.class,
MyClassTest.class,
MyClassTest.class,
MyClassTest.class,
MyClassTest.class,
// And 100 more rows ....
})

public class PackageTest {
}

Bob Jacobsen
 

I surprised this wasn’t automatically solved by the synchronization of the timer task with the rest of the code. How was the task protected?

Bob


On Jan 8, 2020, at 6:59 AM, danielb987 <db123@...> wrote:

My code called TimerTask.cancel() before register a new timer task. It turned out that the problem was that sometimes TimerTask.cancel() was called when the timer task was started but not completed, and TimerTask.cancel() doesn't wait for the timer task to end before returning. And there is no way to force TimerTask.cancel() to wait.
--
Bob Jacobsen
@BobJacobsen

danielb987
 

https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/digital/expressions/Timer.java

The timer task is created in the method getNewTimerTask() at lines 157 - 177. See the method TimerTask.run() on lines 161 - 175.

The timer is stopped in the method stopTimer() at lines 220 - 252.

The variable _timerTask is the current running timer task.
_timerTask._timerIsRunning is when the task is running. (Bad name, I should rename the variable).
_timerTask._stopTimer tells the timer task to not run the task.

If there is a better way to solve this, I will be happy to update my code :)

Daniel

2020-01-08 17:52 skrev Bob Jacobsen:

I surprised this wasn’t automatically solved by the synchronization of
the timer task with the rest of the code. How was the task protected?
Bob

On Jan 8, 2020, at 6:59 AM, danielb987 <db123@...> wrote:
My code called TimerTask.cancel() before register a new timer task. It turned out that the problem was that sometimes TimerTask.cancel() was called when the timer task was started but not completed, and TimerTask.cancel() doesn't wait for the timer task to end before returning. And there is no way to force TimerTask.cancel() to wait.
--
Bob Jacobsen
@BobJacobsen

Svata Dedic
 

If the Timer class is meant to be used from one single dedicated thread, the code may eventually work. But as j.u.Timer is used, which fires up its scheduled tasks in a private thread, I would suspect that the user-provided callback is allowed eventually to manipulate the Timer instance, isn't it ?
Anyway, at least the private j.u.Timer thread AND the scheduling thread both access the logix timer shared state, so it's likely that the Timer should be thread-safe - right ?

-S.

Dne 08. 01. 20 v 18:15 danielb987 napsal(a):

https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/digital/expressions/Timer.java The timer task is created in the method getNewTimerTask() at lines 157 - 177. See the method TimerTask.run() on lines 161 - 175.
The timer is stopped in the method stopTimer() at lines 220 - 252.
The variable _timerTask is the current running timer task.
_timerTask._timerIsRunning is when the task is running. (Bad name, I should rename the variable).
_timerTask._stopTimer tells the timer task to not run the task.
If there is a better way to solve this, I will be happy to update my code  :)
Daniel
2020-01-08 17:52 skrev Bob Jacobsen:
I surprised this wasn’t automatically solved by the synchronization of
the timer task with the rest of the code.  How was the task protected?

Bob


On Jan 8, 2020, at 6:59 AM, danielb987 <db123@...> wrote:

My code called TimerTask.cancel() before register a new timer task. It turned out that the problem was that sometimes TimerTask.cancel() was called when the timer task was started but not completed, and TimerTask.cancel() doesn't wait for the timer task to end before returning. And there is no way to force TimerTask.cancel() to wait.
--
Bob Jacobsen
@BobJacobsen




danielb987
 

The method TimerTask.run() executes the method c.execute() which runs its code in ThreadingUtil.runOnGUIEventually() or ThreadingUtil.runOnGUI(), so the code will always be executed on the GUI thread.

See the methods DefaultConditionalNG.execute() and DefaultConditionalNG.runOnGUI() here:
https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/implementation/DefaultConditionalNG.java

Daniel

2020-01-08 21:44 skrev Svata Dedic:

If the Timer class is meant to be used from one single dedicated
thread, the code may eventually work. But as j.u.Timer is used, which
fires up its scheduled tasks in a private thread, I would suspect that
the user-provided callback is allowed eventually to manipulate the
Timer instance, isn't it ?
Anyway, at least the private j.u.Timer thread AND the scheduling
thread both access the logix timer shared state, so it's likely that
the Timer should be thread-safe - right ?
-S.
Dne 08. 01. 20 v 18:15 danielb987 napsal(a):
https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/digital/expressions/Timer.java The timer task is created in the method getNewTimerTask() at lines 157 - 177. See the method TimerTask.run() on lines 161 - 175.
The timer is stopped in the method stopTimer() at lines 220 - 252.
The variable _timerTask is the current running timer task.
_timerTask._timerIsRunning is when the task is running. (Bad name, I should rename the variable).
_timerTask._stopTimer tells the timer task to not run the task.
If there is a better way to solve this, I will be happy to update my code  :)
Daniel
2020-01-08 17:52 skrev Bob Jacobsen:
I surprised this wasn’t automatically solved by the synchronization of
the timer task with the rest of the code.  How was the task protected?
Bob

On Jan 8, 2020, at 6:59 AM, danielb987 <db123@...> wrote:
My code called TimerTask.cancel() before register a new timer task. It turned out that the problem was that sometimes TimerTask.cancel() was called when the timer task was started but not completed, and TimerTask.cancel() doesn't wait for the timer task to end before returning. And there is no way to force TimerTask.cancel() to wait.
-- Bob Jacobsen
@BobJacobsen

Randall Wood
 

Cancelling a Timer or TimerTask is purely about ensuring a timer does not trigger a task; I think the Javadocs are very clear about that. Once a task is started, the Timer and TimerTask APIs provide no method to kill the running task, and I’m not sure they should (it’s not easy to interrupt a thread and clean up from that unless *everything* is thread safe, and if you are relying on Swing for anything, it’s not thread safe), so I can see why Sun choose not to provide any easy way to shoot yourself in the foot with such an API.

If you *really* want to allow a LogixNG that is running to be killed, you need to provide a separate and safe way to kill that. (Put another way, I think your stopTimer is not well thought out; it’s one thing to prevent a scheduled activity from running next time, but quite a different thing to kill a running activity; and conflating the two is just asking for trouble; which is partly why the warning you suppressed was presented.)


If really want be able to cancel a running LogixNG computation, you should be using Futures, not delaying the cancel for the next computation until no computation is running (what happens if a user schedules an operation every N seconds, but the operation takes 2*N seconds to complete? It appears to be that the design intent is that becomes uncancelable).

Randall

On Jan 8, 2020, at 17:07, danielb987 <db123@...> wrote:

The method TimerTask.run() executes the method c.execute() which runs its code in ThreadingUtil.runOnGUIEventually() or ThreadingUtil.runOnGUI(), so the code will always be executed on the GUI thread.

See the methods DefaultConditionalNG.execute() and DefaultConditionalNG.runOnGUI() here:
https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/implementation/DefaultConditionalNG.java

Daniel

2020-01-08 21:44 skrev Svata Dedic:
If the Timer class is meant to be used from one single dedicated
thread, the code may eventually work. But as j.u.Timer is used, which
fires up its scheduled tasks in a private thread, I would suspect that
the user-provided callback is allowed eventually to manipulate the
Timer instance, isn't it ?
Anyway, at least the private j.u.Timer thread AND the scheduling
thread both access the logix timer shared state, so it's likely that
the Timer should be thread-safe - right ?
-S.
Dne 08. 01. 20 v 18:15 danielb987 napsal(a):
https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/digital/expressions/Timer.java The timer task is created in the method getNewTimerTask() at lines 157 - 177. See the method TimerTask.run() on lines 161 - 175.
The timer is stopped in the method stopTimer() at lines 220 - 252.
The variable _timerTask is the current running timer task.
_timerTask._timerIsRunning is when the task is running. (Bad name, I should rename the variable).
_timerTask._stopTimer tells the timer task to not run the task.
If there is a better way to solve this, I will be happy to update my code  :)
Daniel
2020-01-08 17:52 skrev Bob Jacobsen:
I surprised this wasn’t automatically solved by the synchronization of
the timer task with the rest of the code.  How was the task protected?
Bob
On Jan 8, 2020, at 6:59 AM, danielb987 <db123@...> wrote:
My code called TimerTask.cancel() before register a new timer task. It turned out that the problem was that sometimes TimerTask.cancel() was called when the timer task was started but not completed, and TimerTask.cancel() doesn't wait for the timer task to end before returning. And there is no way to force TimerTask.cancel() to wait.
-- Bob Jacobsen
rgj1927@...



danielb987
 

I don't mind if TimerTask.run() runs to completion if I call TimerTask.cancel(). I don't need to abort the TimerTask once it's running. But what I don't want is to have TimerTask.cancel() to _return_ while TimerTask.run() is executed.

I'm fine with this:
1) Timer starts
2) TimerTask.run() starts
3) TimerTask.cancel() is executed from another task and waits
4) TimerTask.run() is finished
5) TimerTask.cancel() returns

I'm fine with this:
1) Timer starts
2) TimerTask.cancel() is executed from another task and aborts the Timer
3) TimerTask.run() is never executed

What must _not_ happen is this:
1) Timer starts
2) TimerTask.run() starts
3) TimerTask.cancel() is executed from another task and returns
4) TimerTask.run() continues

TimerTask.cancel() doesn't need to abort the running TimerTask. But it must not return until TimerTask is completed, if TimerTask is started. Since this is not the case, I need to solve that in some way. If you have a better way to solve it, I will listen. But I cannot create a new TimerTask before I'm guaranteed that the old TimerTask is either finished or will never run.

You wrote in another mail about poor tests. I'm not sure the tests are poor. It may be that the main code in JMRI that calls Timer.cancel() or TimerTask.cancel() and expects cancel() to either abort the task or wait until the task is finished. I'm well aware that the javadoc for cancel() is very clear, but the problem is that when you call cancel(), you do it because you don't want the TimerTask to run anymore. And if you call cancel() from any other thread than the timer thread itself, you have no way to know if the timer task is currently running or not. This makes Timer and TimerTask hard to use and error prone.

Daniel

2020-01-08 23:50 skrev Randall Wood via Groups.Io:

Cancelling a Timer or TimerTask is purely about ensuring a timer does
not trigger a task; I think the Javadocs are very clear about that.
Once a task is started, the Timer and TimerTask APIs provide no method
to kill the running task, and I’m not sure they should (it’s not
easy to interrupt a thread and clean up from that unless *everything*
is thread safe, and if you are relying on Swing for anything, it’s
not thread safe), so I can see why Sun choose not to provide any easy
way to shoot yourself in the foot with such an API.
If you *really* want to allow a LogixNG that is running to be killed,
you need to provide a separate and safe way to kill that. (Put another
way, I think your stopTimer is not well thought out; it’s one thing
to prevent a scheduled activity from running next time, but quite a
different thing to kill a running activity; and conflating the two is
just asking for trouble; which is partly why the warning you
suppressed was presented.)
See also section 2.6 in
http://www.cs.utexas.edu/users/dahlin/Classes/UGOS/reading/programming-with-threads.pdf
If really want be able to cancel a running LogixNG computation, you
should be using Futures, not delaying the cancel for the next
computation until no computation is running (what happens if a user
schedules an operation every N seconds, but the operation takes 2*N
seconds to complete? It appears to be that the design intent is that
becomes uncancelable).
Randall

On Jan 8, 2020, at 17:07, danielb987 <db123@...> wrote:
The method TimerTask.run() executes the method c.execute() which
runs its code in ThreadingUtil.runOnGUIEventually() or
ThreadingUtil.runOnGUI(), so the code will always be executed on the
GUI thread.
See the methods DefaultConditionalNG.execute() and
DefaultConditionalNG.runOnGUI() here:
https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/implementation/DefaultConditionalNG.java
Daniel
2020-01-08 21:44 skrev Svata Dedic:

If the Timer class is meant to be used from one single dedicated
thread, the code may eventually work. But as j.u.Timer is used,
which
fires up its scheduled tasks in a private thread, I would suspect
that
the user-provided callback is allowed eventually to manipulate the
Timer instance, isn't it ?
Anyway, at least the private j.u.Timer thread AND the scheduling
thread both access the logix timer shared state, so it's likely
that
the Timer should be thread-safe - right ?
-S.
Dne 08. 01. 20 v 18:15 danielb987 napsal(a):
https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/digital/expressions/Timer.java
The timer task is created in the method getNewTimerTask() at lines
157 - 177. See the method TimerTask.run() on lines 161 - 175.
The timer is stopped in the method stopTimer() at lines 220 - 252.
The variable _timerTask is the current running timer task.
_timerTask._timerIsRunning is when the task is running. (Bad name,
I should rename the variable).
_timerTask._stopTimer tells the timer task to not run the task.
If there is a better way to solve this, I will be happy to update
my code :)
Daniel
2020-01-08 17:52 skrev Bob Jacobsen:
I surprised this wasn’t automatically solved by the
synchronization of
the timer task with the rest of the code. How was the task
protected?
Bob
On Jan 8, 2020, at 6:59 AM, danielb987 <db123@...> wrote:
My code called TimerTask.cancel() before register a new timer task.
It turned out that the problem was that sometimes TimerTask.cancel()
was called when the timer task was started but not completed, and
TimerTask.cancel() doesn't wait for the timer task to end before
returning. And there is no way to force TimerTask.cancel() to wait.
-- Bob Jacobsen
@BobJacobsen
Links:
------
[1] https://jmri-developers.groups.io/g/jmri/message/2545
[2] https://groups.io/mt/69529560/1303822
[3] https://jmri-developers.groups.io/g/jmri/post
[4] https://jmri-developers.groups.io/g/jmri/editsub/1303822
[5] https://jmri-developers.groups.io/g/jmri/leave/defanged

Paul Bender
 

On Jan 8, 2020, at 6:09 PM, danielb987 <db123@...> wrote:
What must _not_ happen is this:
1) Timer starts
2) TimerTask.run() starts
3) TimerTask.cancel() is executed from another task and returns
4) TimerTask.run() continues
Why is that a problem?

I haven’t looked at the code, but If you really need that to happen, you need to synchronize the code, possibly with a synchronization object ( I.e. not just with a synchronized keyword, but with a lock or a semaphore or...

Paul

danielb987
 

I need to know that the TimerTask is either finished or will never be executed. It doesn't matter which, but I must be sure that I don't end up in a situation where I think the TimerTask is completed but it's still running.

What happen was that I got a NullPointerException in 1 of 20 executions of the test that tested the timer, and I found that that was because of a concurrency problem.

I'm using a couple of variables that are protected by synchronization blocks.

Daniel

2020-01-09 00:17 skrev Paul Bender:

On Jan 8, 2020, at 6:09 PM, danielb987 <db123@...> wrote:
What must _not_ happen is this:
1) Timer starts
2) TimerTask.run() starts
3) TimerTask.cancel() is executed from another task and returns
4) TimerTask.run() continues
Why is that a problem?
I haven’t looked at the code, but If you really need that to happen,
you need to synchronize the code, possibly with a synchronization
object ( I.e. not just with a synchronized keyword, but with a lock or
a semaphore or...
Paul

Randall Wood
 

TimerTask .cancel() is not designed to halt a running Runnable; it is designed to prevent a Runnable from being run *if it has not started.*

As Paul noted, if you really want to know the Runnable is not running, you need to monitor it separately.

Randall

On Jan 8, 2020, at 18:45, danielb987 <db123@...> wrote:

I need to know that the TimerTask is either finished or will never be executed. It doesn't matter which, but I must be sure that I don't end up in a situation where I think the TimerTask is completed but it's still running.

What happen was that I got a NullPointerException in 1 of 20 executions of the test that tested the timer, and I found that that was because of a concurrency problem.

I'm using a couple of variables that are protected by synchronization blocks.

Daniel

2020-01-09 00:17 skrev Paul Bender:
On Jan 8, 2020, at 6:09 PM, danielb987 <db123@...> wrote:
What must _not_ happen is this:
1) Timer starts
2) TimerTask.run() starts
3) TimerTask.cancel() is executed from another task and returns
4) TimerTask.run() continues
Why is that a problem?
I haven’t looked at the code, but If you really need that to happen,
you need to synchronize the code, possibly with a synchronization
object ( I.e. not just with a synchronized keyword, but with a lock or
a semaphore or...
Paul

Bob Jacobsen
 

Why not synchronize (on the same object) the run() and the call to cancel()?

(Note: I don’t think the synchronize operations in getNewTimerTask() and run() i.e. line 164 are on the same object as the ones in stop timer. That means they don’t prevent simultaneous running…)

It is very hard to do better than the Java-provided synchronization tools. Most of the things that people create themselves just reduce the probability of a race condition, they don’t actually remove it. It particular, adding flags is never needed if the “synchronized” operations are set up right, and if they aren’t then adding flags can never (literally never) completely remove race conditions from Java code; an atomic “synchronized” operation is necessary. (There are lots of higher-level operations, like the various Atomic* classes, but that’s not my point)

Bob

On Jan 8, 2020, at 3:08 PM, danielb987 <db123@...> wrote:

I don't mind if TimerTask.run() runs to completion if I call TimerTask.cancel(). I don't need to abort the TimerTask once it's running. But what I don't want is to have TimerTask.cancel() to _return_ while TimerTask.run() is executed.

I'm fine with this:
1) Timer starts
2) TimerTask.run() starts
3) TimerTask.cancel() is executed from another task and waits
4) TimerTask.run() is finished
5) TimerTask.cancel() returns

I'm fine with this:
1) Timer starts
2) TimerTask.cancel() is executed from another task and aborts the Timer
3) TimerTask.run() is never executed

What must _not_ happen is this:
1) Timer starts
2) TimerTask.run() starts
3) TimerTask.cancel() is executed from another task and returns
4) TimerTask.run() continues

TimerTask.cancel() doesn't need to abort the running TimerTask. But it must not return until TimerTask is completed, if TimerTask is started. Since this is not the case, I need to solve that in some way. If you have a better way to solve it, I will listen. But I cannot create a new TimerTask before I'm guaranteed that the old TimerTask is either finished or will never run.

You wrote in another mail about poor tests. I'm not sure the tests are poor. It may be that the main code in JMRI that calls Timer.cancel() or TimerTask.cancel() and expects cancel() to either abort the task or wait until the task is finished. I'm well aware that the javadoc for cancel() is very clear, but the problem is that when you call cancel(), you do it because you don't want the TimerTask to run anymore. And if you call cancel() from any other thread than the timer thread itself, you have no way to know if the timer task is currently running or not. This makes Timer and TimerTask hard to use and error prone.

Daniel

2020-01-08 23:50 skrev Randall Wood via Groups.Io:
Cancelling a Timer or TimerTask is purely about ensuring a timer does
not trigger a task; I think the Javadocs are very clear about that.
Once a task is started, the Timer and TimerTask APIs provide no method
to kill the running task, and I’m not sure they should (it’s not
easy to interrupt a thread and clean up from that unless *everything*
is thread safe, and if you are relying on Swing for anything, it’s
not thread safe), so I can see why Sun choose not to provide any easy
way to shoot yourself in the foot with such an API.
If you *really* want to allow a LogixNG that is running to be killed,
you need to provide a separate and safe way to kill that. (Put another
way, I think your stopTimer is not well thought out; it’s one thing
to prevent a scheduled activity from running next time, but quite a
different thing to kill a running activity; and conflating the two is
just asking for trouble; which is partly why the warning you
suppressed was presented.)
See also section 2.6 in
http://www.cs.utexas.edu/users/dahlin/Classes/UGOS/reading/programming-with-threads.pdf
If really want be able to cancel a running LogixNG computation, you
should be using Futures, not delaying the cancel for the next
computation until no computation is running (what happens if a user
schedules an operation every N seconds, but the operation takes 2*N
seconds to complete? It appears to be that the design intent is that
becomes uncancelable).
Randall
On Jan 8, 2020, at 17:07, danielb987 <db123@...> wrote:
The method TimerTask.run() executes the method c.execute() which
runs its code in ThreadingUtil.runOnGUIEventually() or
ThreadingUtil.runOnGUI(), so the code will always be executed on the
GUI thread.
See the methods DefaultConditionalNG.execute() and
DefaultConditionalNG.runOnGUI() here:
https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/implementation/DefaultConditionalNG.java
Daniel
2020-01-08 21:44 skrev Svata Dedic:
If the Timer class is meant to be used from one single dedicated
thread, the code may eventually work. But as j.u.Timer is used,
which
fires up its scheduled tasks in a private thread, I would suspect
that
the user-provided callback is allowed eventually to manipulate the
Timer instance, isn't it ?
Anyway, at least the private j.u.Timer thread AND the scheduling
thread both access the logix timer shared state, so it's likely
that
the Timer should be thread-safe - right ?
-S.
Dne 08. 01. 20 v 18:15 danielb987 napsal(a):
https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/digital/expressions/Timer.java
The timer task is created in the method getNewTimerTask() at lines
157 - 177. See the method TimerTask.run() on lines 161 - 175.
The timer is stopped in the method stopTimer() at lines 220 - 252.
The variable _timerTask is the current running timer task.
_timerTask._timerIsRunning is when the task is running. (Bad name,
I should rename the variable).
_timerTask._stopTimer tells the timer task to not run the task.
If there is a better way to solve this, I will be happy to update
my code :)
Daniel
2020-01-08 17:52 skrev Bob Jacobsen:
I surprised this wasn’t automatically solved by the
synchronization of
the timer task with the rest of the code. How was the task
protected?
Bob
On Jan 8, 2020, at 6:59 AM, danielb987 <db123@...> wrote:
My code called TimerTask.cancel() before register a new timer task.
It turned out that the problem was that sometimes TimerTask.cancel()
was called when the timer task was started but not completed, and
TimerTask.cancel() doesn't wait for the timer task to end before
returning. And there is no way to force TimerTask.cancel() to wait.
-- Bob Jacobsen
@BobJacobsen
Links:
------
[1] https://jmri-developers.groups.io/g/jmri/message/2545
[2] https://groups.io/mt/69529560/1303822
[3] https://jmri-developers.groups.io/g/jmri/post
[4] https://jmri-developers.groups.io/g/jmri/editsub/1303822
[5] https://jmri-developers.groups.io/g/jmri/leave/defanged

--
Bob Jacobsen
@BobJacobsen

danielb987
 

How should I synchronize the run() method?

final Object lock = new Object();

class MyTimerTask implements TimerTask {
void run() {
synchronize(lock) {
// Do something
}
}
}

void stopTimer() {
synchronize(lock) {
_timerTask.cancel();
}
}

If I do this, I may still end up with TimerTask.run() is started, but paused before entering the synchronized section. Then stopTimer() is called and executes the synchronized block, which is not a problem since TimerTask.run() has not yet entered the synchronized block.

After stopTimer() has called _timerTask.cancel() and left the synchronized block, TimerTask.run() will continue to execute and enter the synchronized block.

I need to _know_ that the timer task has either finished, aborted or will never execute, before I return from stopTimer(). I will be very happy to solve this in a better way, but I don't know how.

--------

stopTimer() is synchronized on 'this'. MyTimerTask.run() is synchronized on 'timer', which is set in the method getNewTimerTask() before creating the TimerTask object:
final Timer timer = this;

Timer is the name of the main class, so it doesn't refer to the Timer that runs the TimerTask. But maybe I should rename that.

So it's the same object that both stopTimer() and MyTimerTask.run() is using to synchronize on.

Before adding this code, I got the error if I ran the test about 20 times. After adding the code, I have run the test several hundred times and never get the error.

Daniel

2020-01-09 07:58 skrev Bob Jacobsen:

Why not synchronize (on the same object) the run() and the call to cancel()?
(Note: I don’t think the synchronize operations in getNewTimerTask()
and run() i.e. line 164 are on the same object as the ones in stop
timer. That means they don’t prevent simultaneous running…)
It is very hard to do better than the Java-provided synchronization
tools. Most of the things that people create themselves just reduce
the probability of a race condition, they don’t actually remove it. It
particular, adding flags is never needed if the “synchronized”
operations are set up right, and if they aren’t then adding flags can
never (literally never) completely remove race conditions from Java
code; an atomic “synchronized” operation is necessary. (There are
lots of higher-level operations, like the various Atomic* classes, but
that’s not my point)
Bob

On Jan 8, 2020, at 3:08 PM, danielb987 <db123@...> wrote:
I don't mind if TimerTask.run() runs to completion if I call TimerTask.cancel(). I don't need to abort the TimerTask once it's running. But what I don't want is to have TimerTask.cancel() to _return_ while TimerTask.run() is executed.
I'm fine with this:
1) Timer starts
2) TimerTask.run() starts
3) TimerTask.cancel() is executed from another task and waits
4) TimerTask.run() is finished
5) TimerTask.cancel() returns
I'm fine with this:
1) Timer starts
2) TimerTask.cancel() is executed from another task and aborts the Timer
3) TimerTask.run() is never executed
What must _not_ happen is this:
1) Timer starts
2) TimerTask.run() starts
3) TimerTask.cancel() is executed from another task and returns
4) TimerTask.run() continues
TimerTask.cancel() doesn't need to abort the running TimerTask. But it must not return until TimerTask is completed, if TimerTask is started. Since this is not the case, I need to solve that in some way. If you have a better way to solve it, I will listen. But I cannot create a new TimerTask before I'm guaranteed that the old TimerTask is either finished or will never run.
You wrote in another mail about poor tests. I'm not sure the tests are poor. It may be that the main code in JMRI that calls Timer.cancel() or TimerTask.cancel() and expects cancel() to either abort the task or wait until the task is finished. I'm well aware that the javadoc for cancel() is very clear, but the problem is that when you call cancel(), you do it because you don't want the TimerTask to run anymore. And if you call cancel() from any other thread than the timer thread itself, you have no way to know if the timer task is currently running or not. This makes Timer and TimerTask hard to use and error prone.
Daniel
2020-01-08 23:50 skrev Randall Wood via Groups.Io:
Cancelling a Timer or TimerTask is purely about ensuring a timer does
not trigger a task; I think the Javadocs are very clear about that.
Once a task is started, the Timer and TimerTask APIs provide no method
to kill the running task, and I’m not sure they should (it’s not
easy to interrupt a thread and clean up from that unless *everything*
is thread safe, and if you are relying on Swing for anything, it’s
not thread safe), so I can see why Sun choose not to provide any easy
way to shoot yourself in the foot with such an API.
If you *really* want to allow a LogixNG that is running to be killed,
you need to provide a separate and safe way to kill that. (Put another
way, I think your stopTimer is not well thought out; it’s one thing
to prevent a scheduled activity from running next time, but quite a
different thing to kill a running activity; and conflating the two is
just asking for trouble; which is partly why the warning you
suppressed was presented.)
See also section 2.6 in
http://www.cs.utexas.edu/users/dahlin/Classes/UGOS/reading/programming-with-threads.pdf
If really want be able to cancel a running LogixNG computation, you
should be using Futures, not delaying the cancel for the next
computation until no computation is running (what happens if a user
schedules an operation every N seconds, but the operation takes 2*N
seconds to complete? It appears to be that the design intent is that
becomes uncancelable).
Randall
On Jan 8, 2020, at 17:07, danielb987 <db123@...> wrote:
The method TimerTask.run() executes the method c.execute() which
runs its code in ThreadingUtil.runOnGUIEventually() or
ThreadingUtil.runOnGUI(), so the code will always be executed on the
GUI thread.
See the methods DefaultConditionalNG.execute() and
DefaultConditionalNG.runOnGUI() here:
https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/implementation/DefaultConditionalNG.java
Daniel
2020-01-08 21:44 skrev Svata Dedic:
If the Timer class is meant to be used from one single dedicated
thread, the code may eventually work. But as j.u.Timer is used,
which
fires up its scheduled tasks in a private thread, I would suspect
that
the user-provided callback is allowed eventually to manipulate the
Timer instance, isn't it ?
Anyway, at least the private j.u.Timer thread AND the scheduling
thread both access the logix timer shared state, so it's likely
that
the Timer should be thread-safe - right ?
-S.
Dne 08. 01. 20 v 18:15 danielb987 napsal(a):
https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/digital/expressions/Timer.java
The timer task is created in the method getNewTimerTask() at lines
157 - 177. See the method TimerTask.run() on lines 161 - 175.
The timer is stopped in the method stopTimer() at lines 220 - 252.
The variable _timerTask is the current running timer task.
_timerTask._timerIsRunning is when the task is running. (Bad name,
I should rename the variable).
_timerTask._stopTimer tells the timer task to not run the task.
If there is a better way to solve this, I will be happy to update
my code :)
Daniel
2020-01-08 17:52 skrev Bob Jacobsen:
I surprised this wasn’t automatically solved by the
synchronization of
the timer task with the rest of the code. How was the task
protected?
Bob
On Jan 8, 2020, at 6:59 AM, danielb987 <db123@...> wrote:
My code called TimerTask.cancel() before register a new timer task.
It turned out that the problem was that sometimes TimerTask.cancel()
was called when the timer task was started but not completed, and
TimerTask.cancel() doesn't wait for the timer task to end before
returning. And there is no way to force TimerTask.cancel() to wait.
-- Bob Jacobsen
@BobJacobsen
Links:
------
[1] https://jmri-developers.groups.io/g/jmri/message/2545
[2] https://groups.io/mt/69529560/1303822
[3] https://jmri-developers.groups.io/g/jmri/post
[4] https://jmri-developers.groups.io/g/jmri/editsub/1303822
[5] https://jmri-developers.groups.io/g/jmri/leave/defanged
--
Bob Jacobsen
@BobJacobsen

RadSolution
 

"
 when you call cancel(), you do it because you don't want the TimerTask to run anymore
"
I don't agree..... I think:
"
when you call cancel(), you do it because you don't want the TimerTask to be run() any time from now on.
It has no effect on already-running tasks.

"

Just my take on the logic; I'm not familiar with Java nor this class.

Dave R


On Thursday, 9 January 2020, 07:34:49 GMT, danielb987 <db123@...> wrote:


How should I synchronize the run() method?

final Object lock = new Object();

class MyTimerTask implements TimerTask {
    void run() {
        synchronize(lock) {
            // Do something
        }
    }
}

void stopTimer() {
    synchronize(lock) {
        _timerTask.cancel();
    }
}

If I do this, I may still end up with TimerTask.run() is started, but
paused before entering the synchronized section. Then stopTimer() is
called and executes the synchronized block, which is not a problem since
TimerTask.run() has not yet entered the synchronized block.

After stopTimer() has called _timerTask.cancel() and left the
synchronized block, TimerTask.run() will continue to execute and enter
the synchronized block.

I need to _know_ that the timer task has either finished, aborted or
will never execute, before I return from stopTimer(). I will be very
happy to solve this in a better way, but I don't know how.

--------

stopTimer() is synchronized on 'this'. MyTimerTask.run() is synchronized
on 'timer', which is set in the method getNewTimerTask() before creating
the TimerTask object:
final Timer timer = this;

Timer is the name of the main class, so it doesn't refer to the Timer
that runs the TimerTask. But maybe I should rename that.

So it's the same object that both stopTimer() and MyTimerTask.run() is
using to synchronize on.

Before adding this code, I got the error if I ran the test about 20
times. After adding the code, I have run the test several hundred times
and never get the error.

Daniel

2020-01-09 07:58 skrev Bob Jacobsen:
> Why not synchronize (on the same object) the run() and the call to
> cancel()?
>
> (Note:  I don’t think the synchronize operations in getNewTimerTask()
> and run() i.e. line 164 are on the same object as the ones in stop
> timer.  That means they don’t prevent simultaneous running…)
>
> It is very hard to do better than the Java-provided synchronization
> tools.  Most of the things that people create themselves just reduce
> the probability of a race condition, they don’t actually remove it. It
> particular, adding flags is never needed if the “synchronized”
> operations are set up right, and if they aren’t then adding flags can
> never (literally never) completely remove race conditions from Java
> code; an atomic “synchronized” operation is necessary.  (There are
> lots of higher-level operations, like the various Atomic* classes, but
> that’s not my point)
>
> Bob
>
>> On Jan 8, 2020, at 3:08 PM, danielb987 <db123@...> wrote:
>>
>> I don't mind if TimerTask.run() runs to completion if I call
>> TimerTask.cancel(). I don't need to abort the TimerTask once it's
>> running. But what I don't want is to have TimerTask.cancel() to
>> _return_ while TimerTask.run() is executed.
>>
>> I'm fine with this:
>> 1) Timer starts
>> 2) TimerTask.run() starts
>> 3) TimerTask.cancel() is executed from another task and waits
>> 4) TimerTask.run() is finished
>> 5) TimerTask.cancel() returns
>>
>> I'm fine with this:
>> 1) Timer starts
>> 2) TimerTask.cancel() is executed from another task and aborts the
>> Timer
>> 3) TimerTask.run() is never executed
>>
>> What must _not_ happen is this:
>> 1) Timer starts
>> 2) TimerTask.run() starts
>> 3) TimerTask.cancel() is executed from another task and returns
>> 4) TimerTask.run() continues
>>
>> TimerTask.cancel() doesn't need to abort the running TimerTask. But it
>> must not return until TimerTask is completed, if TimerTask is started.
>> Since this is not the case, I need to solve that in some way. If you
>> have a better way to solve it, I will listen. But I cannot create a
>> new TimerTask before I'm guaranteed that the old TimerTask is either
>> finished or will never run.
>>
>> You wrote in another mail about poor tests. I'm not sure the tests are
>> poor. It may be that the main code in JMRI that calls Timer.cancel()
>> or TimerTask.cancel() and expects cancel() to either abort the task or
>> wait until the task is finished. I'm well aware that the javadoc for
>> cancel() is very clear, but the problem is that when you call
>> cancel(), you do it because you don't want the TimerTask to run
>> anymore. And if you call cancel() from any other thread than the timer
>> thread itself, you have no way to know if the timer task is currently
>> running or not. This makes Timer and TimerTask hard to use and error
>> prone.
>>
>> Daniel
>>
>> 2020-01-08 23:50 skrev Randall Wood via Groups.Io:
>>> Cancelling a Timer or TimerTask is purely about ensuring a timer does
>>> not trigger a task; I think the Javadocs are very clear about that.
>>> Once a task is started, the Timer and TimerTask APIs provide no
>>> method
>>> to kill the running task, and I’m not sure they should (it’s not
>>> easy to interrupt a thread and clean up from that unless *everything*
>>> is thread safe, and if you are relying on Swing for anything, it’s
>>> not thread safe), so I can see why Sun choose not to provide any easy
>>> way to shoot yourself in the foot with such an API.
>>> If you *really* want to allow a LogixNG that is running to be killed,
>>> you need to provide a separate and safe way to kill that. (Put
>>> another
>>> way, I think your stopTimer is not well thought out; it’s one thing
>>> to prevent a scheduled activity from running next time, but quite a
>>> different thing to kill a running activity; and conflating the two is
>>> just asking for trouble; which is partly why the warning you
>>> suppressed was presented.)
>>> See also section 2.6 in
>>> http://www.cs.utexas.edu/users/dahlin/Classes/UGOS/reading/programming-with-threads.pdf
>>> If really want be able to cancel a running LogixNG computation, you
>>> should be using Futures, not delaying the cancel for the next
>>> computation until no computation is running (what happens if a user
>>> schedules an operation every N seconds, but the operation takes 2*N
>>> seconds to complete? It appears to be that the design intent is that
>>> becomes uncancelable).
>>> Randall
>>>> On Jan 8, 2020, at 17:07, danielb987 <db123@...> wrote:
>>>> The method TimerTask.run() executes the method c.execute() which
>>>> runs its code in ThreadingUtil.runOnGUIEventually() or
>>>> ThreadingUtil.runOnGUI(), so the code will always be executed on the
>>>> GUI thread.
>>>> See the methods DefaultConditionalNG.execute() and
>>>> DefaultConditionalNG.runOnGUI() here:
>>> https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/implementation/DefaultConditionalNG.java
>>>> Daniel
>>>> 2020-01-08 21:44 skrev Svata Dedic:
>>>>> If the Timer class is meant to be used from one single dedicated
>>>>> thread, the code may eventually work. But as j.u.Timer is used,
>>>>> which
>>>>> fires up its scheduled tasks in a private thread, I would suspect
>>>>> that
>>>>> the user-provided callback is allowed eventually to manipulate the
>>>>> Timer instance, isn't it ?
>>>>> Anyway, at least the private j.u.Timer thread AND the scheduling
>>>>> thread both access the logix timer shared state, so it's likely
>>>>> that
>>>>> the Timer should be thread-safe - right ?
>>>>> -S.
>>>>> Dne 08. 01. 20 v 18:15 danielb987 napsal(a):
>>> https://github.com/danielb987/JMRI/blob/LogixNG_new_19/java/src/jmri/jmrit/logixng/digital/expressions/Timer.java
>>>> The timer task is created in the method getNewTimerTask() at lines
>>>> 157 - 177. See the method TimerTask.run() on lines 161 - 175.
>>>>> The timer is stopped in the method stopTimer() at lines 220 - 252.
>>>>> The variable _timerTask is the current running timer task.
>>>>> _timerTask._timerIsRunning is when the task is running. (Bad name,
>>>>> I should rename the variable).
>>>>> _timerTask._stopTimer tells the timer task to not run the task.
>>>>> If there is a better way to solve this, I will be happy to update
>>>>> my code  :)
>>>>> Daniel
>>>>> 2020-01-08 17:52 skrev Bob Jacobsen:
>>>> I surprised this wasn’t automatically solved by the
>>>> synchronization of
>>>> the timer task with the rest of the code.  How was the task
>>>> protected?
>>>> Bob
>>>> On Jan 8, 2020, at 6:59 AM, danielb987 <db123@...> wrote:
>>>> My code called TimerTask.cancel() before register a new timer task.
>>>> It turned out that the problem was that sometimes TimerTask.cancel()
>>>> was called when the timer task was started but not completed, and
>>>> TimerTask.cancel() doesn't wait for the timer task to end before
>>>> returning. And there is no way to force TimerTask.cancel() to wait.
>>>> -- Bob Jacobsen
>>>> rgj1927@...
>>> Links:
>>> ------
>>> [1] https://jmri-developers.groups.io/g/jmri/message/2545
>>> [2] https://groups.io/mt/69529560/1303822
>>> [3] https://jmri-developers.groups.io/g/jmri/post
>>> [4] https://jmri-developers.groups.io/g/jmri/editsub/1303822
>>> [5] https://jmri-developers.groups.io/g/jmri/leave/defanged
>>
>>
>>
>
> --
> Bob Jacobsen
> rgj1927@...
>
>
>
>
>


Svata Dedic
 

Dne 09. 01. 20 v 0:08 danielb987 napsal(a):
What must _not_ happen is this:
1) Timer starts
2) TimerTask.run() starts
3) TimerTask.cancel() is executed from another task and returns
4) TimerTask.run() continues
TimerTask.cancel() doesn't need to abort the running TimerTask. But it must not return until TimerTask is completed, if TimerTask is started.
In general, killing a pending operation is very very peculiar and usually requires support from the killed code, either checking some "cancelled" variable, interrupt status.

BTW - consider also using Thread.interrupt() on the running task's thread to break potential waits and I/O operations.

Some patterns include using (deprecated) Thread.stop(), which throws ThreadDeath in the target thread... but the code MUST be written in a way it really cleans up all locks, mutexes etc in finally blocks, which is rarely the case. And then there's flaky code that catches Error, including ThreadDeath.

It's far better to *design* without "kill running" operation as it cannot be ensured consistently.

The method TimerTask.run() executes the method c.execute() which runs its code in ThreadingUtil.runOnGUIEventually() or ThreadingUtil.runOnGUI(), so the code will always be executed on the GUI thread.
2/ if the ONLY way to execute the user code is to use runOnGUI*(), the Timer implementation should enforce that, and should not rely on the caller's good behaviour.

3/ if Timer code design *requires* being operated from GUI thread, it should check that, checking for EDT in its operations. Prevents accidental user errors early.

4/ It's a mistake to assume the code can be treated as single threaded: It is accessed at least from GUI thread (by your design) AND the j.u.Timer private thread. Any code invoked directly indirectly from TimerTask.run(), including calls from user code must use synchronization when accessing Timer data, and ALL OTHER operations must synchronize when accessing that data as well. Or ensure any callbacks happen in the designed thread.

This is not the case:
* timerStatusRef vs. _timerIsRunning may become inconsistent
* there are synchronized AND unsynchronized accesses to _timerTask and other data (other than worked with from TimerTask.run()): either the code is single-threaded, and the synchronization is redundant, or the synchronization should be done properly.
* for some reason, if-comparison then set is done manually (and without ensuring consistency) for AtomicReference, instead of its weakCompareAndSet() which was invented for exactly that purpose.

So asking once again is GUI thread the *only* calling thread to all the methods ? Or can be start(), reset(), evaluate(), get*+set*() eventually called in OTHER threads as well ?

The code is inconsistent with your answer, so I'd rather have the design idea confirmed.

There are also other considerations: each Timer runs its own private Thread, where thread construction is expensive (with OS support) and each thread maintenance have some costs (i.e. stack, ...). Thread pools were invented for this some decade ago. Given that the user code should be (finally) executed in GUI only, Timer threads are just for ticking wasting system handles and resources.

There are libraries that use sorted time queues, pool threads ... and almost solve "scheduling problem". Why invent another squared wheel again ?

-S.

P.S.: 1/ having anything but visual code executing in GUI thread is a bad idea, and contributes to JMRI being flaky and irresponsive and unreliable as it is. The less (new) code developed with this bad design, the better.

Svata Dedic
 

Sorry, copy / paste typo:
Dne 09. 01. 20 v 10:00 Svata Dedic napsal(a):
weakCompareAndSet() which was invented for exactly that purpose.
should be compareAndSet().

I was originally thinking about a weak variant, but adding ordering would be better. And weak* is now the same impl as compareAndSet() anyway :)

-S.

Randall Wood
 



On 09-Jan-2020, at 04:00, Svata Dedic <svatopluk.dedic@...> wrote:

Dne 09. 01. 20 v 0:08 danielb987 napsal(a):
What must _not_ happen is this:
1) Timer starts
2) TimerTask.run() starts
3) TimerTask.cancel() is executed from another task and returns
4) TimerTask.run() continues
TimerTask.cancel() doesn't need to abort the running TimerTask. But it must not return until TimerTask is completed, if TimerTask is started.

In general, killing a pending operation is very very peculiar and usually requires support from the killed code, either checking some "cancelled" variable, interrupt status.

BTW - consider also using Thread.interrupt() on the running task's thread to break potential waits and I/O operations.

Some patterns include using (deprecated) Thread.stop(), which throws ThreadDeath in the target thread... but the code MUST be written in a way it really cleans up all locks, mutexes etc in finally blocks, which is rarely the case. And then there's flaky code that catches Error, including ThreadDeath.

It's far better to *design* without "kill running" operation as it cannot be ensured consistently.

To add to Svata’s statement, if you have this requirement, you cannot use TimerTask, since per its documentation, cancel() is designed to be callable from multiple threads, including the thread executing its Runnable. If that must not happen, then you can’t use an API that allows it.

The method TimerTask.run() executes the method c.execute() which runs its code in ThreadingUtil.runOnGUIEventually() or ThreadingUtil.runOnGUI(), so the code will always be executed on the GUI thread.

2/ if the ONLY way to execute the user code is to use runOnGUI*(), the Timer implementation should enforce that, and should not rely on the caller's good behaviour.

3/ if Timer code design *requires* being operated from GUI thread, it should check that, checking for EDT in its operations. Prevents accidental user errors early.

If you are only going to run this on a GUI, you should look into using javax.swing.Timer instead of java.util.Timer.

4/ It's a mistake to assume the code can be treated as single threaded: It is accessed at least from GUI thread (by your design) AND the j.u.Timer private thread. Any code invoked directly indirectly from TimerTask.run(), including calls from user code must use synchronization when accessing Timer data, and ALL OTHER operations must synchronize when accessing that data as well. Or ensure any callbacks happen in the designed thread.

This is not the case:
* timerStatusRef vs. _timerIsRunning may become inconsistent
* there are synchronized AND unsynchronized accesses to _timerTask and other data (other than worked with from TimerTask.run()): either the code is single-threaded, and the synchronization is redundant, or the synchronization should be done properly.
* for some reason, if-comparison then set is done manually (and without ensuring consistency) for AtomicReference, instead of its weakCompareAndSet() which was invented for exactly that purpose.

So asking once again is GUI thread the *only* calling thread to all the methods ? Or can be start(), reset(), evaluate(), get*+set*() eventually called in OTHER threads as well ?

The code is inconsistent with your answer, so I'd rather have the design idea confirmed.

There are also other considerations: each Timer runs its own private Thread, where thread construction is expensive (with OS support) and each thread maintenance have some costs (i.e. stack, ...). Thread pools were invented for this some decade ago. Given that the user code should be (finally) executed in GUI only, Timer threads are just for ticking wasting system handles and resources.

There are libraries that use sorted time queues, pool threads ... and almost solve "scheduling problem". Why invent another squared wheel again ?

-S.

P.S.: 1/ having anything but visual code executing in GUI thread is a bad idea, and contributes to JMRI being flaky and irresponsive and unreliable as it is. The less (new) code developed with this bad design, the better.