Adding stopping criteria unit-test

removed 1 deleted label

@cmastall started to work on this issue and should be the one taking this task. Refer to him if you would like to take the lead on this task.

added priority label

removed priority label

Qu represents the gradient of a DDP interval given an updated Hessian of the Value function Vxx_p. If Vxx_p provides a good estimation of the problem, then we can assume that Qu is the gradient of our OC problem.

Instead Qu.T * Quu^-1 * Qu represents the progression of the optimization. In fact, this is used as Armijo rule.

General speaking, we could use the norm of Qu as stopping criteria. However, I see cases where this value is very small (around 10^-5) and still there is a significant improvement, i.e. value of Qu.T * Quu^-1 * Qu (around 10^-3).

We could have use both values as stopping criteria, @nmansard do you think this is a good idea? or just make complex our code.

added priority label

added on-going label and removed priority label

mentioned in merge request !27 (merged)

I included an simple test for stopping criteria in the LQR case (!27 (merged)). Basically, I build and solve a KKT problem for LQR problem. Then I checked that the gradient of the Lagrangian is equals zero.

Note that this test only works for linear system with quadratic cost functions.

Please check the code here: https://gepgitlab.laas.fr/loco-3d/cddp/blob/devel/unittest/ddp.py#L98

Do you think we need a test for nonlinear systems? @nmansard

assigned to @cmastall

added under review label and removed on-going label

added on-going label

We have the jacobian of the constraint to be (with explicit x0 variable and constraint x0=x0ref):

J = [ I 0 ] [ -fx I -fu ] [ -fx I -fu ] [ ... ... ]

The transpose multiplies one lambda per timestep from 0 to T, and produces one "equality" per x and one per u. Looking only at the part of J^T corresponding to x, we have one main diagonal of I and a subdiagonal of -fx. Hence:

lambda_T = - lx_T

and recursively

lambda_t - fx^T lambda_t+1 = - lx_t+1

This recurence is just the same as the recurence on the Vx, so lambda_t = Vx_t.

I implemented an unit-test for stopping criteria in LQR problems. After solving the analogous KKT problem, I computed the gradient of the Lagrangian from the multipliers (see https://gepgitlab.laas.fr/loco-3d/cddp/blob/topic/unittest/unittest/ddp.py#L104). Then I check that its norm is equals to zero.

I will close this task since it's been merged in !29 (merged).

closed

removed on-going under review labels

mentioned in commit 66f21928

Adding stopping criteria unit-test

Designs

Child items ...

Activity