sabato 25 giugno 2016
Midterm week and next steps
lunedì 20 giugno 2016
Summary of the work of the first month
Mid-term evaluation has now arrived so it's time to summarize the work I've done and check which goals I have achieved. During this period I enjoied working with the community and the advices given by the mentors and by the other members have been really helpful. The most important change to the project is that, discussing with mentors, we decided to start implementing ode15i because it's more general than ode15s and to build ode15s later around it.
Here you can find the code I've written so far:
https://bitbucket.org/Francesco_Faccio/octave
The most difficult task in the first part of this project was to have Octave compiled with link to Sundials. After having accomplished this, I checked the presence and usability of the library nvector_serial which contains the implementation of IDADENSE and IDABAND modules. I aggregated its build flags with the flags of sundials_ida and included the header nvector_serial.h in the dld-function.
I checked the licenses of Sundials and SUPERLUMT (a package which will be used as a sparse direct solver, independent from Sundials) and they have 3-Clause license, so they are compatible with GNU-license and can be used.
After configuring the correct flags, I began writing a minimal wrapper of ode15i of the form:
[t, y] = ode15i (odefun, tspan, y0, yp0, options)
The first problem was to deal with Sundials' own types. Sundials uses realtype and N_Vector. An N_Vector is a vector of realtype, while a realtype can be both a float, a double or a long double, depending on how Sundials has been built (default type is double). I assumed to use the default double realtype and wrote functions N_Vector ColToNVec (ColumnVector data, long int n) and ColumnVector NVecToCol (N_Vector v, long int n), which convert an Octave ColumnVector to an N_Vector and viceversa.
I checked some minimal input conditions, wrote a few input validation tests, set AbsTol, Reltol, tspan, y0 and yp0 of type realtype or N_Vector.
Once preprocessed data, the moment of glory of Sundials' functions arrived. The first call was to IDACreate(), which creates an IDA memory block and returns a pointer which is then passed as the first argument to all subsequent ida function calls.
Sundials then asks to provide a function which computes the residual function in the DAE. This function must have the form:
int flag = resfun (realtype tt, N_Vector yy, N_Vector yp, N_Vector rr, void *user data)
As a temporary workaround I wrote a function which converts yy, yp and tt in ColumnVector(s), uses feval to evaluate the DAE (passed through a global pointer of type octave_function) and put the output in rr.
Then a call to IDAInit, IDASVtolerances (or IDASStolerances if AbsTol is a scalar), IDADense and IDADlsSetDenseJacFn (if supplied) sets up the linear solver.
Sundials accepts only a Jacobian Function of the form J = DF/DY + cj*DF/DYP, where cj is a scalar proportional to the inverse of the step size (cj is computed by IDA's solver). I used the same workaround of the residual to evaluate J when it's required.
Finally in the main loop a call to IDASolve solves the DAE system and gives the solution in output.
What this wrapper can solve:
this wrapper of ode15i can solve a system of differential equations of the form f(t, y, y') = 0, integrating from t0 to tf with initial conditions y0 and yp0. The output of the function is the solution of the DAE evaluated ONLY in the points supplied in tspan.
It accepts as option a scalar RelTol and a scalar or vector AbsTol. If the user wants to supply a Jacobian, it must be of the form J = DF/DY + cj DF/DYP. Both the system of DAE and the Jacobian must be a function handle.
I tested this wrapper using the 2 benchmark problems described in the previous post.
In Robertson chemical kinetics problem I found the right solution both passing a Jacobian and letting Sundials approximate it. The script I used is the following:
function res = robertsidae(t, y, yp)
res = [-(yp(1) + 0.04*y(1) - 1e4*y(2)*y(3));
-(yp(2) - 0.04*y(1) + 1e4*y(2)*y(3) + 3e7*y(2)^2);
y(1) + y(2) + y(3) - 1];
endfunction
function jacc = jacobian(t, y, yp, c)
jacc = [-0.04-c, 1e4*y(3), 1e4*y(2);
0.04, -c-6*1e7*y(2)-1e4*y(3), -1e4*y(2);
1, 1, 1];
endfunction
options = odeset('RelTol', 1e-3, 'AbsTol', 1e-6, ...
'Jacobian', @jacobian);
y0 = [1; 0; 0];
yp0 = [-1e-4; 1e-4; 0];
tspan = [0 4*logspace(-6, 6)];
[t, y] = ode15i(@robertsidae, tspan, y0, yp0, options);
y(:, 2) = 1e4*y(:, 2);
figure(1);
semilogx(t, y)
xlabel('time');
ylabel('species concentration');
title('Robertson DAE problem with a Conservation Law');
legend ('y1', 'y2', 'y3');
The second problem was to find the solution of a 2-D heat equation semidiscretized to a DAE on the unit square, as described in the previous post.
After discretizing the domain, the system of 100 differential algebraic equations was passed to ode15i (as sparse methods are in progress, I used the dense solver of Sundials also for this problem, without passing the Jacobian).
Here you can find the script I wrote to solve the problem. A 3-D time dependent plot shows how the solution evolves in time and space.
uu0 = zeros(100, 1);
%Initialize uu in all grid points
for j = 1:10
yfact = (1 / 9).*(j - 1);
offset = 10.*(j - 1);
for i = 1:10
xfact= (1 / 9).*(i - 1);
loc = offset + (i - 1);
uu0(loc + 1) = 16 * xfact * (1 - xfact) * yfact * (1 - yfact);
endfor
endfor
up0 = zeros(100, 1);
%Set values of uu and up at boundary points
for j = 1:10
offset = 10 * (j - 1);
for i = 1:10
loc = offset + (i - 1);
if (j == 1 || j == 10 || i == 1 || i == 10 )
uu0 (loc + 1) = 0; up0 (loc + 1) = 0;
endif
endfor
endfor
function res = klu (t, uu, up)
res = [ uu(1);
uu(2);
uu(3);
uu(4);
uu(5);
uu(6);
uu(7);
uu(8);
uu(9);
uu(10);
uu(11);
up(12) - 81 * (uu(11) + uu(13) + uu(2) + uu(22) - 4*uu(12));
up(13) - 81 * (uu(12) + uu(14) + uu(3) + uu(23) - 4*uu(13));
up(14) - 81 * (uu(13) + uu(15) + uu(4) + uu(24) - 4*uu(14));
up(15) - 81 * (uu(14) + uu(16) + uu(5) + uu(25) - 4*uu(15));
up(16) - 81 * (uu(15) + uu(17) + uu(6) + uu(26) - 4*uu(16));
up(17) - 81 * (uu(16) + uu(18) + uu(7) + uu(27) - 4*uu(17));
up(18) - 81 * (uu(17) + uu(19) + uu(8) + uu(28) - 4*uu(18));
up(19) - 81 * (uu(18) + uu(20) + uu(9) + uu(29) - 4*uu(19));
uu(20);
uu(21);
up(22) - 81 * (uu(21) + uu(23) + uu(12) + uu(32) - 4*uu(22));
up(23) - 81 * (uu(22) + uu(24) + uu(13) + uu(33) - 4*uu(23));
up(24) - 81 * (uu(23) + uu(25) + uu(14) + uu(34) - 4*uu(24));
up(25) - 81 * (uu(24) + uu(26) + uu(15) + uu(35) - 4*uu(25));
up(26) - 81 * (uu(25) + uu(27) + uu(16) + uu(36) - 4*uu(26));
up(27) - 81 * (uu(26) + uu(28) + uu(17) + uu(37) - 4*uu(27));
up(28) - 81 * (uu(27) + uu(29) + uu(18) + uu(38) - 4*uu(28));
up(29) - 81 * (uu(28) + uu(30) + uu(19) + uu(39) - 4*uu(29));
uu(30);
uu(31);
up(32) - 81 * (uu(31) + uu(33) + uu(22) + uu(42) - 4*uu(32));
up(33) - 81 * (uu(32) + uu(34) + uu(23) + uu(43) - 4*uu(33));
up(34) - 81 * (uu(33) + uu(35) + uu(24) + uu(44) - 4*uu(34));
up(35) - 81 * (uu(34) + uu(36) + uu(25) + uu(45) - 4*uu(35));
up(36) - 81 * (uu(35) + uu(37) + uu(26) + uu(46) - 4*uu(36));
up(37) - 81 * (uu(36) + uu(38) + uu(27) + uu(47) - 4*uu(37));
up(38) - 81 * (uu(37) + uu(39) + uu(28) + uu(48) - 4*uu(38));
up(39) - 81 * (uu(38) + uu(40) + uu(29) + uu(49) - 4*uu(39));
uu(40);
uu(41);
up(42) - 81 * (uu(41) + uu(43) + uu(32) + uu(52) - 4*uu(42));
up(43) - 81 * (uu(42) + uu(44) + uu(33) + uu(53) - 4*uu(43));
up(44) - 81 * (uu(43) + uu(45) + uu(34) + uu(54) - 4*uu(44));
up(45) - 81 * (uu(44) + uu(46) + uu(35) + uu(55) - 4*uu(45));
up(46) - 81 * (uu(45) + uu(47) + uu(36) + uu(56) - 4*uu(46));
up(47) - 81 * (uu(46) + uu(48) + uu(37) + uu(57) - 4*uu(47));
up(48) - 81 * (uu(47) + uu(49) + uu(38) + uu(58) - 4*uu(48));
up(49) - 81 * (uu(48) + uu(50) + uu(39) + uu(59) - 4*uu(49));
uu(50);
uu(51);
up(52) - 81 * (uu(51) + uu(53) + uu(42) + uu(62) - 4*uu(52));
up(53) - 81 * (uu(52) + uu(54) + uu(43) + uu(63) - 4*uu(53));
up(54) - 81 * (uu(53) + uu(55) + uu(44) + uu(64) - 4*uu(54));
up(55) - 81 * (uu(54) + uu(56) + uu(45) + uu(65) - 4*uu(55));
up(56) - 81 * (uu(55) + uu(57) + uu(46) + uu(66) - 4*uu(56));
up(57) - 81 * (uu(56) + uu(58) + uu(47) + uu(67) - 4*uu(57));
up(58) - 81 * (uu(57) + uu(59) + uu(48) + uu(68) - 4*uu(58));
up(59) - 81 * (uu(58) + uu(50) + uu(49) + uu(69) - 4*uu(59));
uu(60);
uu(61);
up(62) - 81 * (uu(61) + uu(63) + uu(52) + uu(72) - 4*uu(62));
up(63) - 81 * (uu(62) + uu(64) + uu(53) + uu(73) - 4*uu(63));
up(64) - 81 * (uu(63) + uu(65) + uu(54) + uu(74) - 4*uu(64));
up(65) - 81 * (uu(64) + uu(66) + uu(55) + uu(75) - 4*uu(65));
up(66) - 81 * (uu(65) + uu(67) + uu(56) + uu(76) - 4*uu(66));
up(67) - 81 * (uu(66) + uu(68) + uu(57) + uu(77) - 4*uu(67));
up(68) - 81 * (uu(67) + uu(69) + uu(58) + uu(78) - 4*uu(68));
up(69) - 81 * (uu(68) + uu(60) + uu(59) + uu(79) - 4*uu(69));
uu(70);
uu(71);
up(72) - 81 * (uu(71) + uu(73) + uu(62) + uu(82) - 4*uu(72));
up(73) - 81 * (uu(72) + uu(74) + uu(63) + uu(83) - 4*uu(73));
up(74) - 81 * (uu(73) + uu(75) + uu(64) + uu(84) - 4*uu(74));
up(75) - 81 * (uu(74) + uu(76) + uu(65) + uu(85) - 4*uu(75));
up(76) - 81 * (uu(75) + uu(77) + uu(66) + uu(86) - 4*uu(76));
up(77) - 81 * (uu(76) + uu(78) + uu(67) + uu(87) - 4*uu(77));
up(78) - 81 * (uu(77) + uu(79) + uu(68) + uu(88) - 4*uu(78));
up(79) - 81 * (uu(78) + uu(70) + uu(69) + uu(89) - 4*uu(79));
uu(80);
uu(81);
up(82) - 81 * (uu(81) + uu(83) + uu(72) + uu(92) - 4*uu(82));
up(83) - 81 * (uu(82) + uu(84) + uu(73) + uu(93) - 4*uu(83));
up(84) - 81 * (uu(83) + uu(85) + uu(74) + uu(94) - 4*uu(84));
up(85) - 81 * (uu(84) + uu(86) + uu(75) + uu(95) - 4*uu(85));
up(86) - 81 * (uu(85) + uu(87) + uu(76) + uu(96) - 4*uu(86));
up(87) - 81 * (uu(86) + uu(88) + uu(77) + uu(97) - 4*uu(87));
up(88) - 81 * (uu(87) + uu(89) + uu(78) + uu(98) - 4*uu(88));
up(89) - 81 * (uu(88) + uu(90) + uu(79) + uu(99) - 4*uu(89));
uu(90);
uu(91);
uu(92);
uu(93);
uu(94);
uu(95);
uu(96);
uu(97);
uu(98);
uu(99);
uu(100)];
endfunction
tspan = linspace(0, 0.3, 100);
options = odeset('RelTol', 1e-6, 'AbsTol', 1e-8);
[t, y] = ode15i(@klu, tspan, uu0, up0, options);
sol = zeros(10, 10, 100);
for z = 1:100
for i = 1:10
sol(:, i, z) = y(z, (((i - 1) * 10) + 1):(i * 10));
endfor
endfor
figure(1);
for k = 1:100
surf(sol(:, :, k))
axis([0 10 0 10 0 1]);
xlabel('X');
ylabel('Y');
zlabel('Heat');
title('2-D heat equation semidiscretized to a DAE on the unit square');
pause(0.03)
endfor
This is the solution at time t = 0.0091:
Larger problems will be used in order to test the efficiency of the code, because this 2 were solved almost immediately.
giovedì 9 giugno 2016
First Goals and next steps
I have not written a post for a while because I have had some health issues.
During the first two weeks of GSoC I have worked on Autotools and I have compiled Octave with link to SUNDIALS. The first step for doing this was to check the presence and usability of ida.h in configure.ac, so I used the macro OCTAVE_CHECK_LIB which also sets the flags CPPFLAGS, LDFLAGS and LIBS. Then I set the right configuration variables in the build-aux folder and modified build-env namespace. Finally I wrote a dld-function which includes ida.h and calls the function IDACreate from SUNDIALS which returns a pointer to the IDA memory structure.
This dld-function generates an oct-file which can be executed from Octave.
All these changes are visible in my public repository on Bitbucket:
https://bitbucket.org/Francesco_Faccio/octave
In the next few days I will further investigate the recursive dependencies of SUNDIALS and their license and set up the correct build flags for such dependencies, I will write more tests in configure.ac in order to check the availability of functions and headers of the library.
After discussing with mentors we decided to start the implementation of ode15i because it's more close to IDA and more general than ode15s. Once ode15i will be written, ode15s will be built around it.
We have also decided which are the next steps before midterm evaluation:
- implement a minimal .oct wrapper for IDA in Octave with a primitive interface such as $[t , y] = ode15i (odefun, tspan, y0, yp0, Jacobian)$
that invokes IDA with all options set to default values - use two benchmark problems to test the correctness and speed of the code:
I will compare it with the C implementation of SUNDIALS and with the m-file implementation relying on the mex interface of SUNDIALS
As benchmark problems we have chosen two examples which deal with dense and sparse methods.
The first one regards Robertson chemical kinetics problem, in which differential equations are given for species $y_{1}$ and $y_{2}$ while an algebraic equation determines $y_{3}$. The equations for the species concentrations $y_{i}(t)$ are:
\begin{eqnarray*} \begin{cases} y_{1}^{'} = -0.04y_{1} + 10^{4}y_{2}y_{3} \\ y_{2}^{'} = 0.04y_{1} - 10^{4}y_{2}y_{3} - 3\cdot 10^{7}y_{2}^{2} \\ 0 = y_{1} + y_{2} + y_{3} - 1 \end{cases} \end{eqnarray*}
The initial values are taken as $y_{1} = 1$, $y_{2} = 0$ and $y_{3} = 0$. This example computes the three concentration components on the interval from $t = 0$ through $t = 4\cdot 10^{10}$.
This is the plot of the solution (the value of $y_{2}$ is multiplied by a factor of $10^{4}$).
Dense methods of IDA are applied for solving this problem.
The second problem is a $2D$ heat equation, semidiscretized to a DAE. The DAE system arises from the Dirichlet boundary condition $u = 0$, along with the differential equations arising from the discretization of the interior of the region.
The domain is the unit square $\Omega = \{0 \leq x, y \geq 1\}$ and the equations solved are:
\begin{eqnarray*} \begin{cases} \partial u/\partial t = u_{xx} + u_{yy} & (x, y) \in \Omega \\ u = 0 & (x, y) \in \partial \Omega \end{cases} \end{eqnarray*}
The time interval is $0 \leq t \leq 10.24$, and the initial conditions are $u = 16x(1 − x)y(1 − y)$.
We discretize the PDE system (plus boundary conditions) with central differencing on a $10 \times 10$ mesh, so as to obtain a DAE system of size $N = 100$. The dependent variable vector $u$ consists of the values $u(x_{j}, y_{k}, t)$ grouped first by $x$, and then by $y$. Each discrete boundary condition becomes an algebraic equation within the DAE system.
In this problem IDA's sparse direct methods are used and the Jacobian is stored in compressed sparse column (CSC) format.
Regarding functions which deal with input ode check:
Functions check_input and set_ode_options, which I started to write before the beginning of the coding period, will be improved after the midterm evaluation.