GSoC 2016 ode15s

martedì 23 agosto 2016

Summary of the work - GSoC

MathJax TeX Test Page Hello!

In this final blog post I want to summarize the work done so far for the GSoC. First of all I want to thank Octave community for their support:
when I had some problems I have always found someone in the mailing list or in the IRC channel willing to help me.

In my public repository I pushed three final commits:

https://bitbucket.org/Francesco_Faccio/octave-gsoc/commits/all

The first commit I pushed regards the work we did in the first part of the GSoC and it is a patch for Octave which allows to include IDA module of Sundials library as a dependence of Octave.

I wrote some configuration tests, checking for sundials_ida and sundials_nvecserial headers and functions and a macro to test whether sundials_ida has been configured with double precision realtype. I included a test for klu library in suitesparse, because it is required for sundials' sparse solver.
If all these conditions are satisfied, you can use ode15i and ode15s in Octave (you can see in my previous post how to configure sundials and octave correctly and how to run benchmark problems).

The second commit regards my work on solvers ode23 and ode45 which were already in Octave.
Since there were a lot of duplicate code, we decided to refactor the solvers, exporting the check of the options into two private functions:

$[defaults, classes, attributes] = odedefaults (n, t0, tf)$

is a function which returns a struct defaults, containing the default values of all the solvers, and structs classes and attributes, containing the arguments for validateattributes.

$options = odemergeopts (useroptions, options, classes, attributes, fun_name);$

is a function which loops over default fields and, if a user provides a new value, uses validateattributer or validatestring to test the value and overwrites the default one.

odeset.m and odeget.m have been modified in such a way that the user can add new special fields to an ODE option structure. They don't depend on private functions ode_struct_value_check and known_option_names, but use inputParser to perform input checking. This version of odeset doesn't include the option to catch a misspelled option name yet, so that 'Rel' doesn't become automatically 'RelTol'.

We wrote function decic, which can compute consistent initial conditions for ode15i, close to initial guesses y0 and yp0.

The third commit contains files ode15i.m, ode15s.m and a common interface __ode15__.cc

ode15i

The main functionalities of ode15i have been achieved. Here's a list of the options accepted by the solver:

RelTol: You can specify a relative error tolerance as a positive scalar. The default value is 1e-3.

AbsTol: You can specify an absolute error tolerance as a positive scalar or vector. The default value is 1e-6.

NormControl: Sundials controls the relative error using the norm of the solution at each step, so option NormControl is always "on".

OutputFcn, OutputSel: You can specify them to plot the components of the solution you want after each successful integration step. The default output function is $@odeplot$, but you can provide a custom output function.

Refine: You can increase the number of output points using the refine factor. __ode15__.cc uses the IDAGetDky function to reach the highest possible order of interpolation.

Stats: Only standard statistics are printed (number of successful steps, number of failed attempts and the number of function evaluations).

Events: If you provide an event function, it must be of the form $ [value,isterminal,direction] = EventFcn(t,y,yp)$. The solver localizes a change of sign on your event function and uses linear interpolation to extimate the time when the event occurs. A way to improve this functionality will be to use the rootfinding function of Sundials in order to have an higher order of interpolation.

MaxStep: Use this option to specify an upper bound for the integration step. The default value is $0.1*abs(tf - t0)$.

InitialStep: Use this option to specify the first integration step of the solver. Its default value depends on the tspan and the initial slope.

Jacobian: You can specify the Jacobian as a function which returns the modified Jacobian $[\frac{\partial F}{\partial y}, \frac{\partial F}{\partial y'}] = jac (t, y, y')$ or as a cell array containing the two constant Jacobian matrices $ \{[\frac{\partial F}{\partial y}], [\frac{\partial F}{\partial y'}] \}$.

In both cases, if the two parts of the Jacobian are sparse, the Jacobian will be stored in CCS format.
If you don't provide the Jacobian, it will be extimated internally by Sundials, using difference-quotient approximation.
In this implementation you can't provide only one part of the Jacobian yet, but you have to specify both.

JPattern, Vectorized: These options haven't been developed yet.

MaxOrder: You can specify an upper bound for the order of the solver. The default value is 5.

This solver can be extended in order to increase the compatibility with Matlab:
- completing the missing options
- allowing the user to solve equations with complex number
- allowing the user to use single precision variables.

ode15s

This solver uses the same dld-function of ode15i. Inside the m-file, the semi-explicit DAE is written as an implicit one and solve as ode15i.
The solver uses the same options of ode15i, but in addition, it has:

NonNegative: this option has not been implemented yet. It can be implemented using IDASetConstraint function.

Events: Same as ode15i, but it must have the form $[value,isterminal,direction] = EventsFcn(t,y)$.

Jacobian: You can provide a function which returns the Jacobian of the form $[\frac{\partial F}{\partial y}] = jac(t,y)$ or a constant Jacobian matrix.

Mass: You can provide a constant Mass matrix or a function handle of the form $M(t,y)$ or $M(t)$. If the Mass matrix depends on the state, the Jacobian Matrix will be computed internally from Sundials. If you provide a sparse Jacobian and a sparse Mass matrix, they will be stored in CCS format.

MStateDependence: Specify MStateDependence to "strong" or "weak" if the Mass matrix has the form $M(t,y)$. Use the value "none" if the Mass matrix has the form $M(t)$ or it is constant.

MvPattern: This option has not been developed yet.

MassSingular: This option has not been developed yet. The solver solves the problem both with and without a singular mass matrix in the same way.

InitialSlope: You can specify the initial slope of the problem. The default value is a vector of zeros.

BDF: this implementation of ode15s uses always BDF method.

This solver can be extended in order to increase the compatibility with Matlab:
- completing the missing options
- allowing the user to solve equations with complex number
- allowing the user to use single precision variables.

This code has not been merged yet, since Octave is concluding a release process.

Concluding, I really enjoied working on this project and there is still a lot of work to do. I will work in the next weeks to extend the functionalities of ode15i and ode15s. I plan to continue to contribute to Octave development, maybe both in ode and in some statistics project.

domenica 14 agosto 2016

How to replicate test

Hi!

Here's a little guide to replicate the test I made for ode15i

STEP 1

Download last version of Sundials (only IDA module is required (IDA version > 2.8.0), but SundialsTB, idas and kinsol will be used for some test). Follow the install guide of Sundials to build IDA module with KLU (you will have to specify the lib and include directories where you have previously installed KLU). Use SUNDIALS_PRECISION as double (default) in configuration.

STEP 2

Clone my octave repo you can find here: https://bitbucket.org/Francesco_Faccio/octave
and build Octave from source. KLU is already a Octave dependence, but you can specify other versions you have installed at configuration time.

STEP 3

Here you can find the scripts and C functions I used to test ode15i.
You can test my oct-file running the scripts in folder Octave and the same scripts in Matlab folder for testing Matlab's ode15i.

To test the C examples of Sundials compile the C file you can find in folder C, linking against libklu, libsundials_ida and libsundials_nvecserial. If you installed these libraries in default path, type:

$ gcc -c idaHeat2D_klu.c idaRoberts_dns.c
$ gcc -o idaHeat2D_klu idaHeat2D_klu.o  -lsundials_ida
 -lsundials_nvecserial -lklu -lm
$ gcc -o idaRoberts_dns idaRoberts_dns.o  -lsundials_ida
 -lsundials_nvecserial -lklu -lm
$ ./idaRoberts_dns
$ ./idaHeat2D_klu

Sundials provides some mex interfaces for solving the benchmark problems. In order to test that, you have to edit file SundialsTB/install_STB.m changing mex compiler (near row 20):

mexcompiler = 'mkoctfile --mex';

Install Octave and run from Octave install_STB.m script. Choose to compile IDAS and kinsol interface and install the Toolbox. A startup file is generated where you chose to install SundialsTB: run it.

Now you could be able to run the script in MEX folder, which uses Sundials mex interface for solving the dense problem.

For problem ida_klu in Octave and Matlab folder, you can choose to test the solver with a sparse jacobian function, a dense jacobian function, a cell jacobian dense or a sparse jacobian dense.

Since I added the cell jacobian option in ode15i recently and I changed some parameters in the benchmark problems, I have some new results:

Problem Robertson eq (tspan from zero to 4e10, abstol = [1e-8, 1e-14,1e-6], reltol = 1e-4, dense methods):

Matlab: 0.41s
Octave: 0.06s
C impl: 0.002s
Mex impl: 0.12s

Without jacobian:

Matlab: 0.45s
Octave: 0.07s
Mex impl: 0.15s

Problem Heat2D (tspan from zero to 10.24, reltol = 1e-5, abstol = 1e-8, sparse methods):

With a sparse jacobian function:

Matlab: 0.34s
Octave: 0.13s
C impl: 0.005s

With a dense jacobian function:

Matlab: 0.44s
Octave: 0.15s

With a dense jacobian cell:

Matlab: 0.45s
Octave: 0.08s

With a sparse jacobian cell:

Matlab: 0.34s
Octave: 0.07s

Without a jacobian:

Matlab: 0.56s
Octave: 0.47s

It seems a quite goode result, considering that C and Mex implementations are specificly designed for solving these problems. Since I use a m-interface before calling my oct-file, there is a little overhead, but it will be negligible for bigger more complex problems.

For these test I configured Octave with -O2 flag and all the results refer to the code till this commit

lunedì 1 agosto 2016

Performance test

Hello!

As klu library is now linked to ode15i, I could test the performance of the code both in the dense and sparse case. I tested my oct-file against Matlab ode15i m-file, the C implementation of sundials examples and the m-file relying on the mex interface of sundials. Although the code still contains unefficient use of memory, I think that the results are pretty interesting.

Before going into details, I have to specify that C and mex implementations are written for solving these specific problems, so their code is highly optimized (specific Jacobian matrices are constructed directly inside functions required by IDA, while in my oct-file I receive a function that when evaluated it returns the Jacobian matrices I have to pass to IDA). In other words, solving a problem requires more time than solving the problem.

The first test regards the solution of Robertson chemical kinetics problem, in which the Jacobian is a 3x3 dense matrix. I found a solution for RelTol = 1e-4, AbsTol = [1e-8, 1e-14, 1e-6], tspan = [0, 0.4] and I got (in mean):

0.015s for the oct-file
0.13s for the mex-file
0.002s for the C-implementation
0.19s for Matlab ode15i.

The second test regards the solution of a 2D heat equation. In this problem the Jacobian was a 100x100 sparse matrix with 330 non zero elements. I tested with RelTol = 1e-5, AbsTol = 1e-8 and tspan =[0, 0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56, 5.12, 10.24] and got:

0.42s for the oct-file
0.005s for the C-implementation
0.45s for Matlab ode15i.

Unfortunately Sundials doesn't provide a mex interface for klu, so I couldn't test a sparse mex file version for this problem.

We can now make some considerations:
-Oct-file is 10 times faster than mex-file and this confirms that mex files cannot reach the performance of an equivalent (in this case, of a more general) otc-file
-Our solver is already faster than Matlab m-file
-The C implementation is the fastest, but code is designed to solve only a specific problem. We can try to get closer to this speed.

In order to improve the performance of the solver, the next step is to try to reduce looping and the use of memory and to write a DAE class, using functors and lambda functions to avoid the use of global pointers.

sabato 25 giugno 2016

Midterm week and next steps

Hello! This is my final post before the end of midterm evaluation and I would like to show what I've done during these days and which are the next steps. After the previous post I wrote some of the options that the user can provide using odeset. I wrote a function to allow the user to pass the modified Jacobian through a function that returns DF/DY and DF/DYP or through a cell array (if the Jacobian is constant) and I completed options MaxStep, InitialStep, MaxOrder (commit ea1a311). I started with mentors to write function decic, which computes initial conditions of the problem F(t,x,x') considering some values of x and x' fixed by the user. This function will be submitted as a patch in the next weeks. I tried to build Octave with KLU module included (it is an optional part of IDA which depends on SuiteSparse) but I had some problems with CMake that I hope to solve during the next days. KLU module will be used to solve sparse problems in ode15i, when a sparse Jacobian is supplied. Once I will be able to write the "sparse part" of ode15i, I will show some efficiency tests between my .oct file, Sundials' MEX and C file and ode15i in Matlab. After completing this part, I will try to improve the quality of the code: - writing a class whose methods will be able to call the function supplied by the user without global pointers and perform more efficient data conversion (avoiding loops) - Refactoring the code, putting all the checks in an other function For the examples I provided in the previous post, consider commit 4f60e96.

lunedì 20 giugno 2016

Summary of the work of the first month

Hello!

Mid-term evaluation has now arrived so it's time to summarize the work I've done and check which goals I have achieved. During this period I enjoied working with the community and the advices given by the mentors and by the other members have been really helpful. The most important change to the project is that, discussing with mentors, we decided to start implementing ode15i because it's more general than ode15s and to build ode15s later around it.
Here you can find the code I've written so far:

https://bitbucket.org/Francesco_Faccio/octave

The most difficult task in the first part of this project was to have Octave compiled with link to Sundials. After having accomplished this, I checked the presence and usability of the library nvector_serial which contains the implementation of IDADENSE and IDABAND modules. I aggregated its build flags with the flags of sundials_ida and included the header nvector_serial.h in the dld-function.

I checked the licenses of Sundials and SUPERLUMT (a package which will be used as a sparse direct solver, independent from Sundials) and they have 3-Clause license, so they are compatible with GNU-license and can be used.

After configuring the correct flags, I began writing a minimal wrapper of ode15i of the form:

[t, y] = ode15i (odefun, tspan, y0, yp0, options)

The first problem was to deal with Sundials' own types. Sundials uses realtype and N_Vector. An N_Vector is a vector of realtype, while a realtype can be both a float, a double or a long double, depending on how Sundials has been built (default type is double). I assumed to use the default double realtype and wrote functions N_Vector ColToNVec (ColumnVector data, long int n) and ColumnVector NVecToCol (N_Vector v, long int n), which convert an Octave ColumnVector to an N_Vector and viceversa.

I checked some minimal input conditions, wrote a few input validation tests, set AbsTol, Reltol, tspan, y0 and yp0 of type realtype or N_Vector.

Once preprocessed data, the moment of glory of Sundials' functions arrived. The first call was to IDACreate(), which creates an IDA memory block and returns a pointer which is then passed as the first argument to all subsequent ida function calls.

Sundials then asks to provide a function which computes the residual function in the DAE. This function must have the form:

int flag = resfun (realtype tt, N_Vector yy, N_Vector yp, N_Vector rr, void *user data)

As a temporary workaround I wrote a function which converts yy, yp and tt in ColumnVector(s), uses feval to evaluate the DAE (passed through a global pointer of type octave_function) and put the output in rr.

Then a call to IDAInit, IDASVtolerances (or IDASStolerances if AbsTol is a scalar), IDADense and IDADlsSetDenseJacFn (if supplied) sets up the linear solver.

Sundials accepts only a Jacobian Function of the form J = DF/DY + cj*DF/DYP, where cj is a scalar proportional to the inverse of the step size (cj is computed by IDA's solver). I used the same workaround of the residual to evaluate J when it's required.

Finally in the main loop a call to IDASolve solves the DAE system and gives the solution in output.

What this wrapper can solve:

this wrapper of ode15i can solve a system of differential equations of the form f(t, y, y') = 0, integrating from t0 to tf with initial conditions y0 and yp0. The output of the function is the solution of the DAE evaluated ONLY in the points supplied in tspan.
It accepts as option a scalar RelTol and a scalar or vector AbsTol. If the user wants to supply a Jacobian, it must be of the form J = DF/DY + cj DF/DYP. Both the system of DAE and the Jacobian must be a function handle.

I tested this wrapper using the 2 benchmark problems described in the previous post.

In Robertson chemical kinetics problem I found the right solution both passing a Jacobian and letting Sundials approximate it. The script I used is the following:

function res = robertsidae(t, y, yp)
res = [-(yp(1) + 0.04*y(1) - 1e4*y(2)*y(3));
       -(yp(2) - 0.04*y(1) + 1e4*y(2)*y(3) + 3e7*y(2)^2);
        y(1) + y(2) + y(3) - 1];
endfunction

function jacc = jacobian(t, y, yp, c)
jacc = [-0.04-c, 1e4*y(3), 1e4*y(2);
        0.04, -c-6*1e7*y(2)-1e4*y(3), -1e4*y(2);
        1, 1, 1];
endfunction

options = odeset('RelTol', 1e-3, 'AbsTol', 1e-6, ...
   'Jacobian', @jacobian);

y0 = [1; 0; 0];
yp0 = [-1e-4; 1e-4; 0];
tspan = [0 4*logspace(-6, 6)];

[t, y] = ode15i(@robertsidae, tspan, y0, yp0, options);

y(:, 2) = 1e4*y(:, 2);
figure(1);
semilogx(t, y)
xlabel('time');
ylabel('species concentration');
title('Robertson DAE problem with a Conservation Law');
legend ('y1', 'y2', 'y3');

As a result I got this:

The second problem was to find the solution of a 2-D heat equation semidiscretized to a DAE on the unit square, as described in the previous post.

After discretizing the domain, the system of 100 differential algebraic equations was passed to ode15i (as sparse methods are in progress, I used the dense solver of Sundials also for this problem, without passing the Jacobian).

Here you can find the script I wrote to solve the problem. A 3-D time dependent plot shows how the solution evolves in time and space.

uu0 = zeros(100, 1);

%Initialize uu in all grid points
for j = 1:10 
  yfact = (1 / 9).*(j - 1);
  offset = 10.*(j - 1);
  for i = 1:10 
    xfact= (1 / 9).*(i - 1);
    loc = offset + (i - 1);
    uu0(loc + 1) = 16 * xfact * (1 - xfact) * yfact * (1 - yfact);
  endfor
endfor

up0 = zeros(100, 1);

%Set values of uu and up at boundary points
for j = 1:10 
  offset = 10 * (j - 1);
  for i = 1:10 
    loc = offset + (i - 1);
    if (j == 1 || j == 10 || i == 1 || i == 10 ) 
      uu0 (loc + 1) = 0; up0 (loc + 1) = 0;
    endif
  endfor
endfor


function res = klu (t, uu, up)
 res = [ uu(1);
         uu(2);
         uu(3);
         uu(4);
         uu(5);
         uu(6);
         uu(7);
         uu(8);
         uu(9);
         uu(10);
         uu(11);
         up(12) - 81 * (uu(11) + uu(13) + uu(2) + uu(22) - 4*uu(12));
         up(13) - 81 * (uu(12) + uu(14) + uu(3) + uu(23) - 4*uu(13));
         up(14) - 81 * (uu(13) + uu(15) + uu(4) + uu(24) - 4*uu(14));
         up(15) - 81 * (uu(14) + uu(16) + uu(5) + uu(25) - 4*uu(15));
         up(16) - 81 * (uu(15) + uu(17) + uu(6) + uu(26) - 4*uu(16));
         up(17) - 81 * (uu(16) + uu(18) + uu(7) + uu(27) - 4*uu(17));
         up(18) - 81 * (uu(17) + uu(19) + uu(8) + uu(28) - 4*uu(18));
         up(19) - 81 * (uu(18) + uu(20) + uu(9) + uu(29) - 4*uu(19));
         uu(20);
         uu(21);
         up(22) - 81 * (uu(21) + uu(23) + uu(12) + uu(32) - 4*uu(22));
         up(23) - 81 * (uu(22) + uu(24) + uu(13) + uu(33) - 4*uu(23));
         up(24) - 81 * (uu(23) + uu(25) + uu(14) + uu(34) - 4*uu(24));
         up(25) - 81 * (uu(24) + uu(26) + uu(15) + uu(35) - 4*uu(25));
         up(26) - 81 * (uu(25) + uu(27) + uu(16) + uu(36) - 4*uu(26));
         up(27) - 81 * (uu(26) + uu(28) + uu(17) + uu(37) - 4*uu(27));
         up(28) - 81 * (uu(27) + uu(29) + uu(18) + uu(38) - 4*uu(28));
         up(29) - 81 * (uu(28) + uu(30) + uu(19) + uu(39) - 4*uu(29));
         uu(30);
         uu(31);
         up(32) - 81 * (uu(31) + uu(33) + uu(22) + uu(42) - 4*uu(32));
         up(33) - 81 * (uu(32) + uu(34) + uu(23) + uu(43) - 4*uu(33));
         up(34) - 81 * (uu(33) + uu(35) + uu(24) + uu(44) - 4*uu(34));
         up(35) - 81 * (uu(34) + uu(36) + uu(25) + uu(45) - 4*uu(35));
         up(36) - 81 * (uu(35) + uu(37) + uu(26) + uu(46) - 4*uu(36));
         up(37) - 81 * (uu(36) + uu(38) + uu(27) + uu(47) - 4*uu(37));
         up(38) - 81 * (uu(37) + uu(39) + uu(28) + uu(48) - 4*uu(38));
         up(39) - 81 * (uu(38) + uu(40) + uu(29) + uu(49) - 4*uu(39));
         uu(40);
         uu(41);
         up(42) - 81 * (uu(41) + uu(43) + uu(32) + uu(52) - 4*uu(42));
         up(43) - 81 * (uu(42) + uu(44) + uu(33) + uu(53) - 4*uu(43));
         up(44) - 81 * (uu(43) + uu(45) + uu(34) + uu(54) - 4*uu(44));
         up(45) - 81 * (uu(44) + uu(46) + uu(35) + uu(55) - 4*uu(45));
         up(46) - 81 * (uu(45) + uu(47) + uu(36) + uu(56) - 4*uu(46));
         up(47) - 81 * (uu(46) + uu(48) + uu(37) + uu(57) - 4*uu(47));
         up(48) - 81 * (uu(47) + uu(49) + uu(38) + uu(58) - 4*uu(48));
         up(49) - 81 * (uu(48) + uu(50) + uu(39) + uu(59) - 4*uu(49));
         uu(50);
         uu(51);
         up(52) - 81 * (uu(51) + uu(53) + uu(42) + uu(62) - 4*uu(52));
         up(53) - 81 * (uu(52) + uu(54) + uu(43) + uu(63) - 4*uu(53));
         up(54) - 81 * (uu(53) + uu(55) + uu(44) + uu(64) - 4*uu(54));
         up(55) - 81 * (uu(54) + uu(56) + uu(45) + uu(65) - 4*uu(55));
         up(56) - 81 * (uu(55) + uu(57) + uu(46) + uu(66) - 4*uu(56));
         up(57) - 81 * (uu(56) + uu(58) + uu(47) + uu(67) - 4*uu(57));
         up(58) - 81 * (uu(57) + uu(59) + uu(48) + uu(68) - 4*uu(58));
         up(59) - 81 * (uu(58) + uu(50) + uu(49) + uu(69) - 4*uu(59));
         uu(60);
         uu(61);
         up(62) - 81 * (uu(61) + uu(63) + uu(52) + uu(72) - 4*uu(62));
         up(63) - 81 * (uu(62) + uu(64) + uu(53) + uu(73) - 4*uu(63));
         up(64) - 81 * (uu(63) + uu(65) + uu(54) + uu(74) - 4*uu(64));
         up(65) - 81 * (uu(64) + uu(66) + uu(55) + uu(75) - 4*uu(65));
         up(66) - 81 * (uu(65) + uu(67) + uu(56) + uu(76) - 4*uu(66));
         up(67) - 81 * (uu(66) + uu(68) + uu(57) + uu(77) - 4*uu(67));
         up(68) - 81 * (uu(67) + uu(69) + uu(58) + uu(78) - 4*uu(68));
         up(69) - 81 * (uu(68) + uu(60) + uu(59) + uu(79) - 4*uu(69));
         uu(70);
         uu(71);
         up(72) - 81 * (uu(71) + uu(73) + uu(62) + uu(82) - 4*uu(72));
         up(73) - 81 * (uu(72) + uu(74) + uu(63) + uu(83) - 4*uu(73));
         up(74) - 81 * (uu(73) + uu(75) + uu(64) + uu(84) - 4*uu(74));
         up(75) - 81 * (uu(74) + uu(76) + uu(65) + uu(85) - 4*uu(75));
         up(76) - 81 * (uu(75) + uu(77) + uu(66) + uu(86) - 4*uu(76));
         up(77) - 81 * (uu(76) + uu(78) + uu(67) + uu(87) - 4*uu(77));
         up(78) - 81 * (uu(77) + uu(79) + uu(68) + uu(88) - 4*uu(78));
         up(79) - 81 * (uu(78) + uu(70) + uu(69) + uu(89) - 4*uu(79));
         uu(80);
         uu(81);
         up(82) - 81 * (uu(81) + uu(83) + uu(72) + uu(92) - 4*uu(82));
         up(83) - 81 * (uu(82) + uu(84) + uu(73) + uu(93) - 4*uu(83));
         up(84) - 81 * (uu(83) + uu(85) + uu(74) + uu(94) - 4*uu(84));
         up(85) - 81 * (uu(84) + uu(86) + uu(75) + uu(95) - 4*uu(85));
         up(86) - 81 * (uu(85) + uu(87) + uu(76) + uu(96) - 4*uu(86));
         up(87) - 81 * (uu(86) + uu(88) + uu(77) + uu(97) - 4*uu(87));
         up(88) - 81 * (uu(87) + uu(89) + uu(78) + uu(98) - 4*uu(88));
         up(89) - 81 * (uu(88) + uu(90) + uu(79) + uu(99) - 4*uu(89));
         uu(90);
         uu(91);
         uu(92);
         uu(93);
         uu(94);
         uu(95);
         uu(96);
         uu(97);
         uu(98);
         uu(99);
         uu(100)];
endfunction

tspan = linspace(0, 0.3, 100);
options = odeset('RelTol', 1e-6, 'AbsTol', 1e-8);

[t, y] = ode15i(@klu, tspan, uu0, up0, options);


sol = zeros(10, 10, 100);
for z = 1:100
  for i = 1:10
    sol(:, i, z) = y(z, (((i - 1) * 10) + 1):(i * 10));
  endfor
endfor

figure(1);

for k = 1:100
  surf(sol(:, :, k))
  axis([0 10 0 10 0 1]);
  xlabel('X');
  ylabel('Y');
  zlabel('Heat');
  title('2-D heat equation semidiscretized to a DAE on the unit square');
  pause(0.03)
endfor

This is the solution at time t = 0.0091:

Larger problems will be used in order to test the efficiency of the code, because this 2 were solved almost immediately.

giovedì 9 giugno 2016

First Goals and next steps

MathJax TeX Test Page Hello!

I have not written a post for a while because I have had some health issues.

During the first two weeks of GSoC I have worked on Autotools and I have compiled Octave with link to SUNDIALS. The first step for doing this was to check the presence and usability of ida.h in configure.ac, so I used the macro OCTAVE_CHECK_LIB which also sets the flags CPPFLAGS, LDFLAGS and LIBS. Then I set the right configuration variables in the build-aux folder and modified build-env namespace. Finally I wrote a dld-function which includes ida.h and calls the function IDACreate from SUNDIALS which returns a pointer to the IDA memory structure.
This dld-function generates an oct-file which can be executed from Octave.

All these changes are visible in my public repository on Bitbucket:
https://bitbucket.org/Francesco_Faccio/octave

In the next few days I will further investigate the recursive dependencies of SUNDIALS and their license and set up the correct build flags for such dependencies, I will write more tests in configure.ac in order to check the availability of functions and headers of the library.

After discussing with mentors we decided to start the implementation of ode15i because it's more close to IDA and more general than ode15s. Once ode15i will be written, ode15s will be built around it.

We have also decided which are the next steps before midterm evaluation:

implement a minimal .oct wrapper for IDA in Octave with a primitive interface such as $[t , y] = ode15i (odefun, tspan, y0, yp0, Jacobian)$
that invokes IDA with all options set to default values

use two benchmark problems to test the correctness and speed of the code:
I will compare it with the C implementation of SUNDIALS and with the m-file implementation relying on the mex interface of SUNDIALS

As benchmark problems we have chosen two examples which deal with dense and sparse methods.

The first one regards Robertson chemical kinetics problem, in which differential equations are given for species $y_{1}$ and $y_{2}$ while an algebraic equation determines $y_{3}$. The equations for the species concentrations $y_{i}(t)$ are:

\begin{eqnarray*} \begin{cases} y_{1}^{'} = -0.04y_{1} + 10^{4}y_{2}y_{3} \\ y_{2}^{'} = 0.04y_{1} - 10^{4}y_{2}y_{3} - 3\cdot 10^{7}y_{2}^{2} \\ 0 = y_{1} + y_{2} + y_{3} - 1 \end{cases} \end{eqnarray*}

The initial values are taken as $y_{1} = 1$, $y_{2} = 0$ and $y_{3} = 0$. This example computes the three concentration components on the interval from $t = 0$ through $t = 4\cdot 10^{10}$.

This is the plot of the solution (the value of $y_{2}$ is multiplied by a factor of $10^{4}$).

Dense methods of IDA are applied for solving this problem.

The second problem is a $2D$ heat equation, semidiscretized to a DAE. The DAE system arises from the Dirichlet boundary condition $u = 0$, along with the differential equations arising from the discretization of the interior of the region.
The domain is the unit square $\Omega = \{0 \leq x, y \geq 1\}$ and the equations solved are:

\begin{eqnarray*} \begin{cases} \partial u/\partial t = u_{xx} + u_{yy} & (x, y) \in \Omega \\ u = 0 & (x, y) \in \partial \Omega \end{cases} \end{eqnarray*}

The time interval is $0 \leq t \leq 10.24$, and the initial conditions are $u = 16x(1 − x)y(1 − y)$.
We discretize the PDE system (plus boundary conditions) with central differencing on a $10 \times 10$ mesh, so as to obtain a DAE system of size $N = 100$. The dependent variable vector $u$ consists of the values $u(x_{j}, y_{k}, t)$ grouped first by $x$, and then by $y$. Each discrete boundary condition becomes an algebraic equation within the DAE system.

In this problem IDA's sparse direct methods are used and the Jacobian is stored in compressed sparse column (CSC) format.

Regarding functions which deal with input ode check:
Functions check_input and set_ode_options, which I started to write before the beginning of the coding period, will be improved after the midterm evaluation.

domenica 15 maggio 2016

Timeline

Hello!

This is a Timeline for the project ode15s.

As discussed with the mentors, our goals for the mid-term evaluations are to build Octave with all the dependencies of SUNDIALS and to create an m-file which deals with the input parameters and the options of a generic ODE/DAE solver.
The final goal is to have a well tested and documented implementation of ode15s.

Community Bonding Period:
-Familiarize with Autotools and the structure of Octave
-Study the documentation of SUNDIALS and Oct-files

Week 1-2 (May 23 - Jun 5):
-Add SUNDIALS as a dependency and build Octave from source (I will usa a dld_function which calls a function of SUNDIALS)

Week 2-3 (Jun 6 - 19):

-Write an m-file which deals with the input of a generic ode/dae solver

Midterm Evaluations

Week 4 (Jun 20 - 26):
-Design the code of ode15s (I will choose which functions of SUNDIALS will be used)

Week 5-6 (Jun 27 - Jul 10):
-Write Oct-files

Week 7 (Jul 11 - 17):

-Write tests

Week 8-9 (Jul 18 - 31):
-Test compatibility between Matlab and Octave
-Test the performance of the algorithm

Week 10-11 (Aug 1 - 14):
-Write the documentation and perform more tests

Week 12 (Aug 15 - 21)
-Review of the work

OPTIONAL: If I finish the work early, I will try to write a (slower) version of ode15s which uses DASPK or DASSL (this is for people who don't have SUNDIALS installed)

Final Evaluations