Bstrain step by step

If you want to run Bstrain over your region of interest, here is a step by step guide.

Prepare your files

1- Crop your data set (2D velocity/displacement field from GNSS or optical correlation) to the region of interest

2- Chose the appropriate cartesian projection minimizing the distorsions for your area, project your data to get lon, lat in km units. If you want to use the plotting utilities from the gitlab repository, chose a UTM projection and note its EPSG code (should be given as -utm option when plotting)

3- Your data should look like a XX columns ascii file containing longitude (km), latitude (km), XX (unused), YY (unused), Veast (mm/yr), err_east (mm/yr), Vnorth (mm/yr), err_north (mm/yr). Note that uncertainties should be greater than zero. If not, the code will fail in providing a misfit value that will be NaN. XX and YY should not be empty, we usually let them at 7 8000 values, but these do not affect the code.

4- In your calculation directory, you should have : Regression2D.f90, parameters.in, Makefile, qhull, nn, fmodel*, and lib* files and directories. Regression2D.f90 should only be modified once, to edit the path to the input data file (could be relative path) at line 231 : open(1, file='PATH_TO_DATA', status='old')

5- Edit the parameters.in file by modifying : - npointm, the number of stations used, i.e. the size of your input data file (adjust accordingly the ncell_max and ncell_ma that depend on the complexity of the expected signal) - the size of the box for triangulation (longmin,longmax,latmin,latmax) that should be slightly larger than the data extend - the size of the box for selecting data (longminda, longmaxda, latminda, latmaxda) - the size of the box for plotting (longmind, longmaxd, latmind, latmaxd) - data_amplitude_max if you want to cut data based on their norm - ratio_init is the number of nodes for the first model set by default to 0.3 (one third of the data) but could be increased or decreased to speed up the convergence.

6- To get a correct run, you need to inform parameters.in with your knowledge of the apriori values that you expect for \(d, \omega, I_2, V_e, V_n\). The size of each prior must be adjusted to be as close as possible as the expected pattern. - adjust the apriori on the velocities by chosing the mean of the homogeneous distribution and associated width (theta 1 or 2 depending on the velocity component) - adjust the maximum and minimum value for tensor invariants (divmin, divmax, etc)

7- If you play with large deformations, you may want to use the Lagrangian definition of the strain rate tensor and then set the variable lagrangian to 1 in parameters.in (see About section)

8- You have to adjust your apriori on the scaling hyperparameter to the uncertainties, called sigma in the parameters.in. If you set sigma_max to 2, it means that you allow Bstrain to multiply by 2 or less the uncertainties given in the dataset. If you believe in your velocity field and uncertainties, we recommend to limit sigma_max to 1.1, i.e no more than 10% of deviation. However, note that this parameter should theoretically be set free, as the Bayesian framework should recover the noise level of the data. By experience, we feel that setting the sigma completely free will give a too smooth results that will not fit correctly the observations. We plan to implement more than one sigma parameter when combining different data sets (one sigma per data set).

9- Adjust the number of iterations (both overall number nsample and burn_in iterations). Keep it small for the first tests to have a complete run first

10- Compile and run !

Warning

The size of the cubes that will store the PDF is crucial (nvt, nvp and nvd variables in parameters.in) ! nvt and nvp stands for the spatial resolution of your final maps, while nvd stands for the number of bins to sample the PDF. The largest nvd, the better defined are the results and the statistics you will output from the cubes. However, cubes can get very large if nvd is increased too much : be carefull and start with small cubes to get the proper parametrization ! In areas where large contrasts of strain rates are expected, if you sample your PDF over too few bins, you may miss the lowest strain gradients.

Check your results

Here are few things you can view to check the quality of the run :

1- Have a look at the mpi.out, check that no NaN values appear in the last runs.

2- run check.py code, and have a look at the VR values that should be high (close to 44% ideally, but anyway should not be lower than 0). Look at the convergence graph to see wether the number of iterations allows for convergence, and if the burn_in is ok. If not, adjust the associated values in parameters.in

3- run plot_compar.py code to have a look at the difference between average, median and maximum mode on some invariants. If the discrepancy is important, for instance average is patchy while median mode is smooth, then convergence is probably not over, or PDF are highly asymetric.

4- You can then check the shape of some PDF by selecting suspicious pixels

5- Carefully look at the maps and results in profile line : maximum mode often still exhibits triangular shapes even if convergence is good. Median and average should not bear triangular shapes anymore. If this is the case, then it may be because of a local outlier (i.e. check the input data set), or because sigma should be higher, or convergence is not over.