.. _sec.surrogate.alamo:

Tutorial 1: ALAMO
=================

This tutorial focuses on the use of the ALAMO tool for building
algebraic surrogate models. ALAMO builds simplified algebraic models,
which are particularly well suited for rigorous equation oriented
optimization. To keep the execution of this tutorial fast, a toy problem
is used. In this case study the flowsheet calculations and sample
generation are done within FOQUS, alternatively, the user can provide a
simulation model such as: Excel, Aspen plus, Aspen custom modeler, etc.

Note: Before starting this tutorial the ALAMO product must be downloaded
from the products page on the CCSI website. The path for the ALAMO
executable file must be set in FOQUS settings (see Section
:ref:`section.settings`). Some installations may have two executables;
**alamo.exe** is the main executable for accessing the ALAMO trainer,
and **alamo-ui.exe** is a second executable for launching the stand-alone
user interface application. The **ALAMO EXE** path should be set as the path
to the main executable.

The FOQUS file (**Surrogate_Tutorial_1.foqus**),
where Steps 1 to 42 of this tutorial have been completed
is located in: :path:`examples/tutorial_files/Surrogates`

.. note:: |examples_reminder_text|

Flowsheet Setup
---------------

#. Open FOQUS.

#. Name the session “Surrogate_Tutorial_1” (Figure
   :ref:`fig.tut.sur.session`).

.. figure:: ../figs/session1.svg
   :alt: Session Set Up
   :name: fig.tut.sur.session

   Session Set Up

3. Navigate to the Flowsheet Editor (Figure
   :ref:`fig.tut.sur.flowsheet`).

4. Add a Flowsheet Node named “eq.”

5. Display the Node Editor by clicking the **Node Editor** toggle
   button.

.. figure:: ../figs/flowsheet.svg
   :alt: Flowsheet Setup
   :name: fig.tut.sur.flowsheet

   Flowsheet Setup

The **Node Editor** displays (Figure :ref:`fig.tut.sur.nodeEdit.Input`).
The first step to setting up the node for this problem is to add input
and output variables to the node.

6. If the input variables table is not displayed as shown in Figure
   :ref:`fig.tut.sur.nodeEdit.Input`, click
   the **Variables** tab and then click the **Input Variables** toolbox
   section.

7. Add the variables “x1” and “x2” by clicking the **Add** icon (+)
   above the input table.

8. Edit the **Min/Max** value for both variables to be “-10.0” and
   “10.0.”

9. Add two output variables “z1” and “z2.”

.. figure:: ../figs/nodeInput.svg
   :alt: Node Variables
   :name: fig.tut.sur.nodeEdit.Input

   Node Variables

To keep the execution time short, the node will not be assigned to a
simulation model and calculations are performed directly in FOQUS.

10. Click on the **Node Script** tab in the Node Editor to enter the
    test equation (this step replaces the use of a simulator).

11. Enter the following equations (Figure
    :ref:`fig.tut.sur.nodeEdit.eq`):

    ::

               f["z1"] = x["x1"] + x["x2"]
               f["z2"] = x["x1"]**2 + x["x2"]**2


    The node script calculations are written in Python. The dictionary
    “f” stores output values while the dictionary “x” stores input
    values.

    .. figure:: ../figs/nodeEq.svg
       :alt: Node Script
       :name: fig.tut.sur.nodeEdit.eq

       Node Script

12. Test the model by running the flowsheet with the value “2” for “x1”
    and “x2.” After running, the output variables should have the values
    “4.0” for “z1” and “8.0” for “z2.”

Creating Initial Samples
------------------------

There are two ways to start an ALAMO run: (1) generate a set of initial
data, (2) use ALAMO’s adaptive sampling with no initial data and let
ALAMO generate its own samples. Adaptive sampling can be used with
initial data to generate more points if needed. In this case, initial
data is provided and adaptive sampling is used.

13. Select the UQ tool by clicking on the **Uncertainty** button on the
    Home window (Figure :ref:`fig.tut.sur.new.uq.ens`).

14. Click the **Add New** button.

15. The **Add New Ensemble - Model Selection** dialog will appear. Click
    **OK** to set up the sampling scheme.

.. figure:: ../figs/uqNewEns.svg
   :alt: Add a New Sample Ensemble
   :name: fig.tut.sur.new.uq.ens

   Add a New Sample Ensemble

16. The sample ensemble setup dialog displays (Figure
    :ref:`fig.tut.sur.new.uq.sample1`).
    Select **Choose sampling scheme**.

17. Click the **All Variable** button.

18. Select the **Sampling scheme** tab.

.. figure:: ../figs/uqSample1.svg
   :alt: Sample Distributions
   :name: fig.tut.sur.new.uq.sample1

   Sample Distributions

19. The **Sampling scheme** dialog should display (Figure
    :ref:`fig.tut.sur.new.uq.sample2`).
    Select “Latin Hypercube” from the list.

20. Set the **# of samples** to “1000.”

21. Click **Generate Samples**.

22. Click **Done**.

.. figure:: ../figs/uqSample2.svg
   :alt: Sample Methods
   :name: fig.tut.sur.new.uq.sample2

   Sample Methods

23. Once the samples have been generated a new sample ensemble displays
    in the UQ tool window (Figure :ref:`fig.tut.sur.new.uq.sample3`).
    Click **Launch** to run and generate the samples.

.. figure:: ../figs/uqSample3.svg
   :alt: Run Samples
   :name: fig.tut.sur.new.uq.sample3

   Run Samples

Data Selection
--------------

Initial and validation data can be specified by creating filters that
specify subsets of flowsheet data. In this tutorial only initial data
will be used. A filter must be created to separate the results of the
single test run from the UQ samples.

24. Click on the **Surrogates** button from the Home window. The
    surrogate tool displays :ref:`fig.tut.sur.data`.

25. Select “ALAMO” from the **Tool** drop-down list.

26. Click **Edit Filters** in the **Flowsheet Results** section to
    create a filter.

.. figure:: ../figs/data.svg
   :alt: Surrogate Data
   :name: fig.tut.sur.data

   Surrogate Data

27. Figure :ref:`fig.tut.sur.dataFilter_surrogate_upd`
    displays the Data Filter Editor.

28. Add the filter for initial data.

    #. Click **New Filter**, and enter “f1” as the filter name.

    #. Type the **Filter expression**: c(“set”) = = “UQ_Ensemble”.

29. Click **Done**.

.. figure:: ../figs/dataFilter_surrogate_upd.png
   :alt: Data Filter Dialog
   :name: fig.tut.sur.dataFilter_surrogate_upd

   Data Filter Dialog

Variable Selection
------------------

In this section, input and output variables need to be selected.
Generally, any input variables that vary in the data set should be
selected. However, in some cases, variables may be found to have no, or
very little, effect on the outputs. Only the output variables of
interest need to be selected. Note: Each output is independent from each
other and for the model building, selecting one output is the same as
selecting more.

30. Select the **Variable\ s** tab (Figure
    :ref:`fig.tut.sur.variables`).

31. Select the checkbox for both input variables.

32. Select the checkbox for both output variables.

.. figure:: ../figs/variables.svg
   :alt: Variable Selection
   :name: fig.tut.sur.variables

   Variable Selection

.. _tutorial.alamo.methodsettings:

Method Settings
---------------

The most important feature to generate "good" algebraic models is to
configure the settings accordingly to the problem to be solved. Each
setting has a good description in FOQUS. The JSON parser is used to read
method settings values. Strings must be contained in quotes. Lists have
the following format: [element 1, element 2].

33. Click on the **Method Settings** tab (see Figure
    :ref:`fig.alamo.method.settings.1` and :ref:`fig.alamo.method.settings.2`).

34. Set the **FOQUS Model (for UQ)** to “alamo_surrogate_uq.py.”

35. Set the **FOQUS Model (for Flowsheet)** to “alamo_surrogate_fs.py”

36. Set **Initial Data Filter** to “f1”

37. Set **SAMPLER** to select the adaptive sampling method: “None”
    “Random” or “SNOBFIT.” Use “None” in this tutorial.

38. Set **MONOMIALPOWER** to select the single variable term powers to
    [1,2,3].

39. Set **MULTI2POWER** to select the two variable term powers to [1].

40. Select functions to be considered as basis functions (**EXPFCNS**,
    **LOGFCNS**, **SINFCNS**, **COSFCNS**, **LINFCNS**, **CONSTANT**).

41. Leave the rest of settings as default (see Table
    :ref:`tutorial.alamo.table`).

42. Save this FOQUS session for use in the ACOSSO and BSS-ANOVA
    tutorials.

.. figure:: ../figs/Alamo_Method_Settings_1.png
   :alt: ALAMO Method Settings
   :name: fig.alamo.method.settings.1

   ALAMO Method Settings

.. figure:: ../figs/Alamo_Method_Settings_2.png
   :alt: ALAMO Method Settings Continued
   :name: fig.alamo.method.settings.2

   ALAMO Method Settings Continued

Execution
---------

43. Click the **Run** icon at the top of the window.

44. The ALAMO **Execution** tab starts displaying execution file path,
    sub-directories, input files, and output files.

    #. ALAMO version.

    #. License Information.

    #. Step 0 displays the data set to be used by ALAMO.

    #. Step 1 displays the modeler used by ALAMO to generate the
       algebraic model.

    #. Once the surrogate model has finished, the equations are
       displayed in the execution window. It may be necessary to scroll
       up a little. The result is shown in Figure :ref:`fig.alamo.res`.

    #. Finally, the statistics display the quality metrics of the models
       generated.

.. figure:: ../figs/alamo_exec.svg
   :alt: ALAMO Execution
   :name: fig.alamo.res

   ALAMO Execution

Results
-------

The results are exported as a PSUADE driver file that can be used
perform UQ analysis of the models, and a FOQUS Python plugin model that
allows it to be used in a FOQUS flowsheet. The equations can also be
viewed in the results section.

See tutorial Section :ref:`tutorial.surrogate.uq` and
:ref:`tutorial.surrogate.fs` for information
about analyzing the model with the UQ tools or running the model on the
flowsheet.

As mentioned in section `1.5 <#tutorial.alamo.methodsettings>`__ the
method settings are very important. A brief description and hints are
included in Table :ref:`tutorial.alamo.table`.

.. _tutorial.alamo.table:
.. table:: ALAMO Method Settings

   +-----------------------------------+-----------------------------------+
   | **Method Settings**               | **Description**                   |
   +-----------------------------------+-----------------------------------+
   | Initial Data Filter               | Filter to be applied to the       |
   |                                   | initial data set. Data filters    |
   |                                   | help the user to generate models  |
   |                                   | based on specific data for each   |
   |                                   | variable.                         |
   +-----------------------------------+-----------------------------------+
   | Validation Data filter            | Data set used to compute model    |
   |                                   | errors at the validation phase.   |
   |                                   | The number of data points in a    |
   |                                   | preexisting validation data set   |
   |                                   | can be specified by the user.     |
   +-----------------------------------+-----------------------------------+
   | SAMPLER                           | Adaptative sampling method to be  |
   |                                   | used. Options: "None", "Random"   |
   |                                   | and "SNOBFIT". Adaptive sampling  |
   |                                   | method to be used by ALAMO when   |
   |                                   | more sampling points are needed   |
   |                                   | by the model. If **Random** is    |
   |                                   | used a simulator must be provided |
   |                                   | by the user. If **SNOBFIT** is    |
   |                                   | used a simulator must be provided |
   |                                   | by the user and MATLAB must be    |
   |                                   | installed.                        |
   +-----------------------------------+-----------------------------------+
   | MAXTIME                           | Maximum execution time in         |
   |                                   | seconds. This time includes all   |
   |                                   | the steps on the algorithm, if    |
   |                                   | simulations are needed they run   |
   |                                   | in this time.                     |
   +-----------------------------------+-----------------------------------+
   | MINPOINTS                         | Convergence is assessed only if   |
   |                                   | the simulator is able to compute  |
   |                                   | the output variables for at least |
   |                                   | MINPOINTS of the data set. A      |
   |                                   | reduced number of MINPOINTS may   |
   |                                   | reduce the computational time to  |
   |                                   | get a model, but also reduces the |
   |                                   | accuracy of the model. MINPOINTS  |
   |                                   | must be a positive integer.       |
   +-----------------------------------+-----------------------------------+
   | PRESET                            | Value to be used if the simulator |
   |                                   | fails. This value must be         |
   |                                   | carefully chosen to be an         |
   |                                   | otherwise not realizable value    |
   |                                   | for the output variables.         |
   +-----------------------------------+-----------------------------------+
   | MONOMIALPOWERS                    | Vector of monomial powers to be   |
   |                                   | considered as basis functions,    |
   |                                   | use empty vector for none [].     |
   |                                   | Exponential terms allowed in the  |
   |                                   | algebraic model. i.e., if         |
   |                                   | selecting [1,2] the model         |
   |                                   | considers x1 and x1**2 as basis   |
   |                                   | functions.                        |
   +-----------------------------------+-----------------------------------+
   | MULTI2POWER                       | Vector of pairwise combination of |
   |                                   | powers to be considered as basis  |
   |                                   | functions. Pairwise combination   |
   |                                   | of powers allowed in the          |
   |                                   | algebraic model. i.e., [1,2]      |
   |                                   | allows terms like x1*x2 in the    |
   |                                   | algebraic model.                  |
   +-----------------------------------+-----------------------------------+
   | MULTI3POWER                       | Vector of three variables         |
   |                                   | combinations of powers to be      |
   |                                   | considered as basis functions.    |
   +-----------------------------------+-----------------------------------+
   | EXPFCNS, LOGFCNS, SINFCNS,        | Use or not of exp, log, sin, cos, |
   | COSFCNS, LINFCNS, CONSTANT        | linear, and constant functions as |
   |                                   | basis functions in the model.     |
   +-----------------------------------+-----------------------------------+
   | RATIOPOWER                        | Vector of ratio combinations of   |
   |                                   | powers to be considered in the    |
   |                                   | basis functions. Ratio            |
   |                                   | combinations of powers are [empty |
   |                                   | as default].                      |
   +-----------------------------------+-----------------------------------+
   | Radial Basis Functions            | Radial basis functions centered   |
   |                                   | around the data set provided by   |
   |                                   | the user. These functions are     |
   |                                   | Gaussian and are deactivated if   |
   |                                   | their textual representation      |
   |                                   | requires more than 128 characters |
   |                                   | (in the case of too many input    |
   |                                   | variables and/or datapoints).     |
   +-----------------------------------+-----------------------------------+
   | RBF parameter                     | Constant penalty used in the      |
   |                                   | Gaussian radial basis functions.  |
   +-----------------------------------+-----------------------------------+
   | Modeler                           | Fitness metric to be used for     |
   |                                   | model building. Options: BIC      |
   |                                   | (Bayesian Information Criterion), |
   |                                   | Mallow’s Cp, AICc (Corrected      |
   |                                   | Akaike’s Information Criterio),   |
   |                                   | HQC (Hannan-Quinn Information     |
   |                                   | Criterion), MSE (Mean Square      |
   |                                   | Error), and Convex Penalty.       |
   +-----------------------------------+-----------------------------------+
   | ConvPen                           | Convex penalty term. Used if      |
   |                                   | Convex Penalty is selected.       |
   +-----------------------------------+-----------------------------------+
   | Regularizer                       | Regularization method is used to  |
   |                                   | reduce the number of potential    |
   |                                   | basis functions before the        |
   |                                   | optimization.                     |
   +-----------------------------------+-----------------------------------+
   | Tolrelmetric                      | Convergence tolerance for the     |
   |                                   | chosen fitness metric is needed   |
   |                                   | to terminate the algorithm.       |
   +-----------------------------------+-----------------------------------+
   | ScaleZ                            | If used, the variables are scaled |
   |                                   | prior to the optimization problem |
   |                                   | is solved. The problem is solved  |
   |                                   | using a mathematical programming  |
   |                                   | solver. Usually, scaling the      |
   |                                   | variables may help the            |
   |                                   | optimization procedure.           |
   +-----------------------------------+-----------------------------------+
   | GAMS                              | GAMS is the software used to      |
   |                                   | solve the optimization problems.  |
   |                                   | The executable path is expected   |
   |                                   | or the user must declare GAMS.exe |
   |                                   | in the environment path.          |
   +-----------------------------------+-----------------------------------+
   | GAMS Solver                       | Solver to be used by GAMS to      |
   |                                   | solve the optimization problems.  |
   |                                   | Mixed integer quadratic           |
   |                                   | programming solver is expected    |
   |                                   | like BARON (other solvers can be  |
   |                                   | used).                            |
   +-----------------------------------+-----------------------------------+
   | MIPOPTCR                          | Relative convergence tolerance    |
   |                                   | for the optimization problems     |
   |                                   | solved in GAMS. The optimization  |
   |                                   | problem is solved when the optcr  |
   |                                   | is reached. 5 to 1 % is expected  |
   |                                   | (0.005 to 0.001).                 |
   +-----------------------------------+-----------------------------------+
   | MIPOPTCA                          | Absolute convergence tolerance    |
   |                                   | for mixed-integer optimization    |
   |                                   | problems. This must be a          |
   |                                   | nonnegative scalar.               |
   +-----------------------------------+-----------------------------------+
   | LINEARERROR                       | If true, a linear objective       |
   |                                   | function is used when solving the |
   |                                   | mixed integer optimization        |
   |                                   | problems; otherwise, a quadratic  |
   |                                   | objective function is used.       |
   +-----------------------------------+-----------------------------------+
   | CONREG                            | Specify whether constraint        |
   |                                   | regression is used or not, if     |
   |                                   | true bounds on output variables   |
   |                                   | are enforced.                     |
   +-----------------------------------+-----------------------------------+
   | CRNCUSTOM                         | If true, Custom constraints are   |
   |                                   | entered in the Variable tab.      |
   +-----------------------------------+-----------------------------------+
   | CRNINITIAL                        | Number of random bounding points  |
   |                                   | at which constraints are sampled  |
   |                                   | initially (must be a nonnegative  |
   |                                   | integer).                         |
   +-----------------------------------+-----------------------------------+
   | CRNMAXITER                        | Maximum allowed constrained       |
   |                                   | regressions iterations.           |
   |                                   | Constraints are enforced on       |
   |                                   | additional points during each     |
   |                                   | iteration (must be positive       |
   |                                   | integer).                         |
   +-----------------------------------+-----------------------------------+
   | CRNVIOL                           | Number of bounding points added   |
   |                                   | per round per bound in each       |
   |                                   | iteration (must be positive       |
   |                                   | integer).                         |
   +-----------------------------------+-----------------------------------+
   | CRNTRIALS                         | Number of random trial bounding   |
   |                                   | points per round of constrained   |
   |                                   | regression (must be a positive    |
   |                                   | integer).                         |
   +-----------------------------------+-----------------------------------+
   | CUSTOMBAS                         | A list of user-supplied custom    |
   |                                   | basis functions can be provided   |
   |                                   | by the user. The parser is not    |
   |                                   | case sensitive and allows for any |
   |                                   | Fortran functional expression in  |
   |                                   | terms of the XLABELS (symbol ^    |
   |                                   | may be used to denote power).     |
   +-----------------------------------+-----------------------------------+