Contents¶
FOQUS Installation and Running¶
This chapter covers how to install and run FOQUS as well as how to install other optional components for use within FOQUS.
Quick Start¶
For those familiar with the details, here is a summary of how to install and run FOQUS:
Download and install Anaconda.
In a terminal, to setup and install:
conda create --name ccsi-foqus python=3.7 conda activate ccsi-foqus pip install ccsi-foqus foqus --make-shortcut # Create Desktop shortcut (Windows only)In a terminal, to run:
conda activate ccsi-foqus foqusOn Windows, double-click the Desktop shortcut made above.
For a detailed explanations see the following sub-sections.
Contents¶
Install Python¶
Python version 3.6 up through 3.7 is required to run FOQUS.
We recommend using either the Miniconda or Anaconda Python distribution and package management system. The choice of Miniconda or Anaconda is up to the user, with Miniconda being smaller and quicker to download while Anaconda is larger but more self-contained. For Windows users, Anaconda is likely a better choice as it also comes with the “Anaconda Prompt” which is a command terminal already set up for working with Anaconda. The primary advantage of using Miniconda or Anaconda is being able to isolate and customize a python environment specifically for FOQUS without having to modify your existing system python environment. It does this by allowing the ordinary user the ability to create self-contained python environments without any need for administrator privileges. These separate environments can have different set of packages, isolating version dependencies when working with multiple python projects.
If you have a working version of Python 3.6 through 3.7, which you prefer over Anaconda, you can skip these steps.
Anaconda or Miniconda Install and Setup¶
Install the above package following the install instructions for your operating system.
Create a ccsi-foqus conda environment; this environment will be referred to as “ccsi-foqus” in the installation documentation, but you can use any name you like. If you would like to install multiple version of FOQUS (for example a stable version and the latest development version), this can be done by running the following command multiple times with different environment names after the –name flag in the below command. In a terminal (or on Windows in the Anaconda Prompt) type:
conda create --name ccsi-foqus python=3.7
Then follow the prompts. This will create a new conda environment with a minimal set of packages. To use a different version of python, change the version specified after python= in the command.
Activate the environment on Linux in a terminal type:
conda activate ccsi-foqus
If you create an environment in which to install FOQUS, you will need to ensure that environment is active before installing FOQUS. On Windows, once FOQUS is installed a batch file is created that will activate the proper environment when running FOQUS. On Linux or Mac, you will need to activate the appropriate environment before running FOQUS.
Install FOQUS¶
Note
In previous releases we instructed you to download the FOQUS code and install it in place. As
of version 1.5.0, this is no longer required. The below pip install
method is now the
preferred method to install FOQUS.
To install FOQUS, open the Anaconda prompt (or appropriate terminal or shell depending on operating system and choice of Python), and run the following commands:
conda activate ccsi-foqus
pip install ccsi-foqus
foqus --make-shortcut # Windows only
This will install FOQUS and all the required packages into the ccsi-foqus
conda environment.
The last command there will create a Desktop shortcut for easier, non-terminal, startup of FOQUS
(Windows only, for now).
Install FOQUS Examples¶
Note
In previous releases the examples were packaged inside the main ccsi-foqus
package.
Since version 3.5.0, they are no longer part of the ccsi-foqus
package, but instead distributed as a separate archive.
The steps described below are now the correct method to install the FOQUS examples.
To obtain the FOQUS examples, go to the “Releases” section of the FOQUS code repository on GitHub at https://github.com/CCSI-Toolset/FOQUS/releases, and locate the release of interest based on the version number.
Then, expand the “Assets” section.
The examples are packaged in a ZIP archive named cssi-foqus-X.Y.Z-examples.zip
,
where X.Y.Z
is the FOQUS version number.
Finally, download the archive containing the example, and extract it to a directory of your choice on your system. Throughout the rest of this documentation, we will refer to this directory as the Examples directory.
Run FOQUS¶
The specific command to launch FOQUS depends on the operating system.
To launch FOQUS, open the Anaconda prompt (or appropriate terminal or shell depending on operating system and choice of Python), and run the following commands:
conda activate ccsi-foqus
foqus
Alternatively on Windows you can start FOQUS by double-clicking on the “ccsi-foqus” Desktop shortcut created when FOQUS was first installed. That shortcut can be recreated at any time by opening a terminal, as described above, and starting FOQUS with the “make shortcut” option:
foqus --make-shortcut
Note
The first time FOQUS is run, it will ask for a working directory location. This is the location FOQUS will put any working files. This setting can be changed later.
Note
Files passed as command line arguments to FOQUS will be relative to where FOQUS is run. Once FOQUS starts, file paths will be relative to the FOQUS working directory.
Note
If when running on a remote Linux server or Virtual Machine you encounter an error when starting FOQUS similar to:
PyQt5 or Qt not available
or:
qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, webgl, xcb.
Try installing the libgl1-mesa-glx
and/or libxkbcommon-x11-0
packages using
the package manager appropriate for your Linux distrubution. (i.e. apt-get install
on Ubuntu).
Install Optional Software¶
There are several optional pieces of software which are not written in Python and not easily installed automatically. There are a couple packages which most users would want to install. The first is PSUADE, which provides FOQUS UQ functionality. The second is TurbineLite which requires Windows and is used to interface with Excel, Aspen, and gPROMS software.
Other software listed below will enable additional features of FOQUS if available.
Install PSUADE (current version: 1.7.12)¶
PSUADE (Problem Solving environment for Uncertainty Analysis and Design Exploration) is a software toolkit containing a rich set of tools for performing uncertainty analysis, global sensitivity analysis, design optimization, model calibration, and more.
PSUADE install instructions are on the PSUADE github site. For Windows users, there is an installer at the PSUADE releases page for your convenience.
Install Turbine and SimSinter (Windows Only)¶
- Install Microsoft SQL Server Compact 4.0.
- Download and install the latest releases of SimSinter and TurbineLite.
- Install SimSinter first, then TurbineLite.
- After the install the Turbine Web API Service Will start automatically when Windows starts, but it
will not start directly after the install. Do one of these two things (only after install):
- Restart computer, or
- Start the “Turbine Web API service”:
- open Task Manager
- go to the “Services” tab
- click the “Services” button (in the lower right corner)
- right-click “Turbine Web API Service” from the list, and
- click “Start”
Install ALAMO¶
ALAMO (Automated Learning of Algebraic Models for Optimization) is a software toolkit that generates algebraic models of simulations, experiments, or other black-box systems. For more information, go to the ALAMO Home Page.
Download ALAMO and request a license from the ALAMO download page.
Install NLopt¶
NLopt is an optional optimization library, which can be used by FOQUS. Unfortunately, the Python module is not available to be installed with pip. See the NLopt Installation Instructions or NLopt can be installed with conda as follows:
conda activate ccsi-foqus
conda install -c conda-forge nlopt
Install SnobFit¶
SnobFit is an optional optimization library, which can be used by FOQUS for unconstrained optimization. The python package can be installed with pip with:
conda activate ccsi-foqus
pip install SQSnobFit
The plugin has been developed for FOQUS versions 2.1 and greater. For further details on the available versions and installation, see the SQSnobFit PyPI package page.
Once the python package is downloaded, navigate the path to “SQSnobFit” folder (likely
$CONDA_PREFIX/lib/python3.7/site-packages/SQSnobFit/) and modify the _snobfit.py
file making
the following changes:
Comment out or remove the following code lines just below def minimize(...)
function definition:
if budget <= 0:
budget = 100000
Then replace:
return Result(fbest, xbest), objfunc.get_history()
with:
return (request,xbest,fbest)
in the def minimize()
function.
Install R¶
R is a software toolbox for statistical computing and graphics. R version 3.1+ is required for the ACOSSO and BSS-ANOVA surrogate models and the Basic Data’s SolventFit model.
Follow instructions from the R website to download and install R.
Open R and type the following to install and load the prerequisite packages:
install.packages('quadprog') library(quadprog) install.packages('abind') library(abind) install.packages('MCMCpack') library(MCMCpack) install.packages('MASS') library(MASS) q()
The last command exits R. When asked to save workspace image, type “y”.
Open FOQUS, go to the “Settings” tab, and set the “RScript Path” to the proper location of the R executable.
Optional FOQUS Settings¶
Go to the FOQUS settings tab:
- Set ALAMO and PSUADE locations.
- Test TurbineLite config.
Introduction¶
The Framework for Optimization, Quantification of Uncertainty, and Surrogates (FOQUS) software provides a graphical interface and standard platform for several Carbon Capture Simulation Initiative (CCSI) tools. The primary feature of FOQUS is its ability to interact with commonly-used chemical engineering process modeling software. Models constructed using a variety of software can be combined into a larger composite model. CCSI tools SimSinter and the Turbine Science Gateway (TSG) provide connectivity to external process simulation software. SimSinter provides a standard library to enable interfacing with other software; TSG provides a simulation job queuing system that can be used on: (1) a single workstation, (2) networked workstations, (3) cluster, or (4) cloud computing resources.
In FOQUS, simulations can be connected in a meta-flowsheet, which enables parts of a process to be modeled using the most appropriate software and combines them into a single large model, possibly including recycle streams. For example, in studying a carbon capture system for a coal-fired power plant: a power plant may be modeled in Thermoflex; a solvent-based carbon capture system may be modeled in Aspen Plus; and a compression system may be modeled in gPROMS. To optimize the entire system, these models can be combined into a single large model. The resulting meta-flowsheet can be used for simulation-based optimization, uncertainty quantification (UQ), or generation of surrogate models.
This section provides brief overview and motivating examples, for different uses of FOQUS.
Simulation Based Optimization¶
Simulation-based optimization considers a process simulation to be a black box model, which is a model where the mathematical details are not known. In this case, models are evaluated using process simulation software; multiple models can be combined to form larger models. Due to the long run times and the limitations of the methods used, a limited set of optimization variables (usually less than 30) is considered. Simulation-based optimization has some advantages and disadvantages, compared to equation-based optimization methods. With simulation-based optimization, there is no need to provide simplified algebraic models, problem formulation is relatively simple, and a good solution can usually be obtained; however, a provably-global optimum cannot be found and it is impractical to deal with very large numbers of variables. Large numbers of variables may be found in superstructure and heat integration problems where the structure of a process is being optimized. Both simulation and equation-based optimization methods are used in CCSI.
Capture of CO2 from a pulverized coal-fired power plant involves several very different systems including: a boiler, steam cycle, flue gas desulfurization, carbon capture, and CO2 compression. It is convenient to separate many of these processes into smaller, more reliable simulations. The different processes may also be better simulated in different software packages. Although some process simulation software contains optimization features, there are several reasons these may not be practical for a large composite system. It may be hard to develop a large model of the entire system that reliably converges. Many optimization methods have a difficult time dealing with simulation errors, and many black box derivative free optimization solvers are better able to handle occasional simulation failures. It may not be practical to simulate the entire process accurately using a single tool. Derivatives are also difficult to estimate for many systems when models do not provide exact derivatives, making derivative-free methods a good option.
The motivating example used to demonstrate the optimization framework is fairly simple. The system consists of a series of bubbling fluidized bed (BFB) CO2 adsorbers and regenerators modeled in Aspen Custom Modeler (ACM). The details of the BFB system are described in the CCSI BFB model documentation. A cost analysis for a 650 MW power plant and capture system is presented in an Excel spreadsheet. The simulation and spreadsheet files are provided in the examples directory in the FOQUS installation directory (see the tutorial in Section ref{tutorial.sim.flowsheet} for more information). The spreadsheet contains capital cost as well as operating and maintenance cost estimates, which are used to estimate the cost of electricity.
In this example, the objective function is the cost of electricity; the decision variables are design and operating variables in the ACM model. The cost of electricity is minimized while maintaining a 90 CO2 percent capture rate. The BFB system model and the cost of electricity are contained in separate models connected in a FOQUS flowsheet, which enables the cost of electricity to be calculated in Excel, using data acquired from the ACM model. See Sections ref{tutorial.sim.flowsheet} and ref{sec.opt.tutorial} for more information about the optimization problem.
Uncertainty Quantification¶
The Uncertainty Quantification (UQ) module of FOQUS encompasses a rich selection of mathematical, statistical, and diagnostic tools for application users to perform UQ studies on their simulation models. The PSUADE tool provides most of the UQ functionality available in FOQUS (Tong 2011). The recommended systematic multi-step approach consists of the following steps:
- Define the objectives of the analysis (e.g., identify the most important sources of uncertainties).
- Specify a simulation model to be studied. Acquire the model input files and the executable that runs the simulation (i.e., an executable that uses the specified inputs and generates model outputs). Identify the outputs of interest, identify all relevant sources of uncertainties, and ensure that they can be used as input variables to the simulation model.
- Select some or all input parameters that have uncertainty attributed. Characterize the prior probability distribution of these selected parameters by specifying the upper/lower bounds. For non-uniform prior distributions (e.g., Gaussian), additional information (e.g., mean and standard deviation) is required to define the shape of the prior distribution. This prior distribution represents the user’s best initial guess about the selected parameters’ uncertainties.
- Identify, if available, relevant data from physical experiments that can be used for model parameter calibration. Model calibration is a process that applies the observational data to update the prior distribution. The model calibration correlates the observational data to predict a distribution as a result.
- Select a sample scheme and sample size. From this information, a set of input values are sampled from the prior distribution. The choice of sampling scheme (which affects how the samples populate the input space) depends on the UQ objective(s) specified in the first step.
- “Run” the input samples. Running the input samples is the process where each sampled input value is fed to the simulation executable (specified in Step 2) and the corresponding output value is returned.
- Analyze the results and make decisions on how to proceed.
Steps 1-4 are often done through expert knowledge elicitation and/or literature search. Steps 5-7 can be achieved through software provided in the FOQUS UQ module.
The FOQUS UQ module provides a number of sampling and analysis methods, including:
- Parameter screening methods: computes the importance of input parameters to identify which are important (to be kept in subsequent analyses) and which to ignore (to be weeded out).
- Response surface (used interchangeably with ‘surrogate’) construction: approximates the relationship between the input samples and their outputs via a smooth mathematical function. This response surface or surrogate can then be used in place of the actual simulation model to speed up lengthy simulations.
- Response surface validation methods: evaluates how well a given response surface fits the data. This is important for choosing different response surfaces.
- Basic uncertainty analysis: propagates input uncertainty to output uncertainty.
- Sensitivity analysis methods: quantifies how much varying an input value can impact the resulting output value.
- Bayesian calibration: applies observational data to refine the estimate of input uncertainties.
- Visualization tools: views computed distributions and response surfaces.
- Diagnostics tools (mainly in the form of scatter plots): checks samples and model behaviors (e.g., outliers).
The adsorber 650.1 subsystem process model is used to demonstrate the UQ framework. The A650.1 process model was developed and is continuously refined by our Process Synthesis and Design Team. The model is based on their design and optimization of an initial full-scale design of a solid sorbent capture system for a net 650 MW (before capture) supercritical pulverized coal power plant. The A650.1 model describes a solid sorbent-based carbon capture system that uses the NETL-32D sorbent. NETL-32D is a mixture of polyethyleneamine (PEI) and aminosilanes impregnated into the mesoporous structure of a silica substrate. CO2 removal is achieved through chemical reactions between the amine sites within the sorbent. The A650.1 model is implemented in Aspen Custom Modeler (ACM) and contains many components (e.g., adsorbers, regenerators, compressors, heat exchangers). For the UQ analyses, this manual focuses is on the adsorber units, which are responsible for the adsorption of CO2 from the input flue gas.
In its original form, the A650.1 model consists of a deterministic simulation model, which means to consider all the parameters (e.g. chemical reaction parameters, heat and mass transfer coefficients) to have a fixed value (most likely fixed to a mean value, lower or upper bound for robustness). With the FOQUS UQ module, the model uncertainties can be addressed. Thus, UQ analysis of the A650.1 model would help to develop a robust design by addressing the following questions: * How accurately does each subsystem model predict actual system performance (under uncertain operating conditions)? * Which input parameters should be examined to improve prediction accuracy? * What is each input parameters’ contribution to prediction uncertainty?
Optimization Under Uncertainty¶
The Optimization Under Uncertainty (OUU) module in FOQUS is an extension of simulation-based optimization by including the contribution of model parameter uncertainties in the objective function. OUU is useful when inclusion of uncertainties may significantly alter the optimal design configurations. A straightforward approach to include the effect of uncertainty is to replace the objective function with its statistical mean on an ensemble drawn from the probability distributions of the continuous uncertain parameters (other options are available in FOQUS). Alternatively, users can provide a set of ‘scenarios’, where each scenario is associated with a probability. The latter case is often called ‘scenario optimization.’ The FOQUS OUU accommodates both continuous and scenario-based uncertain parameters. OUU makes use of the flowsheet for evaluations of the objective function. Naturally, OUU requires more computational resources than deterministic optimization. However, the ensemble runs can be launched in parallel so ideally, the turnaround time remains about the same as that of deterministic optimization if high performance computing capability (such as the CCSI Turbine gateway) is used in conjunction with FOQUS.
Surrogate Models¶
Process simulations are often time consuming and occasionally fail to converge. For mathematical optimization, it is sometimes necessary to replace a simulation with a surrogate model, which is a simplified model that executes much faster. FOQUS contains tools for creating and quantifying the uncertainty associated with surrogate models.
ALAMO¶
While simulation based optimization can often do a good job of providing optimal design and operating conditions for a predetermined flowsheet, it cannot provide an optimal flowsheet. To obtain a more optimal flowsheet, a mixed integer nonlinear program must be solved. These types of problems cannot generally be solved using simulation based optimization. A solution is to generate relatively simple algebraic models that accurately represent the high fidelity models. FOQUS currently provides an interface for ALAMO (Cozad et al. 2014), which builds surrogate model that are well suited for superstructure optimization.
ACOSSO¶
The Adaptive Component Selection and Shrinkage Operator (ACOSSO) surface approximation was developed under the Smoothing Spline Analysis of Variance (SS-ANOVA) modeling framework (Storlie et al. 2011). As it is a smoothing type method, ACOSSO works best when the underlying function is somewhat smooth. For functions which are known to have sharp changes or peaks, etc., other methods may be more appropriate. Since it implicitly performs variable selection, ACOSSO can also work well when there are a large number of input variables. To facilitate the description of ACOSSO, the univariate smoothing spline is reviewed first. The ACOSSO procedure also allows for categorical inputs (Storlie et al. 2013).
BSS-ANOVA¶
The Bayesian Smoothing Spline ANOVA (BSS-ANOVA) is essentially a Bayesian version of ACOSSO (Reich 2009). It is Gaussian Process (GP) model with a non-conventional covariance function that borrows its form from SS-ANOVA. It tackles the high dimensionality (of inputs) on two fronts: (1) variable selection to eliminate uninformative variables from the model and (2) restricting the level of interactions involved among the variables in the model. This is done through a fully Bayesian approach which can also allow for categorical input variables with relative ease. Since it is closely related to ACOSSO, it generally works well in similar settings as ACOSSO. The BSS-ANOVA procedure also allows for categorical inputs (Storlie et al. 2013).
Flowsheets and Settings¶
This chapter provides general information about using FOQUS and constructing flowsheets. The FOQUS flowsheet provides the basis for other analysis tools.
Contents¶
Reference¶
Getting Started¶
Follow the installation instructions provided in the FOQUS Installation and Running chapter.
The first time FOQUS is started, the user is prompted to specify a working directory. The working
directory preference is stored in %APPDATA%\.foqus.cfg
on Windows (APPDATA is an environment
variable). On Linux or OSX, the working directory is specified in $HOME/.foqus.cfg
. Additionally
the user can override the working directory when starting FOQUS by using the --working_dir
<working dir>
or -w <working dir>
command line option. Log files, user plugins, and files
related to other FOQUS tools are stored in the working directory. The working directory can be
changed at a later time from within FOQUS. A full list of FOQUS command line arguments is available
using the -h
or --help
arguments.
Settings¶
The settings screen shows FOQUS settings that are related to the general FOQUS setup, and are unlikely to change between sessions. The settings screen is accessible by clicking the Settings button at the top of the Home window. The FOQUS settings can be stored in two locations: (1) “%APPDATA%.foqus.cfg” on Windows or “$HOME/.foqus.cfg” on Linux or OSX, (2) “foqus.cfg” in the working directory.
The Settings screen displays settings grouped into tabs. Figure Settings, FOQUS Tab shows Settings, FOQUS tab.
Settings, FOQUS Tab
Options in the Settings, FOQUS tab are described below.
- Save settings to working directoy, when checkbox is selected the settings file will be read from the specified working directory. This setting is useful for running multiple copies of FOQUS to ensure the settings do not conflict. When starting additional copies of FOQUS, it is best to start them from the Working Directory command line giving each copy of FOQUS its own independent working directory. If FOQUS is started more than once from the Windows start menu, each copy will use the same working directory. Starting FOQUS multiple times with the same working directory may cause unusual behavior in FOQUS.
- Use DMF if available, when checkbox is selected the Data Management Framework (DMF) module will be loaded and the DMF options will be shown in the Session menu.
- Automatically create backup session file, when checkbox is selected each time a FOQUS session is saved it will be saved twice. A backup copy with a universally unique identifier appended to the file name will be saved. This will allow the user to load any previous save point of the session.
- Smaller session files, when checkbox is selected significant storage space is saved by excluding formatting from the session file; this makes the session files less human readable. A more readable session file can be useful for debugging.
- FOQUS Flowsheet Run Method enables the user to select between running simulations on the same computer as FOQUS, or on a remote Turbine gateway. Running simulations remotely allows parallel execution. The default setting is “Local”. If the user switches from “Local” to “Remote”, a warning message will appear. The user will be informed that the models that have been uploaded to the Local Turbine may not be available in the Remote Turbine Gateway. Therefore, the user may need to upload these models into Turbine again in order to run the models remotely.
- Working Directory is the path to the FOQUS working directory. The Working Directory is where FOQUS reads and writes files needed to function. When running multiple copies of FOQUS, the Working Directory can also be specified from the command line using the “-w” or “-workingDir” options. After changing the Working Directory, FOQUS should be restarted.
- PSUADE EXE is the path to the PSUADE executable. PSUADE provides FOQUS’s UQ features.
- SimSinter Home is the path to the SimSinter interface for creating Sinter configuration files for simulations to be run with FOQUS. This setting is not required but it allows easy access to the SimSinter configuration GUI when uploading simulation to Turbine.
- iREVEAL Home is the path the iREVEAL installation. This is required to use the iREVEAL surrogate model module.
- ALAMO EXE is the path to the ALAMO executable. This is required to use the ALAMO surrogate model module.
- RScript Path is the path to the RScript executable. This is required for surrogate model modules that use R as a platform.
- Java Home is the path to the Java installation. The DMF and the iREVEAL surrogate modules require Java.
- Revert Changes The settings changes are applied when the user navigates away from the settings screen. To undo changes made to settings the revert button can be clicked before changing to another screen.
The Turbine tab contains settings for configuring the local and remote instance of Turbine. Figure Settings, Turbine Tab shows the FOQUS Turbine settings.
Settings, Turbine Tab
The first section in the Turbine tab is TurbineLite (Local). This section contains settings related to the local installation of Turbine, and is only applicable when running FOQUS on the windows platform.
- Test tests the connection to the local Turbine server to make sure it is configured and running properly.
- Start Service starts the Turbine server service on Windows. The user must have permission to start services to use this button.
- Stop Service stops the Turbine server service on Windows. The user must have permission to stop services to use this button.
- Change Port can reconfigure the local Turbine server service on Windows to use a different port. This may be necessary if Turbine conflicts with another service.
- Aspen Version, Aspen 7.3 is still in common use but the API differs slightly form newer versions. This option allows FOQUS to be used with Aspen 7.3.
- TurbineLite Home is the location of the TurbineLite installation. For local simulation runs FOQUS needs to know where TurbineLite is installed so it can launch Turbine consumers to run simulations. This setting is not needed if simulations are only run remotely.
- Turbine Configuration (local) is the path to the TurbineLite gateway configuration file for running simulations locally. If simulations are only run remotely, this setting is not needed. New/Edit displays a form to create or edit a Turbine configuration file. Having a setting for both local and remote Turbine allows easy switching between run methods.
The second section in the Turbine tab is Turbine Gateway (Remote). This section contains settings related to a remote instance of Turbine.
- Test tests the connection to the remote Turbine server to make sure it is configured and running properly.
- Turbine Configuration (remote), is the path to the Turbine gateway configuration file for running simulations remotely. If simulations are only run locally, this setting is not needed. New/Edit displays a form to create or edit a Turbine configuration file. Having a setting for both local and remote Turbine allows easy switching between run methods.
- Check Interval (sec) is the number of seconds between checking the remote Turbine server for job results. This number should not be set too low to avoid overwhelming the Turbine server with requests.
- Number of Times to Resubmit Failed Jobs is the number of times to resubmit jobs that fail. Jobs occasionally fail due to software bugs. This allows a job to be retried.
The Logging tab contains settings related to the FOQUS log files, which provide debugging information. The FOQUS log files are stored in the logs directory in the working directory. Figure Settings, Logging Tab show the FOQUS log settings. There are two log files (1) FOQUS and (2) Turbine Client.
Settings, Logging Tab
- The level sliders indicate how much information to send to the logs.
- The Log Files section enables the user to specify where the log information is sent. The File Out checkboxes turn on or off the file output of logs. The Std. Out checkboxes enable or disable the output to the screen.
- Format allows the format of the log messages to be changed. See the documentation for the Python 2.7 logging module for more information.
- Rotate Log Files turns on or off log file rotation. When a log file reaches a certain size, a new log file is started and the contents of the old log are moved to a new file. There currently seems to be a bug in the log file rotation which occasionally makes the log file output stop; therefore, the Rotate Log Files option is labeled as an experimental feature.
Flowsheet¶
The meta-flowsheet defines connections between simulations. The flowsheet defines the order that simulations are performed and what data is transferred between them. Simulations are represented as nodes in the flowsheet. These simulations may be links to external simulation software through the Turbine gateway, or custom simulations or simulation wrappers written in Python. Directed edges in the flowsheet connect nodes. The edges also specify which variables in the simulations are equivalent.
If the flowsheet contains cycles, they are solved iteratively. Tear streams are selected by FOQUS based on two criteria: (1) minimize the maximum number of times any cycle is torn and (2) minimize the total number of tear edges (which only is considered when two tear sets have the same value for the first criteria).
FOQUS currently has two methods available for solving flowsheets with recycle: (1) direct substitution and (2) Wegstien Wegstein 1958. FOQUS will solve strongly connected components in the order they are encountered in the flowsheet. FOQUS flowsheets are generally not very complicated, so if a strongly connected component contains more than one tear stream, they are solved simultaneously. More advanced solution options will be added if a need arises. Figure Flowsheet Recycle shows how a simple flowsheet with recycle would be solved.
Flowsheet Recycle
Flowsheet Editor¶
Figure Flowsheet Editor illustrates the main Flowsheet Editor screen and a description of the pieces. The toolbar on the left contains various flowsheet tools.
Flowsheet Editor
The first three buttons are mouse mode buttons. The current mouse mode is shown by the depressed button. The remaining buttons on the toolbar perform an action. The flowsheet editing toolbar and flowsheet are described in detail below.
- Selection mode enables the user to select nodes and edges. Multiple items may be selected by holding down the Shift key. To deselect everything, click an empty area of the flowsheet while not holding the Shift key. Selected items can be moved by dragging them. To move multiple items, hold down the Shift key while dragging. The last item selected becomes the current object to be edited in the Node or Edge Editor.
- Add node mode enables the user to add a node by clicking anywhere on the flowsheet. Once a location is clicked, a dialog box opens where the new node name can be entered. If Cancel is selected, no node is added. The new node name cannot be “graph” and cannot match any existing node name.
- Add edge mode enables edges to be added by selecting the node that the edge originates from, followed by the node the edge terminates at.
- Center flowsheet in display centers the display on the flowsheet.
- Delete selected deletes all selected nodes and edges. If a node is deleted, all edges connecting to that node are also deleted.
- Run a simulation starts a single simulation run. This is primarily used to test a simulation before running optimization or UQ.
- Stop a simulation is enabled when a simulation is running and stops any running simulation. The simulation may take several seconds to stop.
- Set inputs to defaults returns all of the inputs to their default values.
- Determine tear edges makes it easier to see where initial guesses are needed and makes it possible to edit the tear set before running the flowsheet. If tear streams are needed but not specified before running a flowsheet, they will be automatically specified, however inputs that will be used for the initial guess will not be known before running.
- Flowsheet solver settings contains options related to tear solvers.
- Toggle node editor display displays or hides the Node Editor. The user can change the node being edited by selecting from Name in the Node Editor or selecting it on the flowsheet in selection mode.
- Toggle edge editor display displays or hides the edge editor. The user can change the edge being edited in the Edge Editor, or by selecting it in selection mode.
- Show results from all flowsheet runs displays the results of all flowsheet runs in a table view. This can be exported to a spreadsheet.
- Node represents a simulation or calculations.
- Edge connects simulation data, represents data transfer between two nodes.
Node Editor¶
The Node Editor enables the assignment of simulations to a node, and editing variables. Figure Node Editor Window shows the Node Editor window with the input variables section of the toolbox displayed.
Node Editor Window
- Apply immediately applies any changes made in the Node Editor. This is not usually needed. Changes are applied when the current node is changed, the Node Editor is closed, or some other action is taken that requires the flowsheet, such as running the flowsheet.
- Revert sets the node back to the version where the changes were last applied. This is usually the original state of the node when the editor was opened.
- Run can be used to run the simulation represented by this node only. This can be used for testing to make sure the node is properly configured without running the whole flowsheet.
- Stop Run is active when a simulation is currently running. It stops a single node run or a flowsheet run.
- There are three tabs in the Node Editor: (1) Variables tab, shown in Figure Node Editor Window, (2) Position tab displays the coordinates of the node, and (3) Node Script tab enabling the entry of Python code to be executed after the simulation is run.
- Name displays the name of the node currently being edited. The current node can be changed by selecting from existing nodes in the drop-down menu.
- Code displays the error status code for the node.
- Message displays a more detailed description of the error status of the node.
- Type enables the user to select the type of model to run. The model types are none, Turbine, DMF Lite, DMF Server, or Python Plugin. None allows no model to be assigned to the node; this is useful when the node only executes a script entered directly into FOQUS. Turbine is used to execute Aspen, gPROMS, or Excel simulations. Python plugins are custom simulations or wrappers written by the user. Surrogate model methods may also produce Python plugin models.
- Model enables selection of the models available on Turbine or loaded Python plugins.
- Input Variables enables viewing and editing the node’s input
variables. Most of these variables are added automatically when a
simulation is selected.
- Add variable enables the addition of an input variable. There are two reasons to add an input: (1) to use a variable to pass information to another simulation (even if the variable is not used in any node calculation, it can receive data from the previous simulation and be passed on to the next simulation) and (2) to use in a node script. For example, a variable could be added that provides output in different units of measure.
- Remove variable removes variables. If an input variable is removed that originally came from a Turbine simulation, the simulation will run with the default value.
- Tags displays a tag browser that lists commonly used variable tags.
- Input Variables table displays information about variables. Most attributes can be edited, except for the Name column within the Input Variables table. The rows for input variables are color coded depending on whether they are set by an edge from results in another node. White rows are not connected. Yellow rows are set by a tear edge. These variables serve as initial guesses but their value may change once the simulation has run. Red rows are set by an edge that is not a tear edge. The value set for these inputs does not matter and it may change once the simulation has run.
- Output Variables is a variable table similar to the Input Variables table for node output variables. This area is displayed by clicking Output Variables.
- Settings displays simulation settings. A description is provided for each setting. The available settings vary depending on simulation.
Node Variables¶
Variables in the node editor are grouped into two sections, inputs and outputs. The input and output variable tables are accessible as described in the previous section. The contents of the variable tables are described here.
The columns in the input variable list are:
- Name is the name of the variable,
- Value is the current value,
- Unit is the unit of measure,
- Type is the data type (float, int, or str),
- Default is the default value,
- Min is the minimum value,
- Max is the maximum value,
- Description is a description string,
- Tag is a list of strings that can be used to attach additional information to a variable
- Distribution is a distribution type,
- Param1 is the first parameter of a parametric distribution the exact meaning depends on the selected distribution, and
- Param2 is the second parameter of a parametric distribution the exact meaning depends on the selected distribution.
The minimum and maximum values for are not enforced when running simulations are not enforced. A value can be given outside the range. Optimization and UQ features make use of these values to set upper and lower bounds on decision variables or sampling. The distribution information is used when setting up sampling for UQ. In the future, this may also be used for things like optimization under uncertainty. Integer and string type variables cannot currently be used as optimization decision variables, or sampled with the UQ tool.
The rows of the input variable table are color coded. Some of the input variables may be set by connections to other nodes. White rows are variables who’s values are not set by a connection. The variables that are red have values set by a connection, and the value given will be overwritten and does not matter. The values that are colored yellow are inputs set by a connection that is a tear stream. The values of these variables serves as an initial guess for solving recycles.
The output variable table is similar to the input table, however it only contains the columns: Name, Value, Unit, Type, Description, and Tags. The value of the outputs may not correspond to the inputs until the simulation has been run.
Node Script¶
There are three type of Node Script that can be used: (1) Pre runs before a node simulation, (2) Post runs after a node simulation, and (3) Total scripts how a node runs the simulation.
Figure Node Script Tab illustrates the Node Script tab of the Node Editor with calculations for an optimization test problem.
Node Script Tab
Node scripts can be any valid Python code. The input and output variables for node scripts are stored in dictionaries x and f. The dictionary keys are the variable names. The f dictionary is used to update the node variables after the calculations are executed.
Edge Editor¶
The Edge Editor is illustrated in Figure Edge Editor. The Edge Editor can be used to set connections between node variables.
Edge Editor
- Index is the index of the current edge. The current edge can be changed by selecting an index from the drop-down menu, but since the index is not a very meaningful identifier it is usually more convenient to select the edge to edit with the graphical selection tool.
- Origin Node is the node where an edge starts. This may be edited by selecting a different node from the drop-down menu.
- Destination Node is the node to which the edge goes.
- Curve can be a positive or negative number. The greater the magnitude of number, the more curved an edge will appear in the flowsheet. This setting is used to keep edges from overlapping in the flowsheet display.
- Tear marks this edge as a tear. Before a simulation is run, if a valid tear set is not specified, FOQUS locates one.
- Active specifies whether the edge is active or not. This allows connections to be temporarily disabled.
- Variable Connections table displays which variables are connected. Inputs or outputs in the origin node can be connected to inputs in the destination node.
- Add connection adds a new connection.
- Remove connection deletes the selected connections.
- Auto automatically connects variables having the same name. For example, in connecting a simulation to a spreadsheet to calculate costs there are a large number of variables for which it makes sense that the variables have the same name in the simulation and spreadsheet. Auto should be used with great care. Connecting variables with the same name is often not what is wanted. For example two simulations may have a variable named FlowAIn; however, it is very unlikely that they should be connected. It is more likely FlowAOut should be connected to FlowAIn.
Sample Results¶
Flowsheet evaluations that have been run in a FOQUS session can be viewed by clicking the table button in the flowsheet toolbar (#13 in Figure Flowsheet Editor. The results are displayed in a table, and the contents can be copied and pasted into a spreadsheet or exported to a CSV file. Figure Flowsheet Results Table Window show the Flowsheet Results Table window.
Flowsheet Results Table Window
- Menu contains a menu with four sub menus.
- Import data from files or the clipboard.
- Export data to files or the clipboard.
- Edit or delete data.
- View options for the table.
- The Current Filter drop-down list enables the user to select a data filter, which can be used to filter and sort data.
- Edit Filters enables the user to create or edit data filters.
Error Codes¶
Error codes are listed in the Flowsheet Results table for the whole flowsheet and for individual nodes. Table Flowsheet Error Codes shows the flowsheet error codes and Table Node Error Codes shows the node error codes. The most common flowsheet error is 1, a node calculation failed. The most common node error is 7, Turbine simulation error. These errors are typically caused by a simulation that fails to converge or has some other calculation error (e.g., ACM does not converge or an Excel spreadsheet simulation with a division by 0 error).
Code | Meaning |
---|---|
-1 | Did not run or finish |
0 | Success |
1 | A simulation/node failed to solve |
2 | A simulation/node failed to solve while solving tears |
3 | Failed to create a worker node |
5 | Unknown tear solver |
11 | Wegstein failed, reached iteration limit |
12 | Direct failed, reached iteration limit |
16 | Presolve node error |
17 | Postsolve node error |
19 | Unhandled exception during evaluation (see log) |
20 | Flowsheet thread terminated |
21 | Missing session name |
40 | Error connecting to Turbine |
50 | Error loading session or inputs |
100 | Single node calculation success |
201 | Cycle in determining calculation order (invalid tear set) |
Code | Meaning |
---|---|
-1 | Did not run or finish |
0 | Success |
1 | Simulation error (see log) |
3 | Exceeded maximum wait time |
4 | Failed to create Turbine session ID |
5 | Failed to add Turbine job |
6 | Exceeded maximum run time |
7 | Turbine simulation error |
8 | Failed to start Turbine job |
10 | Failed to get Turbine jobs status |
11 | Flowsheet thread terminated |
20 | Error in node script |
23 | Could not convert Numpy value to list |
27 | Cannot read variable result (see log) |
Tutorial¶
Tutorial 1: Creating a Flowsheet¶
The Basics¶
This tutorial provides information about the basic use of FOQUS and setting up a very simple flowsheet. A single node flowsheet will be created that performs a simple calculation using a square root so that simulation errors can be observed when a negative input value is provided.
This tutorial will show the user the procedure for creating a flowsheet in FOQUS. However, if the user is interested, the finished flowsheet is available in: examples/tutorial_files/Flowsheets/Tutorial_1
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
- Start FOQUS (see Section Getting Started).
- In the session form enter the Session Name as “Simple_Flow” (Figure Setting the Session Name).
Setting the Session Name
- Set the session description.
- Select the Description tab (Figure Setting the Session Description).
- Type the description shown in Figure Setting the Session Description. The buttons above the Description tab box can be used to format the text.
Setting the Session Description
- Click the Flowsheet button at the top of the Home window (Figure Flowsheet, Input Variables).
- Add a node named “calc.”
- Click the Add Node button in the toolbar on the left side of the Home window.
- Click a location on the gridded flowsheet area.
- Enter the node name “calc” in the dialog box.
- Click the Select Mode button in the toolbar.
- Open the Node Editor by clicking the Node Editor button in the toolbar.
- Add input variables to the node. (When linking a node to an external
simulation the input and output variables are populated
automatically, and this step is not necessary.)
- Click + above the Input Variables table.
- Enter x1 in the variable Name dialog box.
- Click + above the Input Variables table.
- Enter x2 in the variable Name dialog box.
- Enter -2 and 2 for the Min and Max of x1 in the Input Variables table.
- Enter -1 and 4 for the Min and Max of x2 in the Input Variables table.
- Enter 1 for the value of x1.
- Enter 4 for the value of x2.
Flowsheet, Input Variables
- Add an output variable to the node. (When linking a node to an
external simulation the input and output variables are populated
automatically.)
- Click Output Variables to show the Output Variables table (Figure Flowsheet, Output Variables).
- Click + above the Output Variables table to add a variable.
- Enter z in the output Name dialog box.
Flowsheet, Output Variables
In this example, the node is not linked to any external simulation. The FOQUS nodes contain a section called node script, which can be used to do calculations before, after or instead of a simulation linked to the node. The node script can be used for things such as unit conversion, simple calculations, or simulation convergence procedures. The node scripts are written as Python. The Input Variables are contained in a dictionary named x and the Output Variables are contained in a dictionary named f. The dictionary keys are the variables names shown in the input and output tables. Only Output Variables can be modified by a node script.
Add a calculation to the node.
Click the Node Script tab (Figure Node Calculation).
- Enter the following code into the Python code box:
f['z'] = x['x1']*math.sqrt(x['x2'])
Click the Variables tab.
Click the Run button (Figure Node Calculation).
The flowsheet should run successfully and the output value should be 2. Rerun the flowsheet with a negative value for x2, and observe the result. The simulation should report an error.
Node Calculation
- Save the FOQUS session.
- Click the Session drop-down menu at the top of the Home window (Figure Save Session).
- Click Save. The exact location of save in the menu depends on whether or not the data management framework is enabled.
- The Change Log entry can be left blank.
- The default file name is the session name. Change the file name and location if desired.
Save Session
Automatically running FOQUS for a set of user-defined input conditions¶
This procedure requires the Uncertainty Tab.
Therefore, the instructions for this procedure can be found in the documentation under:
Uncertainty Quantification / Tutorial / Simulation Ensemble Creation and Execution / Automatically running FOQUS for a set of user-defined input conditions
The link for these instructions is shown below:
https://foqus.readthedocs.io/en/latest/chapt_uq/tutorial/sim.html
Tutorial 2: Creating a Flowsheet with Linked Simulations¶
This tutorial is referenced by other tutorials. Save the flowsheet in a convenient location for future use.
This tutorial demonstrates how to link simulations to nodes, and how to connect nodes in a flowsheet. Two models are used: (1) a bubbling fluidized bed model in ACM and (2) a cost of electricity (COE) model in Excel. The COE model estimates the cost of electricity for a 650 MW (net before adding capture) supercritical pulverized coal power plant with solid sorbent post combustion CO\(_2\) capture process added.
The files for this tutorial is located in: examples/test_files/Optimization/Model_Files/.
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
- Start FOQUS. The Session window displays (Figure Session Setup).
- Enter “BFB_opt” in Session Name (without quotes).
- Click the Description tab. The problem description box displays and is shown in (Figure Session Description).
- In the problem description box enter information about the problem being solved in the FOQUS session; this information can be more extensive than what is shown in the example.
- Save the session file. Click Save Session from the Session drop-down menu. Enter change log information and a file name when prompted. The Creation Time in metadata page will be the time the session is first saved. The Modification Time will be the last time the session was saved. The ID is a unique identifier that changes each time the user saves the simulation. The Change Log tab provides a record of the changes made each time the session is saved.
Session Setup
Session Description
There are two models needed for this optimization problem: (1) the ACM model for
the BFB capture system and (2) the Excel cost estimating spreadsheet. These
models are provided in the example/Optimization/Model_Files/
directory. There are two SimSinter configuration
files: (1) BFB_sinter_config_v6.2.json
for the process model and (2)
BFB_cost_v6.2.3.json
for the cost model. The next step is to upload the models
to Turbine.
Open the Add/Update Model to Turbine dialog box (Figure Open Upload to Turbine Dialog).
In this case, the SimSinter configuration files have already been created. If a SimSinter configuration file needs to be created for the simulation, Create/Edit displays the SimSinter configuration GUI (see Figure Upload to Turbine Dialog). See the SimSinter documentation or Chapter Simulation Standard Interface (SimSinter) for more information.
Click Browse to select a SimSinter configuration file (Figure Upload to Turbine Dialog). Once the SimSinter configuration file is selected, the simulation file and sinterconfig file is automatically added to the files to upload. The application type is entered automatically. If there are additional files required for the simulation, those files can be added by clicking Add File.
Enter the simulation name in Simulation Name. This name is determined by the user, but will default to the SimSinter configuration file name. For this tutorial use BFB_v6_2.
Click OK to upload the simulation.
- Repeat the upload process for the cost model. Name the modelBFB_v6_2_Cost.
Upload to Turbine Dialog
The next step is to create the flowsheet. Figure Flowsheet Editor illustrates the steps to draw the flowsheet.
- Click Flowsheet at the top of the Home window.
- Click Add Node mode.
- Add two nodes to the flowsheet. Name the first node “BFB” and the second node “cost”.
- Click Add Edge mode.
- Click the BFB node followed by the cost node.
- Click Selection mode and select the BFB node.
- Click Toggle Node Editor. The Node Editor displays as illustrated in Figure Node Editor.
Flowsheet Editor
Each node must be assigned the appropriate simulation. Use the Node Editor to set the simulation type and the simulation name from simulation uploaded to Turbine. The Node Editor is illustrated in Figure Node Editor
- Under Model and Type, set the simulation Type to Turbine. This indicates that the simulation is to be run with Turbine.
- Under Model, set the simulation of the BFB node to BFB_v6_2.
- The Variables and Settings are automatically populated from the SimSinter configuration file. Variable values, Min/Max, and descriptions can be changed; however, for this problem, the values taken from the SimSinter configuration should not be changed.
- Repeat the process for the cost node, assigning it the BFB_v6_2_cost simulation.

Node Editor
The connections between variables in the BFB simulation and the cost estimation spreadsheet must be set, so that required information can be transferred from the BFB simulation to the cost simulation.
- Click Toggle Node Editor to hide the Node Editor (Figure Flowsheet Editor).
- Select the edge on the flowsheet with the Selection tool.
- Click Toggle Edge Editor to show the Edge Editor. The Edge Editor is shown in Figure Edge Editor.
- For convenience, all of the variables that should be connected from the ACM model to the Excel spreadsheet have been given the same names in their SimSinter configuration files. To connect the variables click Auto in the Edge Editor. Auto connects variables of the same name. Since this is often not desired, the Auto button should be used carefully. There should be 46 connected variables.
Edge Editor
The flowsheet should now be ready to run. Test the flowsheet by executing a single evaluation before setting up the optimization problem.
- Click Run in the Flowsheet Editor (Figure Flowsheet Editor).
- The flowsheet may take a few minutes to run. The BFB simulation takes a significant amount of time to open in ACM. While running optimization, the evaluations take less time because the simulation remains opened. The simulation should complete successfully. A message box displays when the simulation is done. The status bar also indicates the simulation is running.
- While the simulation is running, Stop is enabled.
- Once the simulation runs successfully, Save the FOQUS session again, and keep it for use in later tutorials.
Tutorial 3: Flowsheets with Recycle¶
This section provides a tutorial on working with flowsheets containing recycle. Sections Tutorial 1: Creating a Flowsheet and Tutorial 2: Creating a Flowsheet with Linked Simulations provide tutorials for creating flowsheets, in this section a pre-constructed flowsheet is used.
The file for this tutorial is Mass_Bal_Test_02.foqus, and this file is located in examples/tutorial_files/Flowsheets/Tutorial_3
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
- Open FOQUS.
Open the Mass_Bal_Test_02.foqus file.
- Open the Session drop-down menu on the right side of the Session button (Figure Flowsheet with Recycle).
- Select Open Session from the drop-down menu.
- Locate Mass_Bal_Test_02.foqus in the file browser, and open it.
Click Flowsheet button from the toolbar at the top of the Home window.
The flowsheet is shown in Figure Flowsheet with Recycle. The flowsheet consists of two reactors in recycle loops. The flowsheet contains mixers, reactors, separators, and splitters. Each node uses a set of simple calculations in the node script section. The tear edges are shown in light blue.
Flowsheet with Recycle
- Inspect a node.
- Make sure the Selection tool is selected (Figure React_01 Node.
- Open the Node Editor by clicking the Node Edit button in the left toolbar in the Flowsheet view.
- Click the “React_01” node.
- Click Input Variables table. Note: Some input rows are colored red. This denotes that their values are set by output of the previous flowsheet node by the edge connecting “Mix_01” to “React_01.”
- Click the Node Script tab.
- Note the equations. Input Variables are stored in the x dictionary and Output Variables are stored in the f dictionary.
- Click the gear icon in the left toolbar (see Figure React_01 Node. The tear solver settings are shown in Figure Tear Solver Settings.
React_01 Node
Tear Solver Settings
- Remove the tear edges.
- Close the Node Editor.
- Open the Edge Editor. Click the Edge Editor icon in the left toolbar (see Figure Edge Edit.
- Click the edge between “React_01” and “Sep_01.”
- In the Edge Editor, clear the Tear checkbox.
- Repeat for the other tear edge.
- Close the Edge Editor.
Edge Edit
There should now be no tear edges in the flowsheet. The user can select tear edges or FOQUS can automatically select a set. If there is not a valid set of tear edges marked when a flowsheet is run, tear edges will automatically be selected.
- Automatically select a tear edge set by clicking the Tear icon in the left toolbar (see Figure Edge Edit).
- Open the Node Editor and look at node “Sep_01.” In the Input Variables table, notice that some of the input lines are colored yellow. The yellow inputs serve as initial guesses for the tear solver. The final value will be different from the initial value.
- Click the Run button on the left toolbar. The flowsheet should solve quickly.
- The results of the completed run are in the flowsheet. An entry will also be created in the Flowsheet Results data table (see Section Tutorial 4: Flowsheet Result Data.
Tutorial 4: Flowsheet Result Data¶
Flowsheet evaluation results are stored in a table in the FOQUS session. This data can be used for many purposes. The flowsheet evaluations may be single runs, part of an optimization problem, or part of a UQ ensemble. This tutorial provide information about sorting, filtering, and exporting data.
The FOQUS file for this tutorial is Simple_flow.foqus, and this file is located in: examples/tutorial_files/Flowsheets/Tutorial_4
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
The Simple_flow.foqus file is similar to the one created in the tutorial Section Tutorial 2: Creating a Flowsheet with Linked Simulations, but it has been run an additional 100 times using a UQ ensemble (see Tutorial 1: Simulation Ensemble Creation and Execution).
- Open FOQUS.
- Open the Simple_flow.foqus session from the example files.
- Click the Flowsheet button from the Home window.
- Click Flowsheet Data in the toolbar on the left side of the Home window.
Flowsheet Results Data Table, All Data
A data table should be displayed like the one shown in the figure below. There are 102 flowsheet evaluations. The first two evaluations are single runs, as can be seen in the SetName column, and the remaining 100 evaluation are from a UQ ensemble. The Error column shows several of the evaluations resulted in an error from a negative number being passed to the square root function.
This tutorial is broken up into mini-tutorials in the remaining subsections, which can be done independently. They each use the example data file described above.
Sorting Data¶
- Open FOQUS.
- Open the Simple_flow.foqus session from the example files.
- Click Flowsheet in the main toolbar at the top of the FOQUS Home window.
- Click Flowsheet Data in the toolbar on the left side of the Home Window.
- Click Edit Filters.
- Click New Filter.
- Enter “Sort1” as the new filter name.
- Click New Filter.
- Enter “Sort2” as the new filter name.
- Select “Sort1” from the Filter drop-down list.
- Enter
["-result"]
as the Sort by Column. Include the square brackets. The square brackets indicate that there is a list of sort terms, although in this case there is only one. If multiple search terms are given, the additional terms will be used to sort results having the same value for the previous terms. The “-” in front of result indicates the results should be sorted in reverse. The names of the sort terms come from the column headings, and are case sensitive. - Click Done to save the filters and return to the results table.
Sort1 Data Filter
- Select “Sort1” from the Current Filter drop-down list.
- The results are shown in below. The data should be sorted in reverse alphabetical order by result. Some of the columns are hidden to make the relevant results easier to see.
Sort1 Data Filter Results
- Click Edit Filters.
- Select “Sort2” from Filter drop-down list.
- Enter
["err", "-result"]
in the Sort Term field. This will sort the data first by Error code then by result in reverse alphabetical order. - Click Done.
Sort2 Data Filter
- Select “Sort2” in the Current Filter drop-down list.
- The results are shown in below. The data should be sorted so all Error code zero results are first then sorted in reverse alphabetical order by result.
Sort2 Data Filter Result
Filtering Data¶
- Open FOQUS.
- Open the Simple_flow.foqus session from the example files.
- Click the Flowsheet button in the Home window.
- Click the Results Data button (Table icon in left toolbar).
- In the data table dialog, click Edit Filters.
- Click New Filter and enter “Filter1” in the Filter field as the new filter name.
The filter expression is a Python expression. The c("Comlumn Name")
function
returns a numpy array containing the column data. The expression should evaluate to
a column of bools where rows containing True
will be included in the filtered
results and rows containing False
will be excluded. If combining multiple logical
expressions the numpy logical functions https://docs.scipy.org/doc/numpy-1.15.1/reference/routines.logic.html
should be used. Numpy is imported as np
8. In this example, results without errors in the “Single_runs” should be selected. In the filer expression
field enter np.logical_and(c("err") == 0, c("set") == "Single_runs")
- Click Done.
Filter1 Data Filter
- In the data table dialog, select “Filter1” from the Current Filter drop-down list.
- The result is displayed in the Figure below.
Filter1 Data Filter Result
Exporting Data¶
This tutorial uses a spreadsheet program such as Excel or Open Office. The exported data is subject to the selected filter. See the previous tutorials in this section for more information about sorting and filtering data to be exported.
FOQUS can export data directly to the Clipboard. The data can be pasted into a spreadsheet or as text. Copying data to the Clipboard eliminates the need for an intermediate file when creating spreadsheets.
- Open FOQUS.
- Open a spreadsheet program.
- Open the Simple_flow.foqus session from the example files.
- Click the Flowsheet button in the Home window.
- Click the Results Data button (Table icon in left toolbar).
- Click on the Menu drop-down list in the data table dialog.
- Select “Export” from the Menu drop-down list.
- Click Copy Data to Clipboard.
- Select Paste in the spreadsheet program. The data table in FOQUS should paste into the spreadsheet. Filters can be used to sort or reduce the exported data.
CSV (comma separated value) files can be read by almost any spreadsheet program, and are common formats readable by many types of software. FOQUS exports CSV files using the column headings from the data table as a header.
- Open FOQUS.
- Open a spreadsheet program.
- Open the Simple_flow.foqus session from the example files.
- Click the Flowsheet button in the Home window.
- Click the Results Data button (Table icon in left toolbar).
- Click the Menu drop-down list.
- Select “Export” from the Menu drop-down list.
- Click Export to CSV File.
- Enter a file name in the file dialog.
- In the spreadsheet program, open the CSV file exported in the previous step.
Tutorial 5: Using a Remote Turbine Instance¶
A remote Turbine instance may be used instead of TurbineLite. TurbineLite, used by default, runs simulations (e.g., Aspen Plus) on the user’s local machine. The remote Turbine gateway has several potential advantages over TurbineLite, while the main disadvantage is the effort required for installation and configuration. Some reasons to run a remote Turbine instance are:
- Simulations can be run in parallel. The Turbine gateway can distribute simulations to multiple machines configured to run FOQUS flowsheet consumers. FOQUS consumers are basically additional instances of FOQUS running on remote systems which can run a FOQUS flowsheet.
- Simulations can be run on machines other than the user’s, so as not to tie-up the user’s machine running simulations.
Running Remote Turbine on Your Own Computer¶
For this tutorial, the FOQUS file is Simple_flow.foqus, and this file is located in: examples/tutorial_files/Flowsheets/Tutorial_5
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
To run remote turbine on you own computer (e.g., if your computer has multiple processors):
- Navigate to the folder where your FOQUS working directory is located (“A” in Figure FOQUS Working Directory and Folder).

FOQUS Working Directory and Folder
- Create a blank folder (“B” in Figure FOQUS Working Directory and Folder). Here, this folder is called “Foqus_1”.
- Open the Anaconda prompt, and navigate to the folder where you downloaded FOQUS from GitHub (Please see Figure The Anaconda Prompt for Remote Turbine).

The Anaconda Prompt for Remote Turbine
- Once you navigate to the above-mentioned folder, type: “foqus –consumer -w” (without the quotes) and the location of the folder created in Step 2 (in quotes) (Please see Figure The Anaconda Prompt for Remote Turbine).
- If successful, a message will appear (shown at the bottom of Figure The Anaconda Prompt for Remote Turbine).
- If you have not done so already, open another Anaconda prompt, use it to open FOQUS.
- Click “Settings” at the top menu (“C” in Figure Remote Turbine Setting).

Remote Turbine Setting
- Under “FOQUS Flowsheet Run Method”, select “Remote” (“A” in Figure Remote Turbine Setting).
- A message box will appear to warn the user that the simulations may need to be re-uploaded to Turbine. Click “OK” to continue (“B” in Figure Remote Turbine Setting).
- Click the “Turbine” tab (“A” in Figure Turbine Lite Setup for Running Turbine Remotely).

Turbine Lite Setup for Running Turbine Remotely
- If necessary, copy the item in “Turbine Configuration (remote)” (“C” in Figure Turbine Lite Setup for Running Turbine Remotely) to a convenient location (e.g., Notepad), just in case.
- Make sure that the item under “Turbine Configuration (remote)” (“C” in Figure Turbine Lite Setup for Running Turbine Remotely) is the same as the item under “Turbine Configuration (local)” (“B” in Figure Turbine Lite Setup for Running Turbine Remotely). The reason you will be using Turbine Lite (instead of Turbine Gateway) is because you will be running the simulation remotely, but still in your own computer (instead of in other computers or in AWS - Amazon Web Service).
- Run the flowsheet. The run should be successful (Figure Example of Running the Flowsheet with Remote Turbine).

Example of Running the Flowsheet with Remote Turbine
Running Remote Turbine on AWS (Amazon Web Service)¶
The steps below demonstrate how to set up FOQUS to run flowsheets remotely if the user would like to run FOQUS in parallel in AWS (see Figure Remote Turbine Settings).
- Obtain a user name, password, and URL from the site’s Turbine administrator.
- Open FOQUS.
- Click Settings at the top right of the Home window (Figure Run Method Settings).
- Select “Remote” from the FOQUS Flowsheet Run Method drop-down list. A message box will appear. The user will be warned that the models that have been uploaded to Turbine Local may not be available in Turbine Remote Gateway, which means that the user may need to upload the models into Turbine again (please see Step 7).
- Click the Turbine tab; this displays the Turbine settings shown in Figure Remote Turbine Settings.
Run Method Settings
- Create a Turbine configuration file; this contains your password in
plain text, so it is very important that if you are allowed to choose
your own password, you choose one that is not used for any other
purpose.
- Click New/Edit next to the Turbine Configuration (remote) field. The Turbine Configuration window displays (see Figure Remote Turbine Settings).
- Select “Cluster/Cloud” from the Turbine Gateway Version drop-down list in the Turbine Configuration window.
- Enter the Turbine URL in the Address field.
- Enter the User name and Password.
- Click Save as and enter a new file name.
- Set the remote Turbine configuration file. Click Browse next to the Turbine Configuration (remote) field. Select the file created in Step 6e.
Remote Turbine Settings
At this point the remote gateway is ready to use. The last step is to ensure that all simulations referenced by flowsheets to be run are uploaded to the remote Turbine gateway.
- Upload any necessary simulations to Turbine (see Section Adding or Changing Turbine Simulations and the tutorial in Section Tutorial 2: Creating a Flowsheet with Linked Simulations)
Once all settings are specified there is no apparent difference between running flowsheets locally or on a remote Turbine gateway, and FOQUS can readily be switched between the two.
Optimization¶
Contents¶
Reference¶
The simulation based optimization tool provides a plug-in system where different derivative free optimization (DFO) solvers can be used with FOQUS. Several solvers are provided with FOQUS. The CMA-ES solver (Hansen 2006) is a good global derivative free optimization (DFO) solver. The NLopt library provides access to several DFO solvers (Johnson 2015). SLSQP and BFGS from the Scipy module are also provided (Jones et al. 2015). Since FOQUS does not generally have access to derivative information the Scipy solvers rely on finite difference approximations, and should only be used with well-behaved functions. Due to convergence tolerances in process simulators, finite difference approximations may not be good for many of FOQUS’s intended applications.
CMA-ES offers a restart feature, which can be used to resume an optimization if it is interrupted for any reason. Other solvers may use an auto-save feature, which does not provide the ability to restart, but will allow optimization to start from the best solution found up to the point the optimization was interrupted. Samples making up the population in CMA-ES can be run in parallel. The NLopt and Scipy plugins do not offer parallel computing for standard optimization. For any solver, parallel computation can be used for parameter estimation and optimization under uncertainty, where multiple flowsheet evaluations go into an objective function calculation.
Problem Set Up¶
See Chapter Flowsheets and Settings for information about setting up a flowsheet in FOQUS. Once the flowsheet has been set up and tested, an optimization problem can be added. FOQUS allows multiple flowsheet evaluations to be used to calculate a single objective function value. This allows FOQUS to do parameter estimation and scenario based optimization under uncertainty. There are three types of variables used in the optimization problem: (1) fixed variables do not change during the optimization, (2) decision variables are modified by the optimization algorithm to find the best value of the objective function, and (3) sample variables, which are used to construct the multiple flowsheet evaluations that can go into an objective calculation. If no sample variables are defined, each objective function value will be based on a single flowsheet evaluation. Figure Optimization Variable Selection shows the Variables tab selection form.
Optimization Variable Selection
- The Variables tab contains the form for variables selection.
- The Variable column shows the name of input variables in the flowsheet. If a variable is set by a connection to another variable through an edge, it is not shown in the table. The format for a variable name is {Node Name}.{Variable Name}.
- The Type column allows the variables to be assigned as one of three types (1) fixed, (2) decision, or (3) sample.
- The Scale column allows the scaling method to be set for each variable. Decision variables must be scaled. Scaling is ignored for other variables. In the FOQUS example files, there is a scaling spreadsheet that provides a demonstration of the different scaling methods. The upper and lower bound are used in the scaling calculations. Regardless of the scaling method, the optimizer sees the decision variables as running from 0 at their minimum to 10 at their maximum.
- The Min and Max columns are used to define the upper and lower bounds for the variables. FOQUS requires that all optimization problems be bounded.
- The Value column provides the starting point for the optimization. How the starting point is used depends on the optimization method. The starting point for sample variables is irrelevant. Fixed variables will remain at their starting point during the optimization.
The sample variables define a set of samples that will be used to calculate an objective function. For each objective function, the decision variables are fixed at values set by the optimization solver, and the flowsheet is evaluated for each row on the sample table. The results of the samples can be used to calculate the objective function. Using the Samples tab is optional. If no sample variables are set, each objective function value will be based on a single simulation. Figure Optimization Sample Table shows the Samples table form.
Optimization Sample Table
- The Samples tab contains the table used to define samples for objective function calculations. If there are no sample variables, the table should be empty.
- Add Sample adds a row to the Samples table.
- Delete Samples deletes the selected rows from the Samples table.
- Generate Samples opens a dialog box that provides a selection of methods to generate samples or read samples from a file.
- Clear Samples clears the Samples table.
Once the variables and (optionally) samples have been selected, the objective function and constraints can be defined. FOQUS is set up to handle multi-objective optimization, but no multi-objective optimization plug-ins are currently provided in the FOQUS installer, so some of the options may seem to be extraneous. There are two methods for entering the objective function and constraints into FOQUS: (1) Simple Python expressions and (2) a more extensive Python function. Python expressions are easier and sufficient for most cases. If the objective function is complicated it may be necessary to write a Python function, which can be as complex as needed.
The variables used in the Python code for the objective function or
constraints are stored in two Python dictionaries, “f” for outputs and
“x” for inputs. There are two ways to index the dictionaries depending
on whether or not sample variables are used. For an input variable with
sampling, the indexing is
x[Sample Index][’Node Name’][’Variable Name’][Time Step Index]
. If
no sample variables are defined, the sample index is not needed, so the
indexing would be, x[’Node Name’][’Variable Name’][Time Step]
. Node
Name and Variable Name are strings so they should be in quotes. The
sample and time step indexes are integers. For steady state simulations,
the time step should be 0.
Figure Optimization Simple Objective Function shows the form for entering the objective function and constraints as Python expressions.
Optimization Simple Objective Function
- The Objective/Constraints tab contains the form used to enter the objective function and constraints.
- The drop-down list enables the selection of either the “Simple Python Expression” or “Custom Python” form of the objective function.
- + adds an objective function to the table. The solvers currently available are single objective and will only use the first objective function.
- - removes the selected objective from the table.
- The Python expression for the objective function can be entered in the Expression column.
- The Penalty Scale column is intended for use with multi-objective solvers and allows the constraint violation penalty to be applied differently to objective functions with different magnitudes.
- The Value for Failure column contains the value to be assigned to the objective function if the objective cannot be evaluated for any reason. The value should be higher than the expected highest value for a successful objective.
- + adds an inequality constraint.
- - removes selected inequality constraints.
- The inequality constraints are in the form \(g(\mathbf{x}) \leq 0\). The Expression column contains the Python expression for \(g(\mathbf{x})\).
- The Penalty Factor contains the coefficient \(a\) used in calculating the penalty for a constraint violation, see Equations (1) to (3).
- The Form column contains a selection of different methods to calculate a constraint penalty.
- Check Input checks the problem for any mistakes that can be detected before running the optimization.
- Variable Explorer enables the user to browse the variables in the simulation. They can be copied and pasted into the Python expression. The variables are provided without the sample index.
The calculations for each type of constraint penalty are given in Equations (1) to (3).
If the Simple Python Expression method of entering the objective function does not offer enough flexibility, the Custom Python method can be used. The Custom Python method enables the user to enter the objective calculation as a Python function, which also should include any required constraint penalties.
Figure Custom Objective Function shows the Custom Python objective form. The top text box provides instructions for writing a custom objective function. The bottom text box provides a place to enter Python code. The numpy and math modules have been imported and are available as numpy and math. To use the Custom Python objective, the user must define a function called “onjfunc(x, f, fail).”” The three arguments are: (1) “x” is the dictionary of input variables, (2) “f” is the dictionary of output variables, and (3) “fail” is a boolean vector that indicates whether a particular sample calculation has failed. The “objfunc” function should return three values: (1) a list of objective function values for multi-objective optimization (in most cases with single objective optimization this will be a list with one value), (2) a list of constraint violations, and (3) the total constraint penalty. The constraint violation and penalty information are only used for debugging, so they are not required. It is safe to return [0] and 0 for the constraint information regardless of whether a constraint penalty has been added to the objective.
Custom Objective Function
The code in Figure Objective Function Code provides an example of a custom objective function for parameter estimation. The objective function minimizes the sum of the differences between simulation and empirical data. In this case the decision variables would be model parameters. The first line defines a function with three arguments. The “x” and “f” arguments are the input and output variables. The variable indexing is explained in the simple objective function section. The “fail” argument is a boolean vector where element “i” is true if sample “i” failed. If there are no sample variables, “fail” will only have one element.
The “if” in the function determines if any flowsheet evaluation failed, and assigns a bad objective function value if so. If all the flowsheet evaluations where successful, the results are used to calculate the objective function. In the objective function calculation, Python list comprehension is used to calculate the sum of squared errors. In this case, no constraint penalty is needed. The objective function is returned as a list with only one element. The last two return values are debugging information for constraints. In this case, the “zeros” are just place holders and have no real utility.
def objfunc(x, f, fail):
if any(fail): # any simulation failed
obj = 100000
else: #simulations successful
obj=sum([(f[i]['Test']['y'][0] - x[i]['Test']['ydata'][0])**2\
for i in range(len(f))])
return [obj], [0], 0
Solver Options¶
The Solver tab in the Optimization button tool enables the selection of the DFO method and setting of solver parameters. Figure Optimization Solver Form illustrates the solver form.
Optimization Solver Form
Elements of the solver form are:
- Select Solver drop-down list, which enables the user to select from available DFO solvers.
- Description text box provides a description of the selected DFO solver.
- Solver Options table contains the solver settings and a description of each option. The settings depend on the selected plug-in.
Running Optimization¶
The optimization monitor is displayed under the Run tab in the Optimization button tool. The optimization monitor, illustrated in Figure Optimization Monitor Form, is used to monitor the progress of the optimization as it runs.
Optimization Monitor Form
Elements of the optimization monitor are:
- Start starts the optimization.
- Stop stops the optimization. The best solution found when optimization is stopped is stored in the flowsheet.
- Update delay is how often the user interface communicates with the optimization thread to update the display.
- Optimization Solver Messages displays output from the optimization solver.
- Best Solution Parallel Coordinate Plot displays the values of the decision variables scaled. This plot is helpful in identifying when variables are at, or near, their bounds.
- Objective Function Plot displays the objective function value at each iteration.
- Status Box displays the current iteration, how many samples have been run, how many sample were successful, and how many failed.
- Clear deletes solver messages from the solve message box.
As the optimization runs, the FOQUS flowsheet is updated to include the best solution found. If sampling is used, the first sample in the best objective function is stored in the flowsheet. If for any reason the optimization terminates, the best solution found is available in the flowsheet. The results for all flowsheet evaluations done for the optimization are available in the Results table in the Flowsheet Editor.
Tutorial¶
Tutorial 1: Optimization¶
This tutorial is a step-by-step walk through of simulation-based optimization. This tutorial builds on the tutorial in Section Tutorial 2: Creating a Flowsheet with Linked Simulations.
The files for this tutorial are located in: examples/test_files/Optimization/Model_Files/
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
- Open FOQUS.
- Load the FOQUS session from the tutorial “Creating a Flowsheet with Linked Simulations” in Section Tutorial 2: Creating a Flowsheet with Linked Simulations or if that tutorial has not yet been completed, complete it first.
Problem Set Up¶
If the simulation runs successfully and the results are reasonable, proceed to define the optimization problem. There are four steps to setting up the optimization problem: (1) select the variables, (2) define samples (optional), (3) define the objective function, and constraints and (4) select and configure the solver.
- Select the Optimization button from the toolbar at the top of the Home window (Figure Optimization Problem Variables). Select the Variables tab.
- Select “Decision” from the drop-down list in the Type column as the variable type for all 17 variables shown. If more than 17 variables are shown, the edge connecting the “BFB” node to the “Cost” node was most likely not configured properly. The scale will automatically change to linear, which is acceptable for most problems.
- The Min, Max, and Value columns can be changed. The Min and Max columns define the lower and upper bounds. The Value column specifies the initial point. For this example the defaults are acceptable.
Optimization Problem Variables
If more than one flowsheet evaluation is used in the objective function calculation (e.g., parameter estimation or optimization under uncertainty), the next step is to setup the samples under the Samples tab. In this case only one evaluation is used to calculate an objective function value, so the sample setup is not needed. The next step is to define the objective function and constraints using the form under the Objective/Constraints tab as shown in Figure Optimization Problem Objective.

Optimization Problem Objective
Select the Objective/Constraints tab (see Figure Optimization Problem Objective).
In the drop-down list, verify “Simple Python Expression” is selected.
Add an objective function by clicking + to the right of the Objective Function table.
- The objective function is the cost of electricity from the cost spreadsheet. Enter:
f.Cost.COE
in the Expression column. Enter 1 in the Penalty Scale column. This setting is used mostly for multi-objective optimization to apply the constraint penalty to different objectives.
Enter 500 in the Value for Failure column. This should be worse than the objective for any non-failed simulations.
Add a constraint by clicking + next to the Inequality Constraints table.
- The constraint is that the fraction of CO\(_2\) captured must be greater than or equal to 0.9. The constraint is in the form \(g(\mathbf{x}) \leq 0\); therefore, in the Expression column enter:
0.9 - f.BFB.removalCO2
. Enter 1000 for the Penalty Factor.
The constraint penalty Form should be linear.
The Variable Explorer button can be used to help select flowsheet variables.
Solver Settings¶
The last step before running the optimization is to select and configure the solver. The solver configuration form is shown in Figure Optimization Solver Setup.
Optimization Solver Setup
- Select the Solver tab (see Figure Optimization Solver Setup).
- Select “OptCMA” from the Select Solver drop-down list.
- The default options are acceptable. Solver options are described in the Solver Options table.
Running Optimization¶
The optimization run form is shown in Figure Optimization Monitor.
Optimization Monitor
- Click the Run tab to display the optimization run form (see Figure Optimization Monitor).
- Click Start.
- Once the optimization has run for while click Stop.
As the optimization run, the best result found is stored in the Flowsheet. If an optimization is run with sample variables the first sample in the set with the best objective function will be stored in the flowsheet. All simulation results can be viewed in the Flowsheet Results table.
The run form displays some diagnostic information as the optimization runs. The parts of the display labeled in Figure Optimization Monitor are described below.
- The Optimization Solver Messages window displays information from the solver.
- The Best Solution Parallel Coordinate Plot shows the value of the scaled decision variables, which is useful to see where the best solution is relative to the variable bounds.
- The Objective Function Plot shows the best value of the objective function found as a function of the optimization iteration or sample number.
- While the optimization is running, the status bar shows the amount of time that has elapsed since starting the optimization.
Tutorial 2: Parameter Estimation¶
Note: The NLopt solvers are used for the tutorial, but are an optional to the installation. See the install instructions for more information about installing NLopt.
This tutorial provides a very simple example of using the sampling with optimization. Sampling can be used to do optimization under uncertainty where there are several scenarios with differing values of uncertain parameters. Sampling can also be used to do parameter estimation, where estimated values must be compared against several data points. This tutorial will focus on parameter estimation.
At any point in this tutorial, the FOQUS session can be saved and the tutorial can be started again from that point.
The model is given by Equation (1). The unknown parameters are \(a\), \(b\), and \(c\). The x and y data are given in Table x-y Data.
Sample | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
x | 0 | 1 | 2 | 3 | 4 |
y | 1 | 0 | 3 | 10 | 21 |
The first step is to create a flowsheet with one node. The node will have the input variables: a, b, c, x, and ydata; and output variable y.
- Open FOQUS.
- In the Session Name field, enter “PE_tutorial” (see Figure Session Setup).
- Click the Flowsheet button in the top toolbar.
Session Setup
- Add a node to the flowsheet named “model.”
- Click Add Node in the left toolbar (see Figure Adding Node and Inputs).
- Click anywhere on the gridded flowsheet area.
- Select “model” in the Name drop-down list and then click OK.
- Click the Selection Mode icon in the left toolbar (see Figure Adding Node and Inputs).
- Click the Node Editor icon in the left toolbar (see Figure Adding Node and Inputs).
- In the Node Edit input table, add the variables a, b, c, x, and
ydata. The ydata variable will be used as an input for the known y
sample point data, later in the tutorial.
- Click the Add Input icon (see Figure Adding Node and Inputs).
- Enter “a” for the variable name in the Name column.
- Enter -10 and 10 for the min and max in the Min and Max columns for a, b, c, and x.
- Repeat for all of the inputs.
- Enter 1 for the value of a, b, and c in the Value column.
- Enter 2 for the value of x in the Value column.
- The Value, Min, and Max for ydata do not matter.

Adding Node and Inputs
- Click Output Variables (see Figure Adding Outputs).
- Add the output variable y.
- Click the Add Output icon (see Figure Adding Outputs).
- Enter “y” for the variable name in the Name column.
Adding Outputs
Add the model equation to the node.
Click the Node Script tab.
Enter the following code in the calculations box:
f['y'] = x['a']*x['x']**2\ + x['b']*x['x'] + x['c']
Adding Node Calculation
- Return to the Output Variables table in the Node Editor, by clicking on the Variables tab, and selecting Output Variables.
- Click Run in the left toolbar in the FOQUS Home window, to test a single flowsheet evaluation and ensure there are no errors.
- When the run is complete, there should be no error and the value of y should be 7 in the Output Variables table.
The next step is to setup the optimization. The objective function is to minimize the sum of the squared errors between the estimated value of y and the observed value of y. There are five data points in Table x-y Data, so there are five flowsheet evaluations that need to go into the calculation of the objective.
Click the Optimization button in the top toolbar of the Home window (see Figure Optimization Variables).
- Select “Decision” in the Type column drop-down lists for “model.a,” “model.b,” and“model.c.” The Scale column will automatically be set to linear.
Select “Sample” in the Type column drop-down lists for “model.x” and “model.ydata.”
Optimization Variables
The decision variables in the optimization problem will be changed by the optimization solver to try to minimize the objective, and the sample variables are used to construct the samples that will go into the objective function calculation.
- Select the Samples tab (see Figure Optimization Samples).
- Click Add Sample five times to add five samples.
- Enter the data from Table x-y Data in the Samples table.
- For larger sample sets, Generate Samples has an option to load from a CSV file. The CSV file must be saved as “CSV (MS-DOS)” as the file type, as follows:

Sample Variable data (csv file)
Optimization Samples
The objective function is the sum of the square difference between y and ydata for each sample in Table x-y Data. The optimization solver changes the a, b, and c to minimize the objective.
Click the Objective/Constraints tab.
Click the Add Objective icon on the right side of the Objective Function table (see Figure Objective Function).
In the Expression column, enter the following (without any line break):
sum([(ff.model.y - xx.model.ydata)**2 for (ff,xx) in zip(f,x)])
The above expression uses Python list comprehension to calculate the sum of squared errors.
- The keys for x (the inputs) and f (the outputs) are:
- Dummy variable name for index (i.e., ff for outputs and xx for inputs)
- Node name (i.e., model)
- Variable name (i.e., y and ydata)
Then, the user will need to specify which of the dummy index corresponds to outputs, and which of the dummy index corresponds to inputs. In this case, ff is for the outputs, and xx is for the inputs. Hence, we have “for (ff,xx) in zip(f,x)” (without the quotes).
Enter 1 for the Penalty Scale.
Enter 100 for the Value for Failure.
No constraints are required.

Objective Function
Once the objective is set up, a solver needs to be selected and configured. Almost any solver in FOQUS should work well for this problem with the default values.
- Click the Solver tab (see Figure Optimization Samples).
- Select “NLopt” from the Select Solver drop-down list. NLopt is a collection of solvers that share a standard interface (Johnson 2015).
- Select “BOBYQA” under the Solver Options table in the Settings column drop-down list.
Optimization Samples
- Click the Run tab (see Figure Running Optimization).
- Click the Start button.
- The Optimization Solver Messages window displays the solver progress. As the solver runs, the best results found is placed into the flowsheet.
- The Best Solution Parallel Coordinate Plot shows the scaled decision variable values for the best solution found so far.
- The Objective Function Plot shows the value of the objective function as the optimization progresses.
Running Optimization
The best result at the end of the optimization is stored in the flowsheet. All flowsheet evaluations run during the optimization are stored in the flowsheet results table.
- Once the optimization has completed, click Flowsheet in the top toolbar.
- Open the Node Editor and look at the Input Variables table. The approximate result should be \(a = 2\), \(b = -3\), and \(c = 1\) (see Figure Flowsheet, Input Variables Results).
Flowsheet, Input Variables Results
Uncertainty Quantification (UQ)¶
Contents¶
Reference¶
The Uncertainty Quantification (UQ) module of FOQUS provides a multitude of analysis and visualization capabilities to facilitate the understanding of uncertainty’s impact on a given system. In a generic UQ study, the workflow is usually comprised of the following steps:
- Define the objectives of the analysis.
- Specify and acquire the simulation model, which implements an input-to-output mapping from inputs to outputs.
- Select the inputs that have uncertainty and characterize said uncertainty in the form of prior distributions.
- Identify relevant data from physical experiments that can be used to refine these prior distributions on the inputs.
- Generate a set of input samples according to the input distribution.
- Propagate the set of input samples through the simulation model to get the corresponding output values.
- Analyze the results to make informed decisions about subsequent analyses.
FOQUS UQ provides tools to perform Steps 5-7. With respect to Step 7, a variety of analysis capabilities are available. They include parameter screening methods, response surface construction/validation/prediction, uncertainty analysis, sensitivity analysis, and visualization.
In this chapter, components of the UQ user interface are first explained, then the use of these components for UQ analyses is illustrated.
UQ User Interface¶
The UQ module enables the user to perform UQ studies on a flowsheet. From the Uncertainty button on the Home window, the user can configure different simulation ensembles (different sets of samples generated using different sampling schemes), run them, and perform analyses. This screen is illustrated in Figure Uncertainty Quantification Screen.

Uncertainty Quantification Screen
Simulation Ensemble Table displays all of the simulation ensembles: each ensemble being a row in the table. A simulation ensemble is a collection of sample points where each sample point has a different set of values for the uncertain variables. The values of these variables are generated based on the sampling scheme designated by the user. When launched, the output values of the sample points are calculated based on the generated sample input values. Subsequently, the corresponding simulation outputs can be analyzed. For each ensemble, the table displays the Ensemble index, Run Status (how many have been completed), Setup and Launch options (discussed below), and a Descriptor. The Descriptor contains the name of the corresponding node in the flowsheet or the name of the file if the ensemble was loaded from a file. Additional sample information such as # Inputs, # Outputs, Sample Design, and Sample Size are also displayed on the right.
- Add New creates a simulation ensemble (a set of input samples) as a new row in the Simulation Ensemble Table. Once clicked, a dialog is displayed to prompt the user to choose between using (1) a flowsheet (an exact simulation model) or (2) a response surface (an approximate simulation model or an emulator) associated with the ensemble.If using an emulator, the user must browse a PSUADE-formatted file that contains the training data for the emulator (in this version, the response surface type has been designated inside the sample file and can only be changed by editing the sample file) and select the output(s) to be evaluated by the trained emulator. Subsequently, a simulation setup dialog box is displayed for setting up the distributions of input variables and the sampling scheme to generate samples of the uncertain input variables. This Simulation Ensemble Setup dialog is explained in further detail in Section Simulation Ensemble Setup Dialog.
Load from File loads a simulation ensemble from a sample file that conforms to the PSUADE full file format. (See Section File Formats for details on the PSUADE full file format.) The user can click Save Selected to save an existing ensemble as a PSUADE full file, and load it by clicking Load from File to perform further analyses.
Clone Selected clones the selected simulation ensemble and adds the copy as a new row at the end of the table. This ensemble can then be edited (e.g., depending on whether the ensemble has been run, the user has different options for modifying the ensemble). This allows the user to create a new ensemble similar to the current ensemble without having to start from scratch (i.e., setting up the input parameters). For example: (1) Clone Selected can be used in conjunction with Load from File to clone an existing ensemble before input/output modification to prepare a new but similar ensemble for UQ analysis. (2) Clone Selected can also be used to prepare a fresh ensemble for evaluation via a different simulation model. In this case, the user should save the cloned ensemble, reload it by clicking Add New, associate it with a node, and then click Launch to start the runs.
Delete Selected deletes the currently selected simulation ensemble.
Revise enables a user to change a simulation ensemble before launching the run. If the ensemble was previously run or it is cloned from an already-generated sample, the corresponding button becomes View so the user can view the input samples in the simulation ensemble.
Launch starts the simulation process of the ensemble. (Launch is not enabled until the user has setup everything needed for simulations.) A simulation is launched for each sample point to compute the corresponding outputs.
Analyze, when enabled (after all simulation results are ready), enables the user to perform various UQ analysis to the ensemble. When clicked, a new dialog box displays, allowing the user to configure and run analysis.
Data Manipulation enables (1) the deletion of inputs, outputs, or samples, (2) the modification of output values for specific sample points, and (3) the range-based filtering of samples.
Inspection/Deletion/Output Value Modification serves three purposes: it enables the user to (1) view the numerical values of samples in table form, (2) delete variables and/or samples, and (3) edit the output values of specific samples. Deletion creates a new simulation ensemble as a new row in the simulation table that contains only those inputs/outputs and samples that were not selected for deletion. Output Value Modification changes the values within the ensemble itself.
Filtering enables the user to filter samples based on the values of an input or output. First, select the ensemble to be filtered from the Simulation Ensemble Table. Once filtering is complete, a new simulation ensemble is added as a new row in the simulation table. The new simulation ensemble contains only those samples that satisfy the filtering criterion (with input or output samples within the specified range).
Reset Table resets the table to default, meaning all variable and sample selections are unselected and output values within the table are reverted back to their original values, thus undoing any edits to the table.
The table displays the values of inputs and outputs for each sample. Inputs are highlighted in pink; outputs are highlighted in yellow. The user can select which variables (columns) to delete by selecting the checkboxes on top. Likewise, the user can select which samples (rows) to delete by selecting the checkboxes on the left. Multiple samples can also be selected/deselected by using (1) Shift+Click or Ctrl+Click to select multiple rows, or (2) right-clicking to bring up a menu to check or uncheck the checkboxes corresponding to the rows of the selected samples. In addition, the user can change any output value by editing the appropriate cell. These modified cells are highlighted green until changes are made permanent by clicking the appropriate button.
Perform Deletion then Save as New Ensemble creates a new simulation ensemble as a new row in the Simulation Ensemble Table. The new ensemble is without the variables and samples that were previously selected for removal.
Make Output Value Changes Permanent overwrites the output values in the current ensemble with those that are highlighted green in the table.
The Filtering tab is illustrated in Figure Filtering Tab and enables the user to filter samples based on the values of an input or output.
Filtering Tab
Filtering Dialog Box
Click on Add/Edit Filters, in the Flowsheet Results window within the “Filtering Tab”
- Within the Filter Dialog Box, Click on “New Filter” to add a filter
- Enter a filter expression in python format. Variables can be dragged into the expression, from the “Columns”, click Done.
Select a “Current Filter” after which the the filtered ensemble can be saved by clicking on ” Save as New Ensemble”
The single-output Analysis of Ensemble dialog, which is displayed when Analyze is clicked for the selected ensemble, has two modes, as shown in Figure Analysis Dialog, Ensemble Data Analysis, Wizard Mode and Figure Response Surface Based Mixed Epistemic-Aleatory Uncertainty Analysis.
Analysis Dialog, Ensemble Data Analysis, Wizard Mode
Select Wizard or Expert mode. The Wizard mode provides more detailed guidance on how to perform UQ analysis. For users familiar with UQ analysis techniques, the Expert mode provides more functionality and flexibility but with less guidance on its use. For example, users will be able to customize the input distributions, as well as run more advanced uncertainty analysis that handles both epistemic and aleatory uncertainties.
The Analyses Performed section provides the user a history of previous analyses that were performed. The results of these analyses are cached, so the user can plot the analysis results without having to recompute them.
The Analysis Table populates as the user performs analyses. It lists previous analyses that the user has performed, along with some of the main analysis settings (analysis type, inputs and outputs analyzed, etc.)
Depending on the type of analysis performed, the Additional Info button displays any additional settings or parameters set by the user in the selected analysis that were not shown in the Analysis Table.
The Results button will display the results of the selected analysis.
The Delete button will delete the selected analysis from the history of previous analyses. Once deleted, the user will need to perform the analysis again to see its results.
The Qualitative Parameter Selection (top part of the Analysis of Ensemble dialog) houses the controls for parameter selection analysis. Parameter selection is a qualitative sensitivity analysis method that identifies a group of dominant input parameters that are recommended for inclusion in subsequent UQ analyses, as they are the ones that most impact the output uncertainty. The parameter screening results are shown as bar graphs so that the user can rank the uncertain parameters visually.
Before performing parameter selection, the user must select a single output for identifying parameter sensitivities from the Choose output to analyze drop-down list.
There are several methods of parameter selection. The list of parameter selection methods available depends on the sample scheme of the selected ensemble. Select the appropriate method from the Choose Parameter Selection Method drop-down list. Then click Compute input importance to start the analysis.
The Ensemble Data radio button directs FOQUS to perform analyses on the raw ensemble data.
To view plots of the raw ensemble data, choose the desired input(s) from the Select the input(s) drop-down lists. Then click Visualize. If multiple inputs are selected, each must be unique.
To perform an analysis, select the desired analysis (“Uncertainty Analysis” or “Sensitivity Analysis”) from the Choose UQ Analysis drop-down list. Uncertainty Analysis computes and displays the probability distribution of the single selected output parameter and displays its sufficient statistics, such as mean, standard deviation, skewness, and kurtosis. Sensitivity Analysis computes and displays each uncertain input parameter’s contribution to the total variance of the output. If Sensitivity Analysis is selected, choose the type of sensitivity analysis desired in the next drop-down list. There are three options for Sensitivity Analysis: (1) first-order, (2) second-order, and (3) total-order.
- First-order analysis examines the effect of varying an input parameter alone.
- Second-order analysis examines the effect of varying pairs of input parameters.
- Total-order analysis examines all interactions’ effect of varying an input parameter alone and as a combination with any other input parameters.
Click Analyze to run the analysis. (Note: Raw ensemble data analysis may not be suitable if the sample size is small. It may be useful if the data set has tens of thousands of sample points or if an adequate response surface cannot be constructed. Otherwise, response surface-based analyses are recommended.)
Analysis Dialog, Response Surface Analysis, Wizard Mode
Response Surface enables the user to perform all analyses related to response surfaces. A response surface is an approximation of the input-to-output relationship. This is an inexpensive way to approximate the values of outputs given different input values when the actual simulation of output values is computationally intensive. FOQUS uses the data (i.e., input-output samples) to fit a response surface scheme. The first step in this analysis is to select which output to analyze.
Select the Response Surface Model to be used to approximate the input-to-output mapping. Selection of “Polynomial” or “MARS” requires one further selection in the second drop-down list. If “Polynomial” is chosen in the first drop-down list and “Legendre” is chosen in the second drop-down list, the user needs to specify a number for the Legendre polynomial order before analysis can proceed.
The response surface selected must be validated before further analyses can be performed. The user can specify the error envelope for the validation plot. When Validate is clicked, the resulting plots display the best fit between the response surface (based on the model selected) and the actual data.
Choose UQ Analysis enables the user to perform response-surface-based UQ analyses. Select the analysis in the first drop-down list. If the desired analysis is Sensitivity Analysis, select the desired type of sensitivity analysis in the second drop-down list and then click Analyze. Uncertainty Analysis and Sensitivity Analysis compute and display the same quantities as in item 30. However, the results displayed are based on samples drawn from the trained response surface, not the simulation ensemble itself. Moreover, if the selected response surface has uncertainty, the resulting plots also reflect this uncertainty information.
FOQUS also provides visualization capabilities, enabling the user to view the response surface as a function of one or multiple inputs. Up to three inputs can be visualized at once. Click Visualize to view. A 2-D line plot displays if only one input parameter is selected. A 3-D surface plot and a 2-D contour plot display if two input parameters are selected. A 3-D isosurface plot with a slider bar displays if three input parameters are chosen. For the isosurface plot, the user can use the slider to selectively display the 3-D input parameter space that activates a particular range in the output parameter.
Finally, the Bayesian Inference of Ensemble dialog (shown in Figure Bayesian Inference Dialog) is used to calculate the posterior distributions (prior distributions integrated with data) of the uncertain input parameters. Inference utilizes Markov Chain Monte Carlo (MCMC) to compute the posterior distributions, using response surfaces that serve as fast approximations to the actual simulation model.
Bayesian Inference Dialog
Inference uses a response surface to approximate the input-to-output mapping. In Output Settings, select the observed outputs and select the response surface type that works best with each observed output. As in item 32, further selections may be required based on the response surface chosen. The simulation ensemble is used as the training data for generating the response surfaces.
The user can specify which inputs are fixed, design (fixed per experiment, but changes between experiments), or variable using the Input Settings Table. In addition, the user can specify which inputs are displayed in the resulting plots of the posterior distributions. By default, once inference completes, all inputs will be displayed in the plots. To omit specific inputs, clear the checkboxes from the Display column of the table. Finally, in Expert mode, this table can also be used to modify the input prior distributions. The default prior is the input distribution specified in the simulation ensemble. To change the prior distribution type, use the drop-down list in the PDF column and enter corresponding values for the PDF parameters. To change the range of a uniform prior, scroll all the way to the right to modify Min/Max.
The Observations section enables the user to add experimental data in the form of observations of certain output variables. At least one observation is required. Currently, the observation noise model is assumed to be a normal distribution. Other distributions may be supported in the future. To specify the observation noise model, enter the mean (and standard deviation, if standard inference is selected) for each output observation. For convenience, the Mean and Standard Deviation fields have been populated with the statistics from the ensemble uncertainty analysis. If any inputs are selected as design inputs, their values will also be required here.
Save Posterior Input Samples to File checkbox, when selected, saves the posterior input samples as a PSUADE sample file (format described in Section File Formats). This file characterizes the input uncertainty as a set of samples, which can be re-used in the Simulation Ensemble Setup dialog, to evaluate the outputs corresponding to these posterior input samples.
If saving posterior samples to a file, click Browse to set the name and location of where this file is saved.
Click Infer to start the analysis. (Note: If the inference returns an invalid posterior distribution (i.e., one with no samples), it usually means the prior distributions or that the observation data distributions are not prescribed appropriately. In this case, it is recommended that the user experiment with different priors and/or data distribution means and/or standard deviations.)
Inference calculations often take a very long time. If inference has run to completion, use Replot to generate new plots (e.g., to only display a subset of the input posterior graphs) from the cached inference results.
Simulation Ensemble Setup Dialog¶
The Simulation Ensemble Setup dialog (shown in Figure Simulation Ensemble Setup Dialog, Distributions Tab) is used to create a new simulation ensemble. This is done by: (1) setting up distribution parameters and generating samples, or (2) loading samples from a file. This dialog is displayed when selecting Add New on the UQ window (Figure Uncertainty Quantification Screen).

Simulation Ensemble Setup Dialog, Distributions Tab
Choose how to generate samples. There are three options: (1) Choose sampling scheme (default), (2) Load flowsheet samples, or (3) Load all samples from a single file. The option 3 is explained in item 11.
If Choose Sampling Scheme is selected, the Distributions tab is displayed. The user specifies the input uncertainty information.
The Distributions Table is pre-populated with input variable information gathered from the flowsheet node. Under the Type column drop-down list, the user can select “Fixed” or “Variable”. Selecting “Fixed” means that the input is fixed at its default value for all the samples. Changing the type to “Variable” means that the input is uncertain; therefore, its value varies between samples. With any fixed input, the only parameter that can be changed is the Default value (i.e., all samples of this input are fixed at this default value). With any variable input, the Min/Max values, as well as the probability distribution function (PDF), for that input can be changed. Some PDFs have their own parameters (e.g., mean and standard deviation for a normal distribution), which are required in the columns right of the distribution column. See the PSUADE manual for more details on the different PDFs.
All Fixed and All Variable are convenient ways to set all the inputs to variable or fixed.
Note: A “Sample” PDF refers to sampling with replacement (i.e., input samples would be randomly drawn, with replacement, from a sample file). If the selected distribution for any input is “Sample”, then the following parameters are required: (1) the path of the sample file (which must conform to the sample format specified in Section File Formats); (2) the output index that designates which output is to be used.
In the Sampling scheme tab (Figure Simulation Ensemble Setup Dialog, Sampling Scheme Tab), specify the sampling scheme, the sample size, and perform sample generation.
Simulation Ensemble Setup Dialog, Sampling Scheme Tab
Each radio button displays a different list of sampling schemes on the right. The radio buttons serve as a guide to help in the selection of the appropriate sampling schemes for target analyses. A sampling scheme must be selected from the list on the right to proceed.
Set the number of samples to be generated from the # of samples spinbox.
When all parameters are set, click Generate Samples. This generates the values for all the input variables, based on the sampling scheme selected.
Once samples have been generated, click Preview Samples to view the samples that were generated. This displays the sample values in table form, as well as graphically as a scatter plot.
From item 1, if the user elects to load all samples from a single file, click Browse to select the file containing the samples (Figure Simulation Ensemble Setup Dialog, Load Samples Option. This file must conform to the PSUADE full file format, the PSUADE sample format, or CSV file (all formats described in Section File Formats). Note: This is the only place where all the formats are supported. Once the file is loaded, the file name displays in the text box. These samples can now be used in the same way as an ensemble that was newly generated (as described above).
Simulation Ensemble Setup Dialog, Load Samples Option
Tutorials¶
This section contains five tutorials that illustrate the use of FOQUS UQ to facilitate the UQ workflow discussed above.
Each tutorial will refer to example files located in the examples directory of the FOQUS download.
Tutorial 1: Simulation Ensemble Creation and Execution¶
Creating a simulation ensemble using the variables’ distributions¶
In this tutorial, a simulation ensemble is created (using FOQUS) and run.
The FOQUS file for this tutorial is Rosenbrock_no_vectors.foqus, and this file is located in: examples/tutorial_files/UQ/Tutorial_1
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
From the FOQUS main screen, click the Session button and then select Open Session to open a session. Browse to the folder shown above, and select the “Rosenbrock_no_vectors.foqus” file (Figure Home Screen).
Home Screen
Opening this file loads a session that has a flowsheet with one node (Figure Flowsheet for Rosenbrock Example). See Section Tutorial 1: Creating a Flowsheet for a detailed example of creating a flowsheet.
Flowsheet for Rosenbrock Example
Click the Uncertainty button (Figure Uncertainty Quantification Screen).
Uncertainty Quantification Screen
Click Add New to create a new simulation ensemble.
The Add New Ensemble dialog displays (Figure Add New Ensemble Dialog, Flowsheet Option). The “Use flowsheet” option should be enabled.
- This item describes additional features and is provided for information only. It is not intended to be followed as part of the step-by-step tutorial.An alternative is to use an emulator by selecting “Use emulator.” This alternative is preferred if the actual simulation model is too computationally expensive to be practical for a large number of samples. This option enables the user to trade off accuracy for speed by training a response surface to approximate the actual simulation model. If this option is selected (Figure Add New Ensemble Dialog, Emulator Option), the user needs to provide a training data file containing a small simulation ensemble generated from the actual simulation model. This training data file should be in the PSUADE full file format (see section File Formats).
- Click Browse and select the training data file with which to train the response surface. The inputs, outputs and response surface type is read from the training data and populated accordingly on this dialog box.
- Select Output(s) of Interest. To select multiple outputs, the user can use Shift + Click to select a range, or use Ctrl + Click to select/deselect individual outputs.
Click OK.
Add New Ensemble Dialog, Flowsheet Option
Add New Ensemble Dialog, Emulator Option
This displays the Simulation Ensemble Setup dialog box (Figure Simulation Ensemble Setup Dialog, Distributions Tab ) that prompts the user for options specific to the creation of input samples.
Within the Distributions tab, the Distributions Table has all the inputs from the flowsheet node, each displayed in its own row.
- Click the All Variable button.
- Change the Type of “x2” to “fixed.”
- Enter 5 into the Default column for “x2.”
Subsequently, other cells in the row are enabled or disabled according to the type selection.
Simulation Ensemble Setup Dialog, Distributions Tab
In this dialog, extra options that are available related to simulation ensemble setup are discussed.
Change the PDF of “x6” by exploring the drop-down list in the PDF column of the Distributions Table. The drop-down list is denoted by box (9c) in Figure Simulation Ensemble Setup Dialog, Distributions Tab, PDF Selection. If any of the parametric distributions are selected (e.g., “Normal”, “Lognormal”, “Weibull”), the user is prompted to enter the appropriate parameters for the selected distribution. If non-parametric distribution “Sample” is selected, the user needs to specify the name of the sample file (a CSV or PSUADE sample format is located in Section File Formats) that contains samples for the variable “x6.” The user also needs to specify the output index to indicate which output in the sample file to use. The resulting simulation ensemble would contain “x6” samples that are randomly drawn (with replacement) from the samples in this file.
Simulation Ensemble Setup Dialog, Distributions Tab, PDF Selection
Alternatively, select Choose sampling scheme (box (8) of Figure Simulation Ensemble Setup Dialog, Distributions Tab), and try selecting “Load all samples from a single file.” With this selection, a new dialog box prompts the user to browse to a PSUADE full file, a PSUADE sample file, or CSV file (all formats are described in Section File Formats) that contains all the samples for all the input variables in the model.
Both of these options offer the user additional flexibility with respect to characterizing input uncertainty or generating the input samples directly.
Once complete, switch to the Sampling Scheme tab (Figure Simulation Ensemble Setup Dialog, Sampling Scheme Tab).
Simulation Ensemble Setup Dialog, Sampling Scheme Tab
Select a sampling scheme with the assumption that the user is unsure which sampling scheme to use, but wants to perform some kind of response surface analysis. This example helps the user find a suitable one.
- Click For response surface analysis. Note the list on the right changes accordingly.
- Select “Latin Hypercube” from the list on the right.
To generate 500 samples, change the value in “# of samples.” Some sampling schemes may impose a constraint on the number of samples. If the user has entered an incompatible sample size, a pop-up window displays with guidance on the recommended samples size.
Click Generate Samples to generate the sample values for all the variable input parameters. On Windows, if the user did not install PSUADE in its default location (C:Program Files (x86)psuade_project 1.7.1binpsuade.exe) and the user did not update the PSUADE path in FOQUS settings (refer to Section Settings), then the user is prompted to locate the PSUADE executable in a file dialog.
Once the samples are generated, the user can examine them by clicking Preview Samples. This displays a table of the values, as well as the option to view scatter plots of the input values. The user can also select multiple inputs at once to view them as separate scatter plots on the same figure.
When finished, click Done.
The simulation ensemble should be displayed in the Simulation Ensemble Table. If the user would like to change any of the parameters and regenerate a new set of samples, simply click the Revise button.
Next, calculate the output value for each sample. Click Launch. The user should see the progress bar quickly advance, displaying the status of completed runs (Figure Simulation Ensemble Added).
Simulation Ensemble Added
Next, look at the output.
Click Analyze for “Ensemble 1” (Figure Simulation Ensemble Evaluation Complete).
Simulation Ensemble Evaluation Complete
Step 1 of “Analysis” (bottom page), the user selects Ensemble Data (Figure Simulation Ensemble Analysis).
Simulation Ensemble Analysis
Step 2 of “Analysis” is to select “Rosenbrock.f” (Figure Simulation Ensemble Analysis).
Step 3 of “Analysis” is to keep the analysis method as “Uncertainty Analysis” and then click Analyze. The user should see two graphs displaying the probability and cumulative distributions plots (Figure Uncertainty Analysis Results). Users should keep in mind these figures are intended to show what type of plots they would get, but they should not expect to reproduce the exact same plots.
Uncertainty Analysis Results
Prior to this, the “Rosenbrock” example was selected to illustrate the process of creating and running a simulation ensemble because simulations complete quickly using this simple model. But from this point on, the adsorber subsystem of the A650.1 design is used as a motivating example to better illustrate how one would apply UQ within the context of CCSI.
A quick recap on our motivating example: The A650.1 design consists of two coupled reactors: (1) the two-stage bubbling fluidized bed adsorber and (2) moving bed regenerator, in which the output (outlet of sorbent stream) from one reactor is the input (inlet) for the other. The performance of the entire carbon capture system is obtained by solving these two reactors simultaneously, accounting for the interactions between the reactors. However, it is also necessary to study the individual effects of the adsorber and the regenerator without the side effects of their coupling since the two reactors display distinct characteristics under different operating conditions. Thus, the Process Design/Synthesis Team has given us a version of the A650.1 model that can be run in two modes: (1) coupled and (2) decoupled. In this section, analysis results are presented from running the A650.1 model using the decoupled mode and examining the adsorber in isolation from the regenerator.
Automatically running FOQUS for a set of user-defined input conditions¶
In this tutorial, we will show you how to automatically run a set of user-defined input conditions in FOQUS.
This procedure will require the user to specify the input conditions in a CSV (comma-separated values) Excel file.
We will use a simple example to show the procedure.
- Open FOQUS.
- Go to the “Session” tab, and under “Session Name” type: basic_example (please see Figure Specifying the Session Name).

Specifying the Session Name
- Go to the “Flowsheet” tab, and click the “Add Node” button (“A” in Figure Inserting a Node and Specifying the Inputs).

Inserting a Node and Specifying the Inputs
- Insert a node called “example” (without the quotes) (“B” in Figure Inserting a Node and Specifying the Inputs).
- Open the Node Editor by clicking the Toggle Node Editor button (“C” in Figure Inserting a Node and Specifying the Inputs).
- Under the Node Editor, click “Input Variables” and the green “+” button (“D” in Figure Inserting a Node and Specifying the Inputs).
- Insert input variables x1 and x2 (“E” in Figure Inserting a Node and Specifying the Inputs).
- For x1, specify the value, default, minimum, and maximum as 3, 3, -10, and 10, respectively (“E” in Figure Inserting a Node and Specifying the Inputs).
- For x2, specify the value, default, minimum, and maximum as 4, 4, -10, and 10, respectively (“E” in Figure Inserting a Node and Specifying the Inputs).
- Under the Node Editor, click “Output Variables” and the green “+” button (“A” and “B” in Figure Specifying the Outputs).
![]()
Specifying the Outputs
- Insert output variables y1 and y2 (“C” in Figure Specifying the Outputs).
- Under the Node Editor, click “Node Script” (“A” in Figure Inserting the Equations).
![]()
Inserting the Equations
- In the first line under “Node Script (Python Code)”, type: f[‘y1’] = 2 * x[‘x1’] + 3 * x[‘x2’] (“B” in Figure Inserting the Equations).
- In the second line under “Node Script (Python Code)”, type: f[‘y2’] = 3 * x[‘x1’] + 5 * x[‘x2’] (“B” in Figure Inserting the Equations).
- Open Microsoft Excel.
- Type example.x1 and example.x2 as the headings in Cells A1 and B1 (please see Figure Specifying the Inputs in Excel).
![]()
Specifying the Inputs in Excel
- Type 1, 3, 5, 7, 9 under example.x1 (please see Figure Specifying the Inputs in Excel).
- Type 0, 2, 4, 6, 8 under example.x2 (please see Figure Specifying the Inputs in Excel).
- Save the Excel file, with file name “example_samples” (without the quotes), and “CSV (MS-DOS)” as the file type .
- Return to FOQUS, and go to the “Uncertainty” tab (“A” in Figure The Uncertainty Tab in FOQUS).
![]()
The Uncertainty Tab in FOQUS
- Click the “Add New” button (“B” in Figure The Uncertainty Tab in FOQUS).
- Select “Use flowsheet”, and click “OK” (“C” and “D” in Figure The Uncertainty Tab in FOQUS).
- Select “Load all samples from a single file” (“A” in Figure Uploading the CSV File Containing the Inputs).
![]()
Uploading the CSV File Containing the Inputs
- Click “Browse”, and select the “example_samples” CSV file (“B” in Figure Uploading the CSV File Containing the Inputs).
- Click “Done” (“C” in Figure Uploading the CSV File Containing the Inputs).
- The user-specified inputs should appear in the “Ensemble” table (please see Figure The User-Specified Inputs in the Uncertainty Tab).
![]()
The User-Specified Inputs in the Uncertainty Tab
- Run these inputs by clicking the “Launch” button (please see Figure The User-Specified Inputs in the Uncertainty Tab).
- After the runs are finished, the results are shown in the table at the bottom of the “Uncertainty” tab (please see Figure The Results of the Runs in the Uncertainty Tab).
![]()
The Results of the Runs in the Uncertainty Tab
- The user can also view the results in the Flowsheet tab by clicking the “Results and Filtering” button (“A” in Figure The Results of the Runs in the Flowsheet Table).
![]()
The Results of the Runs in the Flowsheet Table
- The Flowsheet Table contains the results (“B” in Figure The Results of the Runs in the Flowsheet Table).
Tutorial 2: Data Manipulation¶
In this tutorial, instructions to change the data before analysis are described. Current capabilities include sample filtering, input/output variable deletion, and output value modification.
The files for this tutorial are located in: examples/tutorial_files/UQ/Tutorial_2
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Filtering¶
Filtering involves selecting out samples whose values of a certain input or output fall into a certain range. Typically, when runs are returned from the Turbine gateway, there could be simulations that failed to converge in Aspen, thus the simulation samples corresponding to these failed runs should be excluded from analysis. Follow the steps below to filter out the samples due to failed runs:
Click Load from File on the UQ window (Figure Data Manipulation, Filtering Tab).
Select the file “gmoat5012_9levels.res” in the examplesUQ folder. This file is an actual simulation ensemble that has already been run on the Turbine gateway. To find this file, the user may need to change the file filter to “All files.”
Select the Filtering tab.
Data Manipulation, Filtering Tab
Filtering the loaded simulation ensemble based on output values is performed.
Click on “New Filter”, and create a filter named “f1”
Add the Filter Expression c(“output.status”) == 0, since the user should keep only the samples in which the output parameter status is “0.”
Click “Done”
Data Manipulation, Filtering Dialog Box
Select ‘f1’ as the ” Current Filter” in the Flowsheet Result window within “Filtering Tab”
Once the Filtering is complete, click on “Save as New Ensemble” and a new row should be added to the simulation table
Data Manipulation, Applying the filter
Once filtering is complete, a new row should be added to the simulation table (Figure Data Manipulation, Filtering Results). This ensemble contains only those samples that have a status value of “0.” Analysis can now be performed on this new ensemble because this ensemble contains only the valid simulations (i.e., those with output status value of 0), in which Aspen calculations have properly converged.
Data Manipulation, Filtering Results
Variable Deletion¶
If an input or output variable is to be removed from consideration for analysis, this can be done in the Inspection/Deletion/Output Value Modification tab. Delete the status output from the previous filtering as it is no longer needed for further analysis.
- Verify that the ensemble that resulted from filtering is selected. If not, select that ensemble.
- Click the Inspection/Deletion/Output Modification tab.
- Scroll to the right of the table to the outputs, which are colored yellow.
- Select the checkbox corresponding to the “status” output (the first output).
- Click Perform Deletion then Save as New Ensemble.
The results are illustrated in Figure Data Manipulation, Inspection/Deletion. Note: The output count has decreased by one for the new ensemble. The user can verify that the “status” output was removed in the new ensemble by viewing this in the Inspection/Deletion/Output Value Modification tab again. Deletion of an input can be performed similarly by selecting its checkbox and clicking the Perform Deletion then Save as New Ensemble button.

Data Manipulation, Inspection/Deletion
Output Value Modification¶
To change the value of an output for a sample or several samples, follow steps below:
- Select an ensemble.
- Click the Inspection/Deletion/Output Value Modification tab.
- Scroll to the right to the outputs.
- Click on a cell for one of the outputs and enter a new value. Do the same for another cell. Notice that the modified cells turn green. This indicates the cells that have been modified.
- Click Make Output Value Changes Permanent to permanently change the values. The modified cells will turn yellow, indicating the permanent change. If the user wishes to reset the table and start over before making changes permanent, click the Reset Table.

Data Manipulation, Value Modification
Tutorial 3: Single-Output Analysis¶
From the Single-Output Analysis Screen, the user can perform analyses that are specific to a particular output of interest. Here, the “removalCO2” output parameter is discussed.
The files for this tutorial are located in: examples/tutorial_files/UQ/Tutorial_3
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Parameter Selection¶
For simulation models that have a large number of input parameters, it is common practice to down-select to a smaller subset of the most important input parameters that are most relevant to the output of interest. This is done so subsequent detailed studies can be performed more efficiently. By using a smaller set of inputs, a smaller set of samples may be needed.
From the UQ window, load the file “gmoat5012_9levels.filtered” in examplestutorial_filesUQTutorial_3. (This file contains the same set of samples that resulted from data filtering. They are included here to make each demo self-inclusive.)
Click Analysis. A new page is displayed (Figure [fig:uqt_analysis_param]).
Analysis Dialog, Parameter Selection
[fig:uqt_analysis_param]
Under the Qualitative Parameter Selection section, select “removalCO2” as the output.
Select “MOAT” as the method to be used.
Click Compute input importance. A graph should appear with the results (Figure [fig:uqt_param_results]).
Parameter Selection Results
[fig:uqt_param_results]
The bars in the plot represent the importance of a particular input in determining the value of the output. For example, the values of dH3 and dS3 are very important in determining the value of removalCO2, whereas Hce and hp have no affect (the y-axis displays the average changes in the model output as a result of changing the inputs in their respective ranges. For example, from Figure [fig:uqt_param_results], changing dH2 in its range results in an average change in CO\(_2\) removal as much as about 57 percent with a margin of +/- 3 percent). Thus, it would be safe to exclude any inputs that have negligible bar lengths from analysis. Next, down-select the ten most important inputs based on these results. See Section [subsubsec:uqt_vardel] for details. Change the number of samples and scheme as desired and then generate new samples. Click Launch to run these samples to obtain another simulation ensemble that can be analyzed.
Ensemble Data Analysis¶
If the user is interested in the output uncertainty of “removalCO2” based on the uncertainties from the ten most important input parameters, perform uncertainty analysis, which would compute the probability distribution and sample statistics of “removalCO2.”
Load “lptau20k_10inputs_4outputs.filtered” from the examplestutorial_filesUQTutorial_3 folder. Assume this is the file that the user would receive after running the cloned simulation ensemble in which the user has down-selected the ten most important inputs, set the Sampling Scheme to “Quasi-Monte Carlo (LPTAU)”, set the sample size to 20K, and performed data filtering to retain only the samples with the status output equal to “0.”
Click Analyze. A new page displays (Figure[fig:uqt_analysis_ua]).
Select “Ensemble Data” to indicate that analysis is to be directly performed on the raw sample data.
Select “removalCO2” as the output variable to analyze.
Select “Uncertainty Analysis” and then click Analyze.
Analysis Dialog, Ensemble Data Uncertainty Analysis
[fig:uqt_analysis_ua]
Once uncertainty analysis is complete, results display (Figure [fig:uqt_ua_results]) illustrating the probability distribution function (PDF), cumulative distribution function (CDF), and the sufficient statistics (e.g., mean, standard deviation) of “removalCO2” (top left corner of the PDF plot). This is used to evaluate if the output uncertainty is acceptable. If the output uncertainty is too great or the user prefers the system to operate within a higher percentage of capture, pursue further analyses to understand the relationships between the inputs and outputs, and investigate what can be done to reduce the output uncertainties by reducing the input uncertainties.
Ensemble Data Uncertainty Analysis Results
[fig:uqt_ua_results]
Next, the user may apply variance-based sensitivity analysis to quantify each input’s contribution to the output variance:
From the bottom of the “Analysis” section, select “Sensitivity Analysis.”
There are three options for sensitivity analysis: (1) first-order, (2) second-order, and (3) total-order. First-order analysis examines the effect of varying an input parameter alone. Second-order analysis examines the effect of varying pairs of input parameters. Total-order analysis examines all interactions’ effect of varying an input parameter alone and as a combination with any other input parameters. For this demonstration, select “Total-order” and click Analyze. The total sensitivity indices display in a graph. Note: If the simulation ensemble has more than ten inputs, “Total-order” is disabled (since any reasonable sample size is not sufficient). Additionally, since quantitative sensitivity analysis in general requires large ensembles with many samples (thousands or more), ensemble sensitivity analysis (without the use of response surfaces) is often less practical and accurate than response surface based analyses. The result is illustrated in Figure[fig:uqt_sa_results].

Ensemble Data Total-order Sensitivity Analysis Results
[fig:uqt_sa_results]
These results confirm that “removalCO2” is more sensitive to “dH3” and “dS3” than other inputs. (The y-axis displays an approximate percentage of output variance attributed to each individual parameter. Since total sensitivity includes higher order interaction terms with other parameters, the sum of these total sensitivity indices usually exceeds 1.)
Ensemble Data Visualization¶
In this release, ensemble data visualization is only available in “Expert” mode. At the top of the “Analyze” page, toggle the bar to expert mode and select “removalCO2” as the output. Next, to “Visualize Data,” choose an input (e.g., “UQ_dH1”) and click Visualize for a 2-D scatter plot of “removalCO2” versus that input (Figure [fig:uqt_splot1_results]).
Ensemble Data Visualization of One Input
[fig:uqt_splot1_results]
Next, select a second input (e.g., “UQ_dH2”) and click Visualize for a 3-D scatter plot of “removalCO2” versus the two inputs. (Note: The input selections must be unique for the Visualize button to be enabled.) Figure [fig:uqt_splot2_results] shows the results.
Ensemble Data Visualization of Two Inputs
[fig:uqt_splot2_results]
The plot in Figure [fig:uqt_splot2_results] can be rotated by clicking and dragging.
Tutorial 4: Response Surface Based Analysis¶
For simulation models that are expensive to run, response surface analysis can be a resourceful option. To construct a response surface, a space-filling sampling design is desired. For example, quasi-Monte Carlo (LPTAU) or Latin hypercube sampling schemes are recommended. Additionally, there are several possibilities for curve fitting methods. If the sample size is relatively small, polynomial regression or Gaussian process (if installed as part of PSUADE) is preferred. Alternatively, if the sample size is large enough (one hundred or more), cubic splines (if installed) may also be feasible.
The file for this tutorial is lptau100_10inputs_4outputs.dat, and this file is located in: examples/tutorial_files/UQ/Tutorial_4
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Response Surface Model Validation¶
To proceed with response surface based analysis, the user needs to find a suitable response surface with which to approximate the input-to-output mapping. Validation is performed to see how well a particular response surface can predict a subset of the withheld data.
Load the “lptau100_10inputs_4outputs.dat” file. Note: This is an extremely small simulation ensemble, as this is used to highlight the differences (in validation results) between a good response surface and a bad one.
Click Analyze for the current ensemble. A new dialog page displays (Figure[fig:uqt_rs_validate]).
Under “Analysis” (bottom section), under Step 1, select “Response Surface.”
Analysis Dialog, Response Surface Validation of Linear Model
[fig:uqt_rs_validate]
Under Step 2, select “removalCO2” as the output for analysis.
Under Step 3, select “Polynomial” for response surface method.
There are multiple types of polynomial response surfaces, with increasing complexity as the user navigates down the list. For now, select “Linear” in the next drop-down list.
Insert 5.00 as the error envelope for the validation plot. Click Validate. The result is illustrated in Figure[fig:uqt_rs_validate_results].

Linear Response Validation Results
[fig:uqt_rs_validate_results]
The cross-validation results for the linear regression model are displayed as a histogram of errors to the left and a plot of predicted values versus actual values to the right. The histogram displays the cross validation error distribution, which provides the user information on what the errors are like overall. If this distribution is not centered on zero, there may be a systematic bias in the response surface model. If the distribution is too wide, it is not a good fit. As for the plot of predicted values versus actual values, the more closely the points are to the diagonal, the better the fit. Most response surface models, with the exception of MARS, also provide uncertainty information about the response surface. The vertical error bars on the left plot reflect the uncertainty in the linear response’s predictions.
In summary, these two figures should provide sufficient information for the user to judge how good the fit is. As is apparent in the figures, the linear model consistently overestimates and thus is an ill-suited response surface to model our data. In general, the user may use a few response surface methods to see which method gives the best fit.
Response Surface Based Uncertainty Analysis¶
These capabilities are similar to those for ensemble data analysis. The difference is that the results are now derived from a much larger ensemble that is computed from the response surface. With the 100 samples from the ensemble data, a response surface is trained and is used to generate 100K samples internally to compute the results for uncertainty and sensitivity analyses. (Note: Validation must be performed before these analyses are available.)
After the response surface validation step, select “Uncertainty Analysis” to be the UQ analysis in Step 7 of “Analysis” (Figure [fig:uqt_rs_validate]). Click Analyze and a distribution representing the output uncertainty will be displayed (Figure [fig:uqt_rsua_results]).

Response Surface Based Uncertainty Analysis Results
[fig:uqt_rsua_results]
Compare the response surface based uncertainty results (Figure [fig:uqt_rsua_results]) to the results from ensemble data analysis (Figure [fig:uqt_ua_results]). The two main differences are easily seen.
- Two PDFs on top plot: A response surface (in this case, linear regression) is used to predict the output values corresponding to the input samples. From the validation step (left plot of Figure[fig:uqt_rs_validate_results]). Note: There is error associated with the response surface’s predictions. This error is propagated in uncertainty analysis, in the form of standard deviations around the predicted output values (i.e., the means). Accordingly, two histograms are presented: The “mean PDF” represents the output probability distribution computed from the response surface’s predicted output values only, without consideration for the uncertainties surrounding these predicted values. The “ensemble PDF” represents the output probability distribution that encompasses the uncertainties surrounding these predicted values. In most cases, the ensemble PDF should have a larger spread because it is accounting for more uncertainties (i.e., those that stem from the approximations inherent in the response surface).
- Multiple cumulative distribution functions (CDFs) on bottom plot: The “mean CDF” is constructed from a cumulative sum on the mean PDF in the top plot. Since each predicted output value (i.e., the mean) has an associated standard deviation, this information is used to construct other PDFs that correspond to output values that are +/- 1, 2, and 3 standard deviations from the mean. These PDFs are then converted to CDFs and shown as colored lines. These colored lines provide an uncertainty “envelope” around the mean CDF.
Response Surface Based Mixed Epistemic-Aleatory Uncertainty Analysis¶
In “Expert Mode”, the user can perform more advanced uncertainty analysis that handles both epistemic and aleatory uncertainties. To do so, the user will need to designate the uncertainty type (epistemic or aleatory) for each uncertain input. In general, epistemic uncertainties are reducible uncertainties that arise due to lack of knowledge, such as simplifying assumptions in a mathematical model. Therefore, epistemic uncertainty is often characterized by upper and lower bounds. On the other hand, aleatory uncertainties are irreducible uncertainties that represent natural, physical variability in the phenomenon under study. As such, aleatory uncertainties are often characterized by distributions. Hence, the user is required to provide a PDF for each aleatory input. (In FOQUS, with the exception of mixed epistemic-aleatory uncertainty analysis, all uncertain inputs are treated as aleatory inputs.)
To perform mixed epistemic-aleatory uncertainty (Figure Response Surface Based Mixed Epistemic-Aleatory Uncertainty Analysis), switch to “Expert Mode” by clicking the Mode button that toggles between the analysis modes. After response surface validation, select “Uncertainty Analysis” in the first Choose UQ Analysis drop-down list, then “Epistemic-Aleatory” in the secondary drop-down list, for the UQ analysis. In the input table, designate the parameter Type (“Epistemic”, “Aleatory” or “Fixed”) and the corresponding information for each input. Once complete, click Analyze. In this tutorial we consider dH1 & dH2 as epistemic uncertain parameters, and rest of them are aleatory.

Response Surface Based Mixed Epistemic-Aleatory Uncertainty Analysis
The results of mixed epistemic-aleatory uncertainty analysis is a plot (Figure Response Surface Based Mixed Epistemic-Aleatory Uncertainty Analysis Results) containing multiple CDFs. In the mixed analysis, the epistemic inputs are sampled according to their lower and upper bounds. Each sample point spawns a response surface based uncertainty analysis, in which the epistemic inputs are fixed at their sampled value and the aleatory input uncertainties are propagated to generate a CDF that represents the output uncertainty. A slider is provided for the user to extract the probability range corresponding to a particular value of the output.

Response Surface Based Mixed Epistemic-Aleatory Uncertainty Analysis Results
Response Surface Based Sensitivity Analysis¶
For quantitative sensitivity analysis, follows these steps:
- In the Choose UQ Analysis drop-down list (Step 6 of “Analysis”), select “Sensitivity Analysis.”
- In the next drop-down list, select “First-order” and click Analyze. (This analysis may take a long time depending on the sample size and the response surface used.)
Prediction errors are associated with the response surface’s predictions of the output values (left plot of Figure [fig:uqt_rs_validate_results]). Earlier, it was observed that the response surface error contributed to the output uncertainty, leading to a larger spread in the output PDF (top plot of Figure [fig:uqt_rsua_results]). In Figure [fig:uqt_rssa_results], the response surface error contributed to uncertainty (shown as blue error bars) surrounding each input’s contribution to the output variance (shown as yellow bars).

Response Surface Based First-order Sensitivity Results
[fig:uqt_rssa_results]
Response Surface Based Visualization¶
The response surface that has been validated can also be visualized.
Select one input next to “Visualize Response Surface.”
Click Visualize to display a 2-D line plot that displays “removalCO2” versus the selected input.
1-D Response Surface Visualization
[fig:uqt_rs1_results]
Select another input next to the first one for a 2-D response surface visualization.
- Click Visualize to display a figure with a 3-D surface plot and a 2-D contour plot (Figure [fig:uqt_rs2_results]).
2-D Response Surface Visualization
[fig:uqt_rs2_results]
Select another input next to the second one for a 3-D response surface visualization.
Click Visualize to display a 3-D isosurface plot. Move the slider to see the points in the 3-D input space that fall within the small range of “removalCO2” (Figure [fig:uqt_rs3_results]).
3-D Response Surface Visualization
[fig:uqt_rs3_results]
Tutorial 5: Bayesian Inference¶
For each output variable, the user specifies an observed value (from physical experiments) with the associated uncertainties (in the form of standard deviation), if applicable. Whether standard inference or SolventFit is selected, the tool will launch a Markov Chain Monte Carlo (MCMC) algorithm to compute the posterior distributions of the uncertain input parameters. These input posterior distributions represent a refined hypothesis about the input uncertainties in light of what was previously known (in the form of input prior distributions) and what was observed currently (in the form of noisy outputs).
The file for this tutorial is lptau100_10inputs_4outputs.filtered, and this file is located in: examples/tutorial_files/UQ/Tutorial_5
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Load the “lptau5k_10inputs_4outputs.filtered” file from the above-mentioned folder.
Click Analyze for the current ensemble and a new dialog box displays (Figure [fig:uqt_analysis_infer]).
Analysis Dialog, Bayesian Inference
[fig:uqt_analysis_infer]
Select “Response Surface” in the “Analysis” section.
Select “Output variable to analyze” to be “removalCO2.”
Select “Linear Regression” as the response surface.
Insert 5.00 as the error envelope for the validation plot. Click Validate. The GUI allows the user to proceed with Bayesian inference after one input has been validated; however, the user may want to validate all outputs since they are all used in the inference.
Once validation is completed, click Infer at the lower right corner, which displays a new dialog box (Figure[fig:uqt_infer]).
In the Output Settings table (on the left), select the second, third, and fourth outputs as the observed outputs. The user can experiment with using different response surface models (for example, linear polynomials) to approximate the mapping from inputs to each of the outputs.
In the Input Settings table (on the right), designate input types (variable, design, or fixed) and if necessary, switch to Expert Mode to revise the prior distribution on the input parameters. The prior distribution represents knowledge that the user possesses about the inputs before observational data (from experiments) has been incorporated into this knowledge. If the user does not have any updated knowledge about the simulation ensemble, it is OK to leave the table as is.
In the Observations table (in the middle), select the number of experiments from which the user can get observational data. In essence, if the user has \(N\) observations, then \(N\) should be set as the number of experiments. The table will then populate columns for design inputs (if any) and observed outputs. Currently, only normal distribution is supported as the noise model for observations. Enter the mean and standard deviation for each of these observations. For convenience, the mean and standard deviation values are prepopulated with the results from uncertainty analysis. These values have been provided as a sanity check for the user, in case the observation for a particular output is way out of range from these distributions.
Bayesian Inference Dialog for Standard Inference
[fig:uqt_infer]
To save an input sample drawn from the posterior distribution, select the Save Posterior Input Samples to File checkbox and select a location and file name to store the sample.
Click Infer to start the analysis. Inference can take a long time; thus, a stop feature has been implemented. Once inference starts, the Infer button changes to Stop. To stop inference calculations, click Stop which changes the button back to Infer, allowing the user to restart the calculations from scratch. If inference is allowed to run its course, its results are interpolated to produce heat maps (off-diagonal subplots in Figure[fig:uqt_infer_results]) for visualization. This interpolation step can take a few minutes and while it is running, Infer is disabled.
[fig:uqt_infer_results]
Once the inference and interpolation steps are complete, two windows will be displayed: a multi-plot figure of the prior distributions and another multi-plot figure of the posterior distributions. If the user has selected the Save Posterior Input Samples to File checkbox, then a sample file will also be written to the designated file location.
In the resulting prior and posterior plots (Figure [fig:uqt_infer_results]), the univariate input distributions are displayed as histograms on the diagonal. The bivariate input distributions (between pairs of inputs) are displayed as heat maps in the off-diagonal subplots. On these heat maps, the regions in red reflect the input space with higher probability. In the posterior plots, the red regions represent inputs that are more likely to have generated the specified observations on the outputs. By comparing the prior and the posterior figures, the user can see the ”before” and ”after” impact of inference on our knowledge of the input uncertainty.
To zoom in on any one of the subplots, left-click; to zoom out, right-click. To display a subset of these subplots, clear the checkbox for the inputs to be omitted (from the first column of the Input Prior Table) and click Replot (Figure [fig:uqt_infer_replot_results]).
[fig:uqt_infer_replot_results]
File Formats¶
Most UQ capabilities within FOQUS rely on PSUADE. As such, different UQ components require input files in PSUADE formats. CSV (comma-separated values) files are also compatible. The specific requirements are explained in the UQ section Tutorials and section Optimization Under Uncertainty (OUU).
PSUADE Full File Format¶
The following is an example of the full PSUADE file format. Comments in red do not appear in the file and are only for instructional purposes.
This file format is accepted when:
- The user load an existing ensemble by clicking the Load From File button from the Uncertainty Quantification Screen.
- The user creates a new ensemble by clicking the Add New button from the Uncertainty Quantification Screen and selecting the Load all samples from a single file radio button in the user’s selection of sample generation (Simulation Ensemble Setup Dialog, Load Samples Option).
- The user performs optimization under uncertainty from the main Optimization Under Uncertainty Screen and selects the Load Model From File radio button for the user’s model; for this file, the user does not need to specify the first block (i.e., the PSUADE_IO block).
This file format is written when:
- The user saves an existing ensemble by clicking the Save Selected button from the Uncertainty Quantification Screen.
PSUADE Sample File Format¶
The following is an example of the sample file format. Comments in red do NOT appear in the file and are only for instructional purposes.
This file format is accepted when:
- The user creates a new ensemble by clicking the Add New button from the Uncertainty Quantification Screen and selecting the Load all samples from a single file radio button in the user’s selection of sample generation (Simulation Ensemble Setup Dialog, Load Samples Option).
- The user creates a new ensemble by clicking the Add New button from the Uncertainty Quantification Screen and selecting the Choose sampling scheme radio button in the user’s selection of sample generation (Simulation Ensemble Setup Dialog, Distributions Tab); in the Distributions tab, if the user designates an input variable’s PDF to be of type “Sample”, the “Param 1” field will generate a Select File button that prompts for the sample file representing the input’s PDF.
- Similar to above, when the user enters Expert Mode within the Analysis dialog; within Expert Mode (Response Surface Based Mixed Epistemic-Aleatory Uncertainty Analysis), the user can change the input distribution before performing response surface based analysis.
- The user performs optimization under uncertainty from the main Optimization Under Uncertainty Screen; if any of the variables are designated as random variables, the UQ Setup tab will be displayed and any prompt for loading existing sample (e.g., “Load existing sample for Z3” or “Load existing sample for Z4”) will require this file format. (Currently, the UQ Setup tab is missing from the Figure because no variables have been designated as random).
This file format is written when:
- The user wants to save the results of inference by clicking Save Posterior Input Samples to File within Bayesian Inference (Bayesian Inference Dialog), which is accessible from the Analysis screen of UQ (Analysis Dialog, Ensemble Data Analysis, Wizard Mode).
Comma Separated Values (CSV) File Format¶
The following is an example of the CSV file format. Comments in red do not appear in the file and are only for instructional purposes. CSV files can be easily generated using Excel and exporting in the .csv format.
Variable names are specified in the first line, with input names and then output names. Output names can be specified, even if there is no data available for them yet. Data is only required for inputs. In addition, the variable names line is not required in those places where a PSUADE sample file is acceptable.
This file format is accepted when:
- The user loads an existing ensemble by clicking the Load from File button from the Uncertainty Quantification Screen. Variable names are required.
- The user creates a new ensemble by clicking the Add New button from the Uncertainty Quantification Screen and selecting the Load all samples from a single file radio button in the user’s selection of sample generation (Simulation Ensemble Setup Dialog, Load Samples Option).
- The user creates a new ensemble by clicking the Add New button from the Uncertainty Quantification Screen and selecting the Choose sampling scheme radio button in the user’s selection of sample generation (Simulation Ensemble Setup Dialog, Distributions Tab); in the Distributions tab, if the user designates an input variable’s PDF to be of type “Sample”, the “Param 1” field will generate a Select File button that prompts for the sample file representing the input’s PDF.
- Similar to above, when the user enters Expert Mode within the Analysis dialog; within Expert Mode(Response Surface Based Mixed Epistemic-Aleatory Uncertainty Analysis), the user can change the input distribution before performing response surface based analysis.
- The user performs optimization under uncertainty from the main Optimization Under Uncertainty Screen; if any of the variables are designated as random variables, the UQ Setup tab will be displayed and any prompt for loading existing sample (e.g., “Load existing sample for Z3” or “Load existing sample for Z4”) will require this file format. (Currently, the UQ Setup tab is missing from the Figure because no variables have been designated as random).
Optimization Under Uncertainty (OUU)¶
Contents¶
[sec:ouu_overview]
Reference¶
The FOQUS OUU module supports several variants of optimization under uncertainty. This chapter first presents the mathematical formulations of these variants. Subsequently, details of the OUU graphical user interface will be discussed.
OUU Variables¶
Suppose a simulation model is available for an OUU study. Let this simulation model be represented by the following function:
which is characterized by four types of variables:
- Design/Decision/Optimization variables
- Notation: \(Z_1\) with dimension \(n_1\)
- Definition: Design variables are continuous variables that may be bounded or unbounded. They are generally the set of optimization variables in a single-stage optimization or the set of outer optimization variables in the two-stage optimization.
- Recourse/Operating variables
- Notation: \(Z_2\) with dimension \(n_2\)
- Definition: Operating variables are optimization variables in the inner optimization for a given scenario (or realization) of the uncertain variables in a two-stage optimization.
- Discrete uncertain variables
- Notation: \(Z_3\) with dimension \(n_3\)
- Definition: Discrete variables are uncertain variables that have an enumerable set of states (called scenarios) such that each state is associated with a finite probability and the sum of probabilities for all the scenarios is equal to \(1\).
- Continuous uncertain variables
- Notation: \(Z_4\) with dimension \(n_4\)
- Definition: Continuous uncertain variables are associated with a joint probability distribution function from which a sample can be drawn to compute the basic statistics.
OUU Objective Functions¶
In the presence of uncertainties, OUU seeks to find the optimal solution in some statistical sense. For example, an optimization goal may to be find the design settings that minimizes the statistical mean of the system response. Other popular objective functions are:
- a linear combination of statistical mean and standard deviation of some selected output,
- probability of exceeding the best value is smaller than some percentage at any point in the design space (this is analogous to conditional value at risk).
Note that these metrics are defined in the design variable space - that is, at each iteration of an OUU algorithm, the selected metric will be computed for the decision point under consideration. Since the calculation of these statistical metrics requires a sample (possibly large), OUU can benefit from parallel computing capabilities (e.g., the Turbine gateway).
Mathematical Formulations¶
FOQUS supports two types of OUU methods: single-stage OUU and two-stage OUU. The main difference between single-stage and two-stage OUU is the presence of the recourse (or operational) variables. Strictly speaking, since recourse variables are generally hidden (they are only needed in the inner stage and their values are not used in the outer stage of two-stage OUU), the distinction between single-stage and two-stage OUU is not clear. Nevertheless, for the sake of clarify, we will describe details of each formulation separately. The current OUU does not support linearly or nonlinearly-constrained optimization.
Single-Stage Formulation¶
In this formulation, there is no recourse variable:
and the optimization problem becomes:
where \(\Phi_{Z_3,Z_4} [F(Z_1,Z_3,Z_4)]\) is the statistical metric (one of the three options given above).
For example, if the objective function is the statistical mean, then the formulation becomes:
where, again, \(n_3\) is the number of scenarios for the discrete uncertain variables, \(\pi_j\) is the probability of the \(j\)-th scenario, and \(P(Z_4)\) is the joint probability of the continuous uncertain variables.
Two-Stage Formulation¶
In this formulation all four types of variables are present. The objective function is given by:
If the objective function is the statistical mean, the formulation becomes:
Then the two-stage equation can be rewritten as:
which is a single-stage OUU with respect to the \(G\) function.
OUU User Interface¶
The OUU module enables the user to perform optimization under uncertainty studies on a flowsheet. From the OUU tab, the user can set up the different types of optimization parameters, select from the different OUU options, and run the optimization. This screen is shown in Figure [fig:ouu_screen].
- Model provides two options for setting up the model: (1) select a node from the flowsheet that has already been instantiated; or (2) load the model from a file in the PSUADE full file format (with the opt_driver variable set to the simulation executable.)
- Variables displays all variables defined in the model that can be
used in this context. Each available variable can be set to either
one of the 6 types:
- “Fixed”: The parameter’s value is fixed throughout the optimization process.
- “Opt: Primary Continuous (Z1)”: Continuous parameter for the outer optimization.
- “Opt: Primary Discrete (Z1d)”: Discrete parameter for the outer optimization.
- “Opt: Recourse (Z2)”: Recourse parameter for the inner optimization.
- “UQ: Discrete (Z3)”: Discrete or categorical uncertain parameter that contributes to scenarios.
- “UQ: Continuous (Z4)”: Continuous uncertain parameter with a given probability distribution.
- Optimization Setup allows users to select the objective function for OUU. It also allows users to select the inner optimization solver. There are two options for the inner solver: (1) the simulation model provided by users is an optimizer itself, and (2) the simulation provided by users needs to be wrapped around by another optimizer in FOQUS.
- UQ Setup allows users to set up the continuous uncertain parameters. There are two options: (1) FOQUS can generate a sample internally, or (2) a user-generated sample can be loaded into FOQUS. The sample size should be larger than the number of continuous uncertain parameters. Optionally, response surface can be turned on to enable the statistical moments to be computed more accurately even with small samples. Users can also select a smaller subset of the sample for building response surfaces and evaluate the response surfaces with the larger samples.
- Launch/Progress has the ‘Run OUU’ button to launch OUU runs.
Tutorials¶
This section walks through a few examples of running OUU.
The files for these tutorials are located in: examples/tutorial_files/OUU
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Example 1: OUU with Discrete Uncertain Parameters Only¶
This example has only discrete uncertain parameters and the objective function is computed from the mean estimation with the scenarios from a sample file.
- Start FOQUS and click the ‘OUU’ icon.
- Under ‘Model’, browse and load examples/OUU/ouu_optdriver.in.
- Under ‘Variables’, set variable \(1-4\) as \(Z_1\), variable \(5-8\) as \(Z_2\), and variable \(9-12\) as \(Z_3\).
- Under ‘Optimization Setup’, select the first objection function (default) and select ‘use model as optimizer’ as the ‘Inner Solver’.
- Under ‘UQ Setup’ and ‘Discrete Random Variables’, browse the examples/OUU/ directory and load the ex1_x3sample.smp sample file (see Figure [fig:ouu_ex1]).
- Go to ‘Launch/Progress’ page, click ‘Run OUU’ and see OUU in action.
Example 2: OUU with Continuous Uncertain Parameters Only¶
This example has only continuous uncertain parameters and the objective function is computed from the mean estimation with a Latin hypercube sample of size \(200\) for \(Z_4\).
- Start FOQUS and click the ‘OUU’ icon.
- Under ‘Model’, browse and load examples/OUU/ouu_optdriver.in.
- Under ‘Variables’, set variable \(1-4\) as \(Z_1\), variable \(5-8\) as \(Z_2\), and variable \(9-12\) as \(Z_4\).
- Under ‘Optimization Setup’, select the first objection function (default) and select ‘use model as optimizer’ as the ‘Inner Solver’.
- Under ‘UQ Setup’ and ‘Continuous Random Variables’, select ‘Generate new sample for \(Z_4\)’, set ‘Sample Scheme’ to ‘Latin Hypercube’ and set sample size to \(200\) (see Figure [fig:ouu_ex2]).
- Go to ‘Launch/Progress’ page, click ‘Run OUU’ and see OUU in action.
Example 3: OUU with Continuous Uncertain Parameters and Response Surface¶
This example is similar to Example 2 except that response surfaces will be used on the \(Z_4\) sample (that is, the \(Z_4\) sample will be used to construct response surfaces and the means will be estimated from a large sample evaluated on the response surfaces).
- Start FOQUS and click the ‘OUU’ icon.
- Under ‘Model’, browse and load examples/OUU/ouu_optdriver.in.
- Under ‘Variables’, set variable \(1-4\) as \(Z_1\), variable \(5-8\) as \(Z_2\), and variable \(9-12\) as \(Z_4\).
- Under ‘Optimization Setup’, select the first objection function (default) and select ‘use model as optimizer’ as the ‘Inner Solver’.
- Under ‘UQ Setup’ and ‘Continuous Random Variables’, select ‘Generate new sample for \(Z_4\)’, set ‘Sample Scheme’ to ‘Latin Hypercube’ and set sample size to \(200\).
- Under ‘UQ Setup’ and ‘Continuous Random Variables’, check the ‘Use Response Surface’ box (see Figure [fig:ouu_ex2]).
- Go to ‘Launch/Progress’ page, click ‘Run OUU’ and see OUU in action.
Example 4: OUU with Discrete and Continuous Uncertain Parameters¶
This example has both discrete and continuous parameters. The discrete scenarios will be loaded from a sample file. A Latin hypercube sample will be generated for the continuous variables.
- Start FOQUS and click the ‘OUU’ icon.
- Under ‘Model’, browse and load examples/OUU/ouu_optdriver.in.
- Under ‘Variables’, set variable \(1-4\) as \(Z_1\), variable \(5-8\) as \(Z_2\), variable \(9\) as \(Z_3\), and variable \(10-12\) as \(Z_4\).
- Under ‘Optimization Setup’, select the first objection function (default) and select ‘use model as optimizer’ as the ‘Inner Solver’.
- Under ‘UQ Setup’ and ‘Discrete Random Variables’, browse the examples/OUU/ directory and load the ex456_x3sample.smp sample file.
- Under ‘UQ Setup’ and ‘Continuous Random Variables’, select ‘Generate new sample for \(Z_4\)’, set ‘Sample Scheme’ to Latin hypercube and set ‘Sample Size’ to \(100\).
- Go to ‘Launch/Progress’ page, click ‘Run OUU’ and see OUU in action.
Example 5: OUU with Mixed Uncertain Parameters and Response Surface¶
This example is similar to Example 4 except that response surfaces will be used to estimate the means for the continuous uncertain variables.
- Start FOQUS and click the ‘OUU’ icon.
- Under ‘Model’, browse and load examples/OUU/ouu_optdriver.in.
- Under ‘Variables’, set variable \(1-4\) as \(Z_1\), variable \(5-8\) as \(Z_2\), variable \(9\) as \(Z_3\), and variable \(10-12\) as \(Z_4\).
- Under ‘Optimization Setup’, select the first objection function (default) and select ‘use model as optimizer’ as the ‘Inner Solver’.
- Under ‘UQ Setup’ and ‘Discrete Random Variables’, browse the examples/OUU/ directory and load the ex456_x3sample.smp sample file.
- Under ‘UQ Setup’ and ‘Continuous Random Variables’, select ‘Generate new sample for \(Z_4\)’, set ‘Sample Scheme’ to Latin hypercube and set ‘Sample Size’ to \(100\).
- Under ‘UQ Setup’ and ‘Continuous Random Variables’, check the ‘Use Response Surface’ box.
- Go to ‘Launch/Progress’ page, click ‘Run OUU’ and see OUU in action.
Example 6: OUU with User-provided Samples and Response Surface¶
This example is similar to Example 4 except that a sample for \(Z_4\) will be used (instead of the Latin hypercube sample generated internally).
- Start FOQUS and click the ‘OUU’ icon.
- Under ‘Model’, browse and load examples/OUU/ouu_optdriver.in.
- Under ‘Variables’, set variable \(1-4\) as \(Z_1\), variable \(5-8\) as \(Z_2\), variable \(9\) as \(Z_3\), and variable \(10-12\) as \(Z_4\).
- Under ‘Optimization Setup’, select the first objection function (default) and select ‘use model as optimizer’ as the ‘Inner Solver’.
- Under ‘UQ Setup’ and ‘Discrete Random Variables’, browse the examples/OUU/ directory and load the ex456_x3sample.smp sample file.
- Under ‘UQ Setup’ and ‘Continuous Random Variables’, check ‘Load existing sample for \(Z_4\)’ and load the \(Z_4\) sample examples/OUU/ex6_x4sample.smp.
- Go to ‘Launch/Progress’ page, click ‘Run OUU’ and see OUU in action.
Example 7: OUU with Large User-provided Samples and Response Surface¶
This example is similar to Example 5 except that a sample for \(Z_4\) is provided (instead of generated internally).
- Start FOQUS and click the ‘OUU’ icon.
- Under ‘Model’, browse and load examples/OUU/ouu_optdriver.in.
- Under ‘Variables’, set variable \(1-4\) as \(Z_1\), variable \(5-8\) as \(Z_2\), and variable \(9-12\) as \(Z_4\).
- Under ‘Optimization Setup’, select the first objection function (default) and select ‘use model as optimizer’ as the ‘Inner Solver’.
- Under ‘UQ Setup’ and ‘Continuous Random Variables’, check ‘Load existing sample for \(Z_4\)’ and load the \(Z_4\) sample examples/OUU/ex7_x4sample.smp (\(10000\) sample points).
- Under ‘UQ Setup’ and ‘Continuous Random Variables’, check ‘Use Response Surface’ and set ‘Sample Size’ to \(100\).
- Go to ‘Launch/Progress’ page, click ‘Run OUU’ and see OUU in action.
Surrogate Modeling¶
Contents¶
Surrogate Models Overview¶
Large-scale computational models are crucial tools to analyze complex systems. When coupled with uncertainty quantification and optimization methods, the resulting computational expense becomes intractable. In order to face the computational burden, surface approximation methods, black box models, or surrogate models are commonly used. FOQUS provides a selection of surrogate modeling tools all using a similar work-flow. This section provides an overview of the surrogate modeling features and capabilities. The details of each tool are provided in the tutorial sections.
The following surrogate modeling tools are currently available:
- ACOSSO – Adaptive COmponent Selection and Shrinkage Operator is a regularization method for simultaneous model fitting and variable selection based in nonparametric regression methods. ACCOSSO is suitable for approximating models with many inputs and no sharp changes.
- ALAMO – Automated Learning of Algebraic Models for Optimization generates algebraic models from data sets. These surrogate models are ideal for equation oriented optimization problems (which are easily differentiable), such as super structure optimization.
- BSS-ANOVA – Bayesian Smoothing Spline Analysis of Variance is a method similar to ACOSSO.
- iREVEAL – Surrogate models for CFD simulations using Kriging or Neural Networks. It contains special features specifically designed for working with CFDs.
Data Selection¶
The Data tab allows the selection of training data to be used to generate a surrogate model (Surrogate Data Form). If the session is associated with a flowsheet data (results from a single flowsheet run, optimization runs, or UQ samples), then the flowsheet data is available to be the training data and the table will be populated accordingly.
Surrogate Data Form
- Run the surrogate modeling method.
- Stop the surrogate modeling method.
- Surrogate modeling tool enables the user to select the desired surrogate modeling tool from the Tool drop-down list.
- Description of the selected surrogate method.
- Add Samples enables the user to generate new training data using a model specified in the flowsheet or an emulator (i.e., a basic response surface provided as part of the UQ module).
- Flowsheet Results are summarized below.
- The data table has a Menu drop-down list that contains display, import/export, and edit commands.
- Select a data filter from the Current Filter drop-down for current data display.
- Add or edit new data filters from Edit Filters. This dialog is shown in Figure Sort1 Data Filter Results.
- The Display table displays the results of flowsheet evaluations
stored in the FOQUS session file. The columns are:
- SetName is a name assigned to samples. This is typically equivalent to one UQ sample run or one optimization run.
- ResultName is a string representing a result name.
- Error is the simulation result status; 0 indicates success, other numbers represent an error. A column for each node displays the error status of each node.
- Time displays the time when the result was stored.
- Elapsed Time describes how long a result took to calculate.
- Tags enables a list of string labels to be applied to results. This could be used to mark results to be used for a particular purpose such as model validation.
- The remaining columns display the input and output variables.
Filters can be used to select data. See Section Tutorial 4: Flowsheet Result Data for more information on creating filters to the results. The “All” and “None” filters are available by default. These can be used, for example, to assign all the data as a training set, or to split the data into a separate training set and a test set.
Variables¶
The Variables section is illustrated in Figure Surrogate Variable Selection. This section allows selection of input and output variables used in a surrogate model. Some surrogate methods such as ALAMO may generate and run additional samples while building surrogates. The Min/Max columns provide bounds on the variables. Selecting the checkbox next to the variable Name indicates that it should be included in the surrogate generation. Failure to select a checkbox for any variables will result in error during surrogate generation.
Surrogate Variable Selection
Method Settings¶
The Method Settings table is illustrated in Figure Surrogate Settings. The settings available in this table depend on the surrogate tool. A description of each setting is provided in the third column of the table.
Surrogate Settings
Execution¶
Clicking Run starts the surrogate model building process. The execution monitor displays after Run is clicked (see Figure Surrogate Status Monitor). The execution monitor displays the status of the surrogate build. The messages displayed depends on the surrogate tool.
Surrogate Status Monitor
After a successful execution and model building, the results are displayed. Note that in this case, the surrogate modeling tool ends with an error, the errors are displayed in this window. After surrogate generation completes, one or two Python files will be generated depending on the tool. Each tool generates a file that encodes the surrogate model as a general Python script that can be used to evaluate output values for UQ analyses within the UQ module. The other file, if available, is a FOQUS flowsheet plugin model that allows the surrogate to be run in a FOQUS flowsheet. The next version of FOQUS will generate a FOQUS flowsheet plugin model (i.e., the second file) for all surrogate tools.
Tutorial¶
Tutorial 1: ALAMO¶
This tutorial focuses on the use of the ALAMO tool for building algebraic surrogate models. ALAMO builds simplified algebraic models, which are particularly well suited for rigorous equation oriented optimization. To keep the execution of this tutorial fast, a toy problem is used. In this case study the flowsheet calculations and sample generation are done within FOQUS, alternatively, the user can provide a simulation model such as: Excel, Aspen plus, Aspen custom modeler, etc.
Note: Before starting this tutorial the ALAMO product must be downloaded from the products page on the CCSI website. The path for the ALAMO executable file must be set in FOQUS settings (see Section Settings).
The FOQUS file (Surrogate_Tutorial_1.foqus), where Steps 1 to 42 of this tutorial have been completed is located in: examples/tutorial_files/Surrogates
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Flowsheet Setup¶
- Open FOQUS.
- Name the session “Surrogate_Tutorial_1” (Figure Session Set Up).
Session Set Up
- Navigate to the Flowsheet Editor (Figure Flowsheet Setup).
- Add a Flowsheet Node named “eq.”
- Display the Node Editor by clicking the Node Editor toggle button.
Flowsheet Setup
The Node Editor displays (Figure Node Variables). Thefirst step to setting up the node for this problem is to add input and output variables to the node.
- If the input variables table is not displayed as shown in Figure Node Variables, click the Variables tab and then click the Input Variables toolbox section.
- Add the variables “x1” and “x2” by clicking the Add icon (+) above the input table.
- Edit the Min/Max value for both variables to be “-10.0” and “10.0.”
- Add two output variables “z1” and “z2.”
Node Variables
To keep the execution time short, the node will not be assigned to a simulation model and calculations are performed directly in FOQUS.
Click on the Node Script tab in the Node Editor to enter the test equation (this step replaces the use of a simulator).
Enter the following equations (Figure Node Script):
f["z1"] = x["x1"] + x["x2"] f["z2"] = x["x1"]**2 + x["x2"]**2
The node script calculations are written in Python. The dictionary “f” stores output values while the dictionary “x” stores input values.
Node Script
Test the model by running the flowsheet with the value “2” for “x1” and “x2.” After running, the output variables should have the values “4.0” for “z1” and “8.0” for “z2.”
Creating Initial Samples¶
There are two ways to start an ALAMO run: (1) generate a set of initial data, (2) use ALAMO’s adaptive sampling with no initial data and let ALAMO generates its own samples. Adaptive sampling can be used with initial data to generate more points if needed. In this case, initial data is provided and adaptive sampling is used.
- Select the UQ tool by clicking on the Uncertainty button on the Home window (Figure Add a New Sample Ensemble).
- Click the Add New button.
- The Add New Ensemble - Model Selection dialog will appear. Click OK to set up the sampling scheme.
Add a New Sample Ensemble
- The sample ensemble setup dialog displays (Figure Sample Distributions). Select Choose sampling scheme.
- Click the All Variable button.
- Select the Sampling scheme tab.
Sample Distributions
- The Sampling scheme dialog should display (Figure Sample Methods). Select “Latin Hypercube” from the list.
- Set the # of samples to “1000.”
- Click Generate Samples.
- Click Done.
Sample Methods
- Once the samples have been generated a new sample ensemble displays in the UQ tool window (Figure Run Samples). Click Launch to run and generate the samples.
Run Samples
Data Selection¶
Initial and validation data can be specified by creating filters that specify subsets of flowsheet data. In this tutorial only initial data will be used. A filter must be created to separate the results of the single test run from the UQ samples.
- Click on the Surrogates button from the Home window. The surrogate tool displays Surrogate Data.
- Select “ALAMO” from the Tool drop-down list.
- Click Edit Filters in the Flowsheet Results section to create a filter.
Surrogate Data
- Figure Data Filter Dialog displays the Data Filter Editor.
- Add the filter for initial data.
- Click New Filter, and enter “f1” as the filter name.
- Type the Filter expression: c(“set”) = = “UQ_Ensemble”.
- Click Done.

Data Filter Dialog
Variable Selection¶
In this section, input and output variables need to be selected. Generally, any input variables that vary in the data set should be selected. However, in some cases, variables may be found to have no, or very little, effect on the outputs. Only the output variables of interest need to be selected. Note: Each output is independent from each other and for the model building, selecting one output is the same as selecting more.
- Select the Variables tab (Figure Variable Selection).
- Select the checkbox for both input variables.
- Select the checkbox for both output variables.
Variable Selection
Method Settings¶
The most important feature to generate “good” algebraic models is to configure the settings accordingly to the problem to be solved. Each setting has a good description in FOQUS. The JSON parser is used to read method settings values. Strings must be contained in quotes. Lists have the following format: [element 1, element 2].
- Click on the Method Settings tab (see Figure ALAMO Method Settings).
- Set the FOQUS Model (for UQ) to “ALAMO_tutorial_UQ.py.”
- Set the FOQUS Model (for Flowsheet) to “ALAMO_tutorial_FS.py”
- Set Initial Data Filter to “Initial.”
- Set SAMPLER to select the adaptive sampling method: “None” “Random” or “SNOBFIT.” Use “None” in this tutorial.
- Set MONOMIALPOWER to select the single variable term powers to [1,2,3].
- Set MULTI2POWER to select the two variable term powers to [1].
- Select functions to be considered as basis functions (EXPFCNS, LOGFCNS, SINFCNS, COSFCNS).
- Leave the rest of settings as default (see Table ALAMO Method Settings).
- Save this FOQUS session for use in the ACOSSO and BSS-ANOVA tutorials.
ALAMO Method Settings
Execution¶
- Click the Run icon at the top of the window.
- The ALAMO Execution tab starts displaying execution file path,
sub-directories, input files, and output files.
- ALAMO version.
- License Information.
- Step 0 displays the data set to be used by ALAMO.
- Step 1 displays the modeler used by ALAMO to generate the algebraic model.
- Once the surrogate model has finished, the equations are displayed in the execution window. It may be necessary to scroll up a little. The result is shown in Figure ALAMO Execution.
- Finally, the statistics display the quality metrics of the models generated.
ALAMO Execution
Results¶
The results are exported as a PSUADE driver file that can be used perform UQ analysis of the models, and a FOQUS Python plugin model that allows it to be used in a FOQUS flowsheet. The equations can also be viewed in the results section.
See tutorial Section Tutorial 4: Surrogates with UQ Tools and Tutorial 5: Surrogates with the Flowsheet for information about analyzing the model with the UQ tools or running the model on the flowsheet.
As mentioned in section 1.5 the method settings are very important. A brief description and hints are included in Table ALAMO Method Settings.
Method Settings | Description |
Initial Data Filter | Filter to be applied to the initial data set. Data filters help the user to generate models based on specific data for each variable. |
Validation Data filter | Data set used to compute model errors at the validation phase. The number of data points in a preexisting validation data set can be specified by the user. |
SAMPLER | Adaptative sampling method to be used. Options: “None”, “Random” and “SNOBFIT”. Adaptive sampling method to be used by ALAMO when more sampling points are needed by the model. If Random is used a simulator must be provided by the user. If SNOBFIT is used a simulator must be provided by the user and MATLAB must be installed. |
MAXTIME | Maximum execution time in seconds. This time includes all the steps on the algorithm, if simulations are needed they run in this time. |
MINPOINTS | Convergence is assessed only if the simulator is able to compute the output variables for at least MINPOINTS of the data set. A reduced number of MINPOINTS may reduce the computational time to get a model, but also reduces the accuracy of the model. MINPOINTS must be a positive integer. |
PRESET | Value to be used if the simulator fails. This value must be carefully chosen to be an otherwise not realizable value for the output variables. |
MONOMIALPOWERS | Vector of monomial powers to be considered as basis functions, use empty vector for none []. Exponential terms allowed in the algebraic model. i.e., if selecting [1,2] the model considers x1 and x1**2 as basis functions. |
MULTI2POWER | Vector of pairwise combination of powers to be considered as basis functions. Pairwise combination of powers allowed in the algebraic model. i.e., [1,2] allows terms like x1*x2 in the algebraic model. |
MULTI3POWER | Vector of three variables combinations of powers to be considered as basis functions. |
Use or not of exp, log, sin, and cos functions as basis functions in the model. | |
RATIOPOWER | Vector of ratio combinations of powers to be considered in the basis functions. Ratio combinations of powers are [empty as default]. |
Radial Basis Functions | Radial basis functions centered around the data set provided by the user. These functions are Gaussian and are deactivated if their textual representation requires more than 128 characters (in the case of too many input variables and/or datapoints). |
RBF parameter | Constant penalty used in the Gaussian radial basis functions. |
Modeler | Fitness metric to be used for model building. Options: BIC (Bayesian Information Criterion), Mallow’s Cp, AICc (Corrected Akaike’s Information Criterio), HQC (Hannan-Quinn Information Criterion), MSE (Mean Square Error), and Convex Penalty. |
ConvPen | Convex penalty term. Used if Convex Penalty is selected. |
Regularizer | Regularization method is used to reduce the number of potential basis functions before the optimization. |
Tolrelmetric | Convergence tolerance for the chosen fitness metric is needed to terminate the algorithm. |
ScaleZ | If used, the variables are scaled prior to the optimization problem is solved. The problem is solved using a mathematical programming solver. Usually, scaling the variables may help the optimization procedure. |
GAMS | GAMS is the software used to solve the optimization problems. The executable path is expected or the user must declare GAMS.exe in the environment path. |
GAMS Solver | Solver to be used by GAMS to solve the optimization problems. Mixed integer quadratic programming solver is expected like BARON (other solvers can be used). |
MIPOPTCR | Relative convergence tolerance for the optimization problems solved in GAMS. The optimization problem is solved when the optcr is reached. 5 to 1 % is expected (0.005 to 0.001). |
MIPOPTCA | Absolute convergence tolerance for mixed-integer optimization problems. This must be a nonnegative scalar. |
Linear error | If true, a linear objective function is used when solving the mixed integer optimization problems; otherwise, a quadratic objective function is used. |
Specify whether constraint regression is used or not, if true bounds on output variables are enforced. | |
CRNCUSTOM | If true, Custom constraints are entered in the Variable tab. |
CRNINITIAL | Number of random bounding points at which constraints are sampled initially (must be a nonnegative integer). |
CRNMAXITER | Maximum allowed constrained regressions iterations. Constraints are enforced on additional points during each iteration (must be positive integer). |
CRNVIOL | Number of bounding points added per round per bound in each iteration (must be positive integer). |
CRNTRIALS | Number of random trial bounding points per round of constrained regression (must be a positive integer). |
CUSTOMBAS | A list of user-supplied custom basis functions can be provided by the user. The parser is not case sensitive and allows for any Fortran functional expression in terms of the XLABELS (symbol ^ may be used to denote power). |
Tutorial 2: ACOSSO¶
This tutorial covers the ACOSSO surrogate modeling method. The Adaptive COmponent Selection and Shrinkage Operator (ACOSSO) surface approximation was developed under the Smoothing Spline Analysis of Variance (SS-ANOVA) modeling framework (Storlie et al. 2011). As it is a smoothing type method, ACOSSO works best when the underlying function is somewhat smooth. For functions which are known to have sharp changes or peaks, etc., other methods may be more appropriate. Since it implicitly performs variable selection, ACOSSO can also work well when there are a large number of input variables. The ACOSSO procedure also allows for categorical inputs (Storlie et al. 2013).
This tutorial uses the same flowsheet and sample setup as the ALAMO tutorial in Section Tutorial 1: ALAMO.
The FOQUS file for this tutorial is Surrogate_Tutorial_1.foqus, and this file is located in: examples/tutorial_files/Surrogates
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
The statistics software “R” is also required to use ACOSSO and BSS-ANOVA. Before starting this tutorial, you will need to install R version 3.1 or later (see https://cran.r-project.org/).
Once R is installed, you will need to install the “quadprog” package. ACOSSO requires this package for solving quadratic programming problems. You will only need to perform this step once.
Start R. In Windows, this must be done with administrative privileges. Either run this from an administrator account, or right-click “R x64 3.1.2” and click “Run with administrator” and type in administrator credentials.
Inside the R console, type:
- install.packages(’quadprog’)
- library(quadprog)
- q()
The first line installs the package. If prompted for a CRAN mirror, select the one closest to you geographically. The second line loads the package. The last line quits R. If prompted to save workspace image, choose ‘y’.
Once you have done these steps, ACOSSO is ready to be invoked inside FOQUS.
- Set the path to the RScript executable.
- Click the Settings button in the Home window.
- Change the RScript path if necessary. The Browse button opens a file browser that can be used to set the path.
- Complete the ALAMO tutorial in Section Tutorial 1: ALAMO through Step 32, load the FOQUS session saved after completing the ALAMO tutorial, or load the “Surrogate_Tutorial_1.foqus” file from the examples/tutorial_files/Surrogates folder.
- Click the Surrogates button in the Home window (Figure ACOSSO Session Set Up).
- Select “ACOSSO” in the Tool drop-down list.
- Select the Method Settings tab.
- Set “Data Filter” to “Initial.”
- Set “Use Flowsheet Data” to “Yes.”
- Set “FOQUS Model (for UQ)” to “ACOSSO_Tutorial_UQ.py.”
- Set “FOQUS Model (for Flowsheet)” to “ACOSSO_Tutorial_FS.py.”
- Click the Run icon (Figure ACOSSO Session Set Up).
ACOSSO Session Set Up
- The execution window will automatically display. While ACOSSO is running, the execution window may show warnings, but this is normal.
- When the run completes, a UQ driver file is created, allowing the ACOSSO surrogate to be used as a user-defined response surface in UQ analyses. (See Section Tutorial 4: Surrogates with UQ Tools.)
- ACOSSO also produces a flowsheet plugin; however.
Tutorial 3: BSS-ANOVA¶
This tutorial covers the BSS-ANOVA surrogate modeling method. The Bayesian Smoothing Spline ANOVA (BSS-ANOVA) is essentially a Bayesian version of ACOSSO (Reich et al. 2009). It is Gaussian Process (GP) model with a non-conventional covariance function that borrows its form from SS-ANOVA. It tackles the high dimensionality (of inputs) on two fronts: (1) variable selection to eliminate uninformative variables from the model and (2) restricting the level of interactions involved among the variables in the model. This is done through a fully Bayesian approach which can also allow for categorical input variables with relative ease. Since it is closely related to ACOSSO, it generally works well in similar settings as ACOSSO. The BSS-ANOVA procedure also allows for categorical inputs (Storlie et al. 2013). In this current implementation, BSS-ANOVA is more computationally intensive than ACOSSO, so ACOSSO is preferred for faster surrogate generation.
This tutorial uses the same flowsheet and sample setup as the ALAMO tutorial in Section Tutorial 1: ALAMO.
The FOQUS file for this tutorial is Surrogate_Tutorial_1.foqus, and this file is located in: examples/tutorial_files/Surrogates
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
The statistics software “R” is also required to use ACOSSO and BSS-ANOVA. Before starting this tutorial, you will need to install R version 3.1 or later (see http://cran.r-project.org/).
- Set the path to the RScript executable.
- Click the Settings button from the Home window.
- Change the RScript path if necessary. The Browse button opens a file browser that can be used to set the path.
- Complete the ALAMO tutorial in Section Tutorial 1: ALAMO through Step 32, load the FOQUS session saved after completing the ALAMO tutorial, or load the “Surrogate_Tutorial_1.foqus” file from the examples/tutorial_files/Surrogates folder.
- Click the Surrogates button from the Home window (Figure BSS-ANOVA Session Set Up).
- Select “BSS-ANOVA” in the Tool drop-down list.
- Select the Method Settings tab.
- Set “Data Filter” to “Initial.”
- Set “Use Flowsheet Data” to “Yes.”
- Set “FOQUS Model (for UQ)” to “bssanova_tutorial_uq.py.”
- Set “FOQUS Model (for Flowsheet)” to “bssanova_tutorial_fs.py.”
- Click the Run icon (Figure BSS-ANOVA Session Set Up).
BSS-ANOVA Session Set Up
- The execution window will automatically display. While BSS-ANOVA is running, the execution window may show warnings, but this is normal.
- When the run completes, a UQ driver file is created, allowing the BSS-ANOVA surrogate to be used as a user-defined response surface in UQ analyses. (See Section Tutorial 4: Surrogates with UQ Tools.)
- BSS-ANOVA also produces a flowsheet plugin.
Tutorial 4: Surrogates with UQ Tools¶
For the purpose of this tutorial, we will use ACOSSO to demonstrate the use of a surrogate within the UQ module. The steps are the same regardless of the surrogate tool chosen.
To perform the UQ analysis, Python is required for use the “User Regression” response surface that will be used. Before starting this tutorial, you will need to install Python 2.7.x (not Python 3). (See https://www.python.org/downloads/). In addition, if *.py files have been re-associated with other executables (e.g. editors), please change the association back to python.exe.
The FOQUS file for this tutorial is Rosenbrock_no_vectors.foqus, and this file is located in: examples/tutorial_files/Surrogates
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
- Load a fresh session by clicking the Session button from the Home window. Select Open Session and then navigate to the above-mentioned folder, and select “Rosenbrock_no_vectors.foqus.” This will load a session with a simple flowsheet containing a single node.
- Click Settings and ensure that (1) FOQUS Flowsheet Run Method is set to “Local”, and that (2) proper paths are set for PSUADE and RScript.
- Train an ACOSSO surrogate of this node by clicking the Surrogates
button from the Home window.
- Click Add Samples and select “Use Flowsheet”. This will display the Simulation Ensemble Setup dialog.
- Within this dialog, ensure all variables are set to “Variable” type in the Distributions tab. In the Sampling scheme tab, select “Monte Carlo” as your sampling scheme, set the number of samples to 100, and then click Generate Samples to generate the set of input values. Click Done to return to the Surrogates screen.
- Once sample generation completes, click the Uncertainty button from the Home window.
- Click the Launch button to generate the samples.
- Click the Surrogates button from the Home window. The Data tab of the Surrogates screen should now displays a Flowsheet Results table that is populated with the values of the new input samples.
- From the Variables tab, select all of the checkboxes. (There should be six checkboxes for input variables and one checkbox for output variable.) Here, you are defining the inputs and outputs for your surrogate function.
- From the Method Settings tab, note the name of the file next to “FOQUS Model (for UQ)”. This will be the name of the UQ driver file that contains the Python code that implements the surrogate function.
- On top of this screen, select “ACOSSO” as your surrogate tool from the Tool drop-down list and then click on the green arrow to start training the surrogate.
- Once complete, a popup window will display, reminding you of the location of the drive file. Note the location as you will need this information later inside the UQ module.
- Perform a response-surface-based uncertainty analysis by clicking the
Uncertainty button from the Home window.
- In the Uncertainty Quantification Simulation Ensembles table. A row corresponding to the ensemble that was just generated for surrogate training should be displayed. This same ensemble can be used or a new one can be created to be used as the test data set for analysis. In the row corresponding to the ensemble to be analyzed, click the Analyze button to proceed. This action will bring up an analysis dialog.
- Within this analysis dialog, navigate to “Analysis” section. For Step 1, select “Response Surface”. For Step 3, select “User Regression” in the first drop-down list. Lastly, for “User Regression File”, browse to the same location as the UQ driver file that was generated within the Surrogates module. (This is the same location that was previously noted from the popup message.) At this point, your surrogate function is now set up as a user-defined response surface and all response-surface-based UQ analyses are accessible.
- Click Validate (Step 4) to perform response surface validation. Once complete, a figure with cross-validation results will be displayed: a histogram of errors to the left and a plot of predicted values versus actual values to the right. For more information, refer to the UQ Tutorial in Section[tutorial.uq.rs].
- Once a “Response Surface” has been validated, other UQ analysis options are available. Choose “Uncertainty Analysis” in Step 5 and click Analyze to perform uncertainty analysis using your ACOSSO surrogate.
During validation, if the error, “RSAnalyzer: RSTest_hs.m does not exist.” displays, this is likely caused by incompatibility with the surrogate and the test data. An example scenario might be your test data has six inputs, but your surrogate assumes five inputs. This is easily fixed by returning to the Surrogates screen, clicking on the Variables tab, and making sure the appropriate selections are made (i.e., check off six inputs instead of just five).
Tutorial 5: Surrogates with the Flowsheet¶
This section provides a brief tutorial for using the flowsheet plugin models generated by surrogate modeling methods. In a future FOQUS release all surrogate modeling methods will produce a model that can be run in a FOQUS flowsheet. Currently iREVEAL (part of the CCSI Toolset) does not produce a flowsheet model.
Before doing this tutorial complete the ALAMO tutorial in Section :ref:`sec.surrogate.alamo`.
- Open FOQUS. If FOQUS has not been closed since completing the ALAMO tutorial, close it and reopen it. There is a known issue where existing flowsheet model plugins may not update until FOQUS is restarted.
- Enter “FS_Plugin_Tutorial” as the Session Name.
- Click the Flowsheet button from the Home window.
- Click the Add Node icon in the left toolbar (see Figure Plugin Flowsheet).
- Click a location for the node in the Flowsheet area.
- Enter “model” for the node name (without quotes).
- Click the Node Editor icon in the left toolbar (see Figure Plugin Flowsheet).
- In the Node Editor, select “Plugin” from the Model Type drop-down list.
- Select “ALAMO_Tutorial_FS” from the Model drop-down list.
- Set the Value of the Input Variables “eq.x1” to 2.
- Set the Value of the Input Variables “eq.x2” to 3.
- Click the Run icon in the left toolbar (see Figure Plugin Flowsheet).
- Wait for the Flowsheet evaluation to complete. It should finish successfully.
- Check the value of the Output Variables; the approximate values should be z1 = 5 and z2 = 13.
Plugin Flowsheet
Sequential Design of Experiments (SDOE)¶
Contents¶
Sequential Design of Experiments (SDOE)¶
Experimenters often begin an experiment with imperfect knowledge of the underlying relationship they seek to model, and may have a variety of goals that they would like to accomplish with the experiment. In this chapter, we describe how sequential design of experiments can help make the best use of resources and improve the quality of learning. We describe the different types of space filling designs that can help accomplish this, define basic terminology, and show a common sequence of steps that are applicable to many experiments. We show the basics for the types of designs supported in the SDoE module, and provide some examples to illustrate the methods.
A sequential design of experiments strategy allows for adaptive learning based on incoming results as the experiment is being run. The SDoE module in FOQUS allows the experimenter to flexibly incorporate this strategy into their designed experimental planning to allow for maximally relevant information to be collected. Statistical design of experiments is an important strategy to improve the amount of information that can be gleaned from the overall experiment. It leverages principles of putting experimental runs where they are of maximum value, the interdependence of the runs to estimate model parameters, and robustness to the variability of results that can be obtained when the same experimental conditions are repeated. There are two major categories of designed experiments: those for which a physical experiment is being run, and designs for a computer experiment where the output from a computer model (based on underlying science or engineering theory) is explored. There are also experimental situations when the goal is to collect both from a physical experiment as well as the computer model to compare them and to calibrate some of the computer model parameters to best match what is observed. The methods available in the SDoE module can be beneficial for all three of these cases. They present opportunities for accelerated learning through strategic selection and updating of experimental runs that can adapt to multiple goals.
The current version of the SDoE module has functionality that can produce flexible space-filling designs. Currently, two types of space-filling designs are supported:
Uniform Space Filling (USF) designs space design points evenly, or uniformly, throughout the user-specified input space. These designs are common in physical and computer experiments where the goal is to have data collected throughout the region. They are well suited to exploration, and being able to predict results at a new input combination, as there will be some data available close by. To use the Uniform Space Filling design capability in the SDoE module, the only requirement for the user is that the candidate set contains a column for each of the inputs and a row for each possible run. It is also recommended (but not required) to have an index column to be able to track which rows of the candidate set are selected in the constructed design.
Non-Uniform Space Filling (NUSF) designs maintain the goal of having design points spread throughout the desired input space, but add a feature of being able to emphasize some regions more than others. This adds flexibility to the experimentation, when the user is able to tune the design to have as close to uniform as desired or as strongly concentrated in one or more regions as desired. This is newly developed capability, which has just been introduced into the statistical and design of experiments literature, and has been added to the SDoE module. It provides the experimenter with the ability to tailor the design to what is needed. To use the Non-Uniform Space Filling design capability in the SDoE module, the requirements are that the candidate set contains (a) one column for each of the inputs to be used to construct the design, and (b) one column for the weights to be assigned to each candidate point, where larger values are weighted more heavily and will result in a higher density of points close to those locations. The Index column is again recommended, but not required.

Comparison of USF and NUSF designs
Key features of both approaches available in this module are: a) designs will be constructed by selecting from a user-provided candidate set of input combinations, and b) historical data, which has already been collected can be integrated into the design construction to ensure that new data are collected with a view to account for where data are already available.
Why Space-Filling Designs?¶
Space-filling designs are a design of experiments strategy that is well suited to both physical experiments with an accompanying model to describe the process and to computer experiments. The idea behind a space-filling design is that the design points are spread throughout the input space of interest. If the goal is to predict values of the response for a new set of input combinations within the ranges of the inputs, then having data spread throughout the space means that there should be an observed data point relatively close to where the new prediction is sought, regardless of the new location.
In addition, if there is a model for the process, then having data spread throughout the input space means that the consistency of the model to the observed data can be evaluated at multiple locations to look for possible discrepancies and to quantify the magnitude of those differences throughout the input space.
Hence, for a variety of criteria, a space-filling design might serve as good choice for exploration and for understanding the relationship between the inputs and the response without making a large number of assumptions about the nature of the underlying relationship. As we will see in subsequent sections and examples, the sequential approach allows for great flexibility to leverage what has been learned in early stages to influence the later choices of designs. In addition, the candidate-based approach that is supported in this module has the advantage that it can make the space-filling approach easier to adapt to design space constraints and specialized design objectives that may evolve through the stages of the sequential design.
We begin with some basic terminology that will help provide structure to the process and instructions below.
- Input factors – these are the controllable experimental settings that are manipulated during the experiment. It is important to carefully define the ranges of interest for the inputs (eg. Temperature in [200°C,400°C]) as well as any logistical or operational constraints on these input factors (eg. Flue Gas Rate < 1000 kg/hr when Temperature > 350°C)
- Input combinations (or design runs) – these are the choices of settings for each of the input factors for a particular run of the experiment. It is assumed that the implementers of the experiment are able to set the input factors to the desired operating conditions to match the prescribed choice of settings. It is not uncommon for the experimenter to not have perfect control of the input settings, but in a designed experiment, it is important to have a target value for each input and also to record the observed value if in fact it is different than what was intended. This allows for more precise estimation of the model and improved prediction.
- Input space (or design space) – the region of interest for the input factors in which the experiment will be run. This is typically constructed by combining the individual input factor ranges, and then adapting the region to take into account any constraints. Any suggested runs of the experiment will be located in this region. The candidate set of runs used by the SDoE module should provide coverage of all regions of this desired input space.
- Responses (or outputs) – these are the measured results obtained from each experimental run. Ideally, these are quantitative summaries (measured by a numeric value or possibly a vector of numeric values) of a characteristic of interest resulting from running the process at the prescribed set of operating conditions (eg. CO2 capture efficiency is a typical response of interest for CCSI).
- Design criterion / Utility function – this is a mathematical expression of the goal (or goals) of the experiment that is used to guide the selection of new input combinations, based on the prior information before the start of the experiment and during the running of the experiment. The design criterion can be based on a single goal or multiple competing goals, and can be either static throughout the experiment or evolve as goals change in importance over the course of the experiment. Common choices of goals for the experiment are:
- exploring the region of interest,
- improving the precision (or reducing the uncertainty) in the estimation of model parameters,
- improving the precision of prediction for new observations in the design region,
- assessing and quantifying the discrepancy between the model and data, or
- optimizing the value of responses of interest.
An ideal design of experiment strategy uses the design criterion to evaluate potential choices of input combinations to maximize the improvement in the criterion over the available candidates. If the optimal design strategy is sequential, then the goal is to use early results from the beginning of the experiment to guide the choice of new input combinations based on what has already been learned about the responses.
Matching the Design Type to Experiment Goals¶
At different stages of the sequential design of experiments, different objectives are common. We outline a common progression of objectives for experiments that we have worked with in the CCSI project. Typically, an initial pilot study is conducted to show that the right data can be collected and that measurements can be made with the required precision. Often no designed experiment is used for this small study as it is just to establish viability to proceed.

SDOE sequence of steps
Once the viability of the experimental set-up and measurement system has been established, it is common to proceed to the next step of exploration. This is appropriate if little is known about the response and its characteristics. Hence, a first experiment may have the goal of gaining some preliminary understanding of the characteristics of the response across the input region of interest. Depending on how easy it is to collect and process data, this exploration might be done in a single first experiment, or there may be opportunities to do several smaller stages (this is shown in the figure above with the recursive arrow). It is particularly beneficial to do the exploration step in smaller stages if there is uncertainty about what areas of the input space are feasible. This can help save resources by exploring slowly and eliminating regions where there are problems.
After initial exploration, a common next step in the sequence of experiments is model building or model refinement. For many CCSI experiments, the physical experiments are being collected in conjunction with an underlying science-based model. If a model does not already exist, then one might be developed based on the initial data collected in the previous stage. If a model already exists, then it can be refined by collecting new data where (a) there is maximum uncertainty in prediction, or (b) where there are discrepancies between the data and the model. In this way, the data collection from a physical experiment is used to calibrate the model and provide feedback about where model performance needs improvement (both resolving inaccurate characterization of features and high uncertainty). Often after the first set of data, some regions of the input space perform well, while others have issues. It is ideal to target new data in regions where it can be most beneficially used to improve the model.
After the experimenter has confidence in the model, it can then be used for optimization. This involves using the model to predict regions with desirable values of the response(s) of interest. Often the experiments associated with this stage focus on a smaller region of the input space close to where the optimum lies. The final stage, confirmation is often a very small experiment located right at the location where the model says the response is optimal. The goal of this stage is to verify that the results predicted by the model are matched with what is observed from experimental data. As with the pilot study, often this final stage involves only a small number of runs and no formal designed experiment is run.
We now illustrate these stages with a simple example involving 2 inputs where the candidate set fills a rectangular region defined by the range of each input. In the first stage, the pilot study (the two orange dots) are used to establish viability of the test method and measurement system. The second stage, an initial exploratory experiment (six blue dots) spreads the points throughout the defined region of interest. Here we start to see the benefit of using a sequential approach as the blue dots take into account the locations where the orange pilot data were collected.

SDoE Pilot study (orange) and Exploration (blue) stage
Based on this exploration, it may be discovered that one portion of the region (top right) is not viable for data collection, or is not desirable for the observed response values. Hence, in future experiments no data should be collected here. At this point, an initial model is constructed to combine what is known from the experimental data with the underlying science.

New Constraint added (dashed black line)
In the next stage of experimentation, some additional runs are added (red dots) that are used for model refinement. These are placed in regions where there is larger uncertainty in the model predictions and also seek to fill in empty space.

Model Refining stage of experimentation (red dots)
With the updated model based on the additional data, a region where good response values are possible is identified. This becomes the focus of another experiment for optimizing the response. The oval indicates the region of desirable responses, and the three green dots indicate the new input combinations collected to provide additional information.

The optimal region for the responses (oval) with additional runs (green dots)
The final data collection involves two confirmation runs (black dots) at the identified optimal location to verify that results are observed to match what the model predicts.

SDOE confirmation runs (black dots)
To conclude this example, we illustrate the power of the sequential approach to collecting data. In the figure below, we show the 18 runs collected with the sequential approach (on left) and a typical 18-run space filling design (on right). Both these experiments have the same total budget, but the sequential approach avoids placing much data in the undesirable top right corner as well as has much more data concentrated close to where the overall optimal combination of inputs is located.

A comparison of 2 18-run experiments: On left, the sequential approach. On right, the single experiment approach.
Using the SDoE Module - The Basics¶
In this section, we will describe the basic steps in for creating a design with this module. We first give details for the Uniform Space Filling Design, and then give a second set of details for a Non-Uniform Space Filling design.
When you first click on the SDOE button from the main FOQUS homepage, a first window appears. To create a design, the progression of steps takes you through the Ensemble Selection box (top left), then a transition triggered by the Confirm button to the Ensemble Aggregation box, and finally there are optional changes that can be made in the box at the bottom of the window. The final step in this window is to click on which type of design do you want to construct Uniform Space Filling or Non Uniform Space Filling.
Creating a New Candidate Set¶
To create a new candidate set the user can choose between two options: loading from an existing file or generating a new candidate set providing some ranges for each input.
Note
To use this feature you need to install the latest version of PSUADE. For more details go to section Install Optional Software
Loading from File¶
In the Ensemble Selection box, click on the Load from File… button to select the file(s) for the construction of the design. Several files can be selected and added to the box listing the chosen files.
For each of the files selected using the pull-down menu, identify them as either a Candidate file or a History file. Candidate .csv files are comprised of possible input combinations from which the design can be constructed. The columns of the file should contain the different input factors that define the dimensions of the input space. The rows of the file each identify one combination of input values that could be selected as a run in the final design. Typically, a good candidate file will have many different candidate runs listed, and they should fill the available ranges of the inputs to be considered. Leaving gaps or holes in the input space is possible, but generally should correspond to a region where it is not possible (or desirable) to collect data.
History .csv files should have the same number of columns for the input space as the candidate file (with matching column names) and represent data that have already been collected. The algorithm for creating the design aims to place points in different locations from where data have already been obtained, while filling the input space around those locations.
Both the Candidate and History files should be .csv files that have the first row as the Column heading. The Input columns should be numeric. Additional columns are allowed and can be identified as not necessary to the design creation at a later stage.
Generating a New Candidate Set¶
In the Ensemble Selection box, click on the Add New… button to select the file for the construction of the candidate set. The following menu will appear:

The user can select between two options: using a history file or a template file.
- History File. An existing .csv file with historical data is required. If this option is selected, then the inputs to be used in the candidate set are extracted from the columns of the file.
- Template File. The template file should be a simple comma separated values (.csv) file with at least three rows:
- Header with input names
- Minimum values
- Maximum values
An optional fourth row with default value can be added. If fourth row is not provided, the middle point between min and max becomes the default value.
Note
Choosing History or Template file won’t change the next steps.
Once the user has decided on which file to use, click on the OK button and the following dialog will pop up:

Decide on the type of input (variable or fixed) and the probability distribution function desired. If the input is set to variable, the user can modify minimum and maximum values to define the range over which sampling should be drawn. If input is set to fixed, then user selects the default value that will be used for all samples. Then click on the Sampling Scheme tab and you’ll see the following menu:

Choose sampling scheme to be used. The Monte Carlo and Quasi Monte Carlo options provide candidates that are scattered arbitrarily throughout the input space, while Latin Hypercube, Orthogonal Array and METIS provide different approaches for structured distribution throughout the input space. Next, choose the number of samples you want to generate and click on Generate Samples button. The user can preview the samples by clicking on the Preview Samples button.

On the left-hand side table, you can explore the generated data for the different inputs and on the right-hand side list you can select which inputs you want to plot.

Once you are happy with the samples generated then click on Done button, so the new candidate set gets saved and populated in the Ensemble Selection box.
Basic Steps for a Uniform Space Design¶
We now consider some details for each of these steps:
1. In the Ensemble Selection box, click on the Load from File button to select the file(s) for the construction of the design. Several files can be selected and added to the box listing the chosen files.

SDOE Home Screen
2. For each of the files selected using the pull-down menu, identify them as either a Candidate file or a History file. Candidate .csv files are comprised of possible input combinations from which the design can be constructed. The columns of the file should contain the different input factors that define the dimensions of the input space. The rows of the file each identify one combination of input values that could be selected as a run in the final design. Typically, a good candidate file will have many different candidate runs listed, and they should fill the available design region to be considered. Leaving gaps or holes in the input space is possible, but generally should correspond to a region where it is not possible (or desirable) to collect data. The flexibility of the candidate set approach allows for linear and non-linear constraints for one or more of the inputs to be incorporated easily.
History .csv files should have the same number of columns for the input space as the candidate file (with matching column names), and represent data that have already been collected. The algorithm for creating the design aims to place points separated from where data have already been obtained, while filling the input space around those locations. If the experiment is being run sequentially, the History file should use the input values that were actually implemented, not the target values from the previous designed experiment.
Both the Candidate and History files should be .csv files that have the first row as the Column heading. The Input columns should be numeric. Additional columns are allowed and can be identified as not necessary to the design creation at a later stage.
3. Click on the View button to open the Preview Inputs pop-up widow, to see the list of columns contained in each file. The left hand side displays the first few rows of input combinations from the file. Select the columns that you wish to see graphically in the right hand box , and then click on Plot SDOE to see a scatterplot matrix of the data.

SDOE view candidate set inputs

SDOE plot of candidate set inputs
The plot shows histograms of each of the inputs on the diagonals to provide a view of the distribution of values as well as the range of each input. The off-diagonals show pairwise scatterplots of each pair of inputs. This should provide the experimenter with the ability to assess if the ranges specified and any constraints for the inputs have been appropriately captured by the specified candidate set. In addition, repeating this process for any historical data will provide verification that the already observed data have been suitably summarized.
4. Once the data have been verified for both the Candidate and History files (if a History file has been included), click on the Confirm button to make the Ensemble Aggregation window active.

SDOE Ensemble Aggregation
5. If more than one Candidate file was specified, then the aggregate_candidates.csv file that was created will have combined these files into a single file. Similarly if more than one History file was specified, then the aggregate_history.csv file will have been created with all runs from all these files. If only a single file was selected for either the Candidate and History files, then their corresponding aggregated files will be the same as the original.
Note
There are options to view the aggregated files for both the candidate and history files, with a similar interface as was shown in step 3. In addition, a single plot of the combined candidate and history files can be viewed. In this plot the points representing the candidate locations and points of already collected data from the history file are shown in different colors.
6. Once the data have been verified as the desired set to be used for the design construction, then click on the Uniform Space Filling button at the bottom right corner of the Ensemble Aggregation window. This opens the second SDoE window, which allows for specific design choices to be made.

SDOE second window
7. The first choice to be made for the design is whether to optimize using minimax or maximin. The first choice, minimax, looks to choose design points that minimize the maximum distance that any point in the input space (as characterized by the candidate set and historical data, if it is available) is away from a design point. Hence, the idea here is that if we want to use data to help predict new outcomes throughout the input space, then we never want to be too far away from a location where data was observed.
The second choice, maximin looks to choose a design where the design points are as far away from each other as possible. In this case, the design criterion is looking to maximize how close any point is from their nearest neighbor. In practice the two design criterion often give similar designs, with the maximin criterion tending to push the chosen design points closer to the edges of the specified regions.
Hint
If there is uncertainty about some of the edge points in the candidate set being viable options, then minimax would be preferred. If the goal is to place points throughout the input space with them going right to the edges, than maximin would be preferred. Note, that creating the designs is relatively easy, so we recommend trying both approaches to examine them and then choose which is preferred based on the summary plots that are provide later.
8. The next choice falls under Design Specification, where the experimenter can select the sizes of designs to be created. The Min Design Size specifies the smallest design size to be created. Note that the default value is set at 2, which would lead to choosing the best two design runs from the candidate set to fill the space (after taking into account any historical data that have already been gathered).
The Max Design Size specifies the largest design size to be created. The default value is set at 8, which means that if this combination were used, designs would be created of size 2, 3, 4, 5, 6, 7 and 8. The number of integers between Min Design Size and Max Design Size determines the total number of searches that the SDoE algorithm will perform. Hence, it is prudent to make a thoughtful choice for this range, that balances design sizes that are potentially of interest with the waiting time for the designs to be created. In the figure above, the Min Design Size has been changed to 4, so that only the designs of size 4, 5, 6, 7 and 8 will be created.
9. Next, there are options for the columns of the candidate set to be used for the construction of the design. Under Include? in the box on the right hand side, the experimenter has the option of whether particular columns should be included in the space-filling design search. Unclick a box, if a particular column should not be included in the space filling criterion search.
Next select the Type for each column. Typically most of the columns will be designated as Inputs, which means that they will be used to construct the best uniform space filling design. In addition, we recommend including one Index column which contains a unique identifier for each run of the candidate set. This makes it easier to track which runs are included in the constructed designs. If no Index column is specified, a warning appears later in the process, but this column is not strictly required.
Finally, the Min and Max columns in the box allow the range of values for each input column to be specified. The default is to extract the smallest and largest values from the candidate and history data files, and use these as the Min and Max values, respectively. This approach generally works well, as it scales the inputs to be in a uniform hypercube for comparing distances between the design points.
Hint
The default values for Min and Max can generally be left at their defaults unless: (1) the range of some inputs represent very different amounts of change in the process. For example, if temperature is held nearly constant, while a flow rate changes substantially, then it may be desirable to extend the range of the temperature beyond its nominal values to make the amount of change in temperature more commensurate with the amount of change in the flow rate. This is a helpful strategy to make the calculated Euclidean distance between any points a more accurate reflection of how much of an adjustment each input requires. (2) if changes are made in the candidate or history data files. For example, if one set of designs are created from one candidate set, and then another set of designs are created from a different candidate set. These designs and the achieved criterion value will not be comparable unless the range of each input has been fixed at matching values.
10. Once the design choices have been made, click on the Test SDOE button. This performs a small number of iterations of the search algorithm to calibrate the timing for constructing and evaluating the designs. The time taken to generate a design is a function of the size of the candidate set, the size of the design, as well as the dimension of the input space. The slider below Test SDOE now indicates an estimate of the time to construct all of the designs across the range of the Min Design Size and Max Design Size specified. The smallest Number of Random Starts is 10^3 = 1000, and is generally too small to produce a good design, but this will run very quickly and so might be useful for a demonstration. However, it would generally be unwise to use a design generated from this small a set of random starts for an actual experiment. Powers of 10 can be chosen with an Estimated Runtime provided below the slider.

SDOE second window after clicking Test SDOE
Hint
The choice of Number of Random Starts involves a trade-off between the quality of the design generated and the time spent waiting to generate the design. The larger the chosen number of random starts, the better the design is likely to be. However, there are diminishing gains for increasingly large numbers of random starts. If running the actual experiment is expensive, it is generally recommended to choose as large a number of random starts as possible for the available time frame, to maximize the quality of the constructed design.
11. Once the slider has been set to the desired Number of Random Starts, click on the Run SDOE button, and initiate the construction of the designs. The progress bar indicates how design construction is advancing through the chosen range of designs between the specified Min Design Size and Max Design Size values.
12. When the SDOE module has completed the design creation process, the left window Created Designs will be populated with files containing the results. The column entries summarize the key features of each of the designs, including Optimality Method (whether minimax or maximin was selected), Design Size (d, the number of runs in the created design), # of Random Starts, Runtime (number of seconds needed to create the design), Criterion Value (the value obtained for the minimax or maximin criterion for the saved design).

SDOE Created Designs
13. To see details of the design, the View button at the right hand side of each design row can be selected to show a table of the design, as well as a pairwise scatterplot of any subset of the input columns for the chosen design. The table and plot of the design are similar in characteristics to their counterparts described above for the candidate set.

SDOE table of created design

SDOE pairwise plot of created design
14. To access the file with the generated design, go to the SDOE_files folder, and a separate folder will have been created for each of the designs. In the example shown, 5 folders were created for the designs of size 4, 5, 6, 7 and 8, respectively. In each folder, there is a file containing the design, with a name that summarizes some of the key information about the design. For example, candidates_d6_n10000_w+G+lldg+L contains the design created using the candidate set called candidates.csv, with d=6 runs, based on n=10000 random starts, and based on the 4 inputs W, G, lldg and L.

SDOE directory
When one of the design files is opened it contains the details of each of the runs in the design, with the input factor levels that should be selected for that run.

SDOE file containing a created design
To evaluate the designs that have been created, it is helpful to look at a number of summaries, including the criteria values and visualizing the spread of the design points throughout the region. Recall that at the beginning of the design creation process we recommended constructing multiple designs, with different design sizes. By examining multiple designs, it is easier to determine which design is best suited to the requirements of the experiment.
In the Created Designs table, it is possible to see the criterion values for each of the designs. For minimax designs, the goal is to minimize how far away any point in the candidate set is away from a design point. Hence, smaller values of this criterion are better. It should be the case, that a larger design size will result in smaller values, as there are more design points to distribute throughout the input space, and hence any location should have a design point closer to it. When evaluating between different sizes of design, it is helpful to think whether the improvement in the design criterion justifies the additional budget from a larger design.
For maximin designs, the goal is to maximize the distance between nearest neighbors for all design points. So for designs of the same size, we want the distance between neighboring points to be as large as possible, as this means that we have achieved near equal spacing of the design points. However, when we are comparing designs of different sizes, then the maximin criterion can be a bit confusing. Adding more runs to the design will mean that nearest neighbors will need to get closer together, and hence we would expect that on average the criterion value would get smaller for larger experiments. As with the minimax designs, we want to evaluate whether the closer packing of the design points from a larger experiment is worth the increase in cost for the additional runs.
Hint
Note that the criterion values for minimax and maximin should not be compared - one is comparing distances between design points and the candidate points, while the other is comparing distances between different design points.
For all of the designs, it is important to use the View option to look at scatterplots of the chosen design. When History points have been incorporated into the design, the plots will show how the overall collection of points fills the input space. When examining the scatterplots, it is important to assess (a) how close the design points have been placed to the edges of the region?, (b) are there holes in the design space that are unacceptably large?, and (c) does a larger design show a worthwhile improvement in the density of points to justify the additional expense?
Based on the comparison of the criterion values and the visualization of the spread of the points, the best design can be chosen that balances design performance with an appropriate use of the available budget. Recall that with sequential design of experiments, runs that are not used in the early stages might provide the opportunity for more runs at later stages. So the entire sequence of experimental runs should be considered when making choices about each stage.
Basic Steps for a Non-Uniform Space Design¶
We now consider some details for each of these steps for the second type of design, where we want to have different densities of design points throughout the chosen input region:
1. In the Ensemble Selection box, click on the Load from File button to select the file(s) to be used for the construction of the design. Several files can be selected and added to the box listing the chosen files.

SDOE Home Screen
2. For each of the files selected using the pull-down menu, identify them as either a Candidate file or a History file. Candidate .csv files are comprised of possible input combinations from which the design can be constructed. The columns of the file should contain the different input factors that define the dimensions of the input space, as well as a column that will be used to specify the weights associated with each of the design points. Note that there is a requirement for a column to be used to identify the prioritized regions of the input space. If this is not provided, then a non-uniform space filling design cannot be created.
History .csv files should have the same number of columns for the input space as the candidate file (with matching column names), and represent data that have already been collected. Note that a weight column is also required for the history file, as the calculation of how close each of the points are to each other requires this. The algorithm for creating the design aims to place points farther away from locations where data have already been obtained, while also filling the input space around those locations.
Both the Candidate and History files should be .csv files that have the first row as the Column headings. The Input and Weight columns should be numeric. Additional columns are allowed and can be identified as not necessary to the design creation algorithm at a later stage.
3. Click on the View button to open the Preview Inputs pop-up window, to see the list of columns contained in each file. The left hand side displays the first few rows of input combinations from the file. Select the columns that you wish to see graphically in the right hand box , and then click on Plot SDOE to see a scatterplot matrix of the data.

SDOE plot of candidate set inputs
The plot shows histograms of each of the columns on the diagonals to provide a view of the distribution of values as well as the range of each input. The off-diagonals show pairwise scatterplots of each pair of columns selected. This should provide the experimenter with the ability to assess if the ranges specified and any constraints for the inputs have been appropriately captured for the specified candidate set. In addition, repeating this process for any historical data will provide verification that the already observed data have been suitably characterized.
Note
In this file, the “Values” column contains the numbers that will be used to define the weights. The numeric values contained in this column do not have any restrictions, except (a) there is a value provided for each row in the candidate set, and (b) that larger values correspond to points that the user wishes to emphasize with regions containing a higher density of points in the constructed design.
4. Once the data have been verified for both the Candidate and History files, click on the Confirm button to make the Ensemble Aggregation window active.
5. If more than one Candidate file was specified, then the aggregate_candidates.csv file that was created will have combined these files into a single file. Similarly if more than one History file was specified, then the aggregate_history.csv file has been created with all runs from these files. If only a single file was selected for either of the Candidate or History files, then their corresponding aggregated files will be the same as the original.
There are options to view the aggregated files for both the candidate and history files, with a similar interface as was shown in step 3. In addition, a single plot of the combined candidate and history files can be viewed. In this plot the points representing the candidate locations and points of already collected data from the history file are shown in different colors.
6. Once the data have been verified as the desired set to be used for the design construction, click on the Non-Uniform Space Filling button at the bottom right corner of the Ensemble Aggregation window. This opens the second SDOE window, which allows for specific design choices to be made.

SDOE second window
7. Unlike the Uniform Space Filling designs, the choice of the optimality criterion to be used is fixed at maximin. Recall that a maximin design looks to choose design points that are as far away from each other as possible. In this case, the design criterion is looking to maximize a weighted value of how close any two points are away from their nearest neighbor. Larger weights inflate the calculated distance function larger, thus making the apparent distance between the points seem closer than their standard non-weighted Euclidean distance.
8. The next choice to be made falls under Scaling Method, where the experimenter can select how the column specified in the Weight column will be scaled. The scaling translates the values in the column specified with the Weight label directly to the new range of [1, MWR], where MWR = Maximum Weight Ratio, which will be specified in the next step. The smallest value in the weight column (MinValue) gets mapped to the value 1, while the largest value in the column (MaxValue) gets mapped to the value MWR (which will be specified in the next step. For the Direct MWR option, the shape of the histogram of the values is preserved, through the formula:
Scaled Weight = 1 + ((MWR - 1)*(Value - MinValue)/(MaxValue - MinValue))
For the Ranked MWR option, the values are sorted from smallest to largest (ties allowed) and then assigned a rank. Rank = 1 corresponds to the smallest value, while the largest Rank is the number of rows in the candidate set (NumCand). Then the scaled weights are assigned through the formula:
Scaled Weight = 1 + ((MWR - 1)*(Rank - 1)/(NumCand - 1))
Note
The designs created are dependent on the choice of weights selected. The Ranked MWR choice creates a uniformly spaced order of points from “least important” to “most important” that results in a symmetric flat histogram for the weights, while the Direct MWR scaling preserves the shape of the original values. If the user is not sure which of the choices is better suited to their problem, we recommend generating designs for both choices and comparing the results to see which are a better match for desired spacing throughout the input space.
9. Next, there are options for the values of the Maximum Weight Ratio (MWR) to be used. This is an important step in the Non-Uniform Space Filling design process, as it gives the user control about how much difference there is in the density of points. Smaller values of MWR (close to 1), result in a nearly uniform design. Larger values result in a design that has a higher density of design points for the higher weighed regions, and more sparse for the lower weighted regions. Since how this value impacts the density of the design is also a function of the histogram of the values for the Weight column and the choice of the Scaling Method, we recommend constructing designs for several MWR values and comparing their results.
The user can specify up to 5 MWR values, where for each of the MWR boxes, there is a set of choices that range from 2 to 60. This range should provide considerably flexibility in choosing how unequal the spacing will be throughout the design space.

Choice of MWR Value and Columns
Note
Here are some recommendations about the role of the MWR value and the choice of scaling:
- Think about changes to the MWR as multiplicative or exponential (e.g. 1 - 2 - 4 - 8 - 16), not linear (e.g. 1 - 2 - 3 - 4 - 5).
- If there are many candidate points that should be weighted approximately equally, the direct weight scaling might be more appropriate. The ranked weighting tends to spread out the final weights for similar values.
- If the original candidate set weight distribution is close to uniformly distributed, then the Ranked MWR and Direct MWR scalings will produce very similar designs.
- The ranked scaling for weights makes it easier to predict what the impact of a choice of MWR value will be (since the initial weight distribution is always approximately the same).
- As the skew of the direct weight distribution increases, the effective MWR becomes consistently smaller than the chosen value (only a small fraction of the candidates are using the edges of [1,MWR] range). Hence, for skewed distributions, a larger MWR might be needed for the Direct scaling to get a design that is similar to a given MWR value for the Ranked weight scaling.
Also in this step, the columns of the candidate set to be used for the construction of the design are identified. Under Include? in the box on the right hand side, the experimenter has the option of choosing whether particular columns should be included in the space-filling design search. Uncheck a box, if a particular column should not be included in the search.
Next select the Type for each column. Typically most of the columns will be designated as Inputs, which means that they will be used to define the input space and to find the best design. For the Non-Uniform Space Design, there is a required column for the Weights, which designates which rows in the candidate to emphasize (bigger weights) and which to de-emphasize (smaller weights). In addition, we recommend including one Index column which contains a unique identifier for each run of the candidate set. This makes tracking which runs are included in the constructed designs easier. If no Index column is specified, a warning appears later in the process, but this column, while recommended, is not strictly required.
Finally, the Min and Max columns in the box allow the range of values for each input column to be specified. The default is to extract the smallest and largest values from the candidate and history data files, and use these. This approach generally works well, as it scales the inputs to be in a uniform hypercube for comparing distances between the design points.
Note
The default values for Min and Max can generally be left at their defaults unless: (1) the range of some inputs represent very different amounts of change in the process. For example, if temperature is held nearly constant, while a flow rate changes substantially, then it may be desirable to extend the range of the temperature beyond its nominal values to make the amount of change in temperature more commensurate with the amount of change in the flow rate. This is a helpful strategy to make the calculated distance between any points a more accurate reflection of how much of an adjustment each input requires. (2) if changes are made in the candidate or history data files. For example, if one set of designs are created from one candidate set, and then another set of designs are created from a different candidate set. These designs and the achieved criterion value will not be comparable unless the range of each input has been fixed at matching values.
10. Once the design choices have been made, click on the Test SDOE button. This generates a small number of iterations of the search algorithm to calibrate the timing for constructing and evaluating the designs. The time taken to generate a design is a function of the size of the candidate set, the size of the design, as well as the dimension of the input space.

Test SDOE timing
Note
The number of random starts looks very different from what was done with the Uniform Space Filling Design. In that case, the number of random starts was offered in powers of 10. In this case, since a more sophisticated search algorithm is being used, each random start takes longer to run, but generally many fewer starts are needed. There is set of choices for the number of random starts, which ranges from 10 to 1000. Producing a sample design for demonstration purposes with a small number of random starts (say 10 to 30) should work adequately, but recall that the choice of Number of Random Starts involves a trade-off between the quality of the design generated and the time to generate the design. The larger the chosen number of random starts, the better the design is likely to be. However, there are diminishing gains for increasingly large numbers of random starts. If running the actual experiment is expensive, it is generally recommended to choose as large a number of random starts as possible for the available time frame, to maximize the quality of the design generated.

Number of Random Start choices
11. Once the slider has been set to the desired Number of Random Starts, click on the Run SDOE button, and initiate the construction of the designs. The progress bar indicates how design construction is advancing through the chosen range of designs for each of the MWR values specified.
12. When the SDOE module has completed the design creation process, the left window Created Designs will be populated with files containing the results. The column entries summarize the key features of each of the designs, including MWR, Design Size (d, the number of runs in the created design), # of Random Starts, n, Runtime (number of seconds needed to create the design), Criterion Value (the value obtained for the maximin criterion for the saved design). Note that the criterion values are specific to the MWR value chosen, and hence should not be considered comparable across different values.

SDOE Created Designs
13. As with the Uniform Space Filling designs, to see details of the design, the View button at the right hand side of each design row can be selected to show a table of the design, as well as a pairwise scatterplot of the input and weight columns for the chosen design. The table and plot of the design are similar in characteristics to their counterparts for the candidate set. If multiple designs were created with different MWR values (or using the different Scaling Method choices), it is helpful to examine the plots to compare their properties to those sought by the experimenter. A final choice should be made based on what is needed for the goals of the study.
14. Similar to the Uniform Space Filling designs, to access the file with the generated design, go to the SDOE_files folder, and a separate folder will have been created for each of the designs. The structure of the folder and files corresponds to what was done in the Uniform Space filling design instructions. The labeling of the files is a bit different to reflect the choices that the user made in creating the design. For example, the file nusf_d10_n1000_m30_Label+w+G+lldg+L+Values.csv contains the design of size 10 (d10), generated from 1000 random starts (n1000), with the maximum weight ratio (MWR) set to 30 (m30). The columns from the file that were used include “Label”, “w”, “G”, “lldg”, “L” and “Values”.
When one of the design files is opened it contains the details of each of the runs in the design, with the input factor levels that should be set for that run.
To evaluate and compare the designs that have been created, it is helpful to look at a number of summaries, including the criteria values and visualizing the spread of the design points throughout the region. Recall that at the beginning of the design creation process we recommended constructing multiple designs, with different MWR values, choosing between the Direct and Ranked weighting strategies, and potentially with different design sizes. By examining multiple designs, it is easier to determine which design is best suited to the requirements of the experiment.
In the Created Designs table, it is possible to see the criterion values for each of the designs. When comparing two designs of the same size with the same MWR value, the maximin criterion should be made as large as possible. However, comparisons between designs with the same MWR value but of different sizes share the same issues that were present in the uniform space filling case. Adding more runs to the design will mean that nearest neighbors will need to get closer together, and hence we would expect that on average the criterion value would get smaller for larger experiments. Hence, we want to evaluate whether the closer packing of the design points from a larger experiment is worth the increase in cost for the additional runs.
Making comparisons for designs with different MWR values based on the design criterion is not recommended, because the distance metric that is embedded in the non-uniform space filling design approach adjusts based on the selected MWR value. Hence, it is not possible to make a direct comparison or easy interpretation of the values from the criterion for this approach.
Hence for the NUSF designs, it is critical to use the View option to look at graphical summaries of the designs. Two plots are produced: The first is the Closest Distance by Weight (CDBW) plot, and the second is the more familiar pairwise scatterplot of the created design.
First, we describe the information that is contained in the CDBW plot. There are two portions to the plot. The lower section shows a histogram of the weights in the candidate set. Note that the range of values goes from 1 to the MWR value selected. For the figure below, we are looking at a design created with a MWR value of 5. The shape of the histogram shows what values were available to be selected from the candidate set. The top portion of the plot, has a vertical line for each of the design points selected (in this case 15 vertical lines for 15 design points). The location of each vertical line shows the weight for the selected design point.

A sample Closest Distance by Weight (CDBW) plot for a 15-run design with MWR value of 5
Second, a pairwise scatter plot of the design is provided to see how the design points fill the input space. Since the spread of the points throughout the design space is intentionally non-uniform, it is helpful to see how the distribution matches up with the specified weights provided in the candidate set. Recall that larger values of MWR lead to designs that are less evenly distributed, while MWR values that approach 1 will become closer to uniform.

A sample pairwise scatterplot for the constructed design with 15 runs and a MWR value of 5
When History points have been incorporated into the design, the plots will show how the overall collection of points fill the input space. When examining the scatterplots, it is important to assess (a) whether the increase in concentration of points is located in the desired region?, (b) is the degree of non-uniformity what was desired?, (c) how close the design points have been placed to the edges of the region?, (d) are there holes in the design space that are unacceptably large?, and (e) does a larger design show a worthwhile improvement in the density of points to justify the additional expense?
Recall that the effect of different MWR values depends on the size of the design, the spread of weights provided across the candidate points and the shape of the input region of interest. Hence, constructing several designs and comparing them can be an effective approach for obtaining the right design.
Based on the visualization of the spread of the points, the best design can be chosen that balances design performance with an appropriate use of the available budget. Recall that with sequential design of experiments, runs that are not used in the early stages might provide the opportunity for more runs at later stages. So the entire sequence of experimental runs should be considered when making choices about each stage.
Efficient Implementation of Experimental Run Order¶
Once designs have been created, it is often important to optimize the run order to efficiently reach equilibrium and allow for the maximum number of runs to be implemented within a constrainted budget or time period. While statisticians generally recommend using a randomized order for the experimental runs, it can sometimes mean the difference of a small randomized experiment versus a larger non-randomized experiment.

Comparison of the number of runs possible with an optimized run order (left) versus an inefficient randomized run order (right)
In this section we describe how to generate an efficient run order for a design created using the Uniform Space Filling or Non-Uniform Space Filling design options.
Once we created a design (USF or NUSF), it appears on the left panel in the Created Designs table. Click on the design that we want to order (it is highlighted in blue as shown below). Then click on the button below named Order Design, to order the design points in an efficient run order that sequences the runs to favor having nearby points adjacent to each other in the run order.

How to create an ordered design
A pop up window confirms the location of the newly ordered file (see below). Click ‘Yes’ to continue.

Message window for new design created
Both design files (located in the designated folder) are saved in the csv format, which can be opened with your preferred application (e.g. Microsoft Excel). You can produce a scatterplot of the ordered design file either using FOQUS or any other external application.
The ordering scheme provides a method for the user to design the experimental run order that follows the minimal path distance to traverse from one design point to another, i.e., minimal changes the the experimental processes. This standardizes the range of each input factor t to be between 0 and 1, and then minimizes the sum of the Euclidean distances between all of the points. Often this would be a preferred operational implementation to increase the efficiency of the experiment, by reducing the time for the process to reach equilibrium. The implementation provided uses the TSP (travelling sales person) algorithm as implemented in the ‘mlrose’ library in scikit-learn package for ordering/ranking the design points.
An alternative to this approach is a simple sequential ordering (ascending or descending) of the most expensive input factor. This is easily implemented by the user, and can be efficient for the running of the experiment, but should be used cautiously because the run order might confound other changes in the system during the implementation of the experiment.
Examples¶
Next, we illustrate the use of the SDOE capability for several different scenarios. Example USF-1 constructs several uniform space filling designs of size 8 to 10 runs for a 2-dimensional input space based on a regular square region with a candidate set that is a regularly spaced grid. Both minimax and maximin designs are constructed to illustrate the difference in the criteria. Example USF-2 takes one of the designs created in Example 1, and considers how it might be used for sequential updating with additional experimentation. In this case the Example 1 design is considered as historical data, and the goal is to augment the design with several additional runs. Example USF-3 considers a 5-dimensional input space based on a CCSI example, and demonstrates what the process of Sequential Design of Experiments might look like with several iterations of constructing uniform space filling designs.
Example NUSF-1 constructs several non-uniform space filing designs of size 15 in a 2-dimensional regular input space. Several designs are generated using the same weights, but with different Maximum Weight Ratios (MWRs), to illustrate how the concentration of points can be altered to match the experimenter’s preferences. Example NUSF-2 considers a CCSI example, with a non-regular region, and the weights that were derived from the width of the confidence interval for prediction based on an existing model. The goal is to concentrate more of the new runs in regions where there is greater uncertainty, and hence the widths of the confidence intervals are larger. Again multiple designs are created to show how the MWR influences the concentration of the points in different regions.
The files for these tutorials are located in: examples/tutorial_files/SDOE
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Example USF-1: Constructing Uniform Space Filling minimax and maximin designs for a 2-D input space¶
For this first example, the goal is to construct a simple space-filling design with between 8 and 10 runs in a 2-dimensional space based on a regular unconstrained square region populated with a grid of candidate points.
- From the FOQUS main screen, click the SDOE button. On the top left side, select Load from File, and select the SDOE_Ex1_Candidates.csv file from examples folder. This identifies the possible input combinations from which the design will be constructed. The more possible candidates that can be provided to the search algorithm used to construct the design, the better the design might be for the specified criterion.

Ex 1 Ensemble Selection
- Next, by selecting View and then Plot it is possible to see the grid of points that will be used as the candidate points. In this case, the range for each of the inputs, X1 and X2, has been chosen to be between -1 and 1.

Ex 1 Candidate Grid
- Next, click on Confirm to advance to the Ensemble Aggregation Window, and then click on Uniform Space Filling to advance to the second SDOE screen, where particular choices about the design can be made. On the second screen, select minimax for the Optimality Method Selection. Change the Min Design Size and Max Design Size to 8 and 10, respectively. This will construct 3 minimax designs of size 8, 9 and 10. Next, change the column called Label to be Index. This will mean that the design is not constructed using this as an input, but rather that the unique identifiers in this column will help identify which runs from the candidate set were chosen for the final designs. Since the ranges of each of X1 and X2 are the bounds that we want to use for creating this design, we do not need to change the entries in Min and Max.

Ex 1 Minimax design choices
- Once the choices for the design have been specified, click on the Test SDOE button to estimate the time taken for creating the designs. For the computer on which this example was developed, if we ran the minimum number of random starts (10^3=1000), it is estimated that the code would take 4 seconds to create the three designs (of size 8, 9 and 10). If we chose 10^4=10000 runs, then the code is estimated to take 44 seconds. It is estimated that 10^5=100000 random starts would take 7 minutes and 11 seconds, while 10^6=1 million random starts would take approximately 1 hour, 12 minutes. In this case, we selected to create designs based on 100000 random starts, since this was a suitable balance between timeliness and giving the algorithm a chance to find the best possible designs. Hence, select 10^5 for the Number of Random Starts, and then click Run SDOE.
- Since we are also interested in examining maximin designs for the same scenario, we click on the Reload Design Specifications button in the Created Design window to repopulate the right window with the same choices that we made for all of the design options.

Ex 1 Minimax created designs
- After changing the Optimality Method Selection to maximin, click on Test SDOE, select 10^5 for the Number of Random Starts, and then click Run SDOE. After waiting for the prescribed time, the Created Designs window will have 6 created designs - three that are minimax designs and three that are maximin designs.

Ex 1 Created designs
- We now consider the choices between the designs to determine which is the best match for our experimental goals. We can see a list of the selected design points by clicking View for any of the created designs, and Plot allows us to see the spread of the design points throughout the input region.
Clearly, there is a trade-off between the cost of the experiment (larger number of runs involve more time, effort and expense) and how well the designs fill the space. When choosing which of the designs is most appropriate for the experiment, it is important to remember that resources spent early in the process cannot be used later, so it is helpful to balance early learning about the process, with the ability to identify and focus on the desired optimal location later.
There is also a small difference in priorities between the minimax and the maximin criteria. Minimax seeks to minimize how far any candidate point (which defines our region of interest) is from a design point. Maximin seeks to spread out the design points and maximize how close the nearest points are to each other. As noted previously, minimax designs tend to avoid putting too many points on the edge of the region, while maximin designs often place a number of points right on the edges of the input space.
By looking at the placement of the points, how well they fill the desired space, and the points proximity to the edge of the region, the user can find a good match for their experimental goals. After considering all of the trade-offs between the alternatives, select the design that best matches the goals of the experiment.
- The file for the selected design can be found in the SDOE_files folder. The design can then be used to guide the implementation of the experiment with the input factor levels for each run.
Example USF-2: Augmenting the Example USF-1 design in a 2-D input space with a Uniform Space Filling Design¶
In this example, we consider the sequential aspect of design, by building on the first example results. Consider the scenario where based on the results of Example 1, the experimenter selected to actually implement and run the 8 run minimax design.
- In the Ensemble Selection box, click on Load from File to select the candidate set that you would like to use for the construction of the design. This may be the same candidate set that was used in Example 1, or it might have been updated based on what was learned from the first data collection. For example, if it was learned that one corner of the design space might not be desirable, then the candidate set can be updated to remove candidate points that are now considered undesirable. For the File Type leave the designation as Candidate.
To load in the experimental runs that were already collected, click on Load from File again, and select the design file that was created in the SDOE_files folder. This time, change the File Type to History. If you wish to view either of the candidate or history files, click on View to see either a table or plot.
- Click on the Confirm button at the bottom right of the Ensemble Selection box. This will activate the Ensemble Aggregation box.
- After examining that the desired files have been selected, click on the Uniform Space Filling button at the bottom right corner of the Ensemble Aggregation window. This will open the second SDOE window that shows the Sequential Design of Experiments Set-Up window on the right hand side.
- Select Minimax or Maximin for the type of design to create.
- Select the Min Design Size and Max Design Size to match what is desired. If you wish to just generate a single design of the desired size, make Min Design Size = Max Design Size. Recall that this will be the number of additional points that will be added to the existing design, not the total design size.
- Next, select the options desired in the box: a) Should any of the columns be excluded from the design creation? If yes, then unclick the Include? box. b) For input factors to be used in the construction of the uniform space filling design, make sure that the Type is designated as Input. If there is a label column for the candidates, then designate this as Index. c) Finally, you can optionally change the Min and Max ranges for the inputs to adjust the relative emphasis that distances in each input range are designated.
- Once the set-up choices have been made, click Test SDOE to find out what the anticipated time is for generating designs based on different numbers of random starts.
- Select the number of random starts to use, based on available time. Recall that using more random starts is likely to produce a design that is closer to the overall best optimum.
- After the SDOE module has created the design(s), the left window Created Designs is populated with the new design(s). These can be viewed with the View option, where the plot now shows the History Data with one symbol, and the newly added possible design with another symbol. This allows better assessment of the appropriateness of the new design subject to the data that have already been collected.
- To access the file that contains the created designs, go to the SDOE_files folder. As before, a separate folder will have been created for each design.
- If there is a desire to do another set in the sequential design, then the procedure outlined above for Example 2 can be followed again. The only change will be that this time there will be 3 files that need to be imported: A Candidate file from which new runs can be selected, and two History files. The first of these files will be the selected design from Example USF-1, and the second the newly created design that was run as a result of Example USF-2. When the user clicks on Confirm in the Ensemble Selection window, the two History files will be aggregated into a single Aggregated History file.
Example USF-3: A Uniform Space Filling Design for a Carbon Capture example in a 5-D input space¶
In this example, we consider a more realistic scenario of a sequential design of experiment. Here we explore a 5-dimensional input space with G, lldg, CapturePerc, L and SteamFlow denoting the space that we wish to explore with a space-filling design. The candidate set, Candidate Points 8perc, contains 93 combinations of inputs that have been validated using an ASPEN model as possible combinations for this scenario. The goal is to collect 18 runs in two stages that fill the input space. There are some constraints on the inputs, that make the viable region irregular, and hence the candidate set is useful to avoid regions where it would be problematic to collect data.
- After selecting the SDOE tab in FOQUS, click on Load from File and select the candidate file, Candidate Points 8perc.

Ex 3 Ensemble Selection window
- To see the range of each input and how the viable region of interest is captured with the candidate set, select View and then plot. In this case we have chosen to just show the 5 input factors in the pairwise scatterplot.

Ex 3 plot of viable input space as defined by candidate set
- After clicking Confirm in the Ensemble Selection box, and then Uniform Space Filling from the Ensemble Aggregation box, the SDOE Set-up box will appear on the right side of the second window. Here, select the options desired for the experiment to be run. For the illustrated figure, we selected a Minimax design with 3 potential sizes: 10, 11, 12. We specified that the column Test No. will be used as the Index, G, lldg, CapturePerc, L, SteamFlow will define the 5 factors to be used as inputs. We unclicked the Include? box for CO2 captured since we do not want to use it in the design construction.

Ex 3 set-up window for first stage
- After running Test SDOE and selecting the number of random starts to be used, click Run SDOE. After the module has created the requested designs, they can be viewed and compared.

Ex 3 10,11,12 run designs created for first stage
- By clicking View and then Plot, the designs can be viewed. Suppose that the experimenter decides to use the 12 run design in the initial phase, then this would be the design that would be implemented and data collected for these 12 input combinations.

Ex 3 chosen experiment for first stage
- After these runs have been collected, the experimenter wants to collect additional runs. In this case, return to the first SDOE module window, and load in the candidate set (which can be changed to reflect any knowledge gained during the first phase, such as undesirable regions or new combinations to include). The completed experiment should also be included as a History file, by going to the SDOE_files folder and selecting the file containing the appropriate design.

Ex 3 ensemble selection box for second stage
- After clicking Confirm in the Ensemble Selection box, and then Uniform Space Filling from the Ensemble Aggregation box, the SDOE Set-up box will appear on the right side of the second window. Here, select the options desired for the experiment to be run. For the illustrated figure, we selected a Minimax design with a design size of 6 (to use the remaining available budget). We again specified that the column Test No. will be used as the Index, G, lldg, CapturePerc, L, SteamFlow will define the same 5 factors to be used as inputs.

Ex 3 setup box for second stage
- After running Test SDOE and selecting the number of random starts to be used, click Run SDOE. After the module has created the requested design, they can be viewed. After selecting View and then Plot, the experimenter can see the new design with the historical runs included. This provides a good plot to allow the complete sequence of two experiments to be examined as a combined set of runs. Note that the first and second stages are shown in different colors and with different symbols.

Ex 3 setup box for second stage
Example NUSF-1: Constructing Non-Uniform Space Filling maximin designs for a 2-D input space¶
For this first Non-Uniform Space Filling design example, the goal is to construct a non-uniform space-filling design with 20 runs in a 2-dimensional space based on a regular unconstrained square region populated with a grid of candidate points. The choice of how to construct the candidate set should be based on: a) what is the precision with which each of the inputs can be set in the experiment, and b) timing for generating the designs. Note that the finer the grid that is provided in the candidate set, the longer the search algorithm will take to run for a given number of random starts. In general a finer grid will give better options for the best design, but with diminishing returns after a large number of candidates have already been provided
As noted previously in the Basics section, in addition to specifying the candidate point input combinations, it is also required to supply an additional column of weights. This column will provide the necessary information about which regions of the input space should be emphasized more, and which should be emphasized less. The figure below shows some of the characteristics of the candidate set.

Ex NUSF1 Candidate set of points with their associated weights. Left shows the underlying relationship used to generate the design, and right shows the candidates with the size of the point proportional to the assigned weight.
The candidates are laid out in a regular grid with equal spacing between levels of each of X1 and X2. A contour plot of the weight function that was used to generate the weights is shown on the left side of the plot. The weights range from -14.48 to 50, with the largest values of the weights near the top left corner of the input space. The smallest values lie in the bottom right corner. On the right hand side, we can see a plot where the relative size of the points is proportionate to the size of the weight assigned to that candidate point. This second representation is helpful when the candidate points do not fall on a regular grid, or if the relationship for determining the weights is not smooth.
Here is the process for generating NUSF designs for this problem:
- From the FOQUS main screen, click the SDOE button. On the top left side, select Load from File, and select the “NUSFex1.csv” file from examples folder.

Ex NUSF1 choice of file for candidate set
- Next, by selecting View and then Plot it is possible to see the grid of points that will be used as the candidate points. In this case, the range for each of the inputs, X1 and X2, has been chosen to be between -1 and 1.
3. Next, click on Confirm to advance to the Ensemble Aggregation Window, and the click on Non-Uniform Space Filling to advance to the second SDOE screen, where particular choices about the design can be made. On the second screen, the first choice for Optimality Method Selection is automatic, since the non-uniform space filling designs only use the Maximin criterion. The next choice is to choose the Scaling Method, where the choices are Direct and Ranked. The default is to use the Direct scaling which translates the weights provided with a linear transformation so that they lie in the range 1 to whatever MWR value is selected below. For this example, we choose the option for Direct scaling.

Ex NUSF1 Choice of settings for generating NUSF designs
Next select the Design size, where here we have decided to construct a design with 20 runs. The choice of the Maximum Weight Ratio or MWR is one of the more difficult choices that the experimenter will need to make, since it is often one that they do not have much experience with. It is for this reason that we recommend constructing several designs with different MWR values and then comparing the results to see which value is best suited for the experiment to be run. Recall that a value of 1 corresponds to a uniform space filling design, while larger values will place increasing concentration of points near the regions with larger weight values. In this case, we select to generate 3 designs, with MWR values of 5, 10 and 20. This should give a good variety of designs to choose from after they have been constructed.
There are also choices for which columns to include in the analysis. Here we use all 3 columns for creating the design, so all Include? boxes remain checked. In addition, it is possible to see the range of values for each of the columns in the spreadsheet. Here the two input columns range from -1 to 1, while the “RawWt” column ranges from -14.48 to 50. The user can change these values if they wish to rescale the ranges to widen or narrow them, but in general these values can be left as is.
- Once the choices for the design have been specified, click on the Test SDOE button to estimate the time taken for creating the designs. For the computer on which this example was developed, if we ran 30 random starts, it is estimated that the algorithm would take 17:38 minutes to generate the 3 designs with MWR values of 5, 10, 20. Note that the timing changes linearly, so using 40 random starts would take twice as long as using 20 random starts. Recall that the choice of the number of random starts involves a trade-off between getting the designs created quickly and the quality of the designs. For many applications, we would expect that using at lest 30 random starts would produce designs that are of good quality.

Ex NUSF1 specification of timing to generate the requested designs.
- Once the algorithm has generated the designs, the left box called Created Designs populates with the 3 designs that we have created. Some of the key choices of the designs are summarized in the columns. The size of the design, the MWR value and the number of random starts are all noted. In addition, the time to create the design is also included. The criterion value is provided. Recall from the discussion in the Basics section, that the criterion value can be compared for designs of the same size and with the same MWR value, but should not be compared across design sizes or across different MWR values.

Ex NUSF1 created designs for three MWR values of 5, 10 and 20
- To examine each of the created designs, select View and choose the columns to be included, and click Plot. For this example we included all of the columns. Note that two plots are created for each design. The first is the Closest Distance by Weight (CDBW) plot, and the second is the more familiar pairwise scatterplot of the created design.
First, we describe the information that is contained in the CDBW plot. There are two portions to the plot. The lower section shows a histogram of the weights in the candidate set. Note that the range of values goes from 1 to the MWR value selected. For the figure below, we are looking at a design created with a MWR value of 5. The shape of the histogram shows what values were available to be selected from. The top portion of the plot, has a vertical line for each of the design points selected (in this case 20 vertical lines for 20 design points). The location of each vertical line shows the weight for the selected design point. In this case, the smallest weight selected had weight a bit below a value of 2, while there are several design points chosen that have weight close to the maximum possible (the MWR value). This allows the user to see how much emphasis was placed on getting the larger weight values into the design.

Ex NUSF1 Closest Distance by Weight (CDBW) plot for the constructed design with MWR values of 5
The second plot is the more familiar scatterplot of the design points. It is clear that the non-uniform space filling approach has lived up to its name and has generated a design that has a greater emphasis of points for the larger weights. The design still provides space filling throughout the region, but with very different densities of points for the various regions.

Ex NUSF1 pairwise scatterplot for the constructed design with MWR values of 5
- The next step is to repeat the process for the other two designs created. In this case we can see that the NUSF designs for MWR values of 10 and 20 create even more concentrated designs in the region with higher weights. The figure below shows the collection of the CDBW plot for MWR values of 10 and 20.

Ex NUSF1 Closest Distance by Weight (CDBW) plot for the constructed designs with MWR values of 10 and 20
When we compare the three CDBW plots for the designs with MWR of 5, 10 and 20, we see that more of the points are shifted to the right closer to the maximum weight value as we increase the MWR value. This gives control to the user to adjust the relative density of points for different weights.

Ex NUSF1 pairwise scatterplot for the constructed designs with MWR values of 10 and 20
When we compare the three designs, we can see that increasing the MWR produces a design that moves more of the points closer to the higher weight regions of the input space. This gives the user the control that is needed to create a customized design that matches the desired concentration of points in the regions where they are desired. After examinig the different summary plots for the three designs, the user can choose the plot that is the best match to their experimental needs
Example NUSF-2: Constructing Non-Uniform Space Filling for a 4-Input Carbon Capture example¶
For this second Non-Uniform Space Filling design example, we consider a carbon capture example with 4 inputs (G, lldg, w, L). In this case the experimenter is interested in constructing a 10 run design that is space filling, but also places a slightly higher emphasis in the region that is expected to contain the optimum of the process. The experimenter’s team of experts identify that the most likely location for that optimum is located a G=2200, lldg=0.2, w=0.15 and L=8000. As such they construct a set of weights that are highest at this location in the input space, and then taper away the further the inputs are from that optimum. The figure below shows the set of 526 candidate points that take into account the constraints in the region, where running an experiment at those locations would not yield a desirable outcome or perhaps would not even generate any response. The red triangle indicates the identified likely optimum for all pairwise scatterplots above the diagonal. The size of the symbols is scaled to be proportional to the weights at each location, with largest points near the optimum and tapering away as we move to the extremes of the input space.

Example NUSF2 pairwise scatterplot of the candidate set with the anticipated optimum location shown with red triangles
Here is the process for generating NUSF designs for this problem:
- From the FOQUS main screen, click the SDOE button. On the top left side, select Load from File, and select the “CCSIex.csv” file from examples folder.

Example NUSF2 choice of file for candidate set
- Next, by selecting View and then Plot it is possible to see the pairwise scatterplot of all of candidate points. Note that in this file there are 6 columns - the Label column will be used to identify which of the candidates are selected in the constructed designs. The Weights column summarizes how desirable a candidate point is by its proximity to the anticipated optimum location.

Example NUSF2 top of file with candidate points
- Next, click on Confirm to advance to the Ensemble Aggregation Window, and the click on Non-Uniform Space Filling to advance to the second SDOE screen, where particular choices about the design can be made. On the second screen, the first choice for Optimality Method Selection is automatic, since the non-uniform space filling designs only use the Maximin criterion.
The next choice is to choose the Scaling Method, where the choices are Direct and Ranked. The default is to use the Direct scaling which translates the weights provided with a linear transformation so that they lie in the range 1 to whatever MWR value is selected below. For this example, we will explore what difference the choice of the scaling method makes on the resulting designs, but be begin by choosing the option for Direct scaling.

Ex NUSF2 Choice of settings for generating NUSF designs
Next select the Design size, where here we have decided to construct a design with 10 runs. The choice of the Maximum Weight Ratio or MWR for this example reflects that we wish to have a design that is still space filling throughout the input region, but with a slightly emphasized concentration near the anticipated optimum. Hence we will select small values that are not too far away from 1 (which represents a uniform space filling design). Because, it is not always easy to judge the impact of the choice of MWR value, we recommend constructing several designs with different MWR values and then comparing the results to see which value is best suited for the experiment to be run. In this case, we select to generate 2 designs, with MWR values of 2 and 5. This should provide some variety of designs to choose from after they have been constructed.
- Once the choices for the design have been specified, click on the Test SDOE button to estimate the time taken for creating the designs. For the computer on which this example was developed, if we ran 30 random starts, it is estimated that the algorithm would take 15:08 minutes to generate the 2 designs with MWR values of 2 and 5. Note that the timing changes linearly, so using 40 random starts would take twice as long as using 20 random starts. Recall that the choice of the number of random starts involves a trade-off between getting the designs created quickly and the quality of the designs. For many applications, we would expect that using at lest 30 random starts would produce designs that are of sufficient quality.

Ex NUSF2 specification of timing to generate the requested designs.
- Once the algorithm has generated the designs, the left box called Created Designs populates with the 2 designs that we have created. Some of the key choices made by the experimenter for the designs are summarized in the columns. The size of the design, the MWR value and the number of random starts are all noted. In addition, the time to create the design is also included. The criterion value is provided. Recall from the discussion in the Basics section, that the criterion value can be compared for designs of the same size and with the same MWR value, but should not be compared across design sizes or across different MWR values.
- To examine each of the created designs, select View and choose the columns to be included, and click Plot. For this example we included only the 4 input columns to keep each plot to a moderate size. Note that two plots are created for each design. The first is the Closest Distance by Weight (CDBW) plot, and the second is the more familiar pairwise scatterplot of the created design.
Recall that there are two portions to the CDBW plot. The lower section shows a histogram of the weights in the candidate set. Note that the range of values goes from 1 to the MWR value selected. For the figure below, we are looking at a design created with a MWR value of 2. The shape of the histogram shows what values were available to be selected from. The top portion of the plot, has a vertical line for each of the design points selected (in this case 10 vertical lines for 10 design points). The location of each vertical line shows the weight for the selected design point. This allows the user to see how much emphasis was placed on getting the larger weight values into the design.

Ex NUSF2 Closest Distance by Weight (CDBW) plot for the constructed 10 run design with MWR values of 2
In looking at the location of the vertical lines in the top of the CDBW plot, we see that some locations in the input space have been chosen across the majority of the range of the weight values. This reflects the relatively small MWR of 2 value that was selected.
The second plot is the more familiar scatterplot of the design points. This shows the location of the 10 selected design points in the 4 dimensional input space. The points look to cover much the same region as the overall candidate points, but with a slight concentration of points closer to the anticipated optimum.

Ex NUSF2 pairwise scatterplot for the constructed 10 run design with MWR values of 2
- Next we consider, reproducing the same designs, but now selecting the Ranked scaling option to see how this changes the results of the constructed design. We repeat the early steps for the SDoE module with the same file for the candidate set, “CCSIex.csv”, and all of the choices for the design the same, except this time we choose the Scaling Method, as Ranked.

Ex NUSF2 Choice of settings for generating NUSF designs with Ranked for the Scaline Method
We again construct designs with MWR values of 2 and 5. The time required to generate these designs will be approximately the same as for the other choice of scaling method.
To compare the designs, we can examine the CDBW plots for all 4 of the constructed designs. The figure below shows the CDBW plots for all 4 designs.

Ex NUSF2 Comparison of the CDBW plots for designs with MWR values of 2 and 5 with both Direct and Ranked scaling
To understand the differences between the choices, we note the following points. (a) Not that for the top row of CDBW plots for those associated with the Direct scaling, the shape of the histograms for the candidate set are the same as for the original unscaled weights provided in the candidate set. In this case, we have a skewed distribution with very few small weights. (b) In contrast, the bottom row of CDBW plots are for the Ranked scaling, and the shape of the histogram is quite different from what was obtained with the Direct weighting. As is typical of the the Ranked scaling, we obtain an even histogram with nearly the same count in each bar. (c) Next when we compare the left (MWR=2) and right (MWR=5) plots, we see that the left plots have a more evenly spread set of weights selected across the entire range of values. For the MWR=5 plots, we see that there is a greater concentration of larger weights that have been selected. (d) To select the design that is best suited for the goal of the experiment, it is helpful to think about how non-uniform the spread of points should be, and how big are the gaps where no runs will be collected. The 4 sets of pairwise scatterplots can helpful to see where the gaps exist. The scatterplots are slightly harder to interpret as the number of factors increases, but the histograms for each input can give a good idea of how the runs are spread across the range of each input.

Ex NUSF2 Comparison of the pairwise scatterplots for designs with MWR values of 2 and 5 with both Direct and Ranked scaling
Heat Integration¶
Tutorial¶
Tutorial: Heat Integration with FOQUS¶
The files for this tutorial are located in: examples/tutorial_files/Heat_Integration
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Motivation:¶
Methanol Production involves heating and cooling of process streams at different stages of the process, mainly fresh feed intercooling between compressors, mixed feed preheating before the reactor, intermediate cooling before flash, and heating of products and byproducts from the flash.

As shown in the figure above, there are 2 hot streams being cooled in C1, C2, and 3 cold streams being heated in H1,H2,H3. Clearly, there is a potential to perform heat integration among these process streams in order to minimize total utility and energy consumption while achieving the target temperatures.
Aim:¶
The aim of this tutorial is to implement heat integration for an Aspen Plus methanol production flowsheet, by using the heat integration plugin within FOQUS, in order to obtain the minimum utility consumption of the process.
Procedure:¶
Firstly, a SimSinter Configuration file must be created corresponding to the Aspen Plus backup file, which is located in examples/tutorial_files/Heat_Integration. The simulation model is available in it. Note: Ensure that Aspen v10 is used for this example. Select the fresh feed flowrate and temperature as “input variables”, and inlet, outlet temperatures of the process streams passing through all heaters and coolers (F2,F3,RF1,RF2,RP2,RP3,B2,BY-PROD,P1,PROD), along with heat duty of each heater, cooler as “output variables”.
Once the SimSinter file is saved in .json format, upload it to turbine and keep the simulation name as “MethanolHI”.
In the Flowsheet Window, add a node named “methanol_HI” which would contain the simulation.
Open the node editor for the given node, select model type as “Turbine” and model as “MethanolHI”. All the selected input and output variables of the simulation should be visible in the GUI.
Add heat integration tags beside each output variable. In this case, the order of tags for heat duty and temperature variables is as follows: Heat Duty of Heaters/Coolers: [“Block name”, “Blk_Var”, “heater”, “Q”] Where name is the block name of each heater/cooler in the Aspen model.
Inlet/Outlet temperatures: [“Block name”, “Port_Material_In/Out”, “heater”, “T”] Where name is the block name of the heater/cooler in the Aspen model, associated with the concerned inlet/outet stream.
NOTE: Ensure that all the variables are of the type “float” in the GUI
Run the flowsheet simulation node for testing once. The heat integration tags for output variables are seen in the rightmost column of the node editor, as shown below:
Add another node to the flowsheet window named “HI”
Open the node editor for it, and enter the heat integration plugin. In its input variables, enter number of streams as 5. Keep all other input values default.
Connect both the nodes through an edge connector.
Run the flowsheet simulation.
PYOMO-FOQUS¶
Tutorial¶
Tutorial: Running PYOMO Optimization Model in FOQUS¶
Consider the following optimization problem to be solved with FOQUS using PYOMO.
min y
Subject to:
The complete FOQUS file (Pyomo_Test_Example.foqus), with the code written, is located in: examples/tutorial_files/PYOMO
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Instructions¶
Open FOQUS, and under the Flowsheet Tab, create a Node.
Open the Node Editor, and let the Model Type be “None”.
Add the model parameters
a
,b
,c
as “Input Variables” within the Node Editor, with values1
,2
,3
respectively.Add
x1
,x2
,y
,converged
, andoptimal
as “Output Variables” within the Node Editor.Note that
x1
,x2
andy
correspond to the optimization variable values.converged
is meant to be a binary variable that would denote whether the optimization model has converged, by checking the solver status.optimal
is meant to be a binary variable that would denote whether the solver returns an optimal solution.Under Node Script, set Script Mode to “Post”. This will ensure that the node script runs after the node simulation. Enter the following PYOMO code for the optimization model:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
from pyomo.environ import (Var, Constraint, ConcreteModel, PositiveReals, Objective) from pyomo.opt import SolverFactory import pyutilib.subprocess.GlobalData pyutilib.subprocess.GlobalData.DEFINE_SIGNAL_HANDLERS_DEFAULT = False m = ConcreteModel() m.x1 = Var(within=PositiveReals) m.x2 = Var(within=PositiveReals) m.y = Var() m.c1 = Constraint(expr=x["a"]*m.x1+x["b"]*m.x2 >= x["c"]) m.c2 = Constraint(expr=m.x1+m.x2 == m.y) m.o = Objective(expr=m.y) opt = SolverFactory("ipopt") r = opt.solve(m) f["x1"] = m.x1.value f["x2"] = m.x2.value f["y"] = m.y.value f["converged"] = (str(r.solver.status) == "ok") f["optimal"] = (str(r.solver.termination_condition) == "optimal")
In the above code, lines 1-6 are used to import the PYOMO package and SolverFactory function to develop the model and solve it by accessing an appropriate solver.
A PYOMO Concrete Model is declared, defining the variables, declaring the constraints using the parameters defined within “Input Variables” of the Node, and defining the objective function with lines 10 to 16.
Line 17 sets the solver to ipopt and line 18 sends the problem to be solved to the solver. Ipopt is a nonlinear optimization solver.
Note
ipopt will need to be available in your environment. To install it into your conda environment you should use the command:
conda install -c conda-forge ipopt
The conda install method is preferred for Windows users.Once the model is solved, the values of decision variables
x1
,x2
,y
are assigned to the Node Output Variables in lines 19 to 21.The code lines 22 and 23 check the solver status and termination condition. If the solver status is “ok”, it means that the model has converged, and the ‘converged’ variable is assigned the value 1. Else, it is assigned the value 0, which means that the model has not converged. If the solver termination condition is “optimal”, it means that the solver has found an optimal solution for the optimization model. Else, the solution is either feasible if the solver status is “ok”, or infeasible altogether.
Click the Run button to run the python script and check the Node Output Variables section.
It should be noted that the parameter values within Node Input Variables can be changed as per user’s requirement, to run different cases.
Note
For more information on building and solving pyomo models, refer to the pyomo documentation: https://pyomo.readthedocs.io/en/stable/solving_pyomo_models.html
IDAES-FOQUS¶
Tutorial¶
Tutorial: Running IDAES model in FOQUS¶
The NETL’s Institute for the Design of Advanced Energy Systems is developing an equation-oriented framework for simulation and optimization of energy systems. A library of unit models is available to create and solve process flowsheets, therefore, a tutorial has been developed in FOQUS to import IDAES unit models, build a flowsheet, and simulate it.
The case study consists of the separation of Toluene-Benzene mixture (Figure 1). First the mixture is heated to 370K, and then separated in the Flash Tank. Consider the following process flowsheet that has been developed in FOQUS, using IDAES :

Figure 1: Heater Flash Flowsheet
Feed Conditions:
Flowrate = 0.277778 mol/s
Temperature = 353 K
Pressure = 101325 K
Benzene Mole Fraction = 0.4
Toluene Mole Fraction = 0.6
Heater Specification:
Outlet Temperature (HTOUT stream) = 370 K
Flash Specification:
Heat Duty = 0 W
Pressure Drop = 0 Pa
The following steps show how to import Python, Pyomo, and IDAES libraries and models, build the flowsheet, select input variables, and solve the simulation in FOQUS:
Instructions¶
Open FOQUS, and under the Flowsheet Tab, create a Node named “Flowsheet”.
Open the Node Editor and let the Model Type be “None”.
Add the following input variables with their corresponding values in the Node Editor:
heater_inlet_molflow
: 0.277778 mol/sheater_inlet_pressure
: 101325 Paheater_inlet_temperature
: 353 Kheater_inlet_benzene_molfrac
: 0.4heater_inlet_toluene_molfrac
: 0.6heater_outlet_temperature
: 370 Kflash_heat_duty
: 0 Wflash_pressure_drop
: 0 PaAdd the following output variables in the Node Editor:
heater_heat_duty
Wflash_liq_molflow
mol/sflash_liq_pressure
Paflash_liq_temperature
Kflash_liq_benzene_molfrac
flash_liq_toluene_molfrac
flash_vap_molflow
mol/sflash_vap_pressure
Paflash_vap_temperature
Kflash_vap_benzene_molfrac
flash_vap_toluene_molfrac
As stated in previous tutorials, the FOQUS simulation node allows the user to type a python script under the Node Script option. In this node script section, this tutorial shows how to import python libraries, Pyomo libraries, IDAES libraries and models, build and solve the flowsheet. Note that in this example, process conditions are fixed in order to have 0 degrees of freedom. Hence, the optimization actually gets solved as a simulation problem. A critical step is to link the FOQUS variables (input and output) to the IDAES mathematical model, thus, setting the inlet conditions of the process before solving the simulation problem. Finally, under Node Script, set Script Mode to “Post”. This will ensure that the node script runs after the node simulation. Enter the following code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
# Import objects from pyomo package from pyomo.environ import ConcreteModel, SolverFactory,TransformationFactory, value import pyutilib.subprocess.GlobalData pyutilib.subprocess.GlobalData.DEFINE_SIGNAL_HANDLERS_DEFAULT = False # Import the main FlowsheetBlock from IDAES. The flowsheet block will contain the unit model import idaes from idaes.core.flowsheet_model import FlowsheetBlock # Import the BTX_ideal property package to create a properties block for the flowsheet from idaes.generic_models.properties.activity_coeff_models import BTX_activity_coeff_VLE # Import heater unit model from the model library from idaes.generic_models.unit_models.heater import Heater # Import flash unit model from the model library from idaes.generic_models.unit_models.flash import Flash # Import methods for unit model connection and flowsheet initialization from pyomo.network import Arc, SequentialDecomposition # Import idaes logger to set output levels import idaes.logger as idaeslog # Create the ConcreteModel and the FlowsheetBlock, and attach the flowsheet block to it. m = ConcreteModel() m.fs = FlowsheetBlock(default={"dynamic": False}) # dynamic or ss flowsheet needs to be specified here # Add properties parameter block to the flowsheet with specifications m.fs.properties = BTX_activity_coeff_VLE.BTXParameterBlock(default={"valid_phase": ('Liq', 'Vap'), "activity_coeff_model": "Ideal"}) # Create an instance of the heater unit, attaching it to the flowsheet # Specify that the property package to be used with the heater is the one we created earlier. m.fs.heater = Heater(default={"property_package": m.fs.properties}) m.fs.flash = Flash(default={"property_package": m.fs.properties}) # Connect heater and flash models using an arc m.fs.heater_flash_arc = Arc(source=m.fs.heater.outlet, destination=m.fs.flash.inlet) TransformationFactory("network.expand_arcs").apply_to(m) #Feed Specifications to heater m.fs.heater.inlet.flow_mol.fix(x["heater_inlet_molflow"]) # mol/s m.fs.heater.inlet.mole_frac_comp[0, "benzene"].fix(x["heater_inlet_benzene_molfrac"]) m.fs.heater.inlet.mole_frac_comp[0, "toluene"].fix(x["heater_inlet_toluene_molfrac"]) m.fs.heater.inlet.pressure.fix(x["heater_inlet_pressure"]) # Pa m.fs.heater.inlet.temperature.fix(x["heater_inlet_temperature"]) # K # Unit model specifications m.fs.heater.outlet.temperature.fix(x["heater_outlet_temperature"]) # K m.fs.flash.heat_duty.fix(x["flash_heat_duty"]) # W m.fs.flash.deltaP.fix(x["flash_pressure_drop"]) # Pa #Flowsheet Initialization def function(unit): unit.initialize(outlvl=1) opt = SolverFactory('ipopt') seq = SequentialDecomposition() seq.options.select_tear_method = "heuristic" seq.run(m, function) # Solve the flowsheet using ipopt opt = SolverFactory('ipopt') solve_status = opt.solve(m) #Assign the simulation result from IDAES model to FOQUS output values f["flash_liq_molflow"] = value(m.fs.flash.liq_outlet.flow_mol[0]) f["flash_liq_benzene_molfrac"] = value(m.fs.flash.liq_outlet.mole_frac_comp[0,"benzene"]) f["flash_liq_toluene_molfrac"] = value(m.fs.flash.liq_outlet.mole_frac_comp[0,"toluene"]) f["flash_liq_temperature"] = value(m.fs.flash.liq_outlet.temperature[0]) f["flash_liq_pressure"] = value(m.fs.flash.liq_outlet.pressure[0]) f["flash_vap_molflow"] = value(m.fs.flash.vap_outlet.flow_mol[0]) f["flash_vap_benzene_molfrac"] = value(m.fs.flash.vap_outlet.mole_frac_comp[0,"benzene"]) f["flash_vap_toluene_molfrac"] = value(m.fs.flash.vap_outlet.mole_frac_comp[0,"toluene"]) f["flash_vap_temperature"] = value(m.fs.flash.vap_outlet.temperature[0]) f["flash_vap_pressure"] = value(m.fs.flash.vap_outlet.pressure[0]) f["heater_heat_duty"] = value(m.fs.heater.heat_duty[0])
Note
ipopt will need to be available in your environment. This should be available through the following command during the generic install of IDAES in the environment:
idaes get-extensions
Once the model is solved, the values of flowsheet output variables are assigned to the node output variables.
Click the Run button to run the python script and check the node output variables section, note that their values should have changed.
It should be noted that the values within Node Input Variables can be changed as per user’s requirement, to run different cases.
Note
For more information on installing IDAES, along with building and solving IDAES models, refer to the IDAES documentation: https://idaes-pse.readthedocs.io/en/stable/index.html
This tutorial demonstrates the capability of simulating IDAES based process models in FOQUS. However, optimization problems can also be solved using IDAES in FOQUS, by providing the required degrees of freedom.
It is recommended that FOQUS and IDAES must be installed in the same conda environment for this example to run successfully.
The complete FOQUS file (FOQUS_IDAES_Example.foqus), that includes the IDAES model, is located in: examples/tutorial_files/IDAES. The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Simulation Standard Interface (SimSinter)¶
Contents¶
SimSinter Configuration¶
SimSinter is the standard interface library that FOQUS and Turbine use to drive the target simulation software.
SimSinter currently supports:
- AspenPlus (versions 8, 9, and 10)
- Aspen Custom Modeler (ACM) (versions 8, 9, and 10)
- gPROMS
- Microsoft Excel
SimSinter is used to: (1) open the simulator, (2) initialize the simulation, (3) set variables in the simulation, (4) run the simulation, and (5) get resulting output variables from the simulation.
To drive a particular simulation, SimSinter must be told which input variables to set and which output variables to read when the simulation is finished (there are generally far too many variables in a simulation to set and read them all). Each simulation must have a “Sinter Config File” which records this information. FOQUS keeps the simulation file and the “Sinter Config File” together and sends them to the Turbine gateway when a simulation run is requested.
The configuration is simplified by a GUI included with the SimSinter distribution called, “SinterConfigGUI.” FOQUS can launch the SinterConfigGUI on simulations that have not been configured. To run the “SinterConfigGUI” the user must have:
- SimSinter distribution installed. SimSinter is installed by the FOQUS bundle installer.
- The simulation file the user wants to configure. For example, if the user has an Aspen Custom Modeler simulation called BFB.acmf, that file must be on the user’s computer, and the user should know its location.
- The application used to execute the simulation file. For example, if the user wants to configure an Aspen Custom Modeler simulation called BFB.acmf, Aspen Custom Modeler must be installed on the user’s machine.
The rest of this section details two step-by-step tutorials on configuring a simulation with “SinterConfigGUI.” The first simulation is an Aspen Custom Modeler simulation and the second, Aspen Plus. Please also see the D-RM Builder tutorials for configuring dynamic ACM models. For more details on SimSinter or a tutorial on how to configure a Microsoft Excel file, please see the “SimSinter Technical Manual,” which is included in the FOQUS distribution. The default location is at C:Program Files (x86)foqusfoqusdoc. It is also available on the CCSI website.
Tutorial¶
Tutorial 1: Aspen Custom Modeler (ACM) Configuration¶
The files (both the ACM and the JSON files) for this tutorial are located in: examples/test_files/Optimization/Model_Files/
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
The “SinterConfigGUI” can be launched from FOQUS, via the Create/Edit button found in Session\(\rightarrow\) Add/Update Model to Turbine or “SinterConfigGUI” may be run on its own by selecting SimSinter \(\rightarrow\) SinterConfigGUI from the Windows Start menu.
The splash window displays, as shown in Figure SinterConfigGUI Splash Screen. The user may click the splash screen to proceed, or wait ten seconds for it to close automatically.
SinterConfigGUI Splash Screen
The SinterConfigGUI Open Simulation window displays (Figure SinterConfigGUI Open Simulation Window). If “SinterConfigGUI” was opened from FOQUS, the filename text box already contains the correct file. To proceed immediately click Open File and Configure Variables or click Browse to search for the file. For this tutorial, the ACM model (
BFBv6.2.acmf
) for a bubbling fluidized bed adsorber (located in theexamples/test_files/Optimization/Model_Files
folder) is selected. Once the file is selected, click Open File and Configure Variables. The user can open a fresh ACM simulation (.acmf file) or an existing SimSinter configuration file. For this example, open a fresh simulation.Note
Opening the simulation may take a few minutes depending on how quickly Aspen Custom Modeler can be opened.
SinterConfigGUI Open Simulation Window
Aspen Custom Modeler starts in the background. This is so the user can observe things about the simulation while working on the configuration file.
The SinterConfigGUI Simulation Meta-Data window displays. (Figure SinterConfigGUI Simulation Meta-Data Page Save Name Text Box). The first and most important piece of metadata is SimSinter Save Location at the top of the window. This is where the sinter configuration file is saved. The system attempts to locate a reasonable file location and file name; however, the user must confirm the correct file location, since it automatically overwrites whatever file name currently exists.
SinterConfigGUI Simulation Meta-Data Page Save Name Text Box
Continue to complete the remaining fields and then click Next (Figure SinterConfigGUI Simulation Meta-Data Page with Data Completed).
SinterConfigGUI Simulation Meta-Data Page with Data Completed
In the SinterConfigGUI Variable Configuration Page, (Figure SinterConfigGUI Variable Configuration Page before Input) notice that the ACM Selected Input Variables: TimeSeries, Snapshot, RunMode, printlevel and homotopy are already included in the input variables. TimeSeries and Snapshot are for dynamic simulations. RunMode can be either “Steady State” or “Dynamic”. The Dynamic mode requires a dynamic ACM model. For this simulation, the RunMode is Steady State. The homotopy variable can be set to “1” so that homotopy is on by default. Notice that the Dynamic column (the first column) in each row contains a checkbox, enabling the user to select if the input variable in the row is a dynamic variable. Also notice that a Variable Type search box is on the left. This search is exactly the same as Variable Find on the Tools menu in Aspen Custom Modeler. Please refer to the ACM documentation for details on search patterns.
SinterConfigGUI Variable Configuration Page before Input
A search for everything in the “BFBAdsT” block has been selected. The following Search in Progress dialog is displayed (Figure Search in Progress Bar Page). Sometimes large searches take a while.
Search in Progress Bar Page
First, select the “BFBadsT.A1” scalar variable in the Selected Path field (Figure SinterConfigGUI Variable Configuration Page BFBadsT.A1 Selected).
SinterConfigGUI Variable Configuration Page BFBadsT.A1 Selected
If the user double-clicks, presses Enter, or clicks Preview or Lookup, information displays in the Preview Variable section (Figure SinterConfigGUI Variable Configuration Page BFBadsT.A1 Preview). Here, the user can verify the variable choices.
SinterConfigGUI Variable Configuration Page BFBadsT.A1 Preview
“BFBadsT.A1” is the correct variable; therefore, click Make Input. Information displays in the Selected Input Variables section (Figure SinterConfigGUI Variable Configuration Page BFBadsT.A1 Made Input).
SinterConfigGUI Variable Configuration Page BFBadsT.A1 Made Input
Change the variable name from “BFBadsT.A1” to something more descriptive (e.g., “WaterA”). Set Name, Description and Min/Max as shown in Figure SinterConfigGUI Variable Configuration Page BFBadsT.A1 Change Name.
SinterConfigGUI Variable Configuration Page BFBadsT.A1 Change Name
One input variable is now displayed (Figure SinterConfigGUI Variable Configuration Page Vector Preview). At least one output variable is required. In this example, the vector of calculated bubble sizes is wanted. Scroll down under Search and select “BFBadsT.db.Value,” “BFBadsT.db.Value(0),” “BFBadsT.db.Value(1),” etc. If a name with a number in parenthesis at the end is selected, it is a specific entry in the vector. If a basic name is selected (“BFBadsT.db.Value”), the entire vector is displayed. Select the whole vector and click Preview.
SinterConfigGUI Variable Configuration Page Vector Preview
Click Make Output if the variable the user wants is selected. Notice that this variable has a unit “m” (Figure SinterConfigGUI Variable Configuration Page Vector As Output).
SinterConfigGUI Variable Configuration Page Vector As Output
Change the Name of the variable to “Diameter.” Bubble size is measured in meters; however, meters should be converted to millimeters (mm). Now, the output from the simulation should present bubble diameter in mm (Figure SinterConfigGUI Variable Configuration Page Output Change Units). Internal to the simulation, the unit remains “m.”
SinterConfigGUI Variable Configuration Page Output Change Units
To add a single item in a vector, select “BFBadsT.Ar.Value(1)” and click Make Input (See Figure SinterConfigGUI Variable Configuration Page Removal Demo). To remove item that was just added, select it and click Remove Variable.
SinterConfigGUI Variable Configuration Page Removal Demo
Select the correct variable vector “BFBadsT.Ar.Value” and make it an input (Figure SinterConfigGUI Variable Configuration Page Read Input). Notice that a Default or Min/Max cannot be set in the GUI for a vector. The correct defaults (from the simulation) are set automatically. To change the Min/Max values, the user must edit the JSON file in a text editor.
SinterConfigGUI Variable Configuration Page Read Input
Click Next to display the SinterConfigGUI Vector Default Initialization window as shown in Figure SinterConfigGUI Vector Default Initialization Input Page. Since the input variable “Value” is a vector, its default values can be modified in the window. In this case there is no need to change the values.
SinterConfigGUI Vector Default Initialization Input Page
The simulation is now setup. Save the configuration file by clicking Finish. The file is saved to the location specified on the SinterConfigGUI Simulation Meta-Data page. Clicking Finish will close the SinterConfigGUI, but NOT Aspen Custom Modeler. The user must close ACM manually.
If “SinterConfigGUI” was launched from FOQUS, the path to the configuration file is automatically passed to FOQUS. The next step in FOQUS is to click OK in the Add/Update Turbine Model window. FOQUS may then be used to upload it to the Turbine gateway. If “SinterConfigGUI” was not launched from FOQUS (e.g., it was launched from the Start menu), the configuration file name must be entered in FOQUS manually.
Tutorial 2: Aspen Plus Configuration¶
The files (both the Aspen Plus file and the JSON file) for this tutorial are located in: examples/tutorial_files/SimSinter/Tutorial_2
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
The initial steps for opening a simulation and entering metadata for an Aspen Plus simulation are similar to ACM. Refer to the SimSinter ACM tutorial Tutorial 1: Aspen Custom Modeler (ACM) Configuration. In this tutorial, a flash model “Flash_Example.bkp” (located in the above-mentioned folder) is used as an example. Open the Aspen Plus file and enter the metadata as shown in Figure SinterConfigGUI Simulation Meta-Data with Data Completed.
SinterConfigGUI Simulation Meta-Data with Data Completed
The SinterConfigGUI Variable Configuration Page displays as illustrated in Figure SinterConfigGUI Variable Configuration Page Empty Variables. Aspen Plus has no settings, so there are no setting variables in the input section. Unlike ACM, AspenPlus displays the Variable Tree on the left side, so the user can explore the tree as is done in Aspen Plus Tools \(\rightarrow\) Variable Explorer. Unfortunately, searching is not possible.
SinterConfigGUI Variable Configuration Page Empty Variables
Variable Tree nodes can be expanded for searching (Figure SinterConfigGUI Variable Configuration Page Expanded Aspen Plus Variable Tree).
SinterConfigGUI Variable Configuration Page Expanded Aspen Plus Variable Tree
The user can type the node address directly into the Selected Path field (this is useful for copy/paste from Aspen Plus’ Variable Explorer) (Figure SinterConfigGUI Variable Configuration Page Aspen Plus Variable Selected). Click Lookup or Preview (which automatically causes the tree to expand and selects selected variables in the Variable Tree).
SinterConfigGUI Variable Configuration Page Aspen Plus Variable Selected
To make the temperature of the Flash chamber an Input Variable, click Make Input. Additionally, the user can Name the variable, fix the Description, and enter the Min/Max fields by clicking on the appropriate text and entering it.
SinterConfigGUI Variable Configuration Page Input Variable
Select an Output Variable, Preview it, and click Make Output. Next, update the fields as with the Input Variable to give a better Name and Description.
SinterConfigGUI Variable Configuration Page Output Variable
The task is complete. Save it by clicking Save or CTRL+S. The file is saved to the location specified in the SinterConfigGUI Simulation Meta-Data page. If the user wishes to save a copy under a different name, navigate back to the SinterConfigGUI Simulation Meta-Data page and change the name.
Tutorial 3: Microsoft Excel Configuration¶
The files (both the Excel and the JSON files) for this tutorial are located in: examples/tutorial_files/SimSinter/Tutorial_3
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
The “SinterConfigGUI” can be launched from FOQUS, via the Create/Edit button found in File\(\rightarrow\) Add/Update Model to Turbine or “SinterConfigGUI” may be run on its own by selecting CCSI Tools \(\rightarrow\) FOQUS \(\rightarrow\) SinterConfigGUI from the Start menu.
The splash window displays, as shown in Figure SinterConfigGUI Splash Screen. The user may click the splash screen to proceed, or wait 10 seconds for it to close automatically.
SinterConfigGUI Splash Screen
The SinterConfigGUI Open Simulation window displays (Figure SinterConfigGUI Open Simulation Window). If “SinterConfigGUI” was opened from FOQUS, the filename text box already contains the correct file. To proceed immediately click Open File and Configure Variables or click Browse to search for the file. For this tutorial, a fresh copy of the BMI (body mass index) test is opened (exceltest.xlsm). It is located in:
examples/tutorial_files/SimSinter/Tutorial_3
SinterConfigGUI Open Simulation Screen
Microsoft Excel starts in the background. This is so the user can observe things about the worksheet while working on the configuration file.
In the “SinterConfigGUI” the SinterConfigGUI Simulation Meta-Data page is now displayed (Figure SinterConfigGUI Simulation Meta-Data Save Text Box). The first and most important piece of metadata is Save Location at the top of the window. This is where the sinter configuration file is saved. The system attempts to locate a reasonable file location and file name; however, the user must confirm the correct file location, since it automatically overwrites whatever filename currently exists.
SinterConfigGUI Simulation Meta-Data Save Text Box
Continue to complete in the remaining fields and click Next.
In the SinterConfigGUI Variable Configuration Page, (Figure SinterConfigGUI Variable Configuration Page before Input) notice that the Excel setting variable macro is already included in the Selected Input Variables. If the Excel spreadsheet has a macro that should be run after SimSinter sets the inputs, but before SimSinter gets the outputs, enter the macros name in the Default text box. If the Default box is left blank, no macro is run (unless a name is supplied in the input variables when running the simulation). If the user needs to run multiple macros (e.g., Macro1 and Macro2), we recommend that the user create a “Master” macro in the Excel file that automatically runs Macro1 and Macro2 using the Call statement. Let’s suppose that the “Master” macro is named MasterMacro. Then, in SimSinter, the user will need to type in MasterMacro in the Default text box under the Excel setting variable macro.
SinterConfigGUI Variable Configuration Page before Input
The Excel simulation has the same Variable Tree structure as Aspen Plus, as shown in (Figure SinterConfigGUI Variable Configuration Page Selecting a Variable from the Excel Variable Tree). Only the variables in the active section of the Excel spreadsheet appear in the Variable Tree. If, for some reason, a cell does not appear the in tree, the user may manually enter the cell into the Selected Path text box. In this case, select the “height$C$4” variable.
Note: Row is first in the Variable Tree, yet column is first in the Path.
SinterConfigGUI Variable Configuration Page Selecting a Variable from the Excel Variable Tree
If the user double-clicks, presses enter, clicks Preview, or clicks Lookup, the variable will be displayed in the Preview Variable frame. Click the Make Input button to make the variable an input variable. Now the variable is in the Selected Input Variables section, and its meta-data may be edited (Figure SinterConfigGUI Variable Configuration Page Description “Joe’s Height”).
SinterConfigGUI Variable Configuration Page Description “Joe’s Height”
Enter an output variable (such as, “BMI$C$3”), by selecting the variables in the Variable Tree, clicking Preview, and then clicking Make Output (Figure SinterConfigGUI Variable Configuration Page Selecting Excel Output Variables).
SinterConfigGUI Variable Configuration Page Selecting Excel Output Variables
The simulation is now set up. To save the configuration file, click Finish or press CTRL+S. The file is saved to the location that was set on the SinterConfigGUI Simulation Meta-Data window. A user can save a copy under a different name, by navigating back to the SinterConfigGUI Simulation Meta-Data window using Back, and then changing the name. This creates a second version of the file.
Tutorial 4: gPROMS Configuration¶
gPROMS is significantly different from the other simulators SimSinter supports, and the workflow is also significantly different. If you plan to use gPROMS simulations with FOQUS, the CCSI team strongly encourages you to read the [SimSinter gPROMS Technical Manual](https://github.com/CCSI-Toolset/SimSinter/blob/master/docs/SimSinter%20gPROMS%20Technical%20Manual.pdf).
Unlike Aspen, changes must be made to the gPROMS simulation process in order to work with SimSinter. Therefore, this section consists of a series of tutorials for every step of configuring gPROMS and SimSinter to work together. All the tutorials are required in order to have a gPROMS simulation be runnable with SimSinter. They are divided up to make later reference easier.
The files for this tutorial are located in: examples/tutorial_files/SimSinter/Tutorial_4
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Configuring gPROMS to Work with SimSinter¶
Unlike Aspen, changes have to be made to the gPROMS simulation process in order to work with SimSinter. In fact, SimSinter does not define the inputs to the simulation, gPROMS does. On the other hand, gPROMS does not determine the outputs, SimSinter does. This odd and counter-intuitive situation is the result of how gPROMS gO:Run_XML is designed.
The modification to the gPROMS simulation must be done by a developer with an intimate understanding of the simulation, usually the simulation writer. In some cases additional variables may need to be added to handle an extra step between taking the input and inserting it into the variable where gPROMS will use the data.
Open the gPROMS simulation file (ends in .gPJ) in ModelBuilder 4.0 or newer. For this example, use the gPROMS install test file “BufferTank_FO.gPJ”, found in:
examples/tutorial_files/SimSinter/Tutorial_4
Double-click on the .gPJ file to open ModelBuilder, as shown in Figure Opening BufferTank in gPROMS Model Builder.
Opening BufferTank in gPROMS Model Builder
This simulation was originally a simple BufferTank simulation. However, it was modified into an example of all the different kinds of variables the user can pass into gPROMS via SimSinter. Therefore, it has a lot of extra variables that do not really do anything, with very generic names, like “SingleInt.” The simulation consists of a single model, “BufferTank”, that contains all the simulation logic, and most of the parameter and variable declarations. The SimSinter simulation will change some of these PARAMETERS and VARIABLES to change the output of the simulation.
Viewing BufferTank in gPROMS Model Builder
The example file contains two Processes. SimSinter can only run gPROMS Processes, so any gPROMS simulation must be driven from a Process. “SimulateTank” is the original BufferTank example with hardcoded values, “SimulateTank_Sinter” contains the example of setting values with Sinter. The “SimulateTank_Sinter” example will be recreated in this tutorial.
Viewing SimulateTank in gPROMS Model Builder
First copy the existing hard-coded Process “SimulateTank”.
Copying SimulateTank
Right-click on Processes and select Paste to make a new process.
Paste SimulateTank
The new process will be named “SimulateTank_1”. Rename the process by right-clicking on it and selecting Rename.
Rename SimulateTank
Now open up the new “SimulateTank_tutorial” Process. It has the same hard-coded values as “SimulateTank”.
Opening SimulateTank_tutorial
First, the user needs to add a FOREIGN_OBJECT named “FO” in the PARAMETER section. Then the user needs to set that FOREIGN_OBJECT to “SimpleEventFOI::dummy” in the SET section. This FOREIGN_OBJECT is how inputs are received from SimSinter.
Adding the FOREIGN_OBJECT
This particular simulation has a large number of input variables that simply demonstrate how to set different types. These are named based on their type. Any variable named similarly to “SingleInt” or “ArraySelector” can be safely ignored for this tutorial. For a full list of the methods for setting different types see the later section specifically for covering that. Any variable in the simulation can be an input, whether it is defined in the Process or one of the models referenced by the process, or in a model referenced by a model,� etc. All inputs take their values from the FOREIGN_OBJECT defined, followed by the type name, two underscores, the input variable name, an open parenthesis, an optional index variable (for arrays), and closed with a close parenthesis and a semicolon. For a scalar:
FO.<Type>__<InputName>();
SimSinter can only handle arrays inputted in FOR loops such as:
FOR ii := 1 TO <array size> DO <ArrayName>(ii) := FO.<Type>1__<InputName>(ii); END
For this example the user only really needs to set “T101.Alpha” in PARAMETER, “T101.FlowIn” in ASSIGN, and “T101.Height” in INITIAL.
Setting up Input Variables
Now test “SimulateTank_tutorial” by selecting it and clicking the green Simulate triangle. When the simulation runs it will ask for every input variable the user has defined. For the example variables that do not effect the simulation, such as “SingleInt”, any valid value will work. For the values that do effect the simulation, these values work:
REAL__AlphaFO = .08 REAL__FlowInFO = 14 REAL__HeightFO = 7.5
Testing SimulateTank_Tutorial
Exporting an Encrypted Simulation to Run with SimSinter¶
SimSinter can only run encrypted gPROMS simulations. These files have the .gENCRYPT extension. If the additions to the simulation for reading input variables ran correctly in the previous section, the user is ready to export that process for use by SimSinter.
Right-click on the Process to export (“SimulateTank_tutorial”) and select Export.
Select “Export”
In the resulting Export window, select Encrypted input file for simulation by gO:RUN and click OK.
Select “Encrypted Input File”
On the second page, set the Export directory to a directory the user can find. Preferably one without any other files in it so the user will not be confused by the output. If the filename or the Encryption password are not changed, SimSinter will be able to guess the password. If either of those values are changed, the user will have to set the correct password in the SinterConfigGUI password setting. A Decryption password is probably unnecessary, as the user has the original file. SimSinter does not use it. When the user has finished setting up these fields, click Export Entity.
Export Entity Page
The resulting .gENCRYPT file will be saved to a subdirectory named “Input” in the save directory specified in Step 3. The first part of the name will be identical to the .gPJ filename. The user should not rename it because the SinterConfig file will guess this name, and currently changing it requires editing the SinterConfig file.
Configuring SimSinter to Work with gPROMS¶
Now that the gPROMS process has been prepared, the SimSinter configuration can be done.
The “SinterConfigGUI” can be launched from FOQUS, via the Create/Edit button found in File\(\rightarrow\) Add/Update Model to Turbine or “SinterConfigGUI” may be run on its own by selecting CCSI Tools \(\rightarrow\) FOQUS \(\rightarrow\) SinterConfigGUI from the Start menu.
The splash window displays, as shown in Figure SinterConfigGUI Splash Screen. The user may click the splash screen to proceed, or wait 10 seconds for it to close automatically.
SinterConfigGUI Splash Screen
The SinterConfigGUI Open Simulation window displays (Figure SinterConfigGUI Open Simulation Screen). If “SinterConfigGUI” was opened from FOQUS, the filename text box already contains the correct file. To proceed immediately click Open File and Configure Variables or click Browse to search for the file.
This tutorial will use the .gPJ file edited in Section 1.1. Remember that SinterConfigGUI cannot read the .gENCRYPT file that is actually run by SimSinter. Instead, the user must open the .gPJ file the ModelBuilder uses.
Once the file is selected, click Open File and Configure Variables.
SinterConfigGUI Open Simulation Screen
The SinterConfigGUI Simulation Meta-Data window displays as shown in (Figure SinterConfigGUI Simulation Meta-Data Save Text Box). Unlike the other simulations, gPROMS has not started up in any way. SinterConfigGUI does not get information from gPROMS directly, it must parse the .gPJ file instead.
The first and most important piece of meta-data is the SimSinter Save Location at the top of the window. This is where the Sinter configuration file is saved. The system suggests a file location and name. The user should confirm this is the intended location of the files to not accidently overwrite other files. Enter the remaining fields to provide the meta-data to describe the simulation that was just opened and then click Next.
SinterConfigGUI Simulation Meta-Data Save Text Box
The SinterConfigGUI Variable Configuration Page window displays as shown in Figure SinterConfigGUI gPROMS Settings Configuration. gPROMS has two settings, ProcessName and password. SimSinter has guessed at both the ProcessName and the password. For this example the password is correct, but the ProcessName is incorrect. SimulateTank is the process that isn’t configured to work with SimSinter. On the left side we can see the Variable Tree. The root is connected to the three processes defined in this .gPJ file. First, change the ProcessName to “SimulateTank_tutorial”.
SinterConfigGUI gPROMS Settings Configuration
After changing the ProcessName, click Enter (or clicks away). The Selected Input Variables will automatically display all of the available input variables. This is because the input variables have been configured in gPROMS, and SimSinter has parsed them out of the .gPJ file, as long as you have the ProcessName set correctly. This also means that the user cannot add new input variables in SinterConfigGUI, only in gPROMS. SimSinter also does its best to identify the Default values, Min, and Max of the variables. The default can only be calculated from the file if it was defined purely in terms of actual numbers. SimSinter cannot evaluate other variables or functions. Therefore,
DEFAULT 2 * 3.1415 * 12
will work. However,
DEFAULT 2 * PI * radius
will not work, because SimSinter does not know the value of either PI or radius, and SimSinter will just set the default to 0.
Min and Max values are taken from the variable type, if there is one.
SinterConfigGUI Automatically Displays Input Variables
Now the output values can be entered. Expand the “SimulateTank_tutorial” Process on the Variable Tree, expand the “T101” model, and then double-click on “FlowOut” to make it the Preview Variable. Notice that the Make Input button is disabled. As stated above, the user cannot make new Input Variables in SinterConfigGUI. Only Make Output is allowed.
Preview of the FlowOut Variable
If Make Output is clicked, “FlowOut” will be made an Output Variable as shown in Figure FlowOut as an Input Variable. The Description can be updated, but SimSinter made a good guess in this example; therefore, there is no need to change the description.
FlowOut as an Input Variable
By the same method, make Output Variables “HoldUp” and “Height.”
HoldUp and Height Output Variables
The variables names should be made shorter. Simply click on the Name column and change the name to your preferred name.
Editing Variable Names
For future testing, make sure the defaults are good values. The only three input variables that matter have the following defaults:
AlphaFO = 0.8 FlowInFO = 14 HeightFO = 7.5
Editing Defaults
When finished making output variables, click Next at the bottom of the variables page. If there were any input vectors, the Vector Default Initialization page will display. Here the default values of the vectors may be edited.
Editing Vectors
Finally, click Finish and save your configuration file. Your gPROMS simulation should now be runnable from FOQUS.
Surrogate Model Based Optimizer¶
Contents¶
Surrogate model-based optimizer - overview¶
Introduction¶
As part of the improvements and new capabilities of FOQUS, the Surrogate Model-based Optimizer is an automated framework for hybrid simulation-based and mathematical optimization. The SM-based Optimizer in FOQUS leverages the direct link with commercial simulators, generation of surrogate models, access to algebraic modeling systems for optimization, and implements a modified trust region approach for the optimization of advanced process systems.
The motivation behind developing this framework was to combine the advantages of both, simulation based, and pure mathematical optimization. Pure mathematical optimization directly leverages the equations describing the physical system to be optimized. Such models are the most accurate and complete representation of the system, and thus provide the most accurate optimization results. This approach encounters challenges, however, when large sets of PDE’s and highly complex, nonlinear representations are required to sufficiently characterize the process of interest. The mathematical model can then become intractable. Simulation-based optimization, on the other hand, considers the system model to be a black box and is based on a heuristic algorithm that uses the results from process simulations to obtain the relationship between the relevant system input and output variables. Although this approach can be used to obtain satisfactory results for large scale, complex systems, it can often be computationally expensive and hence, time consuming, due to multiple simulation runs.
The SM-based optimization algorithm involves generating a simplified representation of the rigorous process model (i.e. built using advanced commercial simulators like ASPEN, gPROMs, Python, etc.) via surrogate models that are more amenable to gradient-based optimization methods and nonlinear programming (NLP) solvers. This approach can overcome the difficulties associated with complex process models in terms of intractability and multiple evaluation requirement, without significantly compromising solution quality and speed, provided that the surrogate modeling method is accurate.
Additional python packages required¶
- Surrogate modeling toolbox - smt: pip install smt
- Experimental design package for python - pyDOE: pip install pyDOE
- Pyomo package for optimization: pip install pyomo
- Mathematical optimization solver ipopt: conda install -c conda-forge ipopt (preferred installation method for Windows users)
Note: smt package is required to access its Latin hypercube sampling method, which is required to generate samples and re-build surrogate models in each iteration of the algorithm. pyDOE package is a requirement within the smt package, which makes its installation necessary.
Framework¶

Figure 1 - Framework for surrogate model-based optimization algorithm

Algorithm Steps
As shown in figure 1, the framework consists of 6 main steps, in which the first 2 steps require the user interaction, while the rest of the algorithm will be performed automatically. The detailed description of each Step is provided here:
Step 1 – Flowsheet set up: First, the user must provide a rigorous process simulation to the FOQUS flowsheet, then select the input and output variables of interest. Once, the simulation node has been tested and the user provided input variables with their default values, upper and lower bounds, the user needs to generate simulation samples using the UQ module in FOQUS for a given input space. At this point, the upper and lower variable bounds will be considered as the initial trust region, and the samples will be used to develop the initial surrogate model.
Step 2 – Surrogate Model Development: This step is simple, but critical to minimize the number of iterations required in the algorithm. The user must select the number of samples, and alamo settings to generate the best surrogate possible. Finally, The user generates a surrogate model based on the simulation samples using FOQUS-ALAMO module.
Step 3 – Mathematical Optimization: In the Optimization module, setup the problem by selecting the decision variables, providing the objective function, and additional constraints. Since, FOQUS Optimization module allows multiple derivative free optimizers (DFO), user must select the surrogate model-based optimizer as the solver, with appropriate settings for the algorithm (detailed description of the settings is provided in the tutorial). The SM-based optimizer formulates and solves the optimization problem by creating a Pyomo model (Concrete Model), adding the input and output variables (as Pyomo variables with bounds – trust region), adding the surrogate models as Pyomo constraints, and adding additional constraints provided by the user (g(x)>0 or h(x)=0). In this step, to avoid eliminating feasible solutions due to local optimums, a multi-start approach has been implemented, in which the optimization problem is solved for different initialization points. A combination of initial values is used based on the variable bounds, mid-point, and user provided values for the decision variables. The optimal solution chosen corresponds to that case which gives best value of objective function (minimum or maximum). Note, if a solution returns infeasible it will be eliminated. The solution is called x* and ysm, for optimum decision variables and output variables, respectively.
Step 4 – Rigorous Process Simulation: In this step, the process simulation is run at the optimal point obtained in step 3 (x*), then evaluating the optimal solution using the rigorous model, we obtain the corresponding output variable values ysim.
Step 5 – Termination Condition Check: The algorithm includes three termination conditions to determine if the optimal solution has been obtained:
First, Equation 1 checks if the objective function from the surrogate model (zsm) minus the one obtained evaluating the rigorous model (zsim) meet the tolerance. Secondly, the relative error between the output variables from the optimization problem (ysm) and the rigorous simulation (ysim) in Equation 2. Finally, Equation 3 checks that the additional constraint is satisfied at the optimum point. If the conditions in step 5 are satisfied, the algorithm is terminated, otherwise, step 6 is implemented.
Step 6 – Update Trust Region: In this step, the input variable upper and lower bounds (xub and xlb) are adjusted to shrink the trust region. The extent to which the trust region shrinks (difk) depends on the fractional multiplier α. The updated upper and lower bounds (xub,k+1 and xlb,k+1) are calculated around x*, based on difk:
Note that if the ratio of upper and lower bounds is less than or equal to a set value of bound ratio, the trust region is not updated further, and the algorithm terminates.
If
Stop
Further, Latin hypercube samples are generated in the updated trust region. This sampling method ensures that the sample points are uniformly spaced out and cover the entire trust region without any skewness. Once the samples are generated, step 2 is repeated using this new data set and the original ALAMO settings.
Surrogate model-based optimizer - tutorial¶
Flash Optimization¶
Problem Statement: An Ethanol-CO2 mixture at 50 mol %, enters a flash column at 100 kg/hr, 25 0C and 100 bars. The optimum flash column pressure needs to be determined such that maximum revenue can be obtained based on the CO2 obtained in the vapor stream, and Ethanol obtained in the liquid stream. The optimization is subject to a purity constraint, specifying that the CO2 mass % in the vapor phase should be at least 98.5 %. The system is shown in Figure 2.

Figure 2: Ethanol-CO2 Flash System
Instructions
Step 1 - Flowsheet Setup
Step 1.1 - Setup the Aspen model for flash column as FOQUS simulation node : To setup the Aspen model in the FOQUS flowsheet, first, create and add the SimSinter json file to turbine. Then, create a node named ‘FLASH’, and load the simulation in the node. The Aspen and json files (along with the FOQUS file) can be found in the folder: examples/tutorial_files/SM_Optimizer/Flash_Optimization.
Note
The examples/ directory refers to the location where the FOQUS examples were installed, as described in Install FOQUS Examples.
Figure 3 and 4 represent the FOQUS node with loaded simulation. Finally, run the flowsheet simulation.
![]()
Figure 3: Input variables of the Ethanol-CO2 Flash Simulation Node in FOQUS
![]()
Figure 4: Output variables of the Ethanol-CO2 Flash Simulation Node in FOQUS
Step 1.2 - Generate a simulation ensemble by selecting ‘FLASH.PRES’ as a variable with bounds 1-10 bar (in this case, keep the other variables fixed). Select Latin Hypercube Sampling with 20 points, and after the samples are generated, launch the simulations. Figure 5 represents the simulation ensemble generation.
![]()
Figure 5: Simulation ensemble generation
For more details on this, refer to the documentation: https://foqus.readthedocs.io/en/latest/chapt_uq/tutorial/sim.html
Step 2 - Surrogate Model Development
Step 2.1 - Select Data Set: In the surrogate modeling module, select ALAMO as the tool and under ‘Data’ tab, ensure that the dataset corresponds to the correct UQ Simulation Ensemble. If there are multiple data sets, add filters to select the appropriate set. Figure 6 represents the data selection in the surrogates tab.
![]()
Figure 6: Select data for surrogate model generation
Note: If a particular simulation ensemble needs to be used from the UQ module for generating the surrogate model, add a data filter, referring to the instructions in the documentation: https://foqus.readthedocs.io/en/latest/chapt_uq/tutorial/data.html
Step 2.2 - ALAMO input/output variables: Under ‘Variables’ tab, select ‘FLASH.PRES’ as the surrogate model input variable, and ‘FLASH.CARBOLIQ’, ‘FLASH.CARBOVAP’, ‘FLASH.ETHANLIQ’, ‘FLASH.ETHANVAP’ as the surrogate model output variables. Figure 7 represents surrogate model variables selection.
![]()
Figure 7: Select variables for surrogate model generation
Step 2.3 - ALAMO Settings: Under ‘Method Settings’, to select the data set to be used to develop the surrogate models, an Initial Data Filter can be applied to the full data set, if there are no filters, simply select “all”. In this case, we select “uq2” filter. In Figure 8 and 9, the settings 3 to 9 values are default in FOQUS. The settings 10 to 22 have been selected to explore several basis functions and obtain the best model possible, while minimizing the size of the model (selecting Bayesian Inference Criteria as the modeler). The rest of the settings are kept as their default values. For more information about the best settings to be used in ALAMO, please see the following documentation: https://foqus.readthedocs.io/en/latest/chapt_surrogates/tutorial/alamo.html
![]()
Figure 8: Select appropriate method settings for surrogate model generation
![]()
Figure 9: Select appropriate method settings for surrogate model generation continued
Note that setting number 42 is the name of the python file that gets created after ALAMO runs. It contains the Pyomo model for optimization, based on the ALAMO generated surrogate model. This python file is accessed by the SM based optimizer.
Step 2.4 - Under ‘Execution’, run ALAMO, as shown in figure 10:
![]()
Figure 10: Run ALAMO to generate surrogate model
Step 3 - Mathematical Optimization:
Step 3.1 - Problem Setup - select optimization variables: In the Optimization module, select ‘FLASH.PRES’ as the decision variable. Keep the other input variables fixed, as shown in figure 11.
![]()
Figure 11: Select optimization variables
Step 3.2 - Problem Setup - objective function and additional flowsheet constraints declaration: The objective is to maximize the separation process, therefore, we assume that the selling price of the vapor and liquid are $5/kg and $30/kg, respectively. Additionally, the CO2 vapor stream must be at least 98.5% pure. In the Objective/Constraints tab, under the objective function f(x) expression section or box, enter
-5*f.FLASH.CARBOVAP -30*f.FLASH.ETHANLIQ
Under the inequality constraints section/box expression, enter-f.FLASH.CARBOVAP/(f.FLASH.CARBOVAP + f.FLASH.ETHANVAP) + 0.985
![]()
Figure 12: Add objective function and constraints to the solver
Step 3.3 - Optimization solver settings: Under the solver tab, select “SM_Optimizer”
![]()
Figure 13: Select appropriate solver options
Figure 13 shows the solver options. solver options 1 to 11 are algorithm specific.
Solver option 1 selects the source of mathematical optimization solver. It can either be “gams” or “pyomo”. It is preferred to keep it at the default setting, “pyomo”.
Solver option 2 selects the mathematical optimization solver which will be used to solve the optimization at each iteration. It is preferred to keep it at the default setting, “ipopt”.
Solver option 3 selects the type of mathematical model that is formulated. This is used when “gams” is selected as the solver source. Depending on the type of problem, it can be non-linear programming “nlp”, linear programming “lp”, or mixed integer non-linear programming “minlp”. The setting would be “nlp” for this case.
Solver option 4 describes the maximum number of iterations that are allowed before the algorithm terminates. It can be set to 10 in this case.
Solver option 5 describes the value of ‘alpha’ which is a fractional multiplier that affects the extent to which the trust region shrinks at each iteration. The smaller this value is, faster is the rate of convergence of the algorithm. However, a very small value might discard the optimal solution. A value of 0.8 is chosen for this case.
Solver option 6 describes the number of Latin hypercube samples for generating the surrogate model in each iteration. Note that more the number of samples, a more accurate surrogate model could be obtained, however, the algorithm would take a longer time to converge. A value of 10 is chosen in this case.
Solver option 7 describes the lower limit of the ratio of upper and lower bounds of the decision variables. This condition is imposed while shrinking the trust region, to ensure that the solver converges. A value of 1 is chosen in this case.
Solver option 8 allows the user to display the mathematical optimization solution at each iteration
Solver options 9, 10, 11 describe the tolerance for the objective value, inequality constraint, and output variable value termination conditions, respectively. A value of 0.001 is chosen in this case.
Solver option 12: if true, the optimization results will be stored in the FOQUS flowsheet. i.e. input and output variable values.
Since, each Algorithm iteration includes the generation of surrogate models, a call to Pyomo solver, and a call to the rigorous process simulation, the results are stored in the flowsheet results data tab, under the set name provided by the user in option 13. Solver option 14 corresponds to the python file containing the Pyomo model for the initial surrogate model developed in the previous steps. The name should match setting number 42 in the ALAMO settings. User can select the names of text and python files from option 15 to 17. The names should end with the required extension ‘.txt’ for text file and ‘.py’ for python file.
Step 3.4 Under the Run tab, click on ‘start’. The main details for each iteration get displayed on the message window as the solver runs, the details are divided by section (i.e. step 3, step 4, step 5, etc.). After the final iteration, once the optimization is successful, the results get displayed as shown in the figure 14 below:
![]()
Figure 14: Start the optimization and check results in the message window
Result Analysis:
The optimal solution was obtained in 3 iterations, and reported a revenue of $ 1677.06 /hr and the problem was solved in 4 min 30 seconds. The overall implementation of the algorithm required a total of 23 rigorous simulations (ASPEN), 9 calls to the Ipopt solver, and two calls to ALAMO. Compared with a DFO solver the SM-based obtained the same solution in 6 min 30 seconds. The final optimization result is loaded in the node input and output variables, and gets stored in the flowsheet results data tab.
Solver option 15 corresponds to the file saving the surrogate models generated in each algorithm iteration; Solver option 16 corresponds to the python file containing plots that show termination condition values at each algorithm iteration. These files are useful to track the extent of convergence, as the algorithm proceeds. Finally, Solver option 17 corresponds to the python file that contains data to show the parity plot for the final surrogate model.
Note that these extra text and python files can be found in the “user_plugins” folder of FOQUS working directory.
MEA Carbon Capture System Optimization¶

Figure 15: MEA Carbon Capture System
Problem Statement: An MEA solvent based carbon capture system is set up in Aspen Plus v10, as shown in figure 15, with a design specification of carbon capture rate 90 %. The flue gas flowrate to the absorber is 2266.1 kg/hr with 17.314 % by mass CO2. It is sought to minimize the specific reboiler duty associated with the regenerator, by varying the CO2 loading in the lean solvent entering the absorber.
Note: The Aspen, json, and FOQUS files for this example can be found in the folder: examples/tutorial_files/SM_Optimizer/MEA_Optimization
Result: After implementing the SM based optimization solver, the solution is: Optimum CO2 lean loading = 0.1695 mol CO2/mol MEA Rigorous model output variable values at optimum: Solvent Flowrate = 5438.703 kg/hr Total CO2 Capture Rate = 353.1799 kg/hr SRD = 3.6382 MJ/kg CO2
Summary: This tutorial demonstrated the implementation of the surrogate model-based optimization. This includes setting up the Aspen model in FOQUS, generating the initial dataset (required for surrogate model development) in the UQ module, generating the surrogate model using ALAMO, and further, using it to solve the required optimization problem. In each iteration, after the optimization is solved, the rigorous model is evaluated at the optimum decision variable values returned by the optimization solver. Note that the final optimal solution reported by the algorithm corresponds to the solution of the rigorous model when evaluated at the optimal decision variable values. In comparison with other optimization tools provided by FOQUS, the SM-based optimizer has an advantage over DFO solvers in terms of total solution time and accuracy. For the flash optimization example, SM based optimizer took total 4 min 30 seconds, while the NLopt DFO solver took 6 min 30 seconds for obtaining the same solution. For the MEA system example, SM based optimizer took total 48 mins, while the NLopt DFO solver took 1 hr 5 mins. Overall, the SM based optimizer has expanded the possibility of solving optimization problems involving complex flowsheets within a shorter time frame as compared to DFO solvers, without compromising solution accuracy.
Debugging¶
This chapter contains information that may be helpful in resolving a problem or filing a bug report.
How to Debug¶
Log files may contain very useful information when reporting problems. The log files are contained in the logs sub-directory of the FOQUS working directory. To change the log message levels in FOQUS go to the FOQUS Settings button from the Home window. From there various log settings can be changed. The debugging log level provides the highest level of information.
Almost any error that occurs in FOQUS should be logged. Occasionally, an error may occur that is difficult to find, or causes FOQUS to crash before logging it. In that case the “FOQUS Console” application can be used. All output from FOQUS, including messages that cannot be seen otherwise will be shown in a “cmd” window which will remain open even after FOQUS closes.
Most UQ routines interact with PSUADE via Python wrappers. When PSUADE is running, the stdout is written to psuadelog in the working directory. (At present, only some PSUADE commands write to this log; however, this will be standardized in the near future so that all PSUADE commands write to this log.) Other errors that are due to the Python wrappers or PySide GUI components are written to the logs subdirectory in the working directory.
Known Issues¶
The following are known unresolved issues:
- The FOQUS flowsheet can be edited while a flowsheet evaluation, optimization, or UQ is running. This should not be allowed, and may cause problems.
- With the windows installer, FOQUS may produce output to standard error, especially if it immediately fails to launch. Output is usually caught and redirected to the FOQUS log and displayed in dialog boxes within FOQUS, but rare instances may occur where error messages are not caught. Output to standard error is logged in the directory with foqus.exe in the file foqus.exe.log. The user does not typically have permission to write to the FOQUS install location, so an error message such as “Cannot write to foqus.exe.log” will be displayed. If this occurs there are two solutions (1) change the permissions for the FOQUS install directory or (2) run “FOQUS Console” application, which will direct FOQUS output to the “cmd” window.
- The win32com module generates Python code, which it needs to run. This code is generated in the FOQUS install location “\distwin32comgen_py.” In some cases there may be a problem writing to that directory due to permission settings. This will prevent FOQUS from running simulations locally. If this error is encountered the solution is to make the “gen_py” directory user writable. So far, in testing, this error seems to occur in Windows 8 and 10, but not 7.
- The user regression analysis features, iREVEAL and ALAMO, of the UQ tool requires a separate Python 2.7 installation. Furthermore, Python must be both in PATH variable and associated with .py files. Details on installing Python and fixing any issues encountered may be found in the iREVEAL Installation Guide and the iREVEAL User Manual, Known Issues section.
- FOQUS has trouble getting files from Turbine and saving them to the DMF when dealing with files in Turbine involving directories.
- The default port for TurbineLite is 8080. If another program is already using port 8000, there will be an error in FOQUS when connecting to TurbineLite. In the Turbine Tab of the Settings window, there is a tool to change the TurbineLite port. If the TurbineLite port is changed the configuration file that FOQUS uses to connect to TurbineLite, must also be changed.
Reporting Issues¶
To report an issue or ask a question, send an email to: ccsi-support@acceleratecarboncapture.org or open an issue at: https://github.com/CCSI-Toolset/FOQUS
Please include detailed steps on how to reproduce the error, including if possible, screenshots and log files.
Developer Documentation¶
Since the source code for all of FOQUS is publically available, the more adventurous user may wish to look at the inner-workings of FOQUS to get a better understand how it works, contribute a fix to a bug, or add new features to the source tree. Other members of our CCSI partnership (national laboratories, industry and academic institutions) may be more actively involved in the development of FOQUS.
This chapter describes at a high level how any such person can set themselves up for getting, building, running, testing, documenting and contributing to FOQUS development.
Development Tools, Technology and Process¶
FOQUS is primarily written in Python. We use the following software development tools, technologies and processes:
- GitHub is where the FOQUS source code resides.
- We make extensive use of GitHub’s Issue Tracker , Pull Requests and Project Boards for managing the development tasks using a modified Kanban development process.
- ReadTheDocs is used to generate and host our on-line documentation.
- CircleCI and AppVeyor are the Continuous Integration system we use.
- Anaconda for isolating Python runtime and development environment.
Developer Setup¶
Working as a developer is similar to how a user would work with FOQUS with the exception that they will need a copy of the source to work with. Here is rough set of steps to get setup:
Download and install Anaconda.
In a terminal create a conda env in which to work:
conda create --name ccsi-foqus python=3.7 conda activate ccsi-foqus
In a terminal, get the FOQUS source:
conda activate ccsi-foqus cd CCSI-Toolset # Or a dir of your choice git clone git@github.com:CCSI-Toolset/FOQUS.git # Note: clone the FOQUS repo if you expect to contribute cd FOQUS
Build and Install FOQUS as a developer:
pip install -r requirements-dev.txt # This will pick up both user and developer required packages. foqus # Start the app
Building the Docs locally¶
To build a local copy of the documentation:
cd FOQUS/docs
make clean html
Then open the file FOQUS/docs/build/html/index.html
to view the results.
Developer Details¶
More details are listed in our GitHub Wiki pages.
The development team can be contacted via GitHub Issues, PRs or email: ccsi-support@acceleratecarboncapture.org.
References¶
- Tong, “PSUADE User’s Manual, Version 1.2.0,” Tech. Rep. LLNL-SM-407882, Lawrence Livermore National Laboratory, Livermore, CA 94551-0808, May 2011.
- Cozad, N. V. Sahinidis, and D. C. Miller, “Automatic Learning of Algebraic Models for Optimization,” AIChE Journal, vol. 60, pp. 2211–2227, 2014.
- Storlie, H. D. Bondell, B. J. Reich, and H. H. Zhang, “Surface estimation, variable selection, and the nonparametric oracle property,” Statistica Sinica, vol. 21, no. 2, pp. 679–705, 2011.
- Storlie, B. J. Reich, J. C. Helton, L. P. Swiler, and C. J. Sallaberry, “Analysis of computationally demanding models with continuous and categorical inputs,” Reliability Engineering & System Safety, vol. 113, pp. 30–41, 2013.
- Reich, C. B. Storlie, and H. D. Bondell, “Variable selection in bayesian smoothing spline anova models: Application to deterministic computer codes,” Technometrics, vol. 51, no. 2, pp. 110–120, 2009.
- Wegstein, “Accelerating Convergence of Iterative Processes,” j-CACM, vol. 1, no. 6, pp. 9–13, 1958.
- Hansen, Towards a New Evolutionary Computation. Advances in Estimation of Distribution Algorithms, ch. The CMA Evolution Strategy: A Comparing Review, pp. 75–102. Springer, 2006.
- Johnson, “The nlopt nonlinear-optimization package.” http://ab-initio.mit.edu/nlopt, May 2015.
- Jones, T. Oliphant, P. Peterson, et al., “Scipy: Open source scientific tools for python.” http://www.scipy.org/, May 2015.
- Bhat, B. Sherman, K. Ajayi, B. Ng, J. Eslick, J. Ou, and J.Kress, “Solventfit: A calibration tool for solvent-based CO2 capture models,” in 2015 CCSI Industry Advisory Board (IAB) Program Review Meeting, (Reston, VA), September 2015.
Copyright and License¶
Copyright (c) 2012 - 2019
Copyright Notice¶
Foqus was produced under the DOE Carbon Capture Simulation Initiative (CCSI), and is copyright (c) 2012 - 2019 by the software owners: Oak Ridge Institute for Science and Education (ORISE), TRIAD National Security, LLC., Lawrence Livermore National Security, LLC., The Regents of the University of California, through Lawrence Berkeley National Laboratory, Battelle Memorial Institute, Pacific Northwest Division through Pacific Northwest National Laboratory, Carnegie Mellon University, West Virginia University, Boston University, the Trustees of Princeton University, The University of Texas at Austin, URS Energy & Construction, Inc., et al.. All rights reserved.
NOTICE. This Software was developed under funding from the U.S. Department of Energy and the U.S. Government consequently retains certain rights. As such, the U.S. Government has been granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the Software to reproduce, distribute copies to the public, prepare derivative works, and perform publicly and display publicly, and to permit other to do so.
License Agreement¶
Foqus Copyright (c) 2012 - 2019, by the software owners: Oak Ridge Institute for Science and Education (ORISE), TRIAD National Security, LLC., Lawrence Livermore National Security, LLC., The Regents of the University of California, through Lawrence Berkeley National Laboratory, Battelle Memorial Institute, Pacific Northwest Division through Pacific Northwest National Laboratory, Carnegie Mellon University, West Virginia University, Boston University, the Trustees of Princeton University, The University of Texas at Austin, URS Energy & Construction, Inc., et al. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
- Neither the name of the Carbon Capture Simulation Initiative, U.S. Dept. of Energy, the National Energy Technology Laboratory, Oak Ridge Institute for Science and Education (ORISE), TRIAD National Security, LLC., Lawrence Livermore National Security, LLC., the University of California, Lawrence Berkeley National Laboratory, Battelle Memorial Institute, Pacific Northwest National Laboratory, Carnegie Mellon University, West Virginia University, Boston University, the Trustees of Princeton University, the University of Texas at Austin, URS Energy & Construction, Inc., nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
You are under no obligation whatsoever to provide any bug fixes, patches, or upgrades to the features, functionality or performance of the source code (“Enhancements”) to anyone; however, if you choose to make your Enhancements available either publicly, or directly to Lawrence Berkeley National Laboratory, without imposing a separate written license agreement for such Enhancements, then you hereby grant the following license: a non-exclusive, royalty-free perpetual license to install, use, modify, prepare derivative works, incorporate into other computer software, distribute, and sublicense such enhancements or derivative works thereof, in binary and source code form.
FOQUS¶
Overview¶
The Framework for Optimization, Quantification of Uncertainty, and Surrogates (FOQUS) serves as the primary computational platform enabling advanced Process Systems Engineering (PSE) capabilities to be integrated with commercial process simulation software. It can be used to synthesize, design, and optimize a complete carbon capture system while considering uncertainty. FOQUS enables users to effectively screen potential capture concepts in the context of a complete industrial process so that trade-offs can be appropriately evaluated. The technical and economic performance characteristics of the capture process are highly dependent on employing an effective approach for process synthesis. Since large-scale carbon capture processes are outside of current experience, heuristic and evolutionary approaches are likely to be inadequate. Thus, a key aspect of FOQUS is that it bridges this gap by supporting a superstructure-based approach to determine the optimal process configuration and equipment interconnections.
Modules¶
- SimSinter provides a wrapper to enable models created in process simulators to be linked into a FOQUS Flowsheet.
- The FOQUS Flowsheet is used to link simulations together and connect model variables between simulations on the flowsheet. FOQUS enables linking models from different simulation packages.
- Simulations are run through Turbine, which manages the multiple runs needed to build surrogate models, perform derivative-free optimization or conduct an Uncertainty Quantification (UQ) analysis. Turbine provides the capability for job queuing and enables these jobs to be run in parallel using cloud- or cluster-based computing platforms or a single workstation.
- The Surrogates module can create algebraic surrogate models to support large-scale deterministic optimization, including superstructure optimization to determine process configurations. One of the available surrogate models is the Automated Learning of Algebraic Models for Optimization (ALAMO). ALAMO is an external product due to background Intellectual Property (IP) issues.
- The Derivative-Free Optimization (DFO) module enables derivative-free (or simulation-based) optimization directly on the process models linked together on a FOQUS Flowsheet. It utilizes Excel to calculate complex objective functions, such as the cost of electricity.
- The UQ module enables the effects of uncertainty to be propagated through the complete system model, sensitivity of the model to be assessed, and the most significant sources of uncertainty identified to enable prioritizing of experimental resources to obtain additional data.
- The Optimization Under Uncertainty (OUU) module combines the capabilities of the DFO and the UQ modules to enable scenario-based optimization, such as optimization over a range of operating scenarios.
- The Sequential Design of Experiments (SDOE) module currently provides a way to construct flexible space-filling designs based on a user-provided candidate set of input points. The method allows for new designs to be constructed as well as augmenting existing data to strategically select input combintions that minimizes the distance between points. Development of this module is continuing and will soon include other options for design construction.
Application Based Examples¶
FOQUS has been used to solve problems based on comprehensive analysis and optimization of carbon capture systems. Some relevant research work that includes FOQUS can be found in the following publications:
Chen, Y., Eslick, J.C., Grossmann, I.E., Miller, D.C., 2015. Simultaneous process optimization and heat integration based on rigorous process simulations. Computers and Chemical Engineering 81, 180–199.
Gao, Q., Miller, D.C., 2015. Optimization of amine-based solid sorbent chemistry for post-combustion carbon capture. Paper presented at: 2015 International Pittsburgh Coal Conference; 5–8 October 2015; Pittsburgh, PA, USA.
Ma, J., Mahapatra, P., Zitney, S.E., Biegler, L.T., Miller, D.C., 2016. D-RM Builder: A software tool for generating fast and accurate nonlinear dynamic reduced models from high-fidelity models. Computers and Chemical Engineering 94, 60–74.
Miller, D.C., Agarwal, D., Bhattacharyya, D., Boverhof, J., Chen, Y., Eslick, J., Leek, J., Ma, J., Mahapatra, P., Ng, B., Sahinidis, N.V., Tong, C., Zitney, S.E., 2017. Innovative computational tools and models for the design, optimization and control of carbon capture processes, in: Papadopoulos, A.I., Seferlis, P. (Eds.), Process Systems and Materials for CO2 Capture: Modelling, Design, Control and Integration. John Wiley & Sons Ltd, Chichester, UK, pp. 311–342.
Soepyan, F.B., Anderson-Cook, C.M., Morgan, J.C., Tong, C.H., Bhattacharyya, D., Omell, B.P., Matuszewski, M.S., Bhat, K.S., Zamarripa, M.A., Eslick, J.C., Kress, J.D., Gattiker, J.R., Russell, C.S., Ng, B., Ou, J.C., Miller, D.C., 2018. Sequential Design of Experiments to Maximize Learning from Carbon Capture Pilot Plant Testing. In: Eden, M.R., Ierapetritou, M.G., Towler, G.P. (Editors), 13th International Symposium on Process Systems Engineering (PSE 2018). Elsevier, Amsterdam, pp. 283-288.
Additional research work can be found on https://www.acceleratecarboncapture.org/publications