KEMBAR78
GSIUserGuide v3.1 | PDF | Library (Computing) | Compiler
0% found this document useful (0 votes)
426 views180 pages

GSIUserGuide v3.1

This document is a user's guide for version 3.1 of the Gridpoint Statistical Interpolation (GSI) data assimilation system. It provides an overview of GSI, instructions for installing and running GSI, descriptions of diagnostic tools and options for tuning GSI, and examples of applications for assimilating different types of observations with GSI. The guide is intended for both beginners and experienced users and includes details on theory, code structure, observation and background error statistics, and satellite radiance data assimilation.

Uploaded by

KadirOzturk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
426 views180 pages

GSIUserGuide v3.1

This document is a user's guide for version 3.1 of the Gridpoint Statistical Interpolation (GSI) data assimilation system. It provides an overview of GSI, instructions for installing and running GSI, descriptions of diagnostic tools and options for tuning GSI, and examples of applications for assimilating different types of observations with GSI. The guide is intended for both beginners and experienced users and includes details on theory, code structure, observation and background error statistics, and satellite radiance data assimilation.

Uploaded by

KadirOzturk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 180

Gridpoint Statistical Interpolation (GSI)

Version 3.1 Users Guide

Developmental Testbed Center


National Center for Atmospheric Research
National Centers for Environmental Prediction, NOAA
Global Systems Division, Earth System Research Laboratory, NOAA

July 2012
Foreword
This Users Guide describes the Gridpoint Statistical Interpolation (GSI) Version 3.1 data
assimilation system, released on July 20, 2012. This version of GSI is based on the GSI
repository from May 2012. As GSI is further developed, this document will be enhanced
and updated to match the released version.

This Users Guide is designed for both beginners and experienced users. It includes eight
chapters and two appendices:

Chapter 1: Overview
Chapter 2: Software Installation
Chapter 3: Running GSI
Chapter 4: GSI Diagnostics and Tuning
Chapter 5: GSI Applications
Chapter 6: GSI Theory and Code Structure
Chapter 7: Observation and Background Error Statistics
Chapter 8: Satellite Radiance Data Assimilation
Appendix A: GSI tools
Appendix B: GSI Namelist

Where Chapter 2 through Chapter 5 introduces basic skills on installing the GSI code,
running cases, and checking analysis results. Chapter 6 through 8 and Appendices A and B
present more details on the use of GSI and other advanced topics.

For the latest version of this document, please visit the GSI Users Website at

http://www.dtcenter.org/com-GSI/users/index.php.

Please send questions and comments to gsi_help@ucar.edu.

Editors: Ming Hu, Hui Shao, Donald Stark, Kathryn Newman

Contributors to this guide:


DTC: Chunhua Zhou, Xiang-Yu Huang
NOAA/NCEP/EMC: John Derber, Russ Treadon, Mike Lueken, Wan-Shu Wu
NCAR: Syed Rizvi, Zhiquan Liu, Tom Auligne, Arthur Mizzi
NOAA/ESRL/GSD: Steve Weygandt, Dezso Devenyi, Joseph Olson

ii
Acknowledgements:
We thank the U.S. Air Force Weather Agency and the National Oceanic and Atmospheric
Administration for their support of this work. . This work is also facilitated by National
Center for Atmospheric Research (NCAR). NCAR is supported by the National Science
Foundation (NSF).

iii
Table of Contents

Table of Contents

Chapter 1: Overview .................................................................................................................. 1


Chapter 2: Software Installation ............................................................................................ 7
2.1 Introduction .................................................................................................................................... 7
2.2 Obtaining the Source Code ......................................................................................................... 7
2.3 Compiling GSI .................................................................................................................................. 8
2.3.1 Environment Variables .......................................................................................................................... 9
2.3.2 Configure and Compile ....................................................................................................................... 10
2.4 System Requirements and External Libraries .................................................................. 12
2.4.1 Compilers .................................................................................................................................................. 12
2.4.2 NetCDF and MPI ..................................................................................................................................... 13
2.4.3 LAPACK and BLAS ................................................................................................................................ 14
2.5 Directory Structure, Source code and Supplemental Libraries .................................. 15
2.6 Getting Help and Reporting Problems ................................................................................ 16
2.7 Modifying the GSI Build Environment ................................................................................. 16
2.7.1 Understanding the Build System .................................................................................................... 16
2.7.2 Configuration Resource File ............................................................................................................. 18
2.7.3 Modification Example ......................................................................................................................... 19
Chapter 3: Running GSI .......................................................................................................... 21
3.1 Input Files ..................................................................................................................................... 21
3.2 GSI Run Script .............................................................................................................................. 24
3.2.1 Steps in the GSI run script ................................................................................................................. 24
3.2.2 Customization of the GSI run script .............................................................................................. 25
3.2.2.1 Setting up the machine environment .................................................................................................... 25
3.2.2.2 Setting up the running environment ..................................................................................................... 27
3.2.2.3 Setting up an analysis case ......................................................................................................................... 27
3.2.3 Description of the sample script to run GSI ............................................................................... 29
3.3 Introduction to Frequently Used GSI Namelist Options ................................................ 41
3.4 Files in GSI Run Directory ........................................................................................................ 45
Chapter 4: GSI Diagnostics and Tuning ............................................................................ 48
4.1 Understanding Standard Output (stdout) .......................................................................... 48
4.2 Single Observation Test ............................................................................................................ 63
4.2.1 Setup a single observation test: ...................................................................................................... 63
4.2.2. Examples of single observation tests for GSI ........................................................................... 64
4.3 Control Data Usage ..................................................................................................................... 65
4.4 Domain Partition for Parallelization and Observation Distribution ........................ 69
4.5 Observation Innovation Statistics ......................................................................................... 70
4.5.1 Conventional observations ............................................................................................................... 71
4.5.2 Satellite radiance ................................................................................................................................... 74
4.6 Convergence Information ........................................................................................................ 77
4.7 Use bundle to configure control, state variables and background fields ................ 82
4.8 Analysis Increments .................................................................................................................. 83
4.9 Running Time and Memory Usage ........................................................................................ 83

iv
Table of Contents

Chapter 5: GSI Applications .................................................................................................. 84


5.1. Assimilating Conventional Observations with GSI: ........................................................ 85
5.1.1: Run script ................................................................................................................................................ 85
5.1.2: Run GSI and check the run status ................................................................................................. 87
5.1.3: Check for successful GSI completion ........................................................................................... 89
5.1.4: Diagnose GSI analysis results ......................................................................................................... 91
5.1.4.1: Check analysis fit to observations ......................................................................................................... 91
5.1.4.2: Check the minimization ............................................................................................................................. 94
5.1.4.3: Check the analysis increment .................................................................................................................. 96
5.2. Assimilating Radiance Data with GSI .................................................................................. 97
5.2.1: Run script ................................................................................................................................................ 97
5.2.2: Run GSI and check run status ......................................................................................................... 99
5.2.3: Diagnose GSI analysis results ......................................................................................................... 99
5.2.3.1: Check file fort.207 ......................................................................................................................................... 99
5.2.3.2: Check the analysis increment ............................................................................................................... 101
5.3. Assimilating GPS Radio Occultation Data with GSI ...................................................... 103
5.3.1: Run script ............................................................................................................................................. 103
5.3.2: Run GSI and check the run status .............................................................................................. 103
5.3.3: Diagnose GSI analysis results ...................................................................................................... 104
5.3.3.1: Check file fort.212 ...................................................................................................................................... 104
5.3.3.2: Check the analysis increment ............................................................................................................... 105
5.4 Introduction to GSI hybrid analysis ................................................................................... 105
Summary ............................................................................................................................................. 107
Chapter 6: GSI Theory and Code Structure ................................................................... 108
6.1 GSI Theory ................................................................................................................................... 108
6.1.1 3DVAR equations: .............................................................................................................................. 108
6.1.2 Iterations to find the optimal results ........................................................................................ 109
6.1.3 Analysis variables .............................................................................................................................. 110
6.2 GSI Code Structure .................................................................................................................... 110
6.2.1 Main process ........................................................................................................................................ 110
6.2.2 GSI background IO (for 3DVAR) .................................................................................................. 113
6.2.3 Observation ingestion ...................................................................................................................... 114
6.2.4 Observation innovation calculation ........................................................................................... 115
6.2.5 Inner iteration ................................................................................................................................... 116
Chapter 7: Observation and Background Error Statistics ........................................ 117
7.1 Conventional Observation Errors ....................................................................................... 117
7.1.1 Getting original observation errors ........................................................................................... 117
7.1.2 Observation error adjustment and gross error check within GSI ................................. 118
7.2 Background Error Covariance ............................................................................................. 120
7.2.1 Processing of background error statistics .............................................................................. 121
7.2.2 Apply background error covariance .......................................................................................... 123
Chapter 8: Satellite Radiance Data Assimilation ......................................................... 125
8.1. Satellite radiance data ingest and distribution ............................................................. 125
8.1.1 link radiance BUFR files to GSI recognized names .............................................................. 125
8.1.2 GSI Code to ingest radiance data ................................................................................................. 128
8.1.3 information on ingesting and distribution .............................................................................. 131
8.2. Radiance observation operator .......................................................................................... 131
8.3. Radiance observation quality control ............................................................................. 132
8.4. Bias correction for radiance observations ..................................................................... 134

v
Table of Contents

8.4.1. Bias correction for satellite observations .............................................................................. 134


8.4.2. The GSI Bias correction procedure and configurations ................................................... 135
8.4.3 Namelist, satinfo, and coefficients for bias correction ....................................................... 137
8.4.4. Utility for Angle Bias Correction ................................................................................................. 140
8.4.5. Discussion of FAQ ............................................................................................................................. 143
8.5. Radiance data analysis monitoring ........................................................................................ 145
8.5.1 Organization and compilation ...................................................................................................... 145
8.5.2 Data Extraction ................................................................................................................................... 146
8.5.3 Image Generation ............................................................................................................................... 149
8.5.4 Additional notes ................................................................................................................................. 151
Appendix A: GSI tools ........................................................................................................... 153
A.1 BUFR Tools and SSRC .............................................................................................................. 153
A.2 Read GSI Diagnostic Files ....................................................................................................... 153
A.3 Read and Plot Convergence Information from fort.220 .............................................. 157
A.4 Plot Single Observation Test Result and Analysis Increment ................................... 158
Appendix B: GSI Namelist .................................................................................................... 159

vi
Overview

Chapter 1: Overview

GSI history and background:

The Gridpoint Statistical Interpolation (GSI) system is a unified variational data


assimilation (DA) system for both global and regional applications. It was initially
developed by the National Oceanic and Atmospheric Administration (NOAA) National
Centers for Environmental Prediction (NCEP) as a next generation analysis system based
on the then operational Spectral Statistical Interpolation (SSI) analysis system. Instead of
being constructed in spectral space like the SSI, the GSI is constructed in physical space
and is designed to be a flexible, state-of-art system that is efficient on available parallel
computing platforms.

After initial development, the GSI analysis system became operational as the core of the
North American Data Assimilation System (NDAS) in June 2006 and the Global Data
Assimilation System (GDAS) in May 2007 at NOAA. Since then, the GSI system has been
adopted in various operational systems, including the National Aeronautics and Space
Administration (NASA) global atmospheric analysis system, the NCEP Real-Time
Mesoscale Analysis (RTMA) system, the Hurricane WRF (HWRF), and the Rapid Refresh
(RR) system. The number of groups involved in operational GSI development has also
been expanded to include more development groups, including NASA Goddard Global
Modeling and Assimilation Office (GMAO), NOAA Earth System Research Laboratory
(ESRL), and National Center for Atmospheric Research (NCAR) Earth System
Laboratorys Mesoscale and Microscale Meteorology Division (MMM).

GSI becomes community code:

Starting in 2007, the Developmental Testbed Center (DTC) began collaborating with major
GSI development groups to transform the operational GSI system into a community system
and support distributed development. The DTC has complemented the development groups
in providing GSI documentation, porting GSI to multiple platforms, and testing GSI in an
independent and objective environment, while still maintaining functionally equivalent to
operational centers. Working with NCEP Environmental Modeling Center (EMC), the
DTC is maintaining a community GSI repository that is equivalent to the operational
developmental repository and facilitates community users to develop GSI. Based on the
repository, the DTC releases GSI code annually with intermediate bug fixes. The first
community version of the GSI system was released by the DTC in 2009. Since then, the
DTC has provided community support through the GSI helpdesk (gsi_help@ucar.edu),
community GSI webpage (http://www.dtcenter.org/com-GSI/users/index.php), and annual
community GSI tutorial/workshop.

1
Overview

GSI code management and Review Committee:

In late 2010, a GSI review committee was formed to coordinate distributed development of
the GSI system. The committee is composed of primary GSI research and operational
centers, including NCEP/EMC, NASA/GMAO, NOAA/ESRL, NCAR/MMM, AFWA and
DTC. As one of the committee members, the DTC represents the research community with
GSI users from all over the world.

The GSI Review Committee primarily steers distributed GSI development and community
code management and support. The responsibilities of the Review Committee are divided
into two major aspects: coordination and code review. The purpose and guiding principles
of the Review Committee are as follows:

Coordination and Advisory


Propose and shepherd new development
Coordinate on-going and new development
Process management
Community support recommendation

GSI Code Review


Establish and manage a unified GSI coding standard followed by all GSI
developers.
Establish and manage a process for proposal and commitment of new
developments to the GSI repository.
Review proposed modifications to the code trunk.
Make decisions on whether code change proposals are accepted or denied for
inclusion in the repository and manage the repository.
Oversee the timely testing and inclusion of code into the repository.

The review committee is committed to facilitating the transition from research to


operations (R2O). Prospective contributors of code should contact the DTC through the
GSI helpdesk (gsi_help@ucar.edu ). The DTC will help the prospective contributors go
through a scientific review from the GSI Review Committee to avoid any potential conflict
in code development. Upon the approval of the GSI review committee, the DTC will work
with the prospective contributors in the preparation and integration of their code into the
GSI repository.

About this GSI release version:

As a critical part of the GSI user support, this GSI Users Guide is provided to assist users
in applying GSI to data assimilation and analysis studies. It was composed by the DTC and
reviewed by the GSI Review Committee members. Please note the DTC currently focuses
on testing and evaluation of GSI for limited area Numerical Weather Prediction (NWP)
applications; however, the long-term plan includes transitioning to global forecast
applications. This documentation describes the July 2012 version 3.1 release, which

2
Overview

includes new capabilities and enhancements as well as bug fixes. This version of GSI is
based on a reversion of the community GSI repository from May 2012.

The GSI version 3.1 is primarily a three-dimensional variational (3D-Var) system with
modules developed for advanced features, such as the four-dimensional variational (4D-
Var), ensemble hybrid variational analysis, and observation sensitivity capabilities.
Coupled with forecast models and their adjoint models, GSI can be turned into a 4D-var
system. Combined with an ensemble system, this GSI can also be used in an ensemble-
variational hybrid data assimilation system (GSI-hybrid) with the appropriate
configuration. An example of such an application is the current Global Data Assimilation
System (GDAS) at NCEP, using the GSI-hybrid capability with NCEPs global ensemble
forecast system.

Observations used by this version GSI:

GSI is being used by various applications on multiple scales. Therefore, the types of
observations GSI can assimilate vary from conventional to aerosol observations. Users
should use observations with caution to fit their specific applications. The GSI version 3.1
can assimilate, but is not limited to, the following types of observations:

Conventional observations: (including satellite retrievals)


Radiosondes
Pibal winds
Synthetic tropical cyclone winds
Wind profilers
Conventional aircraft reports
ASDAR aircraft reports
MDCARS aircraft reports
Dropsondes
MODIS IR and water vapor winds
GMS, JMA, METEOSAT and GOES cloud drift IR and visible winds
GOES water vapor cloud top winds
Surface land observations
Surface ship and buoy observation
SSM/I wind speeds
QuikScat and ASCAT wind speed and direction
SSM/I and TRMM TMI precipitation estimates
Doppler radial velocities
VAD (NEXRAD) winds
GPS precipitable water estimates
GPS Radio occultation refractivity and bending angle profiles
SBUV ozone profiles and OMI total ozone
SST
Tropical storm VITAL

Satellite radiance/brightness temperature observations (instrument/satellite ID)


SBUV: n17, n18, n19
HIRS: metop-a, metop-b, n17, n19

3
Overview

GOES_IMG: g11, g12


AIRS:aqua
AMSU-A: metop-a, metop-b, n15, n18, n19
AMSU-B: metop-b, n17
MHS: metop-a, metop-b, n18, n19
SSMI: f14, f15
SSMIS: f16
AMSRE: aqua
SNDR: g12
IASI: metop-a, metop-b
GOME: metop-a, metop-b,
OMI: aura
SEVIR: m08, m09, m10
ATMS: NPP
CRIS: NPP

What is new in this release version:

The following lists some of the new functions and changes included in the GSI release
version 3.1 versus version 3.0:

1. Development of new data assimilation techniques:


Addition of 4D capability for ensembles to allow several flavors of 4DVAR using
ensembles;
Enhancement of Global hybrid function and implementation. Development of
hybrid capability for NAM and HWRF. Addition of dual resolution capability for
regional hybrid ensemble applications;
Addition of cloudy radiance assimilation;
Addition of the NCEP 4DVAR perturbation model as a tool in utility directory;
Incorporation of hybrid capability to sqrt(B) option.

2. Improvement to general data analysis techniques:


Updated and enhanced the utility Radiance Monitoring package for easily
installation and better monitoring radiance observations;
Improved the use of the significant levels in radiosonde observation;
Extended FGAT capability to NAM and ARW NetCDF interface;
Added GSI Metguess bundle;
Added the Bi-Conjugate Gradient minimization option;
Added radiance bias correction spin-up of new instruments: the new channels are
assigned as passive channels and monitored before they are used in the data
assimilation. While they are passive, bias correction will be performed;
Improved the GSD cloud analysis scheme;
Improved surface data analysis for RAP applications.

4
Overview

3. Improvement of the use of observations and addition of new observations:


Enhanced assimilation of GPS-RO data;
Added climatological monthly zonal mean CO2 fields and allow these fields to
impact the CRTM's radiance computation;
Inclusion of ATMS. Added a new fixed file for the ATMS filtering;
Added capability to do chemical data assimilation, including aerosol, PM2.5, and
MODIS AOD;
Allow assimilation of Doppler wind Lidar data;
Improved data quality control process for avhrr, AMSU-A channel 5;
Modification of the specification of TCVITL surface pressure observational errors;
Added the capability to assimilate MLS ozone BUFR data. Added an option to
revert ozone analysis to uni-variate above specified level in hybrid;
Additional quality control for satellite winds;
Inclusion of JMA water vapor cloud (type 250) winds;
Inclusion of the ability to use SEVIRI data;
Improvement of radar radial wind data analysis:
Added observation subtype to discriminate between level-2, level-2.5, and
level-3 observations;
Turned off the use of level-2.5 radial winds in the CONUS domain but kept
the level-2.5 data for Alaska;
Increased the observation error for level-3 data in the CONUS domain.
Enhancements to RTMA:
Added the analysis of: (i) 10-m wind gust; (ii) visibility; and (iii) planetary
boundary layer height;
Included the new report types (192-195, 292-295) that were created for surface
observations missing a surface pressure report;
Added the Hilbert Curve-based cross-validation.

Please note due to the version update, some diagnostic files and static information (fixed)
files have been changed as well.

The structure of this Users Guide:

The Users Guide is organized as follows. Chapter 1 provides a background introduction of


GSI. Chapter 2 contains basic information about how to get started with GSI including
system requirements; required software (and how to obtain it); how to download GSI; and
information about compilers, libraries, and how to build the code. Chapter 3 focuses on the
input files needed to run GSI and how to configure and run GSI. Chapter 4 includes
information about diagnostics and tuning of the system. Chapter 5 illustrates how to setup
configurations to run GSI with conventional, radiance, and GPSRO data and how to
diagnose the results. A brief introduction to the GSI hybrid application is also included in
this chapter. Chapter 6 introduces the data assimilation and minimization theories behind
the GSI system and the GSI code structure. Chapter 7 includes some basics on the

5
Overview

background and observation errors. Finally, Chapter 8 introduces several topics on satellite
radiance data assimilation using GSI, including configuration, bias correction, and
monitoring. Appendix A introduces the community tools available for GSI users and
Appendix B is a complete list of the GSI namelist with explanations.

6
Software Installation

Chapter 2: Software Installation

2.1 Introduction

The DTC community GSI is a community distribution of NOAAs operational GSI. The
original operational code was developed at NOAA/NCEP for use on IBM supercomputers.
The community GSI expands the portability of the operational code by adding a flexible
build system and run script that allows GSI to be compiled and run on a variety of
platforms. The current version of GSI builds and runs on Linux platforms using Intel or
PGI compilers, IBM supercomputers using the xlf compiler, and Intel Macintosh
computers using the PGI compiler.

This chapter provides build instructions applicable only to the DTC community GSI. They
do not pertain to NCEPs operational version of the code because of the different build
environment and operational requirements. However, it can be easily adopted for the same
purpose.

This chapter describes how to build and install the GSI on your computing resources. The
process consists of a few steps:
Obtain the source code (section 2.2).
o Obtain GSI from the community GSI web page.
o Obtain the WRF model from the WRF project page.
Build the WRF model (see the WRF users guide).
Set the appropriate environment variables for the GSI build (section 2.3.1).
Configure and compile the GSI source code (section 2.3.2).

This chapter is organized as follows: Section 2.2 describes how to obtain the source code.
Section 2.3 covers setting up the build environment, configuring the build system, and
compiling the code. Section 2.4 covers the system requirements (tools, libraries, and
environment variable settings) and currently supported platforms in detail. Section 2.5
covers the directory structure and supplemental NCEP libraries included with the
distribution. Section 2.6 discusses what to do if you have problems with the build and
where to get help. And section 2.7 briefly illustrates how to port GSI to another platform.

For beginning users, sections 2.2 and 2.3 provide the necessary overview needed to build
GSI on most systems. The remainder of the chapter is provided for completeness and in
case the user has any difficulty building the code.

2.2 Obtaining the Source Code

The community GSI resources, including source code, build system, utilities, and
documentation, are available from the DTC community GSI users website, located at

7
Software Installation

http://www.dtcenter.org/com-GSI/users/index.php

The source code is available by first selecting the Download tab on the vertical menu
located on the left of the page, and then selecting the GSI System submenu.

New users must first register before downloading the source code. Returning users only
need their registration email address to log in. After getting in the download page, select
the link to the comGSI v3.1 tarball to download the most recent version, as of July 2012,
of the source code. Selecting the newest release of the community GSI is critical for having
the most recent capabilities, supplemental libraries, and code fixes. To analyze satellite
radiance observations, GSI requires the CRTM coefficients from the most recent version of
the CRTM library. As a convenience to the user, and because of its large size, the latest
CRTM coefficients are provided as a separate tarfile, and can be downloaded by selecting
the link CRTM 2.0.5 coefficients tarball from the web page.

The community GSI version 3.1 comes in a tar file named comGSI_v3.1.tar.gz. The tar
file may be unpacked by using the UNIX commands:

gunzip d comGSI_v3.1.tar.gz
tar -xvf comGSI_v3.1.tar

or the single Linux command

tar -zxvf comGSI_v3.1.tar.gz

This creates the top level GSI directory comGSI_v3.1/.

After downloading the source code, and prior to building, the user should check the known
issues link on the DTC website to determine if any bug fixes or platform specific
customizations are needed.

Since the WRF model is needed for a successful GSI build, the WRF code should be
downloaded and built prior to proceeding with the GSI build. The WRF code, and full
WRF documentation, can be obtained from the WRF Users Page,

http://www.mmm.ucar.edu/wrf/users/

following a registration process similar to that for downloading GSI.

2.3 Compiling GSI

This section provides a quick overview of how to build GSI on most systems. Typically
GSI will build straight out of the box on any system that successfully builds the WRF
model. Should the user experience any difficulties with the default build, check the build
environment against the requirements described in section 2.4.

8
Software Installation

To proceed with the GSI build, it is assumed that the WRF model has already been built on
the current system. GSI reads and writes I/O using the WRF I/O API libraries. These I/O
libraries are created as part of the WRF build, and are linked into GSI during the GSI build
process. In order to successfully link the WRF I/O libraries with the GSI source, it is
crucial that both WRF and GSI are built using the same compilers. This means that if WRF
is built with the Intel Fortran compiler, then GSI must also be built with the same Intel
Fortran compiler.

2.3.1 Environment Variables

Before configuring the GSI code to be built, at least one, and up to three environment
variables may need to be set. The first variable WRF_DIR defines the path to the root of the
WRF build directory. This one is mandatory. It tells the GSI build system where to find the
WRF I/O libraries. The second environment variable specifies the local path to NetCDF
library. The variable for this is called NETCDF. The third variable is LAPACK_PATH,
which specifies the path to the LAPACK library on your system. These variables must be
set prior to configuring the build. Be aware that these paths are dependent on your choice
of compiler. In many cases, the latter two variables may already be defined as part of your
login environment. This is typical on systems with only one compiler option such as IBM
supercomputers using the xlf compiler. On systems with multiple compiler options, such as
many Linux systems, these variables may be set as part of the module settings for each
compiler.

The process for setting the environment variables varies according to the login shell in use.
To set the path variable WRF_DIR for csh/tcsh, type;

setenv WRF_DIR /path_to_WRF_root_directory/

for bash/ksh, the syntax is;

export WRF_DIR=/path_to_WRF_root_directory/

As one of the additional environment variables, the value of the NETCDF variable may be
checked by echoing the variable using:

echo $NETCDF

If the command returns with the response that the variable is undefined, such as

NETCDF: Undefined variable.

it is then necessary to manually set this variable. If your system uses modules or a similar
mechanism to set the environment, do this first. If a valid path is returned by the echo
command, no further action is required.

9
Software Installation

The last environment variable, LAPACK_PATH, defines the path to the LAPACK library.
Typically, this variable will only need to be set on systems without a vender provided
version of LAPACK. IBM systems typically come installed with the LAPACK equivalent
ESSL library that links automatically. Likewise, the PGI compiler often comes with a
vender provided version of LAPACK that links automatically with the compiler. Over the
last three years, the GSI help staff have experienced only two situations where the
LAPACK variable needed to be set.

1. On stripped down versions of the IBM operating system that come without the
ESSL libraries.
2. Linux environments using Intel Fortran compiler.

Of the two, the second of these is the most common. The Intel compiler usually comes with
a vender provided mathematics library known as the Mathematics Kernel Libraries or
MKL for short. While most installations of the Intel compiler typically come with the MKL
libraries installed, the ifort compiler does not automatically load the library. It is therefore
necessary to set the LAPACK_PATH variable to the location of the MKL libraries when
using the Intel compiler. You may need to ask your system administrator for the correct
path to these libraries.

On the NOAA supercomputers, the Intel build environment can be specified through
setting the modules. When this is done, the MKL library path is available through a local
variable, MKL on Jet, and MKL_ROOT on Zeus.

Machine Name MKL library path


Jet MKL
Zeus MKL_ROOT

Supposing that the MKL library path is set to the environment variable MKL, then the
LAPACK environment may be set for csh/tcsh with

setenv LAPACK_PATH $MKL

and for bash/ksh by

export LAPACK_PATH=$MKL

Once the environment variables have been set, the next step in the build process is to first
run the configure script and then the compile script.

2.3.2 Configure and Compile

Once the environment variables have been set, building the GSI source code requires two
steps:

1. Run the configure script

10
Software Installation

2. Run the compile script

Change into the comGSI_v3.1 directory and issue the configure command:

./configure

This command uses user input to create a platform specific configuration file called
configure.gsi. The script starts by echoing the NETCDF and WRF_DIR paths set in the
previous section. It then examines the current system and queries the user to select from
multiple build options. On selecting an option, it reports a successful configuration with the
banner:
------------------------------------------------------------------------
Configuration successful. To build the GSI, type: compile
------------------------------------------------------------------------

Failure to get this banner means that the configuration step failed to complete. The most
typical reason for a failure is an error in one of the paths set to the environment variables.

On Linux 64 bit environments, the supported compiler options are:


1. Linux x86_64, PGI compilers (pgf90 & pgcc) (dmpar,optimize)
2. Linux x86_64, PGI/gnu compiler (pgf90 & gcc) (dmpar,optimize)
3. Linux x86_64, Intel compilers (ifort & icc) (dmpar,optimize)
4. Linux x86_64, Intel/gnu compiler (ifort & gcc) (dmpar,optimize)

It is important to note that on Linux systems the Fortran compiler and the C compiler do
not have to match. This allows the build to use the GNU C-compiler in place of the Intel or
PGI C-compiler.

On IBM computers with the xlf compiler, there are two build options:
1. AIX 64-bit (dmpar,optimize)
2. AIX 32-bit (dmpar,optimize)

You may choose either option depending on your computer platform.

After selecting a build option, run the compile script:

./compile

It is typically useful to capture the build information to a log file by redirecting the output.

To conduct a complete clean, which removes ALL built files in ALL directories, as well as
the configure.gsi, type:

./clean -a

A complete clean is strongly recommended if the compilation failed or if the configuration


file is changed.

11
Software Installation

Following a successful compile, the GSI executable gsi.exe can be found in the run/
directory. If the executable is not found, check the compilation log file for the word Error
with a capital E to determine the failure.

2.4 System Requirements and External Libraries

The source code for GSI is written in FORTRAN, FORTRAN 90, and C. In addition, the
parallel executables require some flavor of MPI and OpenMP for the distributed memory
parallelism. Lastly the I/O relies on the NetCDF I/O libraries. Beyond standard shell
scripts, the build system relies on the Perl scripting language and GNU Make.

The basic requirements for building and running the GSI system are the following:

FORTRAN 90+ compiler


C compiler
MPI v1.2+
OpenMP
Perl
NetCDF V3.6+ or V4.0+
LAPACK and BLAS mathematics libraries
WRF V3.3.1+

Because all but the last of these tools and libraries are typically the purview of system
administrators to install and maintain, they are lumped together here as part of the basic
system requirements.

2.4.1 Compilers

The DTC community GSI system successfully builds and runs on IBM AIX, Linux, and
Mac Darwin platforms. The following compiler/OS combinations are supported:

IBM with xlf Fortran compiler


Linux (both 32 and 64-bit) with
o PGI pgf90 & pgcc
o PGI pgf90 & Gnu gcc
o Intel ifort & icc
o Intel ifort & Gnu gcc
Mac Darwin with PGI

Currently, GSI is only tested on the IBM and Linux compiler configurations due to
availability of testing resources. Unforeseen build issues may occur when using older
compiler and library versions. As always, the best results come from using the most recent
version of compilers.

12
Software Installation

2.4.2 NetCDF and MPI

GSI requires a number of support libraries not included with the source code. These
libraries must be built with the compilers being used to build the remainder of the code.
Because of this, they are typically bundled along with the compiler installation, and are
subsequently referred to as system libraries. For our needs, the most important libraries
among these are NetCDF, MPI and OpenMP.

The NetCDF library is an exception to the rule of using the most recent version of libraries.
GSI will work with either V3.6+ or the new V4. Recent updates to NetCDF V4 support
backward compatibility. The V4 series is significantly more difficult to install, so version
3.6+ may be preferred for that reason. The NetCDF libraries can be obtained from the
Unidata website:

http://www.unidata.ucar.edu

Typically, the NetCDF library is installed in a directory that is included in the users path
such as /usr/local/lib. When this is not the case, the environment variable NETCDF, can
be set to point to the location of the library. For csh/tcsh, the path can be set with the
command:

setenv NETCDF /path_to_netcdf_library/

For bash/ksh, the path can be set with the command:

export NETCDF=/path_to_netcdf_library/

It is crucial that system libraries, such as NetCDF, be built with the same FORTRAN
compiler, compiler version, and compatible flags, as used to compile the remainder of the
source code. This is often an issue on systems with multiple FORTRAN compilers, or
when the option to build with multiple word sizes (e.g. 32-bit vs. 64-bit addressing) is
available.

Many default Linux installations include a version of NetCDF. Typically this version is
only compatible with code compiled using gcc. To build GSI, a version of the library must
be built using your preferred Fortran compiler and with both C and FORTRAN bindings. If
you have any doubts about your installation, ask your system administrator.

GSI requires that a version of the MPI library with OpenMP support be installed. GSI is
written as a distributed memory parallel code and these libraries are required to build the
code. Just as with the NetCDF library, the MPI library must be built with the same
FORTRAN compiler, and use the same word size option flags, as the remainder of the
source code.

13
Software Installation

Installing MPI on a system is typically a job for the system administrator and will not be
addressed here. If you are running GSI on a computer at a large center, check the
machines documentation before you ask the local system administrator. On Linux or Mac
Darwin systems, you can typically determine whether MPI is installed by running the
following UNIX commands:

which mpif90
which mpicc

If any of these tests return with Command Not Found, there may be a problem with your
MPI installation or the configuration of users account. Contact your system administrator
for help if you have any questions.

2.4.3 LAPACK and BLAS

The LAPACK and BLAS are open source mathematics libraries for the solution of linear
algebra problems. The source code for these libraries is freely available to download from
NETLIB at

http://www.netlib.org/lapack/

Most commercial compilers provide their own optimized versions of these routines. These
optimized versions of BLAS and LAPACK provide superior performance to the open
source versions.

On the IBM machines, the AIX compiler is often, but not always, installed with the
Engineering and Scientific Subroutine Libraries or ESSL. In part, the ESSL libraries are
highly optimized parallel versions of many of the LAPACK and BLAS routines. Because
GSI was developed on an IBM platform, all of the required linear algebra routines are
available from the ESSL libraries.

On Linux or Mac Darwin systems, GSI supports both the Intel and PGI compilers. The
Intel compiler has its own optimized version of the BLAS and LAPACK routines called the
Math Kernel Library or MKL. The MKL libraries provide sufficient routines to satisfy the
requirements of GSI. The PGI compiler typically comes with its own version of the BLAS
and LAPACK libraries. Again, the PGI version of BLAS and LAPACK contain the
necessary routines to build GSI. For PGI these libraries are loaded automatically.

Should your compiler come without any vender supplied linear algebra libraries, it will be
necessary to download and build your own local version of the BLAS and LAPACK
libraries. Just as with the NetCDF and MPI libraries, the LAPACK/BLAS libraries must be
built with the same FORTRAN compiler, and use the same word size option flags, as the
remainder of the source code.

14
Software Installation

2.5 Directory Structure, Source code and Supplemental Libraries

The GSI system includes the GSI source code, build system, supplemental libraries, fixed
files, and run scripts. The following table lists the system components found inside of the
main GSI directory.

Directory Name Content


src/main/ GSI source code and makefile
src/libs/ Source directory for supplemental libraries
fix/ Fixed input files required by a GSI analysis, such as
background error covariances, observation error tables;
excluding CRTM coefficients
include/ Include file directory, created by the build system
lib/ Directory containing compiled supplemental libraries
created by the build
run/ Directory for executable gsi.exe and sample run script
arch/ Build options and machine architecture specifics
util/ Tools

The src/ directory contains two subdirectories. The main/ directory holds the main GSI
source code along with its makefiles, and the libs/ holds the source code for the
supplemental libraries. The fix/ directory holds the static or fixed input files used by GSI.
These files include among other things, the background error covariances and the
observation error tables. Starting with version 3 of GSI, the GSI tar file no longer contains
the CRTM coefficients due to their size. The CRTM coefficients are now included as a
separate tar file and may be stored separately. This will be discussed further in the next
chapter. The include/ and lib/ directories are created by the build process to store the
include and library files, respectively. The run/ directory contains the executable and a
sample run script for conducting an analysis run. The arch/ directory contains the machine
specific information used by the build system. The build system is discussed in detail in
section 2.7. Lastly, the util/ directory contains tools for GSI diagnostics.

For the convenience of the user, supplemental NCEP libraries for building GSI are
included in the src/libs/ directory. These libraries are built when GSI is built. These
supplemental libraries are listed in the table below.

Directory Name Content


Source code for supplemental libraries (under src/libs/. except WRF)
bacio/ NCEP BACIO library
bufr/ NCEP BUFR library
crtm_gfsgsi_2.0/ JCSDA community radiative transfer model v2.0.5
gfsio/ Unformatted Fortran record for GFS IO
gsdcloud/ GSD Cloud analysis library
misc/ Misc support libraries
nemsio/ NEMS I/O library

15
Software Installation

sfcio/ NCEP GFS surface file i/o module


sigio/ NCEP GFS atmospheric file i/o module
sp/ NCEP spectral - grid transforms
w3/ NCEP W3 library (date/time manipulation, GRIB)
WRF/ IO API libraries are used by GSI (not with GSI tar file)

The one library in this table, which is not included with the source code, is WRF. GSI
uses the WRF I/O APIs to read NetCDF files created as WRF output. Therefore a copy of
successfully compiled WRF is required for the GSI build. Please note that only
WRFV3.3.1 and WRFV3.4 are tested with this current version of GSI.

2.6 Getting Help and Reporting Problems

Should the user experience any difficulty building GSI on their system, please first confirm
that all the required software is properly installed (section 2.4). Next check that the external
libraries exist and that their paths are correct. Lastly, check the resource file
configure.gsi for errors in any of the paths or settings. Should all these check out, feel
free to contact the community GSI supporters at gsi_help@ucar.edu for assistance.

At a minimum, when reporting code building problems to the helpdesk, please include with
your email a copy of the build log, and the contents of the configure.gsi file.

2.7 Modifying the GSI Build Environment

The GSI build system comes with build settings that should compile on most standard
systems. Typically if the WRF model builds on a system, GSI will build there as well.
Unfortunately because standardization is lacking when it comes to setting up Linux HPC
environments, it may be necessary to customize the GSI build settings in order to account
for eccentric computing environments.

Typically build problems can be traced back to issues with the libraries, the compiler, or
the support utilities such as cpp. These types of issues can usually be solved by
customizing the default configuration file settings. This sets up an iterative process where
the build parameters are modified, the compile script is run, build errors diagnosed, and the
process repeated.

2.7.1 Understanding the Build System

The GSI build system uses a collection of data files and scripts to create a configuration
resource file that defines the local build environment.

16
Software Installation

At the top most level there are four scripts. The clean script removes everything created by
the build. The configure script takes local system information and queries the user to
select from a collection of build options. The results of this are saved into a resource file
called configure.gsi. Once the configure.gsi file is created, the actual build is initiated
by running the compile script. The compile script then calls the top-level makefile and
builds the source code.

Name Content
makefile Top-level makefile
arch/ Build options and machine architecture specifics
clean Script to clean up the directory structure
configure Script to configure the build environment for compilation.
Creates a resource file called configure.gsi
compile Script for building the GSI system. Requires the existence
of the configure.gsi prior to running

The compile script uses the resource file configure.gsi to set paths and environment
variables required by the compile. The configure script generates the resource file
configure.gsi by calling the Perl script Config.pl, located in the arch/ directory. The
script Config.pl combines the build information from the files in the arch/ directory with
machine specific and user provided build information to construct the configure.gsi
resource file.

A clean script is provided to remove the build objects from the directory structure. Running
./clean scrubs the directory structure of the object and module files. Running a clean-all
./clean a removes everything generated by the build, including the library files,
executables, and the configure resource file. Should the build fail, it is strongly
recommended that the user run a ./clean a prior to rerunning the compile script.

The arch/ directory contains a number of files used to construct the configuration resource
file configure.gsi.

File name Description


preamble Uniform requirements for the code, such as word size, etc.
configure.defaults Selection of compilers and options.
Users can edit this file if a change to the compilation
options or library locations is needed. Also, it can be
edited to add a new compilation option if needed.
postamble Standard compilation (make) rules and dependencies

Most users will not need to modify any of these files unless experiencing significant build
issues. Should a user require a significant customization of the build for their local
computing environment, those changes would be saved to the configure.defaults file
only after first testing these new changes in the temporary configure.gsi file.

17
Software Installation

2.7.2 Configuration Resource File

The configuration resource file configure.gsi contains build information, such as


compiler flags and paths to system libraries, specific to a particular machine architecture
and compiler.

To illustrate its contents, lets look at the resource for the Linux Intel build.

The header describes the overall build environment


Linux x86 with 64 bit word size
Uses Intel Fortran and C compilers
# Settings for Linux x86_64, Intel compiler (ifort & icc) (dmpar,optimize)#

The link path points to an Intel version of NetCDF and the OpenMP libraries.
LDFLAGS = -Wl,-rpath,/usr/local/netcdf3-ifort/lib -openmp

The code directory location:


COREDIR = $HOME/comGSI_v3.1

Various environment variables:


INC_DIR = $(COREDIR)/include
BYTE_ORDER = LITTLE_ENDIAN

Compiler definitions
Intel ifort Fortran compiler
Intel icc C compiler
SFC = ifort
SF90 = ifort -free
SCC = icc

The include paths for GSI source code and NetCDF:


INC_FLAGS = -module $(INC_DIR) -I $(INC_DIR) -I /usr/local/netcdf3-ifort/include

The default Fortran compiler flags for the main source code:
FFLAGS_DEFAULT = -fp-model precise -assume byterecl -fpe0 -ftz
FFLAGS = -O3 $(FFLAGS_DEFAULT) $(INC_FLAGS) $(LDFLAGS) -DLINUX

The default Fortran compiler flags for the external libraries:


FFLAGS_BACIO = -O3 $(FFLAGS_DEFAULT)
FFLAGS_BUFR = -O3 $(FFLAGS_DEFAULT) $(FFLAGS_i4r8)
CFLAGS_BUFR = -O3 -DUNDERSCORE
FFLAGS_CLOUD = -O3 $(FFLAGS_DEFAULT)
FFLAGS_CRTM = -O2 $(FFLAGS_DEFAULT)
FFLAGS_GFSIO = -O3 $(FFLAGS_DEFAULT) $(FFLAGS_i4r4)
FFLAGS_SFCIO = -O3 $(FFLAGS_DEFAULT) $(FFLAGS_i4r4)
FFLAGS_SIGIO = -O3 $(FFLAGS_DEFAULT) $(FFLAGS_i4r4)
FFLAGS_SP = -O3 $(FFLAGS_DEFAULT) $(FFLAGS_i4r8)
FFLAGS_W3 = -O3 $(FFLAGS_DEFAULT)
INC_FLAGS_W3 = -module ../sigio
#

18
Software Installation

The default CPP path and flags. If your system has multiple versions of cpp it may be
necessary to specify the correct one here.
CPP = cpp
CPP_FLAGS = -C -P -D$(BYTE_ORDER) -D_REAL8_ -DWRF -DLINUX
CPP_F90FLAGS = -traditional-cpp -lang-fortran

The MPI compiler definitions:


DM_FC = mpif90
DM_F90 = mpif90 -free
DM_CC = icc

The default C compiler flags:


CFLAGS = -O0 -DLINUX -DUNDERSCORE
CFLAGS2 = -DLINUX -Dfunder -DFortranByte=char -DFortranInt=int -DFortranLlong='long
long'

The default library paths and names


Variable LAPACK_PATH needs to point to the MKL library location
The library names may be different on other systems
MYLIBsys = -L$(LAPACK_PATH) -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide

NetCDF path information


Older versions of NetCDF only have the single library lnetcdf. If you are using an
older version you may need to remove the first library name.
NETCDFPATH = /usr/local/netcdf3-ifort
NETCDFLIBS = -lnetcdff -lnetcdf $(NETCDF_PATH)

Do not modify anything below these NetCDF environment variables.

2.7.3 Modification Example

To demonstrate how one would go about modifying the configuration resource file, the
generic Linux/Intel configuration will be ported to build on an SGI Linux cluster called
Zeus. Zeus comes with a vender-supplied version of MPI, which necessitates modification
of the MPI paths.

The first change is that Zeus does not use the traditional MPI wrappers such as mpif90 to
invoke the compiler. Instead the Intel compiler is called directly with use of an additional
lmpi flag. Therefore the DM compiler definitions become:

DM_FC = ifort
DM_F90 = ifort -free
DM_CC = icc

Next, additional link flags for MPI are needed. These are in bold.
LDFLAGS = -Wl,-rpath,/usr/local/netcdf3-ifort/lib -L$MPI_ROOT/lib -lmpi -openmp

Then add the path to the MPI include directory, along with the additional Fortran flag.

19
Software Installation

FFLAGS_DEFAULT = -msse2 -fp-model precise -assume byterecl -fpe0 -ftz -I$MPI_ROOT/include

An equivalent include path for the C flags are also needed.


CFLAGS = -O0 -DLINUX -DUNDERSCORE -I$MPI_ROOT/include

Lastly, a difference with the MKL libraries eliminates the need for the lguide library.
MYLIBsys = -L$(LAPACK_PATH) -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core

These changes should be saved to the users configure.gsi resource file and tested. Once
they are confirmed to work, they may be moved into the configure.defaults file located
in the arch/ directory as a new build target.

To save your new build configuration, open the file configure.defaults, located in the
arch/ directory. You will notice that it contains a collection of platform/compiler specific
entries. The first entry is for the IBM platform, using the xlf compiler with 64-bit word
size. This entry is indicated by a label at the top of the block starting with the tag #ARCH.
For the 64-bit IBM build, the tag is:

#ARCH AIX 64-bit #dmpar

The block for the 64-bit IBM build is immediately followed by the 32-bit IBM build entry,
which is indicated by the tag:

#ARCH AIX 32-bit #dmpar

Each subsequent build specification is delineated by a similar tag.

For our port of the generic Intel build to Zeus, locate the tag for the Linux/Intel build with
64 bit words. Its header looks like this:

#ARCH Linux x86_64, Intel compiler (ifort & icc) # (dmpar,optimize)

Duplicate this entry and give it a unique name by modifying the ARCH entry.

#ARCH Linux x86_64, Intel compiler SGI MPT (ifort & icc) #
(dmpar,optimize)

Then update the variables to match the settings in the configure.gsi resource file tested
previously, and save your changes. Now when you run the ./configure script, there will be
a new build option for an SGI MPT build.

20
Running GSI

Chapter 3: Running GSI

This chapter starts with discussing the input files required to run GSI. It then proceeds with
a detailed explanation of a sample run script and some frequently used options from the
GSI namelist. It concludes with samples of GSI products produced by a successful GSI run.

3.1 Input Files


In most cases, three types of input files (background, observation, and fixed files) must be
available to run GSI. In some special idealized cases, such as a pseudo single observation
test, GSI can be run without any observations.

Background or first guess field


As with other data analysis systems, the background or first guess fields may come
from a model forecast conducted separately or from a previous data assimilation cycle.
The following is a list of the types of background fields supported by this release code:

a) WRF NMM input fields in binary format


b) WRF NMM input fields in NetCDF format
c) WRF ARW input fields in binary format
d) WRF ARW input fields in NetCDF format
e) GFS input fields in binary format
f) NEMS-NMMB input fields
g) RTMA input files (2-dimensional binary format)

Please note that WRF is a community model system, including two dynamical cores:
the Advanced Research WRF (ARW) and the nonhydrostatic Mesoscale Model
(NMM). The GFS (Global Forecast System), NEMS (National Environmental
Modeling System)-NMMB (Nonhydrostatic Mesoscale Model B-Grid), and RTMA
(Real-Time Mesoscale Analysis) are operational systems of NCEP. Currently, the DTC
supports GSI for regional community models, such as WRF NMM and ARW.
Therefore, most of the multiple platform tests were conducted using WRF netcdf
background files (b, d). While the formal regression test suite conducts tests of all 7
background file formats (a-g) on an IBM platform, the NetCDF options are the most
thoroughly tested. For the current release, the following tests have been thoroughly
performed:

1. Background (a)-(g) were tested with regression cases on IBM


2. NMM NetCDF (b) and ARW NetCDF (d) were tested with simple cases on
Linux with ifort and pgi, and Mac with pgi.
3. NMM NetCDF (b) and ARW NetCDF (d) were tested with cycled cases

Observations
GSI can analyze many types of observational data, including conventional data,
satellite radiance observations, GSP, and radar data. A list of the default observation

21
Running GSI

file names used in GSI and the corresponding observations included in the files is
provided in the table below:

GSI Name Content Example file names


prepbufr Conventional observations, including ps, t, q, gdas1.t12z.prepbufr
pw, uv, spd, dw, sst, from observation
platforms such as METAR, sounding, et al.
satwnd satellite winds gdas1.t12z.satwnd.tm00.bufr_d
amsuabufr AMSU-A 1b radiance (brightness gdas1.t12z.1bamua.tm00.bufr_d
temperatures) from satellites NOAA-15, 16,
17,18, 19 and METOP-A/B
amsubbufr AMSU-B 1b radiance (brightness gdas1.t12z.1bamub.tm00.bufr_d
temperatures) from satellites NOAA15, 16,17
radarbufr Radar radial velocity Level 2.5 data ndas.t12z. radwnd. tm12.bufr_d
gpsrobufr GPS radio occultation observation gdas1.t12z.gpsro.tm00.bufr_d
ssmirrbufr Precipitation rate observations fromSSM/I gdas1.t12z.spssmi.tm00.bufr_d
tmirrbufr Precipitation rate observations from TMI gdas1.t12z.sptrmm.tm00.bufr_d
sbuvbufr SBUV/2 ozone observations from satellite gdas1.t12z.osbuv8.tm00.bufr_d
NOAA16, 17, 18, 19
hirs2bufr HIRS2 1b radiance from satellite NOAA14 gdas1.t12z.1bhrs2.tm00.bufr_d
hirs3bufr HIRS3 1b radiance observations from gdas1.t12z.1bhrs3.tm00.bufr_d
satellite NOAA16, 17
hirs4bufr HIRS4 1b radiance observation from satellite gdas1.t12z.1bhrs4.tm00.bufr_d
NOAA 18, 19 and METOP-A/B
msubufr MSU observation from satellite NOAA 14 gdas1.t12z.1bmsu.tm00.bufr_d
airsbufr AMSU-A and AIRS radiances from satellite gdas1.t12z.airsev.tm00.bufr_d
AQUA
mhsbufr Microwave Humidity Sounder observation gdas1.t12z.1bmhs.tm00.bufr_d
from NOAA 18, 19 and METOP-A/B
ssmitbufr SSMI observation from satellite f13, f14, f15 gdas1.t12z.ssmit.tm00.bufr_d
amsrebufr AMSR-E radiance from satellite AQUA gdas1.t12z.amsre.tm00.bufr_d
ssmisbufr SSMIS radiances from satellite f16 gdas1.t12z.ssmis.tm00.bufr_d
gsnd1bufr GOES sounder radiance (sndrd1, sndrd2, gdas1.t12z.goesfv.tm00.bufr_d
sndrd3 sndrd4) from GOES 11, 12, 13.
l2rwbufr NEXRAD Level 2 radial velocity ndas.t12z.nexrad. tm12.bufr_d
gsndrbufr GOES sounder radiance from GOES11, 12 gdas1.t12z.goesnd.tm00.bufr_d
gimgrbufr GOES imager radiance from GOES 11, 12
omibufr Ozone Monitoring Instrument (OMI) gdas1.t12z.omi.tm00.bufr_d
observation NASA Aura
iasibufr Infrared Atmospheric Sounding Interfero-meter gdas1.t12z.mtiasi. tm00.bufr_d
sounder observations from METOP-A/B
gomebufr The Global Ozone Monitoring Experiment gdas1.t12z.gome.tm00.bufr_d
(GOME) ozone observation from METOP-A/B
mlsbufr Aura MLS stratospheric ozone data gdas1.t12z. mlsbufr.tm00.bufr_d
tcvitl Synthetic Tropic Cyclone-MSLP observation gdas1.t12z.syndata.tcvitals.tm00
seviribufr SEVIRI radiance from MET-08,09,10 gdas1.t12z. sevcsr.tm00.bufr_d
atmsbufr ATMS radiance from Suomi NPP gdas1.t12z. atms.tm00.bufr_d
crisbufr CRIS radiance from Suomi NPP gdas1.t12z. cris.tm00.bufr_d
modisbufr MODIS aerosol total column AOD
observations from AQUA and TERRA

22
Running GSI

Remarks:
1. Because the current regional models do not have ozone as a prognostic variable,
ozone data are not assimilated at the regional scale.
2. GSI can analyze all of the data types on the list, but each GSI run (for both
operation or case study) only uses a subset of the data based on their quality,
forecast impacts, and availability. Some data may be outdated and not available,
some are on monitoring mode, and some data may have quality issues during
certain period. Users are encouraged to check the data quality issues prior to
running an analysis. The following NCEP links provide resources that include
data quality history:

http://www.emc.ncep.noaa.gov/mmb/data_processing/Satellite_Historical_Documentation.htm
http://www.emc.ncep.noaa.gov/mmb/data_processing/Non-satellite_Historical_Documentation.htm

3. For use with GSI, the observations are saved in the BUFR/PrepBUFR format
(with NCEP specified features). The details on the BUFR/PreBUFR processing
are introduced in the following website, which is supported and maintained by
the DTC:

http://www.dtcenter.org/com-GSI/BUFR/index.php

GSI can be run without any observations to see how the moisture constraint
modifies the first guess (background) field. GSI can be run in a pseudo single
observation mode, which does not require any BUFR observation files. In this
mode, users should specify observation information in the namelist section
SINGLEOB_TEST (see Section 4.2 for details). As more data files are used, additional
information will be added through the GSI analysis.

Fixed file (statistics and control files)

The GSI system has a directory of static or fixed files (called fix/ ), which includes
many files required in a GSI analysis. The following table lists fixed file names
used in the GSI code, the content of the files, and corresponding example files in
the regional test case:

File name used in GSI Content Example files in fix/


anavinfo Information file to set control and anavinfo_arw_netcdf
analysis variables anavinfo_ndas_netcdf
berror_stats background error covariance nam_nmmstat_na.gcv
nam_glb_berror.f77.gcv
errtable Observation error table nam_errtable.r3dv

23
Running GSI

Observation data control file (more detailed explanation in Section 4.3)


convinfo Conventional observation global_convinfo.txt
information file
satinfo satellite channel information file global_satinfo.txt
pcpinfo precipitation rate observation global_pcpinfo.txt
information file
ozinfo ozone observation information global_ozinfo.txt
file
Bias correction used by radiance analysis
satbias_angle satellite scan angle dependent bias global_satangbias.txt
correction file
satbias_in satellite mass bias correction sample.satbias
coefficient file
Radiance coefficient used by CRTM (not under ./fix directory)
EmisCoeff.bin IR surface emissivity coefficients EmisCoeff.bin
AerosolCoeff.bin Aerosol coefficients AerosolCoeff.bin
CloudCoeff.bin Cloud scattering and emission CloudCoeff.bin
coefficients
${satsen}.SpcCoeff.bin Sensor spectral response ${satsen}.SpcCoeff.bin
characteristics
${satsen}.TauCoeff.bin Transmittance coefficients ${satsen}.TauCoeff.bin

Each operation system, such as GFS, NAM, and RTMA, has their own set of fixed
files. Therefore, for each fixed file used in GSI, there may be several optional files in
the fixed file directory fix/. For example, for the background error covariance file, both
nam_nmmstat_na.gcv (which is from the NAM system) and regional_glb_berror.f77.
gcv (which is from the global forecast system) can be used. We also prepared the same
background error covariance files with different byte order for Linux users such as
nam_nmmstat_na.gcv_Little_Endian and nam_glb_berror.f77.gcv_Little_Endian.

3.2 GSI Run Script

3.2.1 Steps in the GSI run script

The GSI run script creates a run time environment necessary for running the GSI
executable. A typical GSI run script includes the following steps:
1. Request computer resources to run GSI.
2. Set environmental variables for the machine architecture.
3. Set experimental variables (such as experiment name, analysis time, background,
and observation).
4. Check the definitions of required variables.
5. Generate a run directory for GSI (sometime called working or temporary
directory).
6. Copy the GSI executable to the run directory.

24
Running GSI

7. Copy the background file and link ensemble members if running the hybrid to the
run directory.
8. Link observations to the run directory.
9. Link fixed files (statistic, control, and coefficient files) to the run directory.
10. Generate namelist for GSI.
11. Run the GSI executable.
12. Post-process: save analysis results, generate diagnostic files, clean run directory.

Typically, users only need to modify specific parts of the run script (steps 1, 2, and 3) to fit
their specific computer environment and point to the correct input/output files and
directories. Section 3.2.2 covers each of these modifications for steps 1 to 3. Section 3.2.3
dissects a sample run script, designed to run on a Linux cluster, an IBM supercomputer,
and a Linux workstation. Users should start with the provided run script and modify it for
their own run environment and case configuration.

3.2.2 Customization of the GSI run script

3.2.2.1 Setting up the machine environment

This section focuses on modifying the machine specific entries in the run script. This
corresponds to step 1 of the run script. Specifically, this consists of setting Unix/Linux
environment variables and selecting the correct parallel run time environment (batch
system with options).

Six specific parallel environments will be considered:

IBM supercomputer using LSF (Load Sharing Facility)


IBM supercomputer using LoadLevel
Linux clusters using PBS (Portable Batch System)
Linux clusters using LSF
Linux workstation (with no batch system)
Intel Mac Darwin workstation with PGI complier (with no batch system)

Let us start with the batch queuing system. On large computing platforms, machine
resources are allocated through a queuing system. A job (along with queuing options) is
submitted to a queue, where the job waits until sufficient machine resources become
available to execute the job. Once sufficient resources become available, the code executes
until it completes or the machine resources have been exhausted. Two queuing systems will
be listed below as examples:

1) IBM with LSF


#BSUB -P ???????? # account number
#BSUB -a poe #
#BSUB -x # exlusive use of node (not_shared)
#BSUB -n 12 # number of total tasks
#BSUB -R "span[ptile=2]" # how many tasks per node (up to 8)

25
Running GSI

#BSUB -J gsi # job name


#BSUB -o gsi.out # output filename (%J to add job id)
#BSUB -e gsi.err # error filename
#BSUB -W 00:02 # wall time
#BSUB -q regular # queue

set -x

# Set environment variables for IBM


export MP_SHARED_MEMORY=yes
export MEMORY_AFFINITY=MCM
export BIND_TASKS=yes

# Set environment variables for threads


export SPINLOOPTIME=10000
export YIELDLOOPTIME=40000
export AIXTHREAD_SCOPE=S
export MALLOCMULTIHEAP=true
export XLSMPOPTS="parthds=1:spins=0:yields=0:stack=128000000"

# Set environment variables for user preferences


export XLFRTEOPTS="nlwidth=80"
export MP_LABELIO=yes

2) Linux Cluster with PBS


#$ -S /bin/ksh
#$ -N GSI_test
#$ -cwd
#$ -r y
#$ -pe comp 64
#$ -l h_rt=0:20:00
#$ -A ??????

The Linux workstation and Mac workstation generally do not run a batch system. In these
cases, step 1 can be skipped.

In both of the examples above, environment variables are set specifying system resource
management, such as the number of processors, the name/type of queue, maximum wall
clock time allocated for the job, options for standard out and standard error, etc. Some
platforms need additional definitions to specify Unix environmental variables that further
define the run environment.

These variable settings can significantly impact the GSI run efficiency and accuracy of the
GSI results. Please check with your system administrator for the optimal settings for your
computer system. Note that while the GSI can be run with any number of processors, using
more than (number_of_levels) times (number_of_variables) of processors, will not scale
well.

26
Running GSI

3.2.2.2 Setting up the running environment

There are only two options to define in this block. The option ARCH selects the machine
architecture. It is a function of platform type, compiler, and batch queuing system. The
option GSIPROC sets the processor count used in the run. This option also decides if the job
is run as a multiple processor job or as a single processor run. We listed several choices of
the option ARCH in the sample run script. One of these choices should be applicable to a
users system. Please check with your system administrator about running parallel MPI
jobs on your system.

Option ARCH Platform Compiler batch queuing system


IBM_LSF IBM AIX xlf, xlc, LSF
IBM_LoadLevel IBM AIX xlf, xlc, LoadLevel
LINUX Linux workstation Intel/PGI mpirun if GSIPROC > 1
LINUX_LSF Linux cluster Intel/PGI LSF
LINUX_PBS Linux cluster Intel/PGI PBS
DARWIN_PGI MAC DARWIN PGI mpirun if GSIPROC > 1

# GSIPROC = processor number used for GSI analysis


#------------------------------------------------
GSIPROC=1
ARCH='LINUX'
# Supported configurations:
# IBM_LSF,IBM_LoadLevel
# LINUX, LINUX_LSF, LINUX_PBS,
# DARWIN_PGI

3.2.2.3 Setting up an analysis case

This section discusses setting up variables specific to the analysis case, such as analysis
time, working directory, background and observation files, location of fixed files and
CRTM coefficients, and the GSI executable file. A few cautions to be aware of are:
The time in the background file must be consistent with the time in the observation file
used for the GSI run (there is a namelist option to turn this check off).
Even if their contents are identical, PrepBUFR/BUFR files will differ if they were
created on platforms with different endian byte order specification (Linux intel vs.
IBM). If users obtain PrepBUFR/BUFR files from an IBM system, these files must be
converted before they can be used on a Linux system. Appendix A.1 discusses the
conversion tool ssrc to byte-swap observation files.

#
#####################################################
# case set up (users should change this part)
#####################################################
#
# ANAL_TIME= analysis time (YYYYMMDDHH)
# WORK_ROOT= working directory, where GSI runs

27
Running GSI

# PREPBURF = path of PreBUFR conventional obs


# BK_FILE = path and name of background file
# OBS_ROOT = path of observations files
# CRTM_ROOT= path of crtm coefficients files
# FIX_ROOT = path of fix files
# GSI_EXE = path and name of the gsi executable
#------------------------------------------------
ANAL_TIME=2008051112
WORK_ROOT=./run_${ANAL_TIME}_arw_single
BK_FILE=./bkARW/wrfout_d01_2008-05-11_12:00:00
OBS_ROOT=./GSI_DATA/obs
PREPBUFR=${OBS_ROOT}/newgblav.gdas1.t12z.prepbufr.nr
CRTM_ROOT=./comGSI/crtm/CRTM_Coefficients
FIX_ROOT=./comGSI_v3.1/fix
GSI_EXE=./comGSI_v3.1/run/gsi.exe

The next part of this block focuses on additional options that specify important aspects of
the GSI configuration. Option bk_core indicates the specific WRF core used to create the
background files and is used to specify the WRF core when generating the namelist. Option
bkcv_option specifies the background error covariance to be used in the script. Two
background error covariance matrices are provided with the release, one from NCEP global
data assimilation (GDAS), and one form NAM data assimilation system (NDAS). Please
check Section 7.2 for more details about GSI background error covariance. Option
if_clean is to set if the run script needs to delete temporal intermediate files in the
working directory after a GSI run is completed.

#------------------------------------------------
# bk_core= which WRF core is used as background (NMM or ARW)
# bkcv_option= which background error covariance and parameter will
# be used (GLOBAL or NAM)
# if_clean = clean :delete temporal files in working directory (default)
# no : leave running directory as is (this is for debug only)
bk_core=ARW
bkcv_option=NAM
if_clean=clean

If using hybrid function, the follow block of parameters related to ensemble forecasts also
need to be set:

#------------------------------------------------
# GSI hybrid options
# if_hybrid = .true. then turn on hybrid data analysis
# ntotmem = number of ensemble members used for hybrid
# mempath = path of ensemble members
#
# please note that we assume the ensemble member are located
# under ${mempath} with name wrfout_d01_${iiimem}.
# If this is not the case, please change the following lines:
#
# if [ -r ${mempath}/wrfout_d01_${iiimem} ]; then
# ln -sf ${mempath}/wrfout_d01_${iiimem} ./wrf_en${iiimem}
# else
# echo "member ${mempath}/wrfout_d01_${iiimem} is not exit"
# fi

28
Running GSI

#
#
if_hybrid=.false.
ntotmem=40
mempath=/ensemble_forecast/2011060212/wrfprd

3.2.3 Description of the sample script to run GSI

Listed below is an annotated run script (Courier New) with explanations on each function
block.

For further details on the first 3 blocks of the script that users need to change, check section
3.2.2.1, 3.2.2.2, and 3.2.2.3:

#!/bin/ksh
#####################################################
# machine set up (users should change this part)
#####################################################
#
#
GSIPROC=1
ARCH='LINUX'
# Supported configurations:
# IBM_LSF,IBM_LoadLevel
# LINUX, LINUX_LSF, LINUX_PBS,
# DARWIN_PGI
#
#####################################################
# case set up (users should change this part)
#####################################################
#
# ANAL_TIME= analysis time (YYYYMMDDHH)
# WORK_ROOT= working directory, where GSI runs
# PREPBURF = path of PreBUFR conventional obs
# BK_FILE = path and name of background file
# OBS_ROOT = path of observations files
# CRTM_ROOT= path of crtm coefficients files
# FIX_ROOT = path of fix files
# GSI_EXE = path and name of the gsi executable
ANAL_TIME=2008051112
WORK_ROOT=./run_${ANAL_TIME}_arw_single
BK_FILE=./bkARW/wrfout_d01_2008-05-11_12:00:00
OBS_ROOT=./GSI_DATA/obs
PREPBUFR=${OBS_ROOT}/newgblav.gdas1.t12z.prepbufr.nr
CRTM_ROOT=./comGSI/crtm/CRTM_Coefficients
FIX_ROOT=./comGSI_v3.1/fix
GSI_EXE=./comGSI_v3.1/run/gsi.exe

#------------------------------------------------
# bk_core= which WRF core is used as background (NMM or ARW)
# bkcv_option= which background error covariance and parameter will
# be used (GLOBAL or NAM)
# if_clean = clean:delete temporal files in working directory (default)

29
Running GSI

# no : leave running directory as is (this is for debug only)


bk_core=ARW
bkcv_option=NAM
if_clean=clean
#
#------------------------------------------------
# GSI hybrid options
# if_hybrid = .true. then turn on hybrid data analysis
# ntotmem = number of ensemble members used for hybrid
# mempath = path of ensemble members
#
# please note that we assume the ensemble member are located
# under ${mempath} with name wrfout_d01_${iiimem}.
# If this is not the case, please change the following lines:
#
# if [ -r ${mempath}/wrfout_d01_${iiimem} ]; then
# ln -sf ${mempath}/wrfout_d01_${iiimem} ./wrf_en${iiimem}
# else
# echo "member ${mempath}/wrfout_d01_${iiimem} is not exit"
# fi
#
#
if_hybrid=.false.
ntotmem=40
mempath=/ensemble_forecast/2011060212/wrfprd

At this point, users should be able to run the GSI for simple cases without changing the
scripts. However, some advanced users may need to change some of the following blocks
for special applications, such as use of radiance data, cycled runs, or running GSI on a
platform not tested by the DTC.
#####################################################
# Users should NOT change script after this point
#####################################################
#
The next block sets byte-order and run command to run GSI on multiple platforms. The
ARCH is set in the beginning of the script. Option BYTE_ORDER tells the script the byte order
of the machine, which will set up the right byte-order files for the CRTM coefficients and
background error covariance. The choices depend on a combination of the platform (e.g.,
IBM vs. Linux).
case $ARCH in
'IBM_LSF')
###### IBM LSF (Load Sharing Facility)
BYTE_ORDER=Big_Endian
RUN_COMMAND="mpirun.lsf " ;;

'IBM_LoadLevel')
###### IBM LoadLeve
BYTE_ORDER=Big_Endian
RUN_COMMAND="poe " ;;

'LINUX')
BYTE_ORDER=Little_Endian

30
Running GSI

if [ $GSIPROC = 1 ]; then
#### Linux workstation - single processor
RUN_COMMAND=""
else
###### Linux workstation - mpi run
RUN_COMMAND="mpirun -np ${GSIPROC} -machinefile ~/mach "
fi ;;

'LINUX_LSF')
###### LINUX LSF (Load Sharing Facility)
BYTE_ORDER=Little_Endian
RUN_COMMAND="mpirun.lsf " ;;

'LINUX_PBS')
BYTE_ORDER=Little_Endian
#### Linux cluster PBS (Portable Batch System)
RUN_COMMAND="mpirun -np ${GSIPROC} " ;;

'DARWIN_PGI')
### Mac - mpi run
BYTE_ORDER=Little_Endian
if [ $GSIPROC = 1 ]; then
#### Mac workstation - single processor
RUN_COMMAND=""
else
###### Mac workstation - mpi run
RUN_COMMAND="mpirun -np ${GSIPROC} -machinefile ~/mach "
fi ;;

* )
print "error: $ARCH is not a supported platform configuration."
exit 1 ;;
esac
#

The next block checks if all the variables needed for a GSI run are properly defined. These
variables should have been defined in the first 3 parts of this script.
#
################################################################
# Check GSI needed environment variables are defined and exist
#

# Make sure ANAL_TIME is defined and in the correct format


if [ ! "${ANAL_TIME}" ]; then
echo "ERROR: \$ANAL_TIME is not defined!"
exit 1
fi

# Make sure WORK_ROOT is defined and exists


if [ ! "${WORK_ROOT}" ]; then
echo "ERROR: \$WORK_ROOT is not defined!"
exit 1
fi

# Make sure the background file exists

31
Running GSI

if [ ! -r "${BK_FILE}" ]; then
echo "ERROR: ${BK_FILE} does not exist!"
exit 1
fi

# Make sure OBS_ROOT is defined and exists


if [ ! "${OBS_ROOT}" ]; then
echo "ERROR: \$OBS_ROOT is not defined!"
exit 1
fi
if [ ! -d "${OBS_ROOT}" ]; then
echo "ERROR: OBS_ROOT directory '${OBS_ROOT}' does not exist!"
exit 1
fi

# Set the path to the GSI static files


if [ ! "${FIX_ROOT}" ]; then
echo "ERROR: \$FIX_ROOT is not defined!"
exit 1
fi
if [ ! -d "${FIX_ROOT}" ]; then
echo "ERROR: fix directory '${FIX_ROOT}' does not exist!"
exit 1
fi

# Set the path to the CRTM coefficients


if [ ! "${CRTM_ROOT}" ]; then
echo "ERROR: \$CRTM_ROOT is not defined!"
exit 1
fi
if [ ! -d "${CRTM_ROOT}" ]; then
echo "ERROR: fix directory '${CRTM_ROOT}' does not exist!"
exit 1
fi

# Make sure the GSI executable exists


if [ ! -x "${GSI_EXE}" ]; then
echo "ERROR: ${GSI_EXE} does not exist!"
exit 1
fi

# Check to make sure the number of processors for


# running GSI was specified
if [ -z "${GSIPROC}" ]; then
echo "ERROR: The variable $GSIPROC must be set to contain the
number of processors to run GSI"
exit 1
fi

The next block creates a working directory (workdir) in which GSI will run. The directory
should have enough disk space to hold all the files needed for this run. This directory is
cleaned before each run, therefore, save all the files needed from the previous run before
rerunning GSI.

32
Running GSI

#
###################################################################
# Create the ram work directory and cd into it

workdir=${WORK_ROOT}
echo " Create working directory:" ${workdir}

echo " Create working directory:" ${workdir}

if [ -d "${workdir}" ]; then
rm -rf ${workdir}
fi
mkdir -p ${workdir}
cd ${workdir}

After creating a working directory, copy the GSI executable, background, observation, and
fixed files into the working directory. If hybrid is chosen, link ensemble members to the
working directory

#
###################################################################

echo " Copy GSI executable, background file, and link observation
bufr to working directory"

# Save a copy of the GSI executable in the workdir


cp ${GSI_EXE} gsi.exe

Note: Copy the background file to the working directory as wrf_inout. The file
wrf_inout will be overwritten by GSI to save analysis result.

# Bring over background field


#(it's modified by GSI so we can't link to it)
cp ${BK_FILE} ./wrf_inout

Note: link ensemble members and rename them with the GSI recognized name:
wrf_en${iiimem}.
# Link ensember members if use hybrid
if [ ${if_hybrid} = .true. ] ; then
echo " link ensemble members to working directory"

let imem=1

while (( $imem <= $ntotmem )); do

if [[ $imem -lt 10 ]]; then


export iiimem=00${imem}
export iimem=0${imem}
else
export iiimem=0${imem}
export iimem=${imem}
fi

let imem="imem + 1"

33
Running GSI

if [ -r ${mempath}/wrfout_d01_${iiimem} ]; then
ln -sf ${mempath}/wrfout_d01_${iiimem} ./wrf_en${iiimem}
else
echo "member ${mempath}/wrfout_d01_${iiimem} is not exit"
exit 1
fi

done
fi

Note: You can link observation files to the working directory because GSI will not
overwrite these files. The possible observations to be analyzed in GSI can be found
in dfile of the GSI namelist section OBS_INPUT in this example script. Most of the
conventional observations are in one single file named prepbufr, while radiance
data are in separate files based on satellite instruments, such as AMSU-A or HIRS.
All these observation files must be linked to GSI recognized file names in dfile.
Please check the table in Section 3.1 under the observation section for a detailed
explanation of links and the meanings of each file name listed below.
# Link to the prepbufr data
ln -s ${PREPBUFR} ./prepbufr

# Link to the radiance data


# ln -s ${OBS_ROOT}/le_gdas1.t12z.1bamua.tm00.bufr_d amsuabufr
# ln -s ${OBS_ROOT}/le_gdas1.t12z.1bamub.tm00.bufr_d amsubbufr
# ln -s ${OBS_ROOT}/le_gdas1.t12z.1bhrs3.tm00.bufr_d hirs3bufr
# ln -s ${OBS_ROOT}/le_gdas1.t12z.1bhrs4.tm00.bufr_d hirs4bufr
# ln -s ${OBS_ROOT}/le_gdas1.t12z.1bmhs.tm00.bufr_d mhsbufr
# ln -s ${OBS_ROOT}/le_gdas1.t12z.gpsro.tm00.bufr_d gpsrobufr

The following block copies constant fixed files from the fix/ directory and links CRTM
coefficients. Please check Section 3.1 for the meanings of each fixed file. For background
error covariances, observation errors, and data information files, we provide two sets of
fixed files here, one set is based on GFS statistics and another is based on NAM statistics.
#
###################################################################

echo " Copy fixed files and link CRTM coefficient files to working
directory"

# Set fixed files


# berror = forecast model background error statistics
# specoef = CRTM spectral coefficients
# trncoef = CRTM transmittance coefficients
# emiscoef = CRTM coefficients for IR sea surface emissivity model
# aerocoef = CRTM coefficients for aerosol effects
# cldcoef = CRTM coefficients for cloud effects
# satinfo = text file with information about assimilation of
# brightness temperatures
# satangl = angle dependent bias correction file (fixed in time)
# pcpinfo = text file with information about assimilation of
# prepcipitation rates

34
Running GSI

# ozinfo = text file with information about assimilation of ozone data


# errtable = text file with obs error for conventional data (regional only)
# convinfo = text file with information about assimilation of conventional data
# bufrtable= text file ONLY needed for single obs test (oneobstest=.true.)
# bftab_sst= bufr table for sst ONLY needed for sst retrieval
(retrieval=.true.)

if [ ${bkcv_option} = GLOBAL ] ; then


echo ' Use global background error covariance'
if [ ${BYTE_ORDER} = Little_Endian ] ; then
BERROR=${FIX_ROOT}/nam_glb_berror.f77.gcv_Little_Endian
else
BERROR=${FIX_ROOT}/nam_glb_berror.f77.gcv
fi
OBERROR=${FIX_ROOT}/prepobs_errtable.global
else
echo ' Use NAM background error covariance'
if [ ${BYTE_ORDER} = Little_Endian ] ; then
BERROR=${FIX_ROOT}/nam_nmmstat_na.gcv_Little_Endian
else
BERROR=${FIX_ROOT}/nam_nmmstat_na.gcv
fi
OBERROR=${FIX_ROOT}/nam_errtable.r3dv
fi

if [ ${bk_core} = NMM ] ; then


ANAVINFO=${FIX_ROOT}/anavinfo_ndas_netcdf
else
ANAVINFO=${FIX_ROOT}/anavinfo_arw_netcdf
fi

SATANGL=${FIX_ROOT}/global_satangbias.txt
SATINFO=${FIX_ROOT}/global_satinfo.txt
CONVINFO=${FIX_ROOT}/global_convinfo.txt
OZINFO=${FIX_ROOT}/global_ozinfo.txt
PCPINFO=${FIX_ROOT}/global_pcpinfo.txt

RTMFIX=${CRTM_ROOT}
RTMEMIS=${RTMFIX}/EmisCoeff/${BYTE_ORDER}/EmisCoeff.bin
RTMAERO=${RTMFIX}/AerosolCoeff/${BYTE_ORDER}/AerosolCoeff.bin
RTMCLDS=${RTMFIX}/CloudCoeff/${BYTE_ORDER}/CloudCoeff.bin

# copy Fixed fields to working directory


cp $ANAVINFO anavinfo
cp $BERROR berror_stats
cp $SATANGL satbias_angle
cp $SATINFO satinfo
cp $CONVINFO convinfo
cp $OZINFO ozinfo
cp $PCPINFO pcpinfo
cp $OBERROR errtable
#
#
## CRTM Spectral and Transmittance coefficients
ln -s $RTMEMIS EmisCoeff.bin
ln -s $RTMAERO AerosolCoeff.bin
ln -s $RTMCLDS CloudCoeff.bin

35
Running GSI

nsatsen=`cat satinfo | wc -l`


isatsen=1
while [[ $isatsen -le $nsatsen ]]; do
flag=`head -n $isatsen satinfo | tail -1 | cut -c1-1`
if [[ "$flag" != "!" ]]; then
satsen=`head -n $isatsen satinfo | tail -1 | cut -f 2 -d" "`
spccoeff=${satsen}.SpcCoeff.bin
if [[ ! -s $spccoeff ]]; then
ln -s $RTMFIX/SpcCoeff/${BYTE_ORDER}/$spccoeff $spccoeff
ln -s $RTMFIX/TauCoeff/${BYTE_ORDER}/${satsen}.TauCoeff.bin
${satsen}.TauCoeff.bin
fi
fi
isatsen=` expr $isatsen + 1 `
done

# Only need this file for single obs test


bufrtable=${FIX_ROOT}/prepobs_prep.bufrtable
cp $bufrtable ./prepobs_prep.bufrtable

# for satellite bias correction


cp ${FIX_ROOT}/sample.satbias ./satbias_in

Set up some constants used in the GSI namelist. Please note that bkcv_option is set for
background error tuning. They should be set based on specific applications. Here we
provide two sample sets of the constants for different background error covariance options,
one set is used in the GFS operations and one for the NAM operations.
#
###################################################################
# Set some parameters for use by the GSI executable and to build
the namelist
echo " Build the namelist "

export JCAP=62
export LEVS=60
export JCAP_B=62
export DELTIM=${DELTIM:-$((3600/($JCAP/20)))}

if [ ${bkcv_option} = GLOBAL ] ; then


vs_op='0.7,'
hzscl_op='1.7,0.8,0.5,'
else
vs_op='1.0,'
hzscl_op='0.373,0.746,1.50,'
fi

if [ ${bk_core} = NMM ] ; then


bk_core_arw='.false.'
bk_core_nmm='.true.'
ensbk_option=2
else
bk_core_arw='.true.'
bk_core_nmm='.false.'
ensbk_option=3
fi

36
Running GSI

The following large chunk of the script is used to generate the GSI namelist called
gsiparm.anl created in the working directory. A detailed explanation of each variable can
be found in Section 3.3 and Appendix B.
&SETUP
miter=2,niter(1)=10,niter(2)=10,
write_diag(1)=.true.,write_diag(2)=.false.,write_diag(3)=.true.,
gencode=78,qoption=2,
factqmin=0.0,factqmax=0.0,deltim=$DELTIM,
ndat=77,iguess=-1,
oneobtest=.false.,retrieval=.false.,
nhr_assimilation=3,l_foto=.false.,
use_pbl=.false.,
/
&GRIDOPTS
JCAP=$JCAP,JCAP_B=$JCAP_B,NLAT=$NLAT,NLON=$LONA,nsig=$LEVS,hybrid=.true.,
wrf_nmm_regional=${bk_core_nmm},wrf_mass_regional=${bk_core_arw},
diagnostic_reg=.false.,
filled_grid=.false.,half_grid=.true.,netcdf=.true.,
/
&BKGERR
vs=${vs_op}
hzscl=${hzscl_op}
bw=0.,fstat=.true.,
/
&ANBKGERR
anisotropic=.false.,an_vs=1.0,ngauss=1,
an_flen_u=-5.,an_flen_t=3.,an_flen_z=-200.,
ifilt_ord=2,npass=3,normal=-200,grid_ratio=4.,nord_f2a=4,
/
&JCOPTS
/
&STRONGOPTS
jcstrong=.false.,jcstrong_option=3,nstrong=0,nvmodes_keep=20,period_max=3.,
baldiag_full=.true.,baldiag_inc=.true.,
/
&OBSQC
dfact=0.75,dfact1=3.0,noiqc=.false.,c_varqc=0.02,vadfile='prepbufr',
/
&OBS_INPUT
dmesh(1)=120.0,dmesh(2)=60.0,dmesh(3)=60.0,dmesh(4)=60.0,dmesh(5)=120,time_window_max=1.5,
dfile(01)='prepbufr', dtype(01)='ps', dplat(01)=' ', dsis(01)='ps', dval(01)=1.0, dthin(01)=0,
dfile(02)='prepbufr' dtype(02)='t', dplat(02)=' ', dsis(02)='t', dval(02)=1.0, dthin(02)=0,
dfile(03)='prepbufr', dtype(03)='q', dplat(03)=' ', dsis(03)='q', dval(03)=1.0, dthin(03)=0,
dfile(04)='prepbufr', dtype(04)='pw', dplat(04)=' ', dsis(04)='pw', dval(04)=1.0, dthin(04)=0,
dfile(05)='satwnd', dtype(05)='uv', dplat(05)=' ', dsis(05)='uv', dval(05)=1.0, dthin(05)=0,
dfile(06)='prepbufr', dtype(06)='uv', dplat(06)=' ', dsis(06)='uv', dval(06)=1.0, dthin(06)=0,
dfile(07)='prepbufr', dtype(07)='spd', dplat(07)=' ', dsis(07)='spd', dval(07)=1.0, dthin(07)=0,
dfile(08)='prepbufr', dtype(08)='dw', dplat(08)=' ', dsis(08)='dw', dval(08)=1.0, dthin(08)=0,
dfile(09)='radarbufr', dtype(09)='rw', dplat(09)=' ', dsis(09)='rw', dval(09)=1.0, dthin(09)=0,
dfile(10)='prepbufr', dtype(10)='sst', dplat(10)=' ', dsis(10)='sst', dval(10)=1.0, dthin(10)=0,
dfile(11)='gpsrobufr', dtype(11)='gps_ref', dplat(11)=' ', dsis(11)='gps', dval(11)=1.0, dthin(11)=0,
dfile(12)='ssmirrbufr',dtype(12)='pcp_ssmi', dplat(12)='dmsp', dsis(12)='pcp_ssmi', dval(12)=1.0, dthin(12)=-1,
dfile(13)='tmirrbufr', dtype(13)='pcp_tmi', dplat(13)='trmm', dsis(13)='pcp_tmi', dval(13)=1.0, dthin(13)=-1,
dfile(14)='sbuvbufr', dtype(14)='sbuv2', dplat(14)='n16', dsis(14)='sbuv8_n16', dval(14)=1.0, dthin(14)=0,
dfile(15)='sbuvbufr', dtype(15)='sbuv2', dplat(15)='n17', dsis(15)='sbuv8_n17', dval(15)=1.0, dthin(15)=0,
dfile(16)='sbuvbufr', dtype(16)='sbuv2', dplat(16)='n18', dsis(16)='sbuv8_n18', dval(16)=1.0, dthin(16)=0,
dfile(17)='omibufr', dtype(17)='omi', dplat(17)='aura', dsis(17)='omi_aura', dval(17)=1.0, dthin(17)=6,
dfile(18)='hirs3bufr', dtype(18)='hirs3', dplat(18)='n16', dsis(18)='hirs3_n16', dval(18)=0.0, dthin(18)=1,
dfile(19)='hirs3bufr', dtype(19)='hirs3', dplat(19)='n17', dsis(19)='hirs3_n17', dval(19)=6.0, dthin(19)=1,
dfile(20)='hirs4bufr', dtype(20)='hirs4', dplat(20)='n18', dsis(20)='hirs4_n18', dval(20)=0.0, dthin(20)=1,
dfile(21)='hirs4bufr', dtype(21)='hirs4', dplat(21)='metop-a', dsis(21)='hirs4_metop-a', dval(21)=6.0, dthin(21)=1,
dfile(22)='gsndrbufr', dtype(22)='sndr', dplat(22)='g11', dsis(22)='sndr_g11', dval(22)=0.0, dthin(22)=1,
dfile(23)='gsndrbufr', dtype(23)='sndr', dplat(23)='g12', dsis(23)='sndr_g12', dval(23)=0.0, dthin(23)=1,

37
Running GSI

dfile(24)='gimgrbufr', dtype(24)='goes_img', dplat(24)='g11', dsis(24)='imgr_g11', dval(24)=0.0, dthin(24)=1,


dfile(25)='gimgrbufr', dtype(25)='goes_img', dplat(25)='g12', dsis(25)='imgr_g12', dval(25)=0.0, dthin(25)=1,
dfile(26)='airsbufr', dtype(26)='airs', dplat(26)='aqua', dsis(26)='airs281SUBSET_aqua',dval(26)=20.0, dthin(26)=1,
dfile(27)='msubufr', dtype(27)='msu', dplat(27)='n14', dsis(27)='msu_n14', dval(27)=2.0, dthin(27)=2,
dfile(28)='amsuabufr', dtype(28)='amsua', dplat(28)='n15', dsis(28)='amsua_n15', dval(28)=10.0, dthin(28)=2,
dfile(29)='amsuabufr', dtype(29)='amsua', dplat(29)='n16', dsis(29)='amsua_n16', dval(29)=0.0, dthin(29)=2,
dfile(30)='amsuabufr', dtype(30)='amsua', dplat(30)='n17', dsis(30)='amsua_n17', dval(30)=0.0, dthin(30)=2,
dfile(31)='amsuabufr', dtype(31)='amsua', dplat(31)='n18', dsis(31)='amsua_n18', dval(31)=10.0, dthin(31)=2,
dfile(32)='amsuabufr', dtype(32)='amsua', dplat(32)='metop-a', dsis(32)='amsua_metop-a', dval(32)=10.0, dthin(32)=2,
dfile(33)='airsbufr', dtype(33)='amsua', dplat(33)='aqua', dsis(33)='amsua_aqua', dval(33)=5.0, dthin(33)=2,
dfile(34)='amsubbufr', dtype(34)='amsub', dplat(34)='n15', dsis(34)='amsub_n15', dval(34)=3.0, dthin(34)=3,
dfile(35)='amsubbufr', dtype(35)='amsub', dplat(35)='n16', dsis(35)='amsub_n16', dval(35)=3.0, dthin(35)=3,
dfile(36)='amsubbufr', dtype(36)='amsub', dplat(36)='n17', dsis(36)='amsub_n17', dval(36)=3.0, dthin(36)=3,
dfile(37)='mhsbufr', dtype(37)='mhs', dplat(37)='n18', dsis(37)='mhs_n18', dval(37)=3.0, dthin(37)=3,
dfile(38)='mhsbufr', dtype(38)='mhs', dplat(38)='metop-a', dsis(38)='mhs_metop-a', dval(38)=3.0, dthin(38)=3,
dfile(39)='ssmitbufr', dtype(39)='ssmi', dplat(39)='f13', dsis(39)='ssmi_f13', dval(39)=0.0, dthin(39)=4,
dfile(40)='ssmitbufr', dtype(40)='ssmi', dplat(40)='f14', dsis(40)='ssmi_f14', dval(40)=0.0, dthin(40)=4,
dfile(41)='ssmitbufr', dtype(41)='ssmi', dplat(41)='f15', dsis(41)='ssmi_f15', dval(41)=0.0, dthin(41)=4,
dfile(42)='amsrebufr', dtype(42)='amsre_low', dplat(42)='aqua', dsis(42)='amsre_aqua', dval(42)=0.0, dthin(42)=4,
dfile(43)='amsrebufr', dtype(43)='amsre_mid', dplat(43)='aqua', dsis(43)='amsre_aqua', dval(43)=0.0, dthin(43)=4,
dfile(44)='amsrebufr', dtype(44)='amsre_hig', dplat(44)='aqua', dsis(44)='amsre_aqua' dval(44)=0.0, dthin(44)=4,
dfile(45)='ssmisbufr', dtype(45)='ssmis', dplat(45)='f16', dsis(45)='ssmis_f16', dval(45)=0.0, dthin(45)=4,
dfile(46)='gsnd1bufr', dtype(46)='sndrd1', dplat(46)='g12', dsis(46)='sndrD1_g12', dval(46)=1.5, dthin(46)=5,
dfile(47)='gsnd1bufr', dtype(47)='sndrd2', dplat(47)='g12', dsis(47)='sndrD2_g12', dval(47)=1.5, dthin(47)=5,
dfile(48)='gsnd1bufr', dtype(48)='sndrd3', dplat(48)='g12', dsis(48)='sndrD3_g12', dval(48)=1.5, dthin(48)=5,
dfile(49)='gsnd1bufr', dtype(49)='sndrd4', dplat(49)='g12', dsis(49)='sndrD4_g12', dval(49)=1.5, dthin(49)=5,
dfile(50)='gsnd1bufr', dtype(50)='sndrd1', dplat(50)='g11', dsis(50)='sndrD1_g11', dval(50)=1.5, dthin(50)=5,
dfile(51)='gsnd1bufr', dtype(51)='sndrd2', dplat(51)='g11', dsis(51)='sndrD2_g11', dval(51)=1.5, dthin(51)=5,
dfile(52)='gsnd1bufr', dtype(52)='sndrd3', dplat(52)='g11', dsis(52)='sndrD3_g11', dval(52)=1.5, dthin(52)=5,
dfile(53)='gsnd1bufr', dtype(53)='sndrd4', dplat(53)='g11', dsis(53)='sndrD4_g11', dval(53)=1.5, dthin(53)=5,
dfile(54)='gsnd1bufr', dtype(54)='sndrd1', dplat(54)='g13', dsis(54)='sndrD1_g13', dval(54)=1.5, dthin(54)=5,
dfile(55)='gsnd1bufr', dtype(55)='sndrd2', dplat(55)='g13', dsis(55)='sndrD2_g13', dval(55)=1.5, dthin(55)=5,
dfile(56)='gsnd1bufr', dtype(56)='sndrd3', dplat(56)='g13', dsis(56)='sndrD3_g13', dval(56)=1.5, dthin(56)=5,
dfile(57)='gsnd1bufr', dtype(57)='sndrd4', dplat(57)='g13', dsis(57)='sndrD4_g13', dval(57)=1.5, dthin(57)=5,
dfile(58)='iasibufr', dtype(58)='iasi', dplat(58)='metop-a', dsis(58)='iasi586_metop-a', dval(58)=20.0, dthin(58)=1,
dfile(59)='gomebufr', dtype(59)='gome', dplat(59)='metop-a', dsis(59)='gome_metop-a', dval(59)=1.0, dthin(59)=6,
dfile(60)='sbuvbufr', dtype(60)='sbuv2', dplat(60)='n19', dsis(60)='sbuv8_n19', dval(60)=1.0, dthin(60)=0,
dfile(61)='hirs4bufr', dtype(61)='hirs4', dplat(61)='n19', dsis(61)='hirs4_n19', dval(61)=6.0, dthin(61)=1,
dfile(62)='amsuabufr', dtype(62)='amsua', dplat(62)='n19', dsis(62)='amsua_n19', dval(62)=10.0, dthin(62)=2,
dfile(63)='mhsbufr', dtype(63)='mhs', dplat(63)='n19', dsis(63)='mhs_n19', dval(63)=3.0, dthin(63)=3,
dfile(64)='tcvitl' dtype(64)='tcp', dplat(64)=' ', dsis(64)='tcp', dval(64)=1.0, dthin(64)=0,
dfile(65)='mlsbufr', dtype(65)='mls', dplat(65)='aura', dsis(65)='mls_aura', dval(65)=1.0, dthin(65)=0,
dfile(66)='seviribufr',dtype(66)='seviri', dplat(66)='m08', dsis(66)='seviri_m08' dval(66)=0.0, dthin(66)=1,
dfile(67)='seviribufr',dtype(67)='seviri', dplat(67)='m09', dsis(67)='seviri_m09', dval(67)=0.0, dthin(67)=1,
dfile(68)='seviribufr',dtype(68)='seviri', dplat(68)='m10' dsis(68)='seviri_m10', dval(68)=0.0, dthin(68)=1,
dfile(69)='hirs4bufr', dtype(69)='hirs4', dplat(69)='metop-b', dsis(69)='hirs4_metop-b', dval(69)=0.0, dthin(69)=1,
dfile(70)='amsuabufr', dtype(70)='amsua', dplat(70)='metop-b', dsis(70)='amsua_metop-b', dval(70)=0.0, dthin(70)=1,
dfile(71)='mhsbufr', dtype(71)='mhs', dplat(71)='metop-b', dsis(71)='mhs_metop-b', dval(71)=0.0, dthin(71)=1,
dfile(72)='iasibufr', dtype(72)='iasi', dplat(72)='metop-b', dsis(72)='iasi616_metop-b', dval(72)=0.0, dthin(72)=1,
dfile(73)='gomebufr', dtype(73)='gome', dplat(73)='metop-b', dsis(73)='gome_metop-b', dval(73)=0.0, dthin(73)=1,
dfile(74)='atmsbufr', dtype(74)='atms', dplat(74)='npp', dsis(74)='atms_npp', dval(74)=0.0, dthin(74)=1,
dfile(75)='crisbufr', dtype(75)='cris', dplat(75)='npp', dsis(75)='cris_npp', dval(75)=0.0, dthin(75)=1,
dfile(76)='modisbufr', dtype(76)='modis', dplat(76)='aqua', dsis(76)='modis_aqua', dval(76)=1.0, dthin(65)=6,
dfile(77)='modisbufr', dtype(77)='modis', dplat(77)='terra', dsis(77)='modis_terra', dval(77)=1.0, dthin(77)=6,

/
&SUPEROB_RADAR
del_azimuth=5.,del_elev=.25,del_range=5000.,del_time=.5,elev_angle_max=5.,
minnum=50,range_max=100000., l2superob_only=.false.,
/
&LAG_DATA
/
&HYBRID_ENSEMBLE
l_hyb_ens=${if_hybrid},uv_hyb_ens=.true.,
aniso_a_en=.false.,generate_ens=.false.,
n_ens=${ntotmem},
beta1_inv=0.5,s_ens_h=110,s_ens_v=-0.3,
regional_ensemble_option=${ensbk_option},
/

38
Running GSI

&RAPIDREFRESH_CLDSURF
/
&CHEM
/
&SINGLEOB_TEST
maginnov=1.0,magoberr=0.8,oneob_type='t',
oblat=38.,oblon=279.,obpres=500.,obdattim=${ANAL_TIME},
obhourset=0.,
/

EOF

Note: EOF indicates the end of GSI namelist.

The following block runs GSI and checks if GSI has successfully completed.
#
#
###################################################
# run GSI
###################################################
echo ' Run GSI with' ${bk_core} 'background'

case $ARCH in
'IBM_LSF'|'IBM_LoadLevel')
${RUN_COMMAND} ./gsi.exe < gsiparm.anl > stdout 2>&1 ;;

* )
${RUN_COMMAND} ./gsi.exe > stdout 2>&1 ;;
esac

##################################################################
# run time error check
##################################################################
error=$?

if [ ${error} -ne 0 ]; then


echo "ERROR: ${GSI} crashed Exit status=${error}"
exit ${error}
fi

The following block saves the analysis results with an understandable name and adds the
analysis time to some output file names. Among them, stdout contains runtime output of
GSI and wrf_inout is the analysis results.
#
##################################################################
#
# GSI updating satbias_in (only for cycling assimilation)
#

# Copy the output to more understandable names


cp stdout stdout.anl.${ANAL_TIME}
cp wrf_inout wrfanl.${ANAL_TIME}
ln fort.201 fit_p1.${ANAL_TIME}
ln fort.202 fit_w1.${ANAL_TIME}
ln fort.203 fit_t1.${ANAL_TIME}
ln fort.204 fit_q1.${ANAL_TIME}

39
Running GSI

ln fort.207 fit_rad1.${ANAL_TIME}

The following block collects the diagnostic files. The diagnostic files are merged and
categorized by outer loop and data type. Setting write_diag to true, directs GSI to write
out diagnostic information for each observation station. This information is very useful to
check analysis details. Please check Appendix A.2 for the tool to read and analyze these
diagnostic files.
# Loop over first and last outer loops to generate innovation
# diagnostic files for indicated observation types (groups)
#
# NOTE: Since we set miter=2 in GSI namelist SETUP, outer
# loop 03 will contain innovations with respect to
# the analysis. Creation of o-a innovation files
# is triggered by write_diag(3)=.true. The setting
# write_diag(1)=.true. turns on creation of o-g
# innovation files.
#

ls -l pe0*.* > listpe


loops="01 03"
for loop in $loops; do

case $loop in
01) string=ges;;
03) string=anl;;
*) string=$loop;;
esac

# Collect diagnostic files for obs types (groups) below


listall="conv amsua_metop-a mhs_metop-a hirs4_metop-a hirs2_n14 msu_n14 \
sndr_g08 sndr_g10 sndr_g12 sndr_g08_prep sndr_g10_prep sndr_g12_prep \
sndrd1_g08 sndrd2_g08 sndrd3_g08 sndrd4_g08 sndrd1_g10 sndrd2_g10 \
sndrd3_g10 sndrd4_g10 sndrd1_g12 sndrd2_g12 sndrd3_g12 sndrd4_g12 \
hirs3_n15 hirs3_n16 hirs3_n17 amsua_n15 amsua_n16 amsua_n17 \
amsub_n15 amsub_n16 amsub_n17 hsb_aqua airs_aqua amsua_aqua \
goes_img_g08 goes_img_g10 goes_img_g11 goes_img_g12 \
pcp_ssmi_dmsp pcp_tmi_trmm sbuv2_n16 sbuv2_n17 sbuv2_n18 \
omi_aura ssmi_f13 ssmi_f14 ssmi_f15 hirs4_n18 amsua_n18 mhs_n18 \
amsre_low_aqua amsre_mid_aqua amsre_hig_aqua ssmis_las_f16 \
ssmis_uas_f16 ssmis_img_f16 ssmis_env_f16"
for type in $listall; do
count=`grep ${type}_${loop} listpe | wc -l`
if [[ $count -gt 0 ]]; then
cat pe*${type}_${loop}* > diag_${type}_${string}.${ANAL_TIME}
fi
done
done

The following scripts clean the temporal intermediate files

# Clean working directory to save only important files


ls -l * > list_run_directory
if [ ${if_clean} = clean ]; then

40
Running GSI

echo ' Clean working directory after GSI run'


rm -f *Coeff.bin # all CRTM coefficient files
rm -f pe0* # diag files on each processor
rm -f listpe # list of diag files on each processor
rm -f obs_input.* # observation middle files
rm -f siganl sigf03 # background middle files
rm -f xhatsave.* # some information on each processor
rm -f fsize_* # delete temperal file for bufr size
fi

The GSI successfully finishes and exits with 0:


exit 0

3.3 Introduction to Frequently Used GSI Namelist Options

The complete namelist options and their explanations are listed in Appendix B. For most
GSI analysis applications, only a few namelist variables need to be changed. Here we
introduce frequently used variables for regional analyses:

1. Set up the number of outer loop and inner loop

To change the number of outer loops and the number of inner iterations in each outer loop,
you need to modify the following three variables in the namelist:
miter: number of outer loops of analysis.
st
niter(1): maximum iteration number of inner loop iterations for the 1 outer
loop. The inner loop will stop when it reaches this maximum number,
reaches the convergence condition, or when it fails to converge.
nd
niter(2): maximum iteration number of inner loop iterations for the 2 outer
loop.
If miter is larger than 2, repeat niter with larger index.

2. Set up the analysis variable for moisture

If you want to change moisture variable used in analysis, the namelist variable is:
qoption = 1 or 2:
If qoption=1, the moisture analysis variable is pseudo-relative humidity.
The saturation specific humidity, qsatg, is computed from the guess and
held constant during the inner loop. Thus, the RH control variable can only
change via changes in specific humidity, q.
If qoption=2, the moisture analysis variable is normalized RH. This
formulation allows RH to change in the inner loop via changes to surface
pressure (pressure), temperature, or specific humidity.

41
Running GSI

3. Set up the background file

The following four variables define which background field will be used in the GSI
analyses:
regional: if true, perform a regional GSI run using either ARW or NMM inputs
as the background. If false, perform a global GSI analysis. If either
wrf_nmm_regional or wrf_mass_regional are true, it will be set to
true.
wrf_nmm_regional: if true, background comes from WRF NMM. When using
other background fields, set it to false.
wrf_mass_regional: if true, background comes from WRF ARW. When using
other background fields, set it to false.
netcdf: if true, wrf files are in NetCDF format, otherwise wrf files are in binary
format. This option only works for performing a regional GSI analysis.

4. Set up the output of diagnostic files

The following variables decide if the GSI will write out diagnostic results in certain loops:
write_diag(1): if true, write out diagnostic data in the beginning of the analysis,
so that we can have information on Observation Background (O-B) .
st nd
write_diag(2) : if true, write out diagnostic data in between the 1 and 2 outer
loop .
nd
write_diag(3): if true, write out diagnostic data at the end of the 2 outer loop
(after the analysis finishes if the outer loop number is 2), so that we can
have information on Observation Analysis (O-A) .
Please check appendix A.2 for the tools to read the diagnostic files.

5. Set up the GSI recognized observation files

The following set up the GSI recognized observation files for GSI observation ingests:
ndat: number of observation variables (not observation types). This number
should be consistent with the observation variable lines in section
OBS_INPUT. If adding a new observation variable, ndat must be
incremented by one and one new line is added in OBS_INPUT. Based on
dimensions of the variables in OBS_INPUT (e.g., dfile), the maximum
value of ndat is 200 in this version.
dfile(01)='prepbufr': GSI recognized observation file name. The
observation file contains observations used for a GSI analysis. This file can
include several observation variables from different observation types. The
file name in this parameter will be read in by GSI. This name can be
changed as long as the name in the link from the BUFR/PrepBUFR file also
changes correspondingly.

42
Running GSI

dtype(01)='ps': analysis variable name that GSI can read in and handle. As an
example here, GSI will read all ps observations from the file prepbufr.
Please note this name should be consistent with that used in the GSI code.
dplat(01): sets up the observation platform for a certain observation, which will
be read in from the file dfile.
dsis(01): sets up data name (including both data type and platform name) used
inside GSI.

Please see Section 4.3 and Section 8.1 for examples and explanations to these
variables.

6. Set up observation time window

In the namelist section OBS_INPUT, use time_window_max to set maximum half time
window (hours) for all data types. In the convinfo file, you can use column twindow to set
the half time window for a certain data type (hours). For conventional observations, only
observations within the smaller window of these two will be kept to further processing. For
others, observations within time_window_max will be kept to further processing.

7. Set up data thinning

1) Radiance data thinning

Radiance data thinning is controlled through two GSI namelist variables in the section
&OBS_INPUT. Below is an example of the section:

&OBS_INPUT
dmesh(1)=120.0,dmesh(2)=60.0,dmesh(3)=60.0,dmesh(4)=60.0,dmesh(5)=120,time_window_max=1.5,
dfile(01)='prepbufr', dtype(01)='ps', dplat(01)=' ', dsis(01)='ps', dval(01)=1.0, dthin(01)=0,
dfile(11)='ssmirrbufr',dtype(11)='pcp_ssmi', dplat(11)='dmsp', dsis(11)='pcp_ssmi', dval(11)=1.0, dthin(11)=-1,
dfile(12)='tmirrbufr', dtype(12)='pcp_tmi', dplat(12)='trmm', dsis(12)='pcp_tmi', dval(12)=1.0, dthin(12)=-1,
dfile(13)='sbuvbufr', dtype(13)='sbuv2', dplat(13)='n16', dsis(13)='sbuv8_n16', dval(13)=1.0, dthin(13)=0,
dfile(14)='sbuvbufr', dtype(14)='sbuv2', dplat(14)='n17', dsis(14)='sbuv8_n17', dval(14)=1.0, dthin(14)=0,
dfile(15)='sbuvbufr', dtype(15)='sbuv2', dplat(15)='n18', dsis(15)='sbuv8_n18', dval(15)=1.0, dthin(15)=0,
dfile(17)='hirs2bufr', dtype(17)='hirs2', dplat(17)='n14', dsis(17)='hirs2_n14', dval(17)=6.0, dthin(17)=1,
dfile(18)='hirs3bufr', dtype(18)='hirs3', dplat(18)='n16', dsis(18)='hirs3_n16', dval(18)=0.0, dthin(18)=1,
dfile(19)='hirs3bufr', dtype(19)='hirs3', dplat(19)='n17', dsis(19)='hirs3_n17', dval(19)=6.0, dthin(19)=1,
dfile(20)='hirs4bufr', dtype(20)='hirs4', dplat(20)='n18', dsis(20)='hirs4_n18', dval(20)=0.0, dthin(20)=1,
dfile(27)='msubufr', dtype(27)='msu', dplat(27)='n14', dsis(27)='msu_n14', dval(27)=2.0, dthin(27)=2,
dfile(28)='amsuabufr', dtype(28)='amsua', dplat(28)='n15', dsis(28)='amsua_n15', dval(28)=10.0,dthin(28)=2,
dfile(34)='amsubbufr', dtype(34)='amsub', dplat(34)='n15', dsis(34)='amsub_n15', dval(34)=3.0, dthin(34)=3,
dfile(35)='amsubbufr', dtype(35)='amsub', dplat(35)='n16', dsis(35)='amsub_n16', dval(35)=3.0, dthin(35)=3,
dfile(41)='ssmitbufr', dtype(41)='ssmi', dplat(41)='f15', dsis(41)='ssmi_f15', dval(41)=0.0, dthin(41)=4,

The two namelist variables that control the radiance data thinning are real array dmesh
in the 1st line and the integer array dthin in the last column. The dmesh gives a set of
the mesh sizes in unit km for radiance thinning grids, while the dthin defines if the
data type it represents needs to be thinned and which thinning grid (mesh size) to use. If
the value of dthin is:

43
Running GSI

an integer less and equal to 0, no thinning is needed


an integer larger than 0, this kind of radiance data will be thinned in a thinning
grid with the mesh size defined as dmesh (dthin).

The following gives several thinning examples defined by the above sample
&OBS_INPUT section:
Data type 1 (ps in prepbufr): no thinning because dthin(01)=0
Data type 11 (pcp_ssmi from dmsp): no thinning because dthin(01)=-1
Data type 17 (hirs2 from NOAA-14): thinning in a 120 km grid because dthin(17)=1
and dmesh(1)=120
Data type 41 (ssmi from f15): thinning in a 60 km grid because dthin(41)=4 and
dmesh(4)=60

2) Conventional data thinning

The conventional data can also be thinned. However, the setup of thinning is not in the
namelist. To give users a complete picture of data thinning, conventional data thinning
is briefly introduced here. There are three columns, ithin, rmesh, pmesh, in the
convinfo file (more details on this file are in Section 4.3) to configure conventional data
thinning:

ithin: 0 = no thinning; 1 = thinning with grid mesh decided by rmesh and


pmesh
rmesh: horizontal thinning grid size in km
pmesh: vertical thinning grid size in mb; if 0, then use background vertical grid.

8. Set up background error factor

In the namelist section BKGERR, use vs to set up scale factor for vertical correlation length
and hzscl to set up scale factors for horizontal smoothing. The scale factors for the
variance of each analysis variables are set in anavinfo file.

9. Single observation test

To do single observation test, the following namelist option has to be set to true first:

oneobtest=.true.

Then go to the namelist section SINGLEOB_TEST to set up the single observation location
and variable to be tested, please see Section 4.2 for an example and details on the single
observation test.

44
Running GSI

3.4 Files in GSI Run Directory

When completed, GSI will create a number of files in the run directory. Below is an
example of the files generated in the run directory from one of the GSI test case runs. This
case was run to perform a regional GSI analysis with a WRF ARW NetCDF background
using conventional (prepbufr), radiance data (AMSU-A, AMSU-B, HIRS3, HIRS4, and
MHS), and GPSRO data. The analysis time is 12Z 22 March 2011. Four processors were
used. To make the run directory more readable, we turned on the clean option in the run
script, which deleted all temporary intermediate files:
amsuabufr fit_p1.2011032212 fort.221
amsubbufr fit_q1.2011032212 gpsrobufr
anavinfo fit_rad1.2011032212 gsi.exe
berror_stats fit_t1.2011032212 gsiparm.anl
convinfo fit_w1.2011032212 hirs3bufr
diag_amsua_metop-a_anl.2011032212 fort.201 hirs4bufr
diag_amsua_metop-a_ges.2011032212 fort.202 l2rwbufr
diag_amsua_n15_anl.2011032212 fort.203 list_run_directory
diag_amsua_n15_ges.2011032212 fort.204 mhsbufr
diag_amsua_n18_anl.2011032212 fort.205 ozinfo
diag_amsua_n18_ges.2011032212 fort.206 pcpbias_out
diag_amsub_n17_anl.2011032212 fort.207 pcpinfo
diag_amsub_n17_ges.2011032212 fort.208 prepbufr
diag_conv_anl.2011032212 fort.209 prepobs_prep.bufrtable
diag_conv_ges.2011032212 fort.210 satbias_angle
diag_hirs3_n17_anl.2011032212 fort.211 satbias_in
diag_hirs3_n17_ges.2011032212 fort.212 satbias_out
diag_hirs4_metop-a_anl.2011032212 fort.213 satinfo
diag_hirs4_metop-a_ges.2011032212 fort.214 stdout
diag_mhs_metop-a_anl.2011032212 fort.215 stdout.anl.2011032212
diag_mhs_metop-a_ges.2011032212 fort.217 wrfanl.2011032212
diag_mhs_n18_anl.2011032212 fort.218 wrf_inout
diag_mhs_n18_ges.2011032212 fort.219
errtable fort.220

There are several important files that hold GSI analysis results and diagnostic information.
We will introduce these files and their contents in detail in the following chapter. The
following is a brief list of what these files contain:
stdout.anl.2011032212/stdout: standard text output file, which is a copy
of stdout with the analysis time appended. This is the most commonly
used file to check the GSI analysis processes as well as basic and important
information about the analyses. We will explain the contents of stdout in
Section 4.1 and users are encouraged to read this file in detail to become
familiar with the order of GSI analysis processing.
wrfanl.2011032212/wrf_inout: analysis results if GSI completes
successfully it exists only if using wrf for background. This is a copy of
wrf_inout with the analysis time appended. The format is the same as the
background file.
diag_conv_anl.(time): binary diagnostic files for conventional and GPS RO
observations at the final analysis step (analysis departure for each
observation, please see Appendix A.2 for more information).
diag_conv_ges.(time): binary diagnostic files for conventional and GPS RO
observations at initial analysis step (background departure for each
observation, please see Appendix A.2 for more information)

45
Running GSI

diag_(instrument_satellite)_anl: diagnostic files for satellite radiance


observations at final analysis step.
diag_(instrument_satellite)_ges: diagnostic files for satellite radiance
observations at initial analysis step.
gsiparm.anl: namelist, generated by the run script.
fit_(variable).(time): copies of fort.2?? with meaningful names (variable
name and analysis time). They are statistic results of observation departures
from background and analysis results according to observation variables.
Please see Section 4.5 for more details.
fort.220: output from the inner loop minimization (in pcgsoi.f90). Please see
Section 4.6 for details.
anavinfo: info file to set up control variables, state variables, and background
variables. Please see Section 4.7 for details.
*info (convinfo, satinfo,): info files that control data usage. Please see
Section 4.3 for details.
berror_stats and errtable: background error file (binary) and observation
error file (text).
*bufr: observation BUFR files linked to the run directory Please see Section 3.1
for details.
satbias_in: the input coefficients of mass bias correction for satellite radiance
observations.
satbias_out: the output coefficients of mass bias correction for satellite
radiance observations after the GSI run.
satbias_angle: the input coefficients of scan angle bias correction for satellite
radiance observations.
list_run_directory : the complete list of files in the run directory before
cleaning the run directory.

Please note that diag_(instrument_satellite)_anl.(time) and


diag_conv_anl.(time) contain important information like analysis increment for each
observation (O-A), also, diag_conv_ges and diag_(instrument_satellite)_
ges.(time) include observation innovation for each observation (O-B). These files can
be very helpful in terms of getting the detailed impact of data on the analysis. A tool is
provided to process these files, which is introduced in Appendix A.2.

There are many intermediate files in this directory during the running stage; the complete
list of files before cleaning is saved in a file list_run_directory. Some knowledge about the
content of these files is very helpful for debugging if the GSI run crashes. Please check the
following list for the meaning of these files: (Note: you may not see all the files in the list
because different observational data are used. Also, the fixed files prepared for a GSI run,
such as CRTM coefficient files, are not included.)

sigf03 Background files in binary format (typically


sigf03,sigf06 and sigf09). This is a temporal file holding
binary background files. When you see this file, at the

46
Running GSI

minimum, a background file was successfully read in.


siganl Analysis results in binary format. When this file exists,
the analysis part has finished.
pe????.(conv or Diagnostic files for conventional and satellite radiance
instrument_satellite)_
(outer loop)
observations at each outer loop and each sub-domains
(????=subdomain id)
obs_input.???? Observation scratch files (each file contains observations
for one observation type within whole analysis domain
and time. ????=observation type id in namelist)
pcpbias_out Output precipitation bias correction file

47
GSI Diagnostics and Tuning

Chapter 4: GSI Diagnostics and Tuning

The purpose of this chapter is to help users understand and diagnose the GSI analysis. The
chapter starts with an introduction to the content and structure of the GSI standard output
(stdout). It continues with the use of a single observation to check the features of the GSI
analysis. An introduction to observation usage control, analysis domain partition, fit files,
and the optimization process will all be presented from information within the GSI output
files (including stdout).

4.1 Understanding Standard Output (stdout)

In Section 3.4, we listed the files present in the GSI run directory following a successful
GSI analysis and briefly introduced the contents of several important files. Of these, stdout
is the most useful because the most critical information about the GSI analysis can be
obtained from the file. From stdout, users can check if the GSI has successfully completed,
if optimal iterations look correct, and if the background and analysis fields are reasonable.
The stdout file also contains the GSI standard output from all the processors.
Understanding the content of this file can also be very helpful for users to find where and
why the GSI failed if it crashes.

The structure of stdout follows the typical steps in a meteorological data analysis system:

1. Read in all data and prepare analysis:


a. Read in configuration (namelist)
b. Read in background
c. Partition domain and data for parallel analysis
d. Read in observation(s)
e. Read in constant fields (fixed files)
2. Optimal iteration (analysis)
3. Save analysis result

In this section, the detailed structure and content of stdout are explained using an on-line
example case. This case uses a WRF-ARW NetCDF file as the background and analyzes
several observations typical for operations, including most conventional observation data,
several radiance data (AMSU-A, AMSU-B, HIRS3, HIRS4, and MHS), and GPSRO data.
The case was run on the Linux cluster supercomputer, using 4 processors. To keep the
output concise and make it more readable, most repeated content was deleted (shown by
the blue dotted line). For the same reason, the accuracy of some numbers has been reduced
to avoid line breaks in stdout.

The following indicates the start of the GSI analysis. It includes some run time information
of the analysis and control variables.

48
GSI Diagnostics and Tuning

* . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .
PROGRAM GSI_ANL HAS BEGUN. COMPILED 1999232.55 ORG: NP23
STARTING DATE-TIME JUL 13,2012 21:55:14.798 195 FRI 2456122

gsi_metguess_mod*init_: 2D-MET STATE VARIABLES:


gsi_metguess_mod*init_: 3D-MET STATE VARIABLES:
cw
gsi_metguess_mod*init_: ALL MET STATE VARIABLES:
cw
INIT_IO: set IO server task to mype_io= 3
state_vectors*init_anasv: 2D-STATE VARIABLES ps sst
state_vectors*init_anasv: 3D-STATE VARIABLES u v
tv tsen q oz cw p3d
state_vectors*init_anasv: ALL STATE VARIABLES u v
tv tsen q oz cw p3d
ps sst
control_vectors*init_anacv: 2D-CONTROL VARIABLES ARE ps sst
control_vectors*init_anacv: 3D-CONTROL VARIABLES ARE sf vp
t q oz cw
control_vectors*init_anacv: MOTLEY CONTROL VARIABLES stl sti
control_vectors*init_anacv: ALL CONTROL VARIABLES sf vp
ps t q oz sst cw
stl sti
INIT_IO: reserve units lendian_in= 15 and lendian_out= 66
for little endian i/o

Next is the content of all namelist variables used in this analysis. The 1st part shows the
4DVAR setups. Please note that while this version of the GSI includes some 4DVAR code,
it is untested in this release. The general set up for the GSI analysis (3DVAR) is located in
the &SETUP section of the GSI namelist. Please check Appendix B for definitions and
default values of each namelist variable.
GSI_4DVAR: nobs_bins = 1
SETUP_4DVAR: l4dvar= F
SETUP_4DVAR: l4densvar= F
SETUP_4DVAR: winlen= 3.00000000000000
SETUP_4DVAR: winoff= 3.00000000000000
SETUP_4DVAR: hr_obsbin= 3.00000000000000
SETUP_4DVAR: nobs_bins= 1
SETUP_4DVAR: ntlevs_ens= 1
SETUP_4DVAR: nsubwin,nhr_subwin= 1 3
SETUP_4DVAR: lsqrtb= F
SETUP_4DVAR: lbicg= F
SETUP_4DVAR: lcongrad= F
SETUP_4DVAR: lbfgsmin= F

&SETUP
GENCODE = 78.0000000000000 ,
FACTQMIN = 0.000000000000000E+000,
FACTQMAX = 0.000000000000000E+000,
DELTIM = 1200.00000000000 ,
DTPHYS = 3600.00000000000 ,
BIASCOR = -1.00000000000000 ,
BCOPTION = 1,
DIURNALBC = 0.000000000000000E+000,
NDAT = 75,
NITER = 0, 2*10, 48*0,
NITER_NO_QC = 51*1000000,
MITER = 2,
QOPTION = 2,
NHR_ASSIMILATION = 3,
MIN_OFFSET = 180,
IOUT_ITER = 220,
NPREDP = 6,

49
GSI Diagnostics and Tuning

RETRIEVAL = F,

0: /
0: &GRIDOPTS

0: &BKGERR

0: &ANBKGERR

0: &JCOPTS

0: &STRONGOPTS

0: &OBSQC

0: &SUPEROB_RADAR

0: &LAG_DATA

0: &HYBRID_ENSEMBLE

0: &RAPIDREFRESH_CLDSURF

0: &CHEM

This version of GSI attempts to read multi-time-level backgrounds, however we only have
provided one. Therefor, there is some error information at the beginning of the background
reading portion:
CONVERT_NETCDF_MASS: problem with flnm1 = wrf_inou1, Status = -1021

We can ignore these errors for missing files wrf_inou1, wrf_inou2, to wrf_inou9 because
we only ran 3DVAR with one background.
Next, the background fields for the analysis are read in and the maximum and minimum
values of the fields at each vertical level are displayed. Here, only part of the variables znu
and T are shown, and all other variables read by the GSI are listed only as the variable
name in the NetCDF file (rmse_var = ). The maximum and minimum values are useful
for a quick verification that the background fields have been read successfully.

dh1 = 3
iy,m,d,h,m,s= 2011 3 22 12 0
0
dh1 = 3
before rmse var T
after rmse var T
dh1 = 3
rmse_var = T
ndim1 = 3
ordering = XYZ

k,znu(k)= 1 0.9990000
k,znu(k)= 2 0.9960001

k,znu(k)= 49 7.1999999E-03
k,znu(k)= 50 2.3500000E-03
rmse_var=ZNW

rmse_var=RDX

rmse_var=RDY

50
GSI Diagnostics and Tuning

rmse_var=MAPFAC_M

rmse_var=XLAT

rmse_var=XLONG

rmse_var=MUB

rmse_var=MU

rmse_var=PHB

rmse_var=T
ordering=XYZ
WrfType,WRF_REAL= 104 104
ndim1= 3
staggering= N/A
start_index= 1 1 1 0
end_index= 348 247 50 0
k,max,min,mid T= 1 310.8738 234.0430 280.8286
k,max,min,mid T= 2 311.2928 235.4056 280.9952
k,max,min,mid T= 3 311.4995 237.6656 281.3005

k,max,min,mid T= 48 702.1189 616.5657 651.2628


k,max,min,mid T= 49 770.3488 684.2148 718.2207
k,max,min,mid T= 50 837.6133 760.6921 792.9601
rmse_var=QVAPOR

rmse_var=U

rmse_var=V

rmse_var=LANDMASK

rmse_var=SEAICE

rmse_var=SST

rmse_var=IVGTYP

rmse_var=ISLTYP

rmse_var=VEGFRA

rmse_var=SNOW

rmse_var=U10

rmse_var=V10

rmse_var=SMOIS

rmse_var=TSLB

rmse_var=TSK

rmse_var=QCLOUD

rmse_var=QRAIN

rmse_var=QSNOW

rmse_var=QICE

rmse_var=QGRAUP

rmse_var=RAD_TTEN_DFI

51
GSI Diagnostics and Tuning

The following example demonstrates how the analysis domain is partitioned into sub-
domains for each processor. 4 processors were used in this example (see Section 4.4 for
more information):
general_DETER_SUBDOMAIN: task,istart,jstart,ilat1,jlon1= 0 1 1 124 174
general_DETER_SUBDOMAIN: task,istart,jstart,ilat1,jlon1= 1 125 1 123 174
general_DETER_SUBDOMAIN: task,istart,jstart,ilat1,jlon1= 2 1 175 124 174
general_DETER_SUBDOMAIN: task,istart,jstart,ilat1,jlon1= 3 125 175 123 174

Information on the horizontal dimensions of the sub-domains and set grid related variables:
in general_sub2grid_create_info, kbegin= 1 77 153
228 303
in general_sub2grid_create_info, kend= 76 152 227
302
INIT_GRID_VARS: number of threads 1
INIT_GRID_VARS: for thread 1 jtstart,jtstop = 1
176

Information on using vertical levels from background fields to replace the number of
vertical levels set in anavinfo and the start index of each state variables in the bundle array:
INIT_RAD_VARS: ***WARNING*** mxlvs from the anavinfo file 30
is different from that of the guess 50
. Resetting maxlvs to match NSIG from guess.
Vars in Rad-Jacobian (dims)
--------------------------
tv 0
q 50
oz 100
u 150
v 151
sst 152

Display the analysis and background file time (they should be the same):
READ_wrf_mass_FILES: analysis date,minutes 2011 3
22 12 0 17472240
READ_wrf_mass_FILES: sigma guess file, nming2 0.000000000000000E+000
12 3 22 2011 17472240
READ_wrf_mass_FILES: sigma fcst files used in analysis : 3
3.00000000000000 1
READ_wrf_mass_FILES: surface fcst files used in analysis: 3
3.00000000000000 1
GESINFO: Guess date is 12 3 22 2011
0.000000000000000E+000
GESINFO: Analysis date is 2011 3 22 12
0 2011032212 3.00000000000000

Read in radar station information and conduct superobs for radar Level-II radial velocity.
This case didnt have radar Level-II velocity data linked. There is warning information
about opening the file but this will not impact the rest of the GSI analysis.
RADAR_BUFR_READ_ALL: problem opening level 2 bufr file "l2rwbufr"

Read in and show the content of the observation info files (see Section 4.3 for details).
Here is part of the stdout shown convinfo:
READ_CONVINFO: tcp 112 0 1 3.00000 0 0 0 75.0000 5.00000 1.00000 75.0000 0.00000 0 0.00000 0.00000 0
READ_CONVINFO: ps 120 0 1 3.00000 0 0 0 4.00000 3.00000 1.00000 4.00000 0.30E-03 0 0.00000 0.00000 0

52
GSI Diagnostics and Tuning

READ_CONVINFO: ps 180 0 1 3.00000 0 0 0 4.00000 3.00000 1.00000 4.00000 0.30E-03 0 0.00000 0.00000 0

READ_CONVINFO: t 120 0 1 3.00000 0 0 0 8.00000 5.60000 1.30000 8.00000 0.10E-05 0 0.00000 0.00000 0
READ_CONVINFO: t 126 0 -1 3.00000 0 0 0 8.00000 5.60000 1.30000 8.00000 0.10E-02 0 0.00000 0.00000 0

READ_CONVINFO: gps 440 0 -1 3.00000 0 0 0 10.0000 10.0000 1.00000 10.0000 0.00000 0 0.00000 0.0000 0

Read in background fields from intermediate binary file sigf03 and partition the whole 2D
field into subdomains for parallel analysis:
READ_WRF_MASS_GUESS: open lendian_in= 15 to file=sigf03
READ_WRF_MASS_GUESS: open lendian_in= 15 to file=sigf03
ifld, temp1(im/2,jm/2)= 4 0.28100E+03
READ_WRF_MASS_GUESS: open lendian_in= 15 to file=sigf03
ifld, temp1(im/2,jm/2)= 2 0.52546E+04
ifld, temp1(im/2,jm/2)= 3 0.28083E+03
gsi_metguess_mod*create_: alloc() for met-guess done
at 0 in read_wrf_mass_guess
at 0.1 in read_wrf_mass_guess
at 1 in read_wrf_mass_guess, lm = 50
at 1 in read_wrf_mass_guess, num_mass_fields= 214
at 1 in read_wrf_mass_guess, nfldsig = 1
at 1 in read_wrf_mass_guess, num_all_fields= 214
at 1 in read_wrf_mass_guess, npe = 4
at 1 in read_wrf_mass_guess, num_loc_groups= 53
at 1 in read_wrf_mass_guess, num_all_pad = 216
at 1 in read_wrf_mass_guess, num_loc_groups= 54
READ_WRF_MASS_GUESS: open lendian_in= 15 to file=sigf03
ifld, temp1(im/2,jm/2)= 1 0.93598E+05
ifld, temp1(im/2,jm/2)= 5 0.28130E+03
ifld, temp1(im/2,jm/2)= 6 0.28213E+03
ifld, temp1(im/2,jm/2)= 7 0.28372E+03

ifld, temp1(im/2,jm/2)= 212 0.37183E+00


ifld, temp1(im/2,jm/2)= 213 0.27570E+03
ifld, temp1(im/2,jm/2)= 214 0.27570E+03
in read_wrf_mass_guess, num_doubtful_sfct_all = 0
in read_wrf_mass_guess, num_doubtful_sfct_all = 0

Display information on the allocation of control variable array.


control_vectors: length= 6710564
control_vectors: currently allocated= 0
control_vectors: maximum allocated= 0
control_vectors: number of allocates= 0
control_vectors: number of deallocates= 0
control_vectors: Estimated max memory used= 0.0 Mb

Show the source of observation error used in the analysis (details see Section 7.1.1):
CONVERR: using observation errors from user provided table

Read in all the observations. The observation ingesting processes are distributed over all
the processors with each processor reading in at least one observation type. To speed up
reading some of the large datasets, one type can be read in by more than one (ntasks)
processors.

Before reading in the data from BUFR files, GSI resets the file status depending on
whether the observation time matches the analysis time and how offtime_date is set. This
step also checks for consistency between the satellite radiance data types in the BUFR files

53
GSI Diagnostics and Tuning

and the usage setups in the satinfo files. The following shows stdout information from this
step:
read_obs_check: bufr file date is 2011032212 prepbufr t
read_obs_check: bufr file date is 2011032212 prepbufr q
read_obs_check: bufr file date is 2011032212 prepbufr ps
read_obs_check: bufr file date is 0 satwnd uv not used
read_obs_check: bufr file date is 0 radarbufr rw not used
read_obs_check: bufr file date is 0 tmirrbufr pcp_tmi not used
read_obs_check: bufr file date is 0 omibufr omi not used
read_obs_check: bufr file date is 0 gimgrbufr goes_img not used
read_obs_check: bufr file date is 2011032212 prepbufr uv
read_obs_check: bufr file date is 2011032212 amsuabufr amsua

read_obs_check: bufr file date is 2011032212 amsuabufr amsua


read_obs_check: bufr file date is 2011032212 amsubbufr amsub
read_obs_check: bufr file date is 0 amsubbufrears amsub
not used

read_obs_check: bufr file date is 0 sbuvbufr sbuv2 not used


data type hirs3_n16 not used in info file -- do not read file
hirs3bufr
data type hirs4_n18 not used in info file -- do not read file
hirs4bufr

read_obs_check: bufr file date is 0 iasibufr iasi not used

The list of observation types that will be read in:


number of extra processors 2
READ_OBS: read 19 hirs3 hirs3_n17 using ntasks= 2 0 2
READ_OBS: read 21 hirs4 hirs4_metop-a using ntasks= 2 2 2
READ_OBS: read 36 amsub amsub_n17 using ntasks= 2 0 2
READ_OBS: read 37 mhs mhs_n18 using ntasks= 2 2 2
READ_OBS: read 38 mhs mhs_metop-a using ntasks= 2 0 2
READ_OBS: read 63 mhs mhs_n19 using ntasks= 2 2 2
READ_OBS: read 1 ps ps using ntasks= 1 0 1
READ_OBS: read 2 t t using ntasks= 1 1 1
READ_OBS: read 3 q q using ntasks= 1 2 1
READ_OBS: read 4 pw pw using ntasks= 1 3 1
READ_OBS: read 6 uv uv using ntasks= 1 0 1
READ_OBS: read 10 sst sst using ntasks= 1 1 1
READ_OBS: read 11 gps_ref gps using ntasks= 1 2 1
READ_OBS: read 28 amsua amsua_n15 using ntasks= 1 3 1
READ_OBS: read 31 amsua amsua_n18 using ntasks= 1 0 1
READ_OBS: read 32 amsua amsua_metop-a using ntasks= 1 1 1
READ_OBS: read 61 hirs4 hirs4_n19 using ntasks= 1 2 1
READ_OBS: read 62 amsua amsua_n19 using ntasks= 1 3 1

Display basic statistics for full horizontal surface fields:


GETSFC: enter with nlat_sfc,nlon_sfc= 0 0 and nlat,nlon=
247 348
GETSFC: set nlat_sfc,nlon_sfc= 247 348
================================================================================
Status Var Mean Min Max
sfcges2 FC10 1.000000000000E+00 1.000000000000E+00 1.000000000000E+00
sfcges2 SNOW 7.733149095057E+00 0.000000000000E+00 3.080491943359E+02
sfcges2 VFRC 5.299080975923E-02 0.000000000000E+00 8.883580780029E-01
sfcges2 SRGH 5.000000000008E-02 5.000000000000E-02 5.000000000000E-02
sfcges2 STMP 2.807239946831E+02 2.168921508789E+02 3.027479553223E+02
sfcges2 SMST 7.959968731205E-01 4.999999888241E-03 1.071443676949E+00
sfcges2 SST 2.807239944456E+02 2.168921356201E+02 3.027479553223E+02
sfcges2 VTYP 1.554879240542E+01 1.000000000000E+00 2.400000000000E+01
sfcges2 ISLI 4.321164316627E-01 0.000000000000E+00 2.000000000000E+00
sfcges2 STYP 1.156151984736E+01 1.000000000000E+00 1.600000000000E+01
================================================================================

54
GSI Diagnostics and Tuning

Loop over all data files to read in observations:


READ_BUFRTOVS: file=hirs4bufr type=hirs4 sis=hirs4_metop-a nread= 189563
ithin= 1 rmesh=120.000000 isfcalc= 0 ndata= 1178 ntask= 2
READ_BUFRTOVS: file=hirs3bufr type=hirs3 sis=hirs3_n17 nread= 202426
ithin= 1 rmesh=120.000000 isfcalc= 0 ndata= 1235 ntask= 2

READ_GPS: time outside window -2.81666666666667 skip this report


READ_GPS: time outside window -2.95000000000000 skip this report

READ_GPS : file=gpsrobufr type=gps_ref sis=gps nread= 22566


ithin= 0 rmesh=120.000000 isfcalc= 0 ndata= 18130 ntask= 1

READ_PREPBUFR: file=prepbufr type=uv sis=uv nread= 91930


ithin= 0 rmesh=120.000000 isfcalc= 0 ndata= 71038 ntask= 1
READ_BUFRTOVS: file=amsuabufr type=amsua sis=amsua_n18 nread= 63825
ithin= 2 rmesh= 60.000000 isfcalc= 0 ndata= 47704 ntask= 1

Using the above output information, many details on the observations can be obtained. For
example, the last line (bold) indicates that subroutine READ_BUFRTOVS was called to
read in NOAA-18 AMSU-A (sis=amsua_n18) from the BUFR file amsuabufr
(file=amsuabufr). Furthermore, this kind of data has 63825 observations in the file
(nread=63825) and 47704 in analysis domain and time-window (ndata=47704). The data
was thinned on a 60 km coarse grid (rmesh=60.000000).

The next step partitions observations into subdomains. The observation distribution is
summarized below by listing the number of observations for each observation variable in
each subdomain (see Section 4.4 for more information):
OBS_PARA: ps 2607 2878 9565 3019
OBS_PARA: t 5172 4743 13902 5590
OBS_PARA: q 4107 4197 11998 4090
OBS_PARA: pw 296 92 475 83
OBS_PARA: uv 6640 5439 18109 5725
OBS_PARA: sst 0 0 6 3
OBS_PARA: gps_ref 3538 5580 2277 6768
OBS_PARA: hirs3 n17 0 0 30 35
OBS_PARA: hirs4 metop-a 0 0 28 35
OBS_PARA: amsua n15 2564 1333 133 195
OBS_PARA: amsua n18 1000 2117 0 90
OBS_PARA: amsua metop-a 0 0 58 67
OBS_PARA: amsub n17 0 0 57 70
OBS_PARA: mhs n18 1438 2942 0 119
OBS_PARA: mhs metop-a 0 0 54 71
OBS_PARA: hirs4 n19 246 1097 0 37
OBS_PARA: amsua n19 656 3503 0 93
OBS_PARA: mhs n19 936 4277 0 116

Information on ingesting background error statistics:


m_berror_stats_reg::berror_read_bal_reg(PREBAL_REG): get balance variables"
berror_stats". mype,nsigstat,nlatstat = 0 60 93
m_berror_stats_reg::berror_read_wgt_reg(PREWGT_REG): read error amplitudes "
berror_stats". mype,nsigstat,nlatstat = 0 60 93
Assigned default statistics to variable oz
Assigned default statistics to variable cw

55
GSI Diagnostics and Tuning

From this point forward in the stdout, the output shows many repeated entries. This is
because the information is written from inside the outer loop. Typically the outer loop is
iterated twice.

For each outer loop, the work begins with the calculation of the observation innovation.
This calculation is done by the subroutine setuprhsall, which sets up the right hand side
(rhs) of the analysis equation. This information is contained within the stdout file, which is
discussed in the following sections:

Start the first outer analysis loop:


GLBSOI: jiter,jiterstart,jiterlast,jiterend= 1 1
2 1

Calculate observation innovation for each data type in the first outer loop:
SETUPALL:,obstype,isis,nreal,nchanl=ps ps 22
0 2607
SETUPALL:,obstype,isis,nreal,nchanl=t t 24
0 5172

Read in CRTM coefficients for the first outer loop, compute radiance observation
innovation, and save diagnostic information in the first outer loop. In stdout, only
information related to available radiance data are printed. The complete information can be
found in the diagnostic files for each observation (for details see Appendix A.2):
SETUPALL:,obstype,isis,nreal,nchanl=amsua amsua_n15 33
15 2564
INIT_CRTM: crtm_init() on path "./"
Read_SpcCoeff_Binary(INFORMATION) : FILE: ./amsua_n15.SpcCoeff.bin; ^M
SpcCoeff RELEASE.VERSION: 7.03 N_CHANNELS=15^M
amsua_n15 AntCorr RELEASE.VERSION: 1.04 N_FOVS=30 N_CHANNELS=15
Read_ODPS_Binary(INFORMATION) : FILE: ./amsua_n15.TauCoeff.bin; ^M
ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=2 N_ABSORBERS=1 N_CHANNELS=15
N_COEFFS=21600
Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M
EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS= 11
SETUPRAD: write header record for amsua_n15 5 30
8 0 0 15 0 13784
to file pe0000.amsua_n15_01 2011032212
INIT_CRTM: crtm_init() on path "./"
Read_SpcCoeff_Binary(INFORMATION) : FILE: ./hirs3_n17.SpcCoeff.bin; ^M
SpcCoeff RELEASE.VERSION: 7.02 N_CHANNELS=19
Read_ODPS_Binary(INFORMATION) : FILE: ./hirs3_n17.TauCoeff.bin; ^M
ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=5 N_ABSORBERS=3 N_CHANNELS=19
N_COEFFS=84300
Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M
EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS= 11
SETUPRAD: write header record for hirs3_n17 5 30
8 0 0 15 0 13784
to file pe0002.hirs3_n17_01 2011032212

SETUPRAD: write header record for mhs_n19 5 30


8 0 0 15 0 13784
to file pe0000.mhs_n19_01 2011032212
obsdiags: Bytes per element= 91
obsdiags: length total, used= 189601 47597
obsdiags: Estimated memory usage= 16.5 Mb

56
GSI Diagnostics and Tuning

The inner iteration of the first outer loop is discussed in the example below. In this simple
example, the maximum number of iterations is 10:
Print Jo components (observation term for each observation type in cost function) at the
beginning of the inner loop:
Begin Jo table outer loop
Observation Type Nobs Jo Jo/n
surface pressure 15600 1.1832900994269770E+04 0.759
temperature 11913 1.9028504337386752E+04 1.597
wind 36726 4.0133809857345797E+04 1.093
moisture 4416 4.6817824733959324E+03 1.060
gps 8709 1.8551360623628108E+04 2.130
radiance 53749 4.8794660264558479E+04 0.908
Nobs Jo Jo/n
Jo Global 131113 1.4302301855058485E+05 1.091
End Jo table outer loop

Print cost function values for each inner iteration (see section 4.6 for more details):
GLBSOI: START pcgsoi jiter= 1
pcgsoi: gnorm(1:2),b= 1.156412696509E+07 1.156412696509277448E+07 0.0000000000E+00
stprat 0.959014637164523
stprat 5.926551062234699E-014
Initial cost function = 1.430230185505848494E+05
Initial gradient norm = 3.400606852473948493E+03
Minimization iteration 0
grepcost J,Jb,Jo,Jc,Jl = 1 0 1.430230185505848494E+05 0.000000000000000000E+00
1.430230185505848494E+05 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 1 0 3.400606852473948493E+03 1.000000000000000000E+00
pcgsoi: cost,grad,step = 1 0 1.430230185505E+05 3.4006068524E+03 2.4398954427E-03
pcgsoi: gnorm(1:2),b= 1.870988434476E+06 1.8709884344764E+06 1.617924500590633419E-01
stprat 0.680228405155974
stprat 4.467539812170933E-015
Minimization iteration 1
grepcost J,Jb,Jo,Jc,Jl = 1 1 1.148077578695608245E+05 6.884228595044261567E+01
1.147389155836103746E+05 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 1 1 1.367840792810494804E+03 4.022343223284949310E-01
pcgsoi: cost,grad,step = 1 1 1.14807757869E+05 1.3678407928E+03 7.63011937910E-03
pcgsoi: gnorm(1:2),b= 1.701652049920E+06 1.7016520499209E+06 9.094936230309182967E-01
stprat 0.375750738871811
stprat 1.030833603466626E-016

Minimization iteration 10
grepcost J,Jb,Jo,Jc,Jl = 1 10 6.518198181954051688E+04 4.712017911695538714E+03
6.046996390784497635E+04 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 1 10 3.526173233561088409E+02 1.036924698012588603E-01
pcgsoi: cost,grad,step = 1 10 6.5181981819E+04 3.526173233E+02 9.3430503337038E-03
Minimization final diagnostics

At the end of the 1st outer loop, print some diagnostics about the guess fields after
adding the analysis increment to the guess and diagnostics about the analysis
increment:
pcgsoi: Updating guess
================================================================================
Status Var Mean Min Max
analysis U 8.687820027834E+00 -5.993215723446E+01 7.019321109527E+01
analysis V -8.389957616632E-01 -7.818069678361E+01 9.543296554002E+01
analysis TV 2.402283737571E+02 1.913718225601E+02 3.039049699223E+02
analysis Q 1.527484398981E-03 1.000000000000E-07 1.836425129904E-02
analysis TSEN 2.399649493443E+02 1.913715400572E+02 3.008258699568E+02
analysis OZ 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
analysis DUMY 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
analysis DIV 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
analysis VOR 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00

57
GSI Diagnostics and Tuning

analysis PRSL 4.097302066597E+01 1.148126437759E+00 1.038437446031E+02


analysis PS 9.917365877107E+01 6.406160090103E+01 1.039817643172E+02
analysis SST 2.807239944456E+02 2.168921356201E+02 3.027479553223E+02
analysis radb -3.450042141774E-03 -3.668133494481E+01 4.186399153656E+01
analysis pcpb 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
================================================================================
wwww u -2.694876007555E-02 -1.101437973098E+01 9.323673570665E+00
wwww v -2.027691113373E-03 -1.030344877527E+01 1.236183232954E+01
wwww tv -2.574050797903E-01 -5.802385852340E+00 2.846798028189E+00
wwww tsen -2.569174827651E-01 -5.802377539978E+00 2.846838216350E+00
wwww q -3.328819711997E-06 -2.793073305892E-03 3.887865406977E-03
wwww oz 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
wwww cw 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
wwww p3d 1.060550728782E-02 -8.703821275248E-02 2.381276521565E-01
wwww ps 2.592449429408E-02 -8.703821275248E-02 2.381276521565E-01
wwww sst 2.945225217239E-03 -5.763484248069E-01 4.716956592383E-01

Information on the state variable array allocation after cleaning up major fields at the
end of the inner loop.
state_vectors: latlon11,latlon1n,latlon1n1,lat2,lon2,nsig= 22176
1108800 1130976 126 176 50
state_vectors: length= 8936928
state_vectors: currently allocated= 0
state_vectors: maximum allocated= 4
state_vectors: number of allocates= 4
state_vectors: number of deallocates= 4
state_vectors: Estimated max memory used= 272.9 Mb

Start the second outer loop.


GLBSOI: jiter,jiterstart,jiterlast,jiterend= 2 1
2 1

Calculate observation innovations for each data type in the second outer loop:
SETUPALL:,obstype,isis,nreal,nchanl=ps ps 22
0 2607
SETUPALL:,obstype,isis,nreal,nchanl=t t 24
0 5172

Read in CRTM coefficients from the fix file for the second outer loop and save diagnostic
information at the beginning of the second outer loop:
SETUPALL:,obstype,isis,nreal,nchanl=amsua amsua_n15 33
15 2564
INIT_CRTM: crtm_init() on path "./"
Read_SpcCoeff_Binary(INFORMATION) : FILE: ./amsua_n15.SpcCoeff.bin; ^M
SpcCoeff RELEASE.VERSION: 7.03 N_CHANNELS=15^M
amsua_n15 AntCorr RELEASE.VERSION: 1.04 N_FOVS=30 N_CHANNELS=15
Read_ODPS_Binary(INFORMATION) : FILE: ./amsua_n15.TauCoeff.bin; ^M
ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=2 N_ABSORBERS=1 N_CHANNELS=15
N_COEFFS=21600
Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M
EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS= 11
INIT_CRTM: crtm_init() on path "./"
Read_SpcCoeff_Binary(INFORMATION) : FILE: ./hirs4_metop-a.SpcCoeff.bin; ^M
SpcCoeff RELEASE.VERSION: 7.02 N_CHANNELS=19
Read_ODPS_Binary(INFORMATION) : FILE: ./hirs4_metop-a.TauCoeff.bin; ^M
ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=5 N_ABSORBERS=3 N_CHANNELS=19
N_COEFFS=82700
Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M
EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS

58
GSI Diagnostics and Tuning

The output from the inner iterations in the second outer loop is shown below. In this
example, the maximum number of iterations is 10:
Print Jo components (observation term for each observation type in cost function) at the
beginning of the inner loop:
Begin Jo table outer loop
Observation Type Nobs Jo Jo/n
surface pressure 15620 6.0353797873191743E+03 0.386
temperature 11913 1.0272593707203929E+04 0.862
wind 36796 2.2776352264954159E+04 0.619
moisture 4416 2.2085428390180150E+03 0.500
gps 8860 8.9698796153000221E+03 1.012
radiance 84607 4.1586556746867529E+04 0.492
Nobs Jo Jo/n
Jo Global 162212 9.1849304960662834E+04 0.566
End Jo table outer loop

Print cost function values for each inner iteration (see section 4.6 for more details):
GLBSOI: START pcgsoi jiter= 2
pcgsoi: gnorm(1:2),b= 6.04436368069045E+06 6.04436368069045E+06 0.000000000000000E+00
stprat 0.952177909610047
stprat 3.022462323621298E-014
Initial cost function = 9.709172928659517493E+04
Initial gradient norm = 2.458528763445824097E+03
Minimization iteration 0
grepcost J,Jb,Jo,Jc,Jl = 2 0 9.709172928659517493E+04 5.242424325932335705E+03
9.184930496066283376E+04 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 2 0 2.458528763445824097E+03 1.000000000000000000E+00
pcgsoi: cost,grad,step = 2 0 9.70917292865E+04 2.45852876344E+03 2.0910838314E-03
pcgsoi: gnorm(1:2),b= 1.39687516886273E+06 1.39687516886273E+06 2.311037592468580E-01
stprat 0.326022963208770
stprat 3.222291707173269E-015
Minimization iteration 1
grepcost J,Jb,Jo,Jc,Jl = 2 1 8.445245812257242505E+04 5.411029823593031324E+03
7.904142829897940101E+04 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 2 1 1.181894736794584333E+03 4.807325236000348778E-01
pcgsoi: cost,grad,step = 2 1 8.44524581225E+04 1.18189473679E+03 3.1026039720E-03
pcgsoi: gnorm(1:2),b= 6.2897182736819E+05 6.2897182736819E+05 4.5027060498202475E-01
stprat 1.600925505225033E-002
stprat 1.290408460808865E-015

Minimization iteration 10
grepcost J,Jb,Jo,Jc,Jl = 2 10 7.074054129933725926E+04 6.952319149296819887E+03
6.378822215004044119E+04 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 2 10 3.275896517354244679E+02 1.332462148119354928E-01
pcgsoi: cost,grad,step = 2 10 7.07405412993E+04 3.27589651735E+02 4.3417883383E-03
Minimization final diagnostics

Because the outer loop is set to 2, the completion of the 2nd outer loop is the end of the
analysis. The next step is to save the analysis results. Again, only a portion of variable T is
shown and all other variables are listed according to variable name in the NetCDF file
(rmse_var = ). The maximum and minimum values are useful information for a quick
check of the reasonableness of the analysis:
pcgsoi: Updating guess
at 2 in wrwrfmassa
at 3 in wrwrfmassa
at 6 in wrwrfmassa
iy,m,d,h,m,s= 2011 3 22 12 0
0
nlon,lat,sig_regional= 348 247 50

59
GSI Diagnostics and Tuning

rmse_var=P_TOP
ordering=0
WrfType,WRF_REAL= 104 104
ndim1= 0
staggering= N/A
start_index= 1 1 1 0
end_index1= 348 247 50 0
p_top= 1000.000
rmse_var=MUB
ordering=XY
WrfType,WRF_REAL= 104 104
ndim1= 2
staggering= N/A
start_index= 1 1 1 0
end_index1= 348 247 50 0
max,min MUB= 99645.99 62950.09
max,min psfc= 103974.1 64035.82
max,min MU= 3974.133 -2036.453
rmse_var=MU
ordering=XY
WrfType,WRF_REAL= 104 104
ndim1= 2
staggering= N/A
start_index= 1 1 1 0
end_index1= 348 247 50 0
k,max,min,mid T= 1 311.0948 234.0167 280.5982
k,max,min,mid T= 2 311.5334 235.6663 280.7569

k,max,min,mid T= 49 762.8171 679.6612 706.3553


k,max,min,mid T= 50 828.8514 754.9369 777.9498
rmse_var=T

rmse_var=QVAPOR

rmse_var=U

rmse_var=V

rmse_var=SEAICE

rmse_var=SST

rmse_var=TSK

Diagnostics of the analysis results after adding the analysis increment to the guess and
diagnostics about the analysis increment:
================================================================================
Status Var Mean Min Max
analysis U 8.684494834819E+00 -5.994932102119E+01 7.041893161310E+01
analysis V -8.368213328561E-01 -7.803787125994E+01 9.484754997846E+01
analysis TV 2.401499796179E+02 1.913068477406E+02 3.039191344395E+02
analysis Q 1.535424180027E-03 1.000000000000E-07 1.869392878573E-02
analysis TSEN 2.398851864729E+02 1.913065654009E+02 3.008630161153E+02
analysis OZ 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
analysis DUMY 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
analysis DIV 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
analysis VOR 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
analysis PRSL 4.098357903854E+01 1.148194759980E+00 1.038787838788E+02
analysis PS 9.917332267920E+01 6.406096161559E+01 1.039929254169E+02
analysis SST 2.807239944456E+02 2.168921356201E+02 3.027479553223E+02
analysis radb -7.769461831636E-03 -4.254112280406E+01 4.610384289368E+01
analysis pcpb 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
================================================================================
wwww u -3.325193015686E-03 -2.341066364048E+00 2.228906207570E+00
wwww v 2.174428807068E-03 -2.228629595272E+00 3.099467379467E+00
wwww tv -7.839413921041E-02 -1.677489609269E+00 1.525438097693E+00

60
GSI Diagnostics and Tuning

wwww tsen -7.976378952151E-02 -1.677484898836E+00 1.525436684058E+00


wwww q 7.939568093893E-06 -2.000539224496E-03 2.329500623959E-03
wwww oz 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
wwww cw 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00
wwww p3d -1.374925497693E-04 -3.162044407889E-02 2.258895669507E-02
wwww ps -3.360918742725E-04 -3.162044407889E-02 2.258895669507E-02
wwww sst 5.598866783354E-03 -4.125074055517E-01 3.498591820290E-01

Information on control variable allocation after cleaning up major fields at the end of the
2nd outer loop:
state_vectors: latlon11,latlon1n,latlon1n1,lat2,lon2,nsig= 22176
1108800 1130976 126 176 50
state_vectors: length= 8936928
state_vectors: currently allocated= 0
state_vectors: maximum allocated= 4
state_vectors: number of allocates= 8
state_vectors: number of deallocates= 8
state_vectors: Estimated max memory used= 272.9 Mb

After completion of the analysis, the subroutine setuprhsall is called again if


write_diag(3)=.true.,to calculate analysis O-A information. Because CRTM
initialization is inside setuprhsall, we can see the section reading in the CRTM coefficients
(the third time to see this information):
SETUPALL:,obstype,isis,nreal,nchanl=ps ps 22
0 2607

SETUPALL:,obstype,isis,nreal,nchanl=amsua amsua_n15 33
15 2564
INIT_CRTM: crtm_init() on path "./"
Read_SpcCoeff_Binary(INFORMATION) : FILE: ./amsua_n15.SpcCoeff.bin; ^M
SpcCoeff RELEASE.VERSION: 7.03 N_CHANNELS=15^M
amsua_n15 AntCorr RELEASE.VERSION: 1.04 N_FOVS=30 N_CHANNELS=15
Read_ODPS_Binary(INFORMATION) : FILE: ./amsua_n15.TauCoeff.bin; ^M
ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=2 N_ABSORBERS=1 N_CHANNELS=15
N_COEFFS=21600
Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M
EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS= 11
SETUPRAD: write header record for amsua_n15 5 30
8 0 0 15 0 13784
to file pe0000.amsua_n15_03 2011032212

INIT_CRTM: crtm_init() on path "./"


Read_SpcCoeff_Binary(INFORMATION) : FILE: ./mhs_n19.SpcCoeff.bin; ^M
SpcCoeff RELEASE.VERSION: 7.03 N_CHANNELS=5^M
mhs_n19 AntCorr RELEASE.VERSION: 1.05 N_FOVS=90 N_CHANNELS=5
Read_ODPS_Binary(INFORMATION) : FILE: ./mhs_n19.TauCoeff.bin; ^M
ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=2 N_ABSORBERS=1 N_CHANNELS=5
N_COEFFS=10200
Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M
EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS= 11
SETUPRAD: write header record for mhs_n19 5 30
8 0 0 15 0 13784
to file pe0000.mhs_n19_03 2011032212
obsdiags: Bytes per element= 91
obsdiags: length total, used= 189601 69044
obsdiags: Estimated memory usage= 16.5 Mb

61
GSI Diagnostics and Tuning

Print Jo components (observation term for each observation type) after the analysis, which
shows the fit of the analysis results to the data if compare to the same section before the 1st
outer loop:
Begin Jo table outer loop
Observation Type Nobs Jo Jo/n
surface pressure 15620 5.6788282318981810E+03 0.364
temperature 11913 9.6271243544834779E+03 0.808
wind 36794 2.1433478710069721E+04 0.583
moisture 4417 2.0790935066632146E+03 0.471
gps 8868 8.1188601604511678E+03 0.916
radiance 100900 3.2527599143479922E+04 0.322
Nobs Jo Jo/n
Jo Global 178512 7.9464984107045690E+04 0.445
End Jo table outer loop

The end of the GSI analysis (a successful analysis must reach this end, but to reach this end
is not necessarily a successful analysis):
ENDING DATE-TIME JUL 13,2012 21:58:55.998 195 FRI 2456122
PROGRAM GSI_ANL HAS ENDED. IBM RS/6000 SP
* . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .

For runs on IBM computers only, additional machine resource statistics are provided. This
section gives very useful information about the computer resources used in the analysis.
Linux computer platforms do not provide this sort of information. The user must look for
other ways to find the time and memory information about the analysis run.
0:*****************RESOURCE STATISTICS*******************************
0:
0:The total amount of wall time = 551.066561
0:The total amount of time in user mode = 544.004598
0:The total amount of time in sys mode = 0.718078
0:The maximum resident set size (KB) = 2087516
0:Average shared memory use in text segment (KB*sec) = 3096668
0:Average unshared memory use in data segment (KB*sec) = 1082940860
0:Average unshared memory use in stack segment(KB*sec) = 0
0:Number of page faults without I/O activity = 34260
0:Number of page faults with I/O activity = 697
0:Number of times process was swapped out = 0
0:Number of times filesystem performed INPUT = 0
0:Number of times filesystem performed OUTPUT = 0
0:Number of IPC messages sent = 0
0:Number of IPC messages received = 0
0:Number of Signals delivered = 0
0:Number of Voluntary Context Switches = 1098
0:Number of InVoluntary Context Switches = 596
0:*****************END OF RESOURCE STATISTICS*************************
0:

62
GSI Diagnostics and Tuning

4.2 Single Observation Test

A single observation test is a GSI run with only one (pseudo) observation at a specific
location of the analysis domain. By examining the analysis increments, one can visualize
the important features of the analysis, such as the ratio of background and observation
variance and the pattern of the background error covariance. Therefore, the single
observation test is the first check that users should do after successfully installing the GSI.

4.2.1 Setup a single observation test:

To perform the single observation test with the GSI, the following GSI namelist variables
need to be set, which should be done through editing the run script (run_gsi.ksh):

Under the &SETUP section, turn on the single observation test:

oneobtest=.true.,

under the &SINGLEOB_TEST section, set up single observation features like:

maginnov=1.0,
magoberr=1.0,
oneob_type='t',
oblat=20.,
oblon=285.,
obpres=850.,
obdattim= 2007081512,
obhourset=0.,

Note:
Please check Section 3.4 for the explanation of each parameter. From these
parameters, we can see that a useful observation in the analysis should include
information like the observation type (oneob_type), value (maginnov), error
(magoberr), location (oblat, oblong, obpres) and time (obdattim,
obhourset). Users can dump out (ncdump) the global attributes from the NetCDF
background file and set oblat=CEN_LAT, oblong=360-CEN_LON to have the
observation in the center of the domain.
In the analysis, the GSI first generates a prepbufr file including only one
observation based on the information given in the namelist &SINGLEOB_TEST
section. To generate this prepbufr file, the GSI needs to read in a PrepBUFR table,
which is not needed when generating a GSI analysis with real observations. The
table can be found in the fix/ directory and needs to be copied to the run directory.
Please check if the following lines are in the GSI run script before running the
single observation test:
bufrtable=${FIX_ROOT}/prepobs_prep.bufrtable
cp $bufrtable ./prepobs_prep.bufrtable

63
GSI Diagnostics and Tuning

4.2.2. Examples of single observation tests for GSI


To give users a taste of the single observation test, a single temperature observation
(oneob_type='t') with 1 degree innovation (maginnov=1.0) and 1 degree observation
error (magoberr=1.0) was used to perform the test.

A horizontal cross section (left column) and a vertical cross section (right column) of
analysis increment of T, U, V, and Q from a single T observation

64
GSI Diagnostics and Tuning

This single observation was located at the center of the domain. The results are shown with
figures of the horizontal and vertical cross sections through the point of maximum analysis
increment. The above figure was generated using NCL scripts, which can be found in the
util/Analysis_Utilities/plots_ncl directory, introduced in Section A.4.

4.3 Control Data Usage

Observation data used in the GSI analysis are controlled by three parts of the GSI system:

1. Link observation BUFR files to working directory in GSI run script:

All BUFR/PrepBUFR observation files need to be linked to the working directory using
GSI recognized names. The run script (run_gsi.ksh) makes these links after locating the
working directory. Turning on or off these links can control the use of all the data types
contained in the BUFR files. In Section 3.1, we provide a table listing all default GSI
recognized observation file names and the corresponding examples of the observation
BUFR files from NCEP. The following is the first 3 rows of the table as an example:

GSI Name Content Example file names


prepbufr Conventional observations, including gdas1.t12z.prepbufr
ps, t, q, pw, uv, spd, dw, sst, from
observation platforms such as METAR,
sounding, et al.
satwnd satellite winds gdas1.t12z.satwnd.tm00.bufr_d
amsuabufr AMSU-A 1b radiance (brightness gdas1.t12z.1bamua.tm00.bufr_d
temperatures) from satellites NOAA-15,
16, 17,18, 19 and METOP-A/B

The left column is the GSI recognized name (bold) and the right column are names of
BUFR files from NCEP. In the run script, the following lines are used to link the BUFR
files in the right column to the working directory using the GSI recognized names shown in
the left column:

# Link to the prepbufr data


ln -s ${PREPBUFR} ./prepbufr

# Link to the radiance data


ln -s ${OBS_ROOT}/gdas1.t12z.1bamua.tm00.bufr_d amsuabufr

The GSI recognized default observation filenames are set up in the namelist section
&OBS_INPUT, which certainly can be changed based on application needs (details see
below).

65
GSI Diagnostics and Tuning

2. In GSI namelist (inside run script), section &OBS_INPUT:

In this namelist section, observation files (dfile) are linked to the observed variables used
in the GSI analysis (dsis), for example:
dfile(01)='prepbufr', dtype(01)='ps', dplat(01)=' ', dsis(01)='ps', dval(01)=1.0, dthin(01)=0,
dfile(02)='prepbufr' dtype(02)='t', dplat(02)=' ', dsis(02)='t', dval(02)=1.0, dthin(02)=0,
dfile(03)='prepbufr', dtype(03)='q', dplat(03)=' ', dsis(03)='q', dval(03)=1.0, dthin(03)=0,

dfile(28)='amsuabufr', dtype(28)='amsua', dplat(28)='n15', dsis(28)='amsua_n15', dval(28)=10.0, dthin(28)=2,


dfile(29)='amsuabufr', dtype(29)='amsua', dplat(29)='n16', dsis(29)='amsua_n16', dval(29)=0.0, dthin(29)=2,

Here, conventional observations ps, t, and q will be read in from file prepbufr and
AMSU-A radiances from NOAA-15 and -16 satellites will be read in from amsuabufr.
Deleting a particular line in &OBS_INPUT will turn off the use of this observation variable in
the GSI analysis but other variables under the same type still can be used. For example, if
we delete:
dfile(28)='amsuabufr', dtype(28)='amsua', dplat(28)='n15', dsis(28)='amsua_n15', dval(28)=10.0, dthin(28)=2,

Then, the AMSU-A observation from NOAA-15 will not be used in the analysis but the
AMSU-A observations from NOAA-16 will still be used.

The observation filename in dfile can be different from the sample script (run_gsi.ksh). If
the filename in dfile has been changed, the link from the BUFR files to the GSI recognized
name in the run script also need to be changed correspondingly. For example, if we change
the dfile(28):
dfile(28)='amsuabufr_n15', dtype(28)='amsua', dplat(28)='n15', dsis(28)='amsua_n15', dval(28)=10.0, dthin(28)=2,
dfile(29)='amsuabufr', dtype(29)='amsua', dplat(29)='n16', dsis(29)='amsua_n16', dval(29)=0.0, dthin(29)=2,

and a new link need to be added in the run script:

# Link to the radiance data


ln -s ${OBS_ROOT}/le_gdas1.t12z.1bamua.tm00.bufr_d amsuabufr
ln -s ${OBS_ROOT}/le_gdas1.t12z.1bamua.tm00.bufr_d amsuabufr_n15

The GSI will read NOAA-16 AMSU-A observations from file amsuabufr and NOAA-15
AMSU-A observations from file amsuabufr_n15 based on the above changes to the run
scripts and namelist. In this example, both amsuabufr and amsuabufr_15 are linked to
the same BUFR file and NOAA-15 AMSU-A and NOAA-16 AMSU-A observations are
still read in from the same BUFR file. If amsuabufr and amsuabufr_15 link to different
BUFR files, then NOAA-15 AMSU-A and NOAA-16 AMSU-A will be read in from
different BUFR files. Clearly, the changeable filename in dfile gives GSI more capability
to handle multiple data resources.

66
GSI Diagnostics and Tuning

3. Use info files

For each variable, observations can come from multiple platforms (data types or
observation instruments). For example, surface pressure (ps) can come from METAR
observation stations (data type 187) and Rawinsonde data (data type 120). There are
several files named *info in the GSI system (located in /fix) to control the usage of
observations based on the observation platform. Below is a list of info files and their
function:

File name Function and Content


in GSI
convinfo Control the usage of conventional data, including tcp, ps, t, q, pw,
sst, uv, spd, dw, radial wind (Level 2 rw and 2.5 srw), gps
satinfo Control the usage of satellite data. Instruments include AMSU-
A/B, HIRS3/4, MHS, ssmi, ssmis, iasi, airs, amsrem sndr, cris, etc.
and satellites include NOAA 16, 17, 18, 19, aqua, GOES 11, 12,
13, METOP-A/B, NPP, etc.
ozinfo Control the usage of ozone data, including sbuv6, 8 from NOAA
14, 16, 17, 18, 19. omi_aura, gome_metop-a, mls_aura
pcpinfo Control the usage of precipitation data, including pcp_ssmi,
pcp_tmi
aeroinfo Control the usage of aerosol data, including modis_aqua and
modis_terra

The header of each info file includes an explanation of the content of the file. Here we
discuss the most commonly used two info files:

convinfo

The convinfo is to control the usage of conventional data. The following is the part of
the content of convinfo:
!otype type sub iuse twindow numgrp ngroup nmiter gross ermax ermin var_b var_pg ithin rmesh pmesh npred
tcp 112 0 1 3.0 0 0 0 75.0 5.0 1.0 75.0 0.000000 0 0. 0. 0
ps 120 0 1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0
ps 132 0 -1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0
ps 180 0 1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0
ps 181 0 1 3.0 0 0 0 3.6 3.0 1.0 3.6 0.000300 0 0. 0. 0
ps 182 0 1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0
ps 183 0 -1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0
ps 187 0 1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0
t 120 0 1 3.0 0 0 0 8.0 5.6 1.3 8.0 0.000001 0 0. 0. 0
t 126 0 -1 3.0 0 0 0 8.0 5.6 1.3 8.0 0.001000 0 0. 0. 0
t 130 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.001000 0 0. 0. 0
t 131 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.001000 0 0. 0. 0
t 132 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.001000 0 0. 0. 0
t 133 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0
t 134 0 -1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0
t 135 0 -1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0
t 180 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0
t 181 0 -1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0
t 182 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0

67
GSI Diagnostics and Tuning

The meaning of each column is explained in the header of the file and listed also in the
following table:

Column Content of the column


Name
otype observation variables (t, uv, q, etc.)
type prepbufr observation type (if available)
sub prepbufr subtype (not yet available)
iuse flag if to use/not use / monitor data
=1, use data, the data type will be read and used in the analysis after quality
control
=0, read in and process data, use for quality control, but do NOT assimilate
=-1, monitor data. This data type will be read in but not be used in the GSI
analysis
twindow time window (+/- hours) for data used in the analysis
numgrp cross validation parameter - number of groups
ngroup cross validation parameter - group to remove from data use
nmiter cross validation parameter - external iteration to introduce removed data
gross gross error parameter - gross error
ermax gross error parameter maximum error
ermin gross error parameter minimum error
var_b variational quality control parameter - b parameter
var_pg variational quality control parameter - pg parameter
ithin Flag to turn on thinning (0, no thinning, 1 - thinning)
rmesh size of horizontal thinning mesh (in kilometers)
pmesh size of vertical thinning mesh
npred Number of bias correction predictors

From this table, we can see that parameter iuse is used to control the usage of data and
parameter twindow is to control the time window of data usage. Parameters gross,
ermax, and ermin are for gross quality control.

satinfo:
The satinfo file contains information about the channels, sensors, and satellites. It
specifies observation error (cloudy or clean) for each channel, how to use the channels
(assimilate, monitor, etc), and other useful information. The following is the part of the
content of satinfo:
!sensor/instr/sat chan iuse error error_cld ermax var_b var_pg
amsua_n15 1 1 3.000 9.100 4.500 10.000 0.000
amsua_n15 2 1 2.000 13.500 4.500 10.000 0.000
amsua_n15 3 1 2.000 7.100 4.500 10.000 0.000
amsua_n15 4 1 0.600 1.300 2.500 10.000 0.000

amsua_n15 14 -1 2.000 1.400 4.500 10.000 0.000


amsua_n15 15 1 3.000 10.000 4.500 10.000 0.000
hirs3_n17 1 -1 2.000 0.000 4.500 10.000 0.000
hirs3_n17 2 -1 0.600 0.000 2.500 10.000 0.000

68
GSI Diagnostics and Tuning

The meaning of each column is explained in the following table:

Column Name Content of the column


sensor/ instr/sat Sensor, instrument, and satellite name
chan Channel number for certain senor
iuse = 1, use this channel data
=-1, dont use this channel data
error Variance for each satellite channel
error_cld Variance for each satellite channel if it is cloudy
ermax Error maximum for gross check to observations
var_b Possible range of variable for gross errors
var_pg Probability of gross error

4.4 Domain Partition for Parallelization and Observation Distribution

In the standard output file (stdout), there is an information block that lists information
regarding each sub-domain partition, including domain number (task), starting point
(istart,jstart), and dimension of the sub-domain (ilat1,jlon1). Here is an
example from the case we showed in Section 4.1. Please note that 4 processors were used
in the analysis:
general_DETER_SUBDOMAIN: task,istart,jstart,ilat1,jlon1= 0 1 1 124 174
general_DETER_SUBDOMAIN: task,istart,jstart,ilat1,jlon1= 1 125 1 123 174
general_DETER_SUBDOMAIN: task,istart,jstart,ilat1,jlon1= 2 1 175 124 174
general_DETER_SUBDOMAIN: task,istart,jstart,ilat1,jlon1= 3 125 175 123 174

The standard output file (stdout) also has an information block that shows the distribution
of different kinds of observations in each sub-domain. This block follows the observation
input section. The following is the observation distribution of the case shown in Section
4.1. From the case introduction, we know the prepbufr (conventional data), radiance BUFR
files, and GPS BUFR files were used. In this list, the conventional observations (ps, t, q,
pw, uv, and sst), GPS (gps_ref), and radiance data (amusa, amsub, hirs3/4, and mhs from
Metop-a, NOAA 15, 17, 18, and 19) were distributed among 4 sub-domains:

Observation type number of observations in each subdomain


OBS_PARA: ps 2607 2878 9565 3019
OBS_PARA: t 5172 4743 13902 5590
OBS_PARA: q 4107 4197 11998 4090
OBS_PARA: pw 296 92 475 83
OBS_PARA: uv 6640 5439 18109 5725
OBS_PARA: sst 0 0 6 3
OBS_PARA: gps_ref 3538 5580 2277 6768
OBS_PARA: hirs3 n17 0 0 30 35
OBS_PARA: hirs4 metop-a 0 0 28 35
OBS_PARA: amsua n15 2564 1333 133 195
OBS_PARA: amsua n18 1000 2117 0 90
OBS_PARA: amsua metop-a 0 0 58 67
OBS_PARA: amsub n17 0 0 57 70
OBS_PARA: mhs n18 1438 2942 0 119
OBS_PARA: mhs metop-a 0 0 54 71
OBS_PARA: hirs4 n19 246 1097 0 37
OBS_PARA: amsua n19 656 3503 0 93
OBS_PARA: mhs n19 936 4277 0 116

69
GSI Diagnostics and Tuning

This list is a good way to quickly check which kinds of data are used in the analysis and
how they are distributed in the analysis domain.

4.5 Observation Innovation Statistics

The GSI analysis gives a group of files named fort.2* (other than fort.220, see explanation
on fort.220 in next section) to summarize observations fitting to the current solution in each
outer loop. The content of each of these files is listed in the following table:

File name Variables in file Ranges/units


fort.201 or fit of surface pressure data mb
fit_p1.analysis_time
fort.202 or fit of wind data m/s
fit_w1.analysis_time
fort.203 or fit of temperature data K
fit_t1.analysis_time
fort.204 or fit of q data percent of
fit_q1.analysis_time guess qsaturation
fort.205 fit of precipitation water data mm
fort.206 fit of ozone observations from sbuv6_n14 (,
_n16, _n17, _n18), sbuv8_n16 (, _n17, _n18,
_n19), omi_aura, gome_metop-a/b, mls_aura
fort.207 or fit of satellite radiance data including:
fit_rad1.analysis_time amsua_n15(, n16, n17, n18, metop-a, aqua,
n19), amsub_n15(, n16, n17,),
hirs3_n16(, n17), hirs4_n18 (, metop-a, n19),
mhs_n18(, metop-a, n19), hirs2_n14,
iasi616_metop-a, airs281SUBSET_aqua,
avhrr3_n16(, n17, n18), amsre_aqua,
ssmi_f13(, f14, f15), ssmis_f16, msu_n14,
sndr_g11(, g12, g13), imgr_g11(, g12, g13),
sndrD1(, 2, 3, 4)_g11(, g12,g13),

fort.208 fit of pcp_ssmi, pcp_tmi


fort.209 fit of radar radial wind (rw)
fort.210 fit of lidar wind (dw)
fort.211 fit of srw1, srw2
fort.212 fit of GPS data fractional
difference
fort.213 fit of conventional sst data C
fort.214 Tropical cyclone central pressure
fort.215 Lagrangian data

To help users understand the information inside these files, the following are some
examples of the contents in these files and corresponding explanations.

70
GSI Diagnostics and Tuning

4.5.1 Conventional observations

Example of files including single level data (fort.201, fort.205, fort.213)


pressure levels (hPa)= 0.0 2000.0
it obs type stype count bias rms cpen qcpen
o-g 01 ps 120 0000 141 0.2326 1.0637 1.6900 1.6613
o-g 01 ps 180 0000 2210 0.2623 1.1016 1.4846 1.2854
o-g 01 ps 181 0000 725 0.1493 1.2380 2.0757 1.8844
o-g 01 ps 187 0000 12524 0.5000 1.0227 0.5437 0.5061
o-g 01 all 15600 0.4476 1.0455 0.7585 0.6910
o-g 01 ps rej 120 0000 2 -7.5859 8.7800 0.0000 0.0000
o-g 01 ps rej 180 0000 16 0.4209 7.0383 0.0000 0.0000
o-g 01 ps rej 181 0000 49 0.5711 57.0891 0.0000 0.0000
o-g 01 ps rej 183 0000 6 -19.2918 19.5258 0.0000 0.0000
o-g 01 ps rej 187 0000 38 -2.2101 7.4631 0.0000 0.0000
o-g 01 rej all 111 -1.6233 38.5608 0.0000 0.0000
o-g 01 ps mon 120 0000 2 0.5141 0.5372 0.4911 0.4911
o-g 01 ps mon 180 0000 115 -0.1094 1.4396 5.4329 4.4811
o-g 01 ps mon 181 0000 257 0.3725 1.2673 4.7674 3.9711
o-g 01 ps mon 183 0000 1378 -0.8872 1.9598 0.0000 0.0000
o-g 01 ps mon 187 0000 245 -0.2670 1.2678 3.5709 2.7001
o-g 01 mon all 1997 -0.6028 1.7815 1.3650 1.1009

Example of files including multiple level data (fort.202, fort.203, fort.204)


ptop 1000.0 900.0 800.0 600.0 100.0 50.0 0.0
it obs type styp pbot 1200.0 1000.0 900.0 800.0 150.0 100.0 2000.0
----------------------------------------------------------------------------------------
o-g 01 uv 220 0000 count 135 428 427 838 688 978 8061
o-g 01 uv 220 0000 bias -0.43 0.93 0.85 0.61 0.31 0.23 0.64
o-g 01 uv 220 0000 rms 3.62 3.68 4.21 4.13 5.61 5.33 4.84
o-g 01 uv 220 0000 cpen 0.69 0.92 1.21 1.30 1.74 1.53 1.39
o-g 01 uv 220 0000 qcpen 0.69 0.92 1.21 1.30 1.72 1.52 1.38
o-g 01 uv 223 0000 count 0 20 122 595 326 93 3169
o-g 01 uv 223 0000 bias 0.00 -1.27 0.38 0.14 0.00 0.92 -0.40
o-g 01 uv 223 0000 rms 0.00 4.83 4.94 4.35 5.56 5.66 4.81
o-g 01 uv 223 0000 cpen 0.00 2.34 2.07 1.14 1.37 1.07 1.48
o-g 01 uv 223 0000 qcpen 0.00 2.34 2.05 1.08 1.34 1.07 1.43

o-g 01 all count 1962 1400 1488 2688 1014 1071 18363
o-g 01 all bias -0.31 0.92 0.56 0.00 0.21 0.29 0.27
o-g 01 all rms 3.02 3.88 4.68 4.65 5.59 5.36 4.74
o-g 01 all cpen 0.45 0.53 0.85 0.94 1.62 1.49 1.09
o-g 01 all qcpen 0.45 0.53 0.84 0.92 1.60 1.48 1.08
o-g 01 uv rej 220 0000 count 0 0 0 0 0 0 292
o-g 01 uv rej 220 0000 bias 0.00 0.00 0.00 0.00 0.00 0.00 2.81
o-g 01 uv rej 220 0000 rms 0.00 0.00 0.00 0.00 0.00 0.00 8.69
o-g 01 uv rej 220 0000 cpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00
o-g 01 uv rej 220 0000 qcpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00

o-g 01 rej all count 34 10 8 19 2 1 429


o-g 01 rej all bias 10.52 4.64 2.94 6.31 24.83 28.94 2.00
o-g 01 rej all rms 17.64 13.49 23.91 44.74 61.69 59.27 22.52
o-g 01 rej all cpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00
o-g 01 rej all qcpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00
o-g 01 uv mon 220 0000 count 3 2 5 3 11 40 145
o-g 01 uv mon 220 0000 bias -1.55 -2.31 7.31 0.04 -3.49 2.02 1.74
o-g 01 uv mon 220 0000 rms 3.92 3.68 15.18 2.94 0.65 8.67 8.93
o-g 01 uv mon 220 0000 cpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00
o-g 01 uv mon 220 0000 qcpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00

o-g 01 mon all count 4091 9546 1653 958 51 63 16727


o-g 01 mon all bias -0.92 -0.25 0.18 -4.74 -0.64 0.15 -0.81
o-g 01 mon all rms 2.92 3.22 5.93 11.09 19.81 12.48 5.11
o-g 01 mon all cpen 1.32 1.28 0.87 0.68 0.00 0.00 1.18
o-g 01 mon all qcpen 1.26 1.24 0.83 0.60 0.00 0.00 1.14

71
GSI Diagnostics and Tuning

Please note 4 layers from 600 to 200 hPa have been deleted to make each row fit into one
line. Only observation type 220 and 223 are shown as an example.
The following table lists the meaning of each item in file fort.201-213 except file fort.207:

Name explanation
it outer loop
= 01: observation background
= 02: observation analysis (after 1st outer loop)
= 03: observation analysis fields (after 2nd outer loop)
obs observation variable (such as uv, ps) and usage of the type, which
include:
blank: used in GSI analysis
mon: monitored, (read in but not assimilated by GSI).
rej: rejected because of quality control in GSI
type prepbufr observation type (see BUFR Users Guide for details)
styp prepbufr observation subtype (not used now)
ptop for multiple level data: pressure at the top of the layer
pbot for multiple level data: pressure at the bottom of the layer
count The number of observations summarized under observation types
and vertical layers
bias Bias of observation departure for each outer loop (it)
rms Root Mean Square of observation departure for each outer loop (it)
cpen Observation part of penalty (cost function)
qcpen nonlinear qc penalty

The contents of the file fort.2* are calculated based on O-B or O-A for each observation.
This detailed information about each observation is saved in the diagnostic files. For the
content of the diagnostic files, please check the content of the array rdiagbuf in one of the
setup subroutines for conventional data, for example, setupt.f90. We provide a tool in
appendix A.2 to help users read in the information from the diagnostic files.

These fit files give lots of useful information on how data are analyzed by the GSI, such as
how many observations are used and rejected, what is the bias and rms for certain data
types or for all observations, and how analysis results fit to the observation before and after
analysis. Again, we use observation type 220 in fort.202 (fit_w1.2011032212) as an
example to illustrate how to read this information. The fit information for observation type
220 (sounding observation) is listed in the next page. Like the previous example, 4 layers
from 600 to 200 hPa were deleted to make each row fit into one line. Also, the qcpen
lines have been deleted because they have the same values as cpen lines. Otherwise, all
fit information of observation type 220 are shown.

72
GSI Diagnostics and Tuning

ptop 1000.0 900.0 800.0 600.0 100.0 50.0 0.0


it obs type styp pbot 1200.0 1000.0 900.0 800.0 150.0 100.0 2000.0
----------------------------------------------------------------------------------------
o-g 01 uv 220 0000 count 135 428 427 838 688 978 8061
o-g 01 uv 220 0000 bias -0.43 0.93 0.85 0.61 0.31 0.23 0.64
o-g 01 uv 220 0000 rms 3.62 3.68 4.21 4.13 5.61 5.33 4.84
o-g 01 uv 220 0000 cpen 0.69 0.92 1.21 1.30 1.74 1.53 1.39

o-g 01 uv rej 220 0000 count 0 0 0 0 0 0 292


o-g 01 uv rej 220 0000 bias 0.00 0.00 0.00 0.00 0.00 0.00 2.81
o-g 01 uv rej 220 0000 rms 0.00 0.00 0.00 0.00 0.00 0.00 8.69
o-g 01 uv rej 220 0000 cpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00

o-g 01 uv mon 220 0000 count 3 2 5 3 11 40 145


o-g 01 uv mon 220 0000 bias -1.55 -2.31 7.31 0.04 -3.49 2.02 1.74
o-g 01 uv mon 220 0000 rms 3.92 3.68 15.18 2.94 0.65 8.67 8.93
o-g 01 uv mon 220 0000 cpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00

ptop 1000.0 900.0 800.0 600.0 100.0 50.0 0.0


it obs type styp pbot 1200.0 1000.0 900.0 800.0 150.0 100.0 2000.0
----------------------------------------------------------------------------------------
o-g 02 uv 220 0000 count 135 428 427 838 688 978 8061
o-g 02 uv 220 0000 bias -0.22 0.80 0.50 0.52 0.17 0.65 0.56
o-g 02 uv 220 0000 rms 3.14 2.93 2.99 2.76 4.75 4.90 3.97
o-g 02 uv 220 0000 cpen 0.49 0.58 0.59 0.59 1.26 1.29 0.90

o-g 02 uv rej 220 0000 count 0 0 0 0 0 0 292


o-g 02 uv rej 220 0000 bias 0.00 0.00 0.00 0.00 0.00 0.00 2.97
o-g 02 uv rej 220 0000 rms 0.00 0.00 0.00 0.00 0.00 0.00 8.94
o-g 02 uv rej 220 0000 cpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00

o-g 02 uv rej 220 0000 count 0 0 0 0 0 0 292


o-g 02 uv rej 220 0000 bias 0.00 0.00 0.00 0.00 0.00 0.00 2.97
o-g 02 uv rej 220 0000 rms 0.00 0.00 0.00 0.00 0.00 0.00 8.94
o-g 02 uv rej 220 0000 cpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00

ptop 1000.0 900.0 800.0 600.0 100.0 50.0 0.0


it obs type styp pbot 1200.0 1000.0 900.0 800.0 150.0 100.0 2000.0
----------------------------------------------------------------------------------------
o-g 03 uv 220 0000 count 135 428 427 838 688 978 8061
o-g 03 uv 220 0000 bias -0.20 0.75 0.50 0.47 0.17 0.63 0.57
o-g 03 uv 220 0000 rms 3.01 2.77 2.82 2.57 4.62 4.82 3.86
o-g 03 uv 220 0000 cpen 0.44 0.51 0.52 0.51 1.19 1.25 0.85

o-g 03 uv rej 220 0000 count 0 0 0 0 0 0 292


o-g 03 uv rej 220 0000 bias 0.00 0.00 0.00 0.00 0.00 0.00 3.02
o-g 03 uv rej 220 0000 rms 0.00 0.00 0.00 0.00 0.00 0.00 8.97
o-g 03 uv rej 220 0000 cpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00

o-g 03 uv mon 220 0000 count 3 2 5 3 11 40 145


o-g 03 uv mon 220 0000 bias -0.78 -1.45 7.89 0.00 -4.70 2.56 1.91
o-g 03 uv mon 220 0000 rms 2.99 2.61 15.59 2.58 11.45 9.15 9.29
o-g 03 uv mon 220 0000 cpen 0.00 0.00 0.00 0.00 0.00 0.00 0.00

In loop section o-g 01, from count line, we can see there are 8061 sounding
observations used in the analysis. Among them, 135 are within the 1000-1200 hPa layer.
Also from the count lines, in the rejection and monitoring section, there are 292
observations rejected and 145 observations being monitored. In the same loop section, from
the bias line and rms lines, we can see the total bias and rms of O-B for soundings is
0.64 and 4.84. The bias and rms of each layer for sounding observation can also be found
in the file.

73
GSI Diagnostics and Tuning

When reading bias and rms values from different loops, as shown with the comparison in
the following three lines:
o-g 01 uv 220 0000 rms 3.62 3.68 4.21 4.13 5.61 5.33 4.84
o-g 02 uv 220 0000 rms 3.14 2.93 2.99 2.76 4.75 4.90 3.97
o-g 03 uv 220 0000 rms 3.01 2.77 2.82 2.57 4.62 4.82 3.86

we can see the rms reduced from 4.84 (o-g 01, which is O-B) to 3.97 (o-g 02, which is O-A
after 1st outer loop) and then to 3.86 (o-g 03, which is O-A after 2nd outer loop, the final
analysis result). The reduction in the rms shows the observation type 220 (sounding) was
used in the GSI analysis to modify the background fields to fit to the observations. Please
note this example only used 10 iterations for each outer loop.

4.5.2 Satellite radiance

The file fort.207 is statistic file for radiance data. Its contents include important
information about the radiance data analysis.

The first part of the file fort.207 lists the content of the file satinfo, which is the info file to
control the data usage for radiance data. This portion begins with the line:
RADINFO_READ: jpch_rad= 2680

This shows there are 2680 channels listed in the satinfo file and the 2680 lines following
this line include the detailed setups in the satinfo file for each channel.

The second part of the file is a list of the coefficients for mass bias correction, which begins
with the line:
RADINFO_READ: guess air mass bias correction coefficients below

Each channel has 5 coefficients listed in a line. Therefore, there are 2680 lines of mass bias
correction coefficients for all channels though some of the coefficients are 0.

The 3rd part of the fort.207 file is similar to other fit files with content repeated in 3
sections to give detailed statistic information about the data in stages before the 1st outer
loop, between 1st and 2nd outer loop, and after 2nd outer loop. The results before the 1st
outer loop are used here as an example to explain the content of the statistic results:

Summaries for various statistics as a function of observation type


sat type penalty nobs iland isnoice icoast ireduce ivarl nlgross
metop-a hirs4 274.88309655 62 1 9 0 17 169 0
qcpenalty qc1 qc2 qc3 qc4 qc5 qc6 qc7
274.88309655 1 52 458 129 0 0 0

sat type penalty nobs iland isnoice icoast ireduce ivarl nlgross
n15 amsua 24874.97687020 4149 673 1475 268 1311 29170 0
qcpenalty qc1 qc2 qc3 qc4 qc5 qc6 qc7
24874.97687020 947 94 1277 183 0 20 46

74
GSI Diagnostics and Tuning

rad total penalty_all= 48794.6602645585226


rad total qcpenalty_all= 48794.6602645585226
rad total failed nonlinqc= 0

The following table lists the meaning of each item in the above statistics:

Name explanation
sat satellite name
type instrument type
penalty contribution to cost function from this observation type
nobs number of good observations used in the assimilation
iland number of observations over land
isnoice number of observations over sea ice and snow
icoast number of observations over coast
ireduce number of observations that reduce qc bounds in tropics
ivarl number of observations tossed by gross check
nlgross number of observation tossed by nonlinear qc
qcpenalty nonlinear qc penalty from this data type
qc1-7 number of observations whose quality control criteria has been
adjusted by each qc method (1-7), details see Section 8.3
rad total summary of penalty for all radiance observation types
penalty_all
rad total summary of qcpenalty for all radiance observation types
qcpenalty_all
rad total summary of observation tossed by nonlinear qc for all radiance
failed nonlinqc observation types

Note: one observation may include multiple channels, not all channels are used in the
analysis.

Summaries for various statistics as a function of channel


1 2 3 4 5 6 7 8 9 10 11
1 1 amsua_n15 1134 157 3.000 -0.6857597 0.4101223 0.2064464 2.1807583 2.1418465
2 2 amsua_n15 1104 186 2.000 -1.0022746 -0.2686506 0.2293813 2.1137854 2.0966439
3 3 amsua_n15 1180 9 2.000 1.3881152 -1.0935757 0.3919319 2.1384668 1.8376976
4 4 amsua_n15 1266 0 0.600 -0.1387697 -0.1315877 0.2929980 0.4427612 0.4227555
5 5 amsua_n15 1266 3 0.300 -0.1162391 -0.2221524 0.9228904 0.3347665 0.2504335
6 6 amsua_n15 2293 4 0.230 -1.5129871 -0.3181822 2.2705673 0.3760536 0.2004405
7 7 amsua_n15 2216 1933 0.250 -0.6290472 -0.4804044 3.3199152 0.5077016 0.1642332
8 8 amsua_n15 1826 2323 0.275 -0.7189088 -0.5710642 3.5927614 0.5909604 0.1520519
9 9 amsua_n15 629 3520 0.340 -1.2802610 -0.8482793 4.5752183 0.8554846 0.1107981
10 10 amsua_n15 19 4130 0.400 -0.9309823 -0.9484701 4.7934543 0.9528062 0.0907966
12 12 amsua_n15 859 3290 1.000 0.7343006 0.0869171 0.0672267 0.5131227 0.5057077
15 15 amsua_n15 760 597 3.000 1.8184975 -0.9903618 0.3098348 2.5077433 2.3039010

61 2 hirs4_metop-a 30 32 0.600 0.0627054 0.8577433 1.4361066 0.8908604 0.2406420


62 3 hirs4_metop-a 43 19 0.530 0.1164134 0.8431586 2.1301347 0.8934206 0.2954387
63 4 hirs4_metop-a 58 3 0.400 -0.1779762 0.1575476 0.4386076 0.3442761 0.3061124
64 5 hirs4_metop- 7 5 0.360 -0.4521126 -0.8968262 2.9409352 0.9013240 0.0899321
65 6 hirs4_metop 7 3 0.460 -0.4749403 -0.8017649 1.4791934 0.8109521 0.1217222
66 7 hirs4_metop-a 8 0 0.570 -0.2872839 -0.8601251 0.9927962 0.9251662 0.3407599
67 8 hirs4_metop-a 8 0 1.000 1.6804195 1.7202181 0.9008130 1.8297700 0.6236248
69 10 hirs4_metop-a 7 1 0.600 1.1985978 0.7861630 0.5144437 0.8915189 0.4204208

75
GSI Diagnostics and Tuning

The following table lists the meaning of each column in above statistics:

Column content
#
1 series number of the channel in satinfo file
2 channel number for certain radiance observation type
3 radiance observation type (for example: amsua_n15)
4 number of observations (nobs) used in GSI analysis within this channel
5 number of observations (nobs) tossed by gross check within this channel
6 variance for each satellite channel
7 bias (observation-guess before bias correction)
8 bias (observation-guess after bias correction)
9 penalty contribution from this channel
10 (observation-guess with bias correction)**2
11 standard deviation

Final summary for each observation type


it satellite instrument # read # keep # assim penalty qcpnlty cpen qccpen
o-g 01 rad n16 hirs3 0 0 0 0.0000 0.0000 0.0000 0.0000
o-g 01 rad n17 hirs3 202426 1235 0 0.0000 0.0000 0.0000 0.0000
o-g 01 rad n18 hirs4 0 0 0 0.0000 0.0000 0.0000 0.0000
o-g 01 rad metop-a hirs4 189563 1178 264 274.88 274.88 1.0412 1.0412
o-g 01 rad g11 sndr 0 0 0 0.0000 0.0000 0.0000 0.0000
o-g 01 rad g12 sndr 0 0 0 0.0000 0.0000 0.0000 0.0000
o-g 01 rad g11 goes_img 0 0 0 0.0000 0.0000 0.0000 0.0000
o-g 01 rad g12 goes_img 0 0 0 0.0000 0.0000 0.0000 0.0000
o-g 01 rad aqua airs 0 0 0 0.0000 0.0000 0.0000 0.0000
o-g 01 rad n14 msu 0 0 0 0.0000 0.0000 0.0000 0.0000
o-g 01 rad n15 amsua 128055 53932 14552 24875. 24875. 1.7094 1.7094

The following table lists the meaning of each column in the above statistics:

Name Explanation
it stage (o-g 01 rad = before 1st outer loop for radiance data)
satellite satellite name (n15=NOAA-15)
instrument instrument name (AMSU-A)
# read number of data (channel value) read in within analysis time window
and domain
# keep number of data (channel value) after data thinning
# assim number of data (channel value) used in analysis (passed all qc process)
penalty contribution from this observation type to cost function
qcpnlty nonlinear qc penalty from this data type
cpen penalty divided by (# assim)
qccpen qcpnlty divided by (# assim)

Same as other fit files, a comparison between results from different outer loops can give us
very useful information on how each channel and data type are analyzed in the GSI.

76
GSI Diagnostics and Tuning

4.6 Convergence Information

There are two ways to check the convergence information for each iteration of the GSI:

1. Standard output file (stdout):

The value of the cost function and norm of the gradient for each iteration are listed in the
file stdout.

The following is an example showing the first two iterations from the first outer loop:
Minimization iteration 0
grepcost J,Jb,Jo,Jc,Jl = 1 0 1.430230185505848494E+05 0.000000000000000000E+00
1.430230185505848494E+05 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 1 0 3.400606852473948493E+03 1.000000000000000000E+00
pcgsoi: cost,grad,step = 1 0 1.43023018550E+05 3.40060685247E+03 2.43989544270E-03
pcgsoi: gnorm(1:2),b= 1.87098843447644E+06 1.87098843447644E+06 1.61792450059063E-01
stprat 0.680228405155974
stprat 4.467539812170933E-015
Minimization iteration 1
grepcost J,Jb,Jo,Jc,Jl = 1 1 1.148077578695608245E+05 6.884228595044261567E+01
1.147389155836103746E+05 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 1 1 1.367840792810494804E+03 4.022343223284949310E-01
pcgsoi: cost,grad,step = 1 1 1.148077578695E+05 1.36784079281E+03 7.6301193791E-03
pcgsoi: gnorm(1:2),b= 1.7016520499209E+06 1.7016520499209E+06 9.094936230309182967E-01
stprat 0.375750738871811
stprat 1.030833603466626E-016

The following are the first two iterations from the second outer loop:
Minimization iteration 0
grepcost J,Jb,Jo,Jc,Jl = 2 0 9.709172928659517493E+04 5.242424325932335705E+03
9.184930496066283376E+04 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 2 0 2.458528763445824097E+03 1.000000000000000000E+00
pcgsoi: cost,grad,step = 2 0 9.70917292865E+04 2.458528763445E+03 2.0910838314E-03
pcgsoi: gnorm(1:2),b= 1.39687516886273E+06 1.39687516886273E+06 2.311037592468580815E-01
stprat 0.326022963208770
stprat 3.222291707173269E-015
Minimization iteration 1
grepcost J,Jb,Jo,Jc,Jl = 2 1 8.445245812257242505E+04 5.411029823593031324E+03
7.904142829897940101E+04 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 2 1 1.181894736794584333E+03 4.807325236000348778E-01
pcgsoi: cost,grad,step = 2 1 8.445245812257E+04 1.18189473679E+03 3.102603972078E-03
pcgsoi: gnorm(1:2),b= 6.28971827368193E+05 6.28971827368193E+05 4.502706049820247580E-01
stprat 1.600925505225033E-002
stprat 1.290408460808865E-015

We can see clearly the number of outer loop and the inner loops (Minimization iteration).
The meaning of the names (bold) used in stdout are explained in the following:

J,Jb,Jo,Jc,Jl: the values of cost function (J, or penalty), background term (Jb),
observations term (Jo), dry pressure constraint term (Jc), and negative and excess
moisture term (Jl).
grad: inner product of gradients (norm of the gradient (Y*X))
grad
reduction:
grad of the first step of the loop
cost: the values of cost function, (=J)

77
GSI Diagnostics and Tuning

step: stepsize ()
2 2
gnorm(1:2): 1=(norm of the gradient) , 2= (norm of the gradient)
b: parameter to estimate the new search direction
stprat: convergence in stepsize estimation

2. Convergence information in file fort.220:

In file fort.220, users can find more detailed information about each iterations. The
following example uses the first two iterations to explain the meaning of each value:
1)
J= 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.118329009942697698E+05
0.190285043373867524E+05 0.401338098573457984E+05 0.468178247339593267E+04
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.185513606236281089E+05 0.487946602645584815E+05 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00
2)
S=-0.100000000000000018E-03 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.931339146384645329E-02
0.112072975062044600E-01 0.106814995402123112E-01 0.296198179729126945E-01
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.325642588289663153E-02 0.215772188906240879E-02 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00
3)
b=-0.115641269650927740E+04 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.281704458045555561E+06
0.402962980357327977E+06 0.292918321075311759E+06 0.274591776085362393E+05
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.262337918646139989E+06 0.982394058251658943E+07 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00
4)
c= 0.115641269650927719E+08 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.302472476475514677E+08
0.359554103149527393E+08 0.274229587308945897E+08 0.927054232191691408E+06
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.805600766238803874E+08 0.455292252088399070E+10 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00
5)
stepsize estimates = 0.100000000000000005E-03 0.243989544270763757E-02
0.243989544270778218E-02

78
GSI Diagnostics and Tuning

6)
0.100000E-03 0.900000E-04 0.110000E-03 0.000000E+00 0.243990E-02 0.241550E-02
0.246429E-02
7)
penalties = 0.000000E+00 0.222277E+03 -0.221329E+03 0.226543E+04 -0.259498E+05 -
0.259470E+05 -0.259470E+05
8)
penalty,grad ,a,b= 1 0 0.143023018550584849E+06 0.115641269650927745E+08
0.243989544270778198E-02 0.000000000000000000E+00
9)
pnorm,gnorm, step? 1 0 0.100000000000000000E+01 0.100000000000000000E+01 good

J= 0.688422859504426132E+02 0.000000000000000000E+00 0.000000000000000000E+00


0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.106235467037641590E+05
0.172586295556918779E+05 0.388542992087058658E+05 0.455285388277129181E+04
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.177114761746839726E+05 0.257381100579931913E+05 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00
S=-0.454000845565062042E-02 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.101217294711077131E-01
0.279345794069204220E-01 0.130410935418658683E-01 0.368341097286829303E-01
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.178110078442459234E-01 0.297635097075488013E-02 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00
b=-0.986861745749417622E+04 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.142776494763965690E+06
0.140788735520069243E+06 0.231276409863045459E+06 0.226873427979176697E+05
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.160017159820703326E+06 0.585021953540974826E+06 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00
c= 0.217370023732255860E+07 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.141059386314876834E+08
0.503994470327305877E+07 0.177344337819967319E+08 0.615932975305519791E+06
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.898417210412935433E+07 0.196556776834889866E+09 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
0.000000000000000000E+00 0.000000000000000000E+00
stepsize estimates = 0.243989544270778198E-02 0.763011937910834613E-02
0.763011937910831204E-02
stepsize guesses = 0.243990E-02 0.219591E-02 0.268388E-02 0.000000E+00 0.763012E-02
0.755382E-02 0.770642E-02
penalties = 0.000000E+00 0.635648E+03 -0.606453E+03 0.767027E+04 -0.660560E+04 -
0.660417E+04 -0.660417E+04
penalty,grad ,a,b= 1 1 0.114807757869560824E+06 0.187098843447644310E+07
0.763011937910831189E-02 0.161792450059063342E+00
pnorm,gnorm, step? 1 1 0.802722240329134418E+00 0.161792450059063564E+00 good

79
GSI Diagnostics and Tuning

For each inner iteration, there are 9 different outputs. The 1st iteration is labeled with
numbers 1 to 9, with a detailed explanation below:

1 4) detailed information on the cost function (J=), the stpx (S=bpen(i)/cpen(i)),


bpen (b=), cpen (c=). There are 32 (5 + number of observation types) items
listed in each and the meanings of these items are:

1 contribution from background, satellite radiance bias, and


precipitation bias
2 place holder for future linear linear term
3 contribution from dry pressure constraint term (Jc)
4 contribution from negative moisture constraint term (Jl/Jq)
5 contribution from excess moisture term (Jl/Jq)
6 contribution from negative gust constraint term
7 contribution from negative vis constraint term
8 contribution from negative pblh constraint term

9-32: contributions to Jo from different observation types:

9 contribution from ps observation term


10 contribution from t observation term
11 contribution from w observation term
12 contribution from q observation term
13 contribution from spd observation term
14 contribution from srw observation term
15 contribution from rw observation term
16 contribution from dw observation term
17 contribution from sst observation term
18 contribution from pw observation term
19 contribution from pcp observation term
20 contribution from oz observation term
21 contribution from o3l observation term (not used)
22 contribution from gps observation term
23 contribution from rad observation term
24 contribution from tcp observation term
25 contribution from lagrangian tracer
26 contribution from carbon monoxide
27 contribution from modis aerosol aod
28 contribution from level modis aero aod
29 contribution from in-situ pm2_5 obs
30 contribution from gust monoxide
31 contribution from vis aerosol aod
32 contribution from pb1h modis aero aod

It is suggested that the users check stpcalc.f90 for the code including the above
information.

5) information on the step size estimates


6) information on the step size guesses
7) information on the penalties
8) and 9) information on the cost function and gradient which is explained in the
following:

80
GSI Diagnostics and Tuning

penalty: the cost function of all observations and background


grad: inner product of gradients (norm of the gradient (Y*X))
a: stepsize ()
b: parameter to estimate the new search direction
penalty
pnorm:
penalty of the first step of the loop
grad
gnorm:
grad of the first step of the loop

To evaluate the convergence of the iteration, we usually make plots based on the above
information, such as the value of the cost function and the norm of the gradient. The
following are example plots showing the evolution of the cost function and the norm of
gradient in different outer loops:

Evolution of cost function (left column) and the norm of gradient (right column) in the
first outer loop (top raw) and the second outer loop (bottom raw)

Please see Section A.3 for the information on how to read convergence information from
fort.220 and how to make the above plots.

81
GSI Diagnostics and Tuning

4.7 Use bundle to configure control, state variables and background fields

Since the GSI release version 3.0, the control variables, state variables, and background
fields can be configured through a new info file named anavinfo. Different GSI
applications need a different anavinfo file to setup the control variables, state variables, and
background fields. In the ./fix directory of the release package, there are many example
anavinfo files for different operational applications. Because this is a working in progress,
users should use one of the sample anavinfo files instead of making a new one. The
released GSI run script has added the link for this new info file. Users should be able to use
this without any problem.

Below is an example of an avaninfo file for an ARW case:


met_guess::
!var level crtm_use desc orig_name
cw 30 10 cloud_condensate cw
# ql 30 10 cloud_liquid ql
# qi 30 10 cloud_ice qi
# qr 30 10 rain qr
# qs 30 10 snow qs
# qg 30 10 graupel qg
::

state_vector::
!var level itracer amedge source funcof
u 30 0 no met_guess u
v 30 0 no met_guess v
tv 30 0 no met_guess tv
tsen 30 0 no met_guess tv,q
q 30 1 no met_guess q
oz 30 1 no met_guess oz
cw 30 1 no met_guess cw
p3d 31 0 yes met_guess p3d
ps 1 0 no met_guess p3d
sst 1 0 no met_guess sst
::

control_vector::
!var level itracer as/tsfc_sdv an_amp0 source funcof
sf 30 0 1.00 -1.0 state u,v
vp 30 0 1.00 -1.0 state u,v
ps 1 0 0.50 -1.0 state p3d
t 30 0 0.70 -1.0 state tv
q 30 1 0.70 -1.0 state q
oz 30 1 0.50 -1.0 state oz
sst 1 0 1.00 -1.0 state sst
cw 30 1 1.00 -1.0 state cw
stl 1 0 1.00 -1.0 motley sst
sti 1 0 1.00 -1.0 motley sst
::

There are three sections in this file:


met_guess:: section to configure background fields
state_vector:: section to configure state variables
control_vector:: section to configure control variables

82
GSI Diagnostics and Tuning

In each section, the 1st column sets up the variable name and 2nd column sets up the vertical
levels. The 4th column in the section control_vector is the normalized scale factor for the
background error variance.

4.8 Analysis Increments

Analysis increments are defined as the fields of analysis minus background. A plot of
analysis increments can help users to understand how the analysis procedure modifies the
background fields according to observations, errors (background and observation), and
other constraints. You can either calculate analysisguess and plot the difference field or
use the tools introduced in Appendix A.4 to make analysis increment figures for different
analysis fields.

4.9 Running Time and Memory Usage

In addition to analysis increments, run time and memory usage are other important features
of an analysis system, especially for operational code like the GSI.

Some computer operating systems provide CPU, Wall time, and memory usage after the
job has completed, but an easy way to determine how much wall time is used for the GSI is
to check the time tag between the files generated at the beginning and the end of the run,
for example:

Wall time of GSI analysis = time of wrfanl.2011032212 - time of convinfo

Also, the GSI standard output file (stdout) gives the GSI start time and end time at the top
and the end of the file. For example:
* . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .
PROGRAM GSI_ANL HAS BEGUN. COMPILED 1999232.55 ORG: NP23
STARTING DATE-TIME JUL 13,2012 21:55:14.798 195 FRI 2456122

ENDING DATE-TIME JUL 13,2012 21:58:55.998 195 FRI 2456122


PROGRAM GSI_ANL HAS ENDED. IBM RS/6000 SP
* . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .

This tells us this case started at 21:55:14.798 and ended at 21:58:55.998. Meaning GSI
used 3 minutes and 41.2 second to finish

For a GSI run using an IBM computer, there is a resource statistics section at the end of the
stdout file, which gives information about run time and memory usage for the analysis (see
example at the end of Section 4.1).

83
GSI Applications

Chapter 5: GSI Applications

In this chapter, the knowledge from the previous chapters will be applied to several GSI
cases to show how to setup GSI with various different data sources, and how to properly
check the run status and analysis results in order to determine if a particular GSI
application was successful. Note the examples here only use the WRF ARW system
global experiments or WRF NMM runs are similar, but require different background and
namelist files. It is assumed that the reader has successfully compiled GSI on a local
machine, and has the following data available:

1. Background file
When using WRF, WPS and real.exe will be run to create a WRF input file:
wrfinput_<domain>_<yyyy-mm-dd_hh:mm:ss>
2. Conventional data
Real time NAM PREPBUFR data can be obtained from the server:
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/nam/prod
Note: NDAS prepbufr data was chosen to increase the amount of data
3. Radiance data and GPS RO data
Real time GDAS BUFR files can be obtained from the following server:
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod
Note: GDAS data was chosen to get better coverage for radiance and GPS RO

GSI Case Study

The following case study will give users an example of a successful GSI run with various
data sources. Users are welcome to download these example data from the GSI users
webpage (online case for release version 3.1) or create a new background and get real-time
observation data from the above server. The background and observations used in this case
study are as follows:

1. Background files: wrfinput_d01_2011-03-22_12:00:00


The horizontal grid spacing is 30-km with 51 vertical sigma levels

Figure 5.1: The terrain (left) and land mask (right) of the background used in this case study

84
GSI Applications

2. Conventional data: NAM PrepBUFR data from 12UTC 22 March 2011.


File: nam.t12z.prepbufr.tm00.nr

3. Radiance and GPS RO data: GDAS PREPBUFR data from 12 UTC 22 March 2011
Files: gdas.t12z.1bmaua.tm00.bufr_d
gdas.t12z.1bamub.tm00.bufr_d
gdas.t12z.1bhrs4.tm00.bufr_d
gdas.t12z.gpro.tm00.bufr_d

This case study was run on a Linux cluster. The BUFR/PrepBUFR files have been byte-
swapped to little endian format using the tool introduced in section A.1:

le_nam.t12z.prepbufr.tm00.nr
le_gdas.t12z.1bmaua.tm00.bufr_d
le_gdas.t12z.1bamub.tm00.bufr_d
le_gdas.t12z.1bhrs4.tm00.bufr_d
le_gdas.t12z.gpro.tm00.bufr_d

Assume the background file is located at:

/scratch1/NA30km/bk

all observations are located at:

/scratch1/NA30km/obs20110322

and the GSI release version 3.1 is located at

/scratch1/comGSI_v3.1

5.1. Assimilating Conventional Observations with GSI:

5.1.1: Run script

With GSI successfully compiled and background and observational data acquired, move to
the ./run directory under ./comGSI_v3.1 to setup the GSI run following the sample script
run_gsi.ksh. To successfully run the run_gsi.ksh script, it must be modified in several
places:

Set up batch queuing system


To run GSI with multi-processors, a job queuing head has to be added at the beginning
of the run_gsi.ksh script. The set up of the job queue is dependent on the machine and

85
GSI Applications

the jobs control system. More examples of the setup are described in section 3.2.2.1.
The following example is setup to run on a Linux cluster supercomputer with PBS. The
job head is as follows:

# Set the queueing options


#PBS -l procs=4
#PBS -l walltime=0:20:00
#PBS -A ?????
#PBS -N wrf_gsi
#PBS -j oe

In order to find out how to setup the job head, a good method is to use an existing MPI
job script and copy the job head over.

Setup the number of processors and the job queue system used. For this example,
LINUX_PBS and 4 processors were used:

GSIPROC=4
ARCH='LINUX_PBS'

Setup the case data, analysis time, and GSI fix, exe, and CRTM coefficients:

Setup analysis time:

ANAL_TIME=2011032212

Setup a working directory, which will hold all the analysis results. This directory must
have correct write permissions, as well as enough space to hold the output. Also, for the
following example, directory /scratch1 has to exist.

WORK_ROOT=/scratch1/gsiprd_${ANAL_TIME}_prepbufr

Set path to the background file:


BK_FILE=/scratch1/NA30km/bk/wrfinput_d01_2011-03-22_12:00:00

Set path to the observation directory and the PrepBUFR file within the observation
directory. All observations to be assimilated should be in the observation directory.
OBS_ROOT=/scratch1/NA30km/obs20110322
PREPBUFR=${OBS_ROOT}/le_nam.t12z.prepbufr.tm00.nr

Set the GSI system used for this case: the fix and CRTM coefficients directories as well
as the location of the GSI executable:

CRTM_ROOT=/scratch1/crtm/CRTM_Coefficients
FIX_ROOT=/scratch1/comGSI_v3.1/fix
GSI_EXE=/scratch1/comGSI_v3.1/run/gsi.exe

86
GSI Applications

Set which background and background error file to use:


bk_core=ARW
bkcv_option=NAM
if_clean=clean

This example uses the ARW NetCDF background; therefore bk_core is set to ARW.
The regional background error covariance files were also used in this case. Finally, the
run scripts are set to clean the run directory to delete all temporary intermediate files.

5.1.2: Run GSI and check the run status

Once the run scripts are set up properly for the case and machine, GSI can be run through
the run scripts. On our test machine, the GSI run is submitted as follows:

$ qsub < run_gsi.ksh

While the job is running, move to the working directory and check the details. Given the
following working directory setup:

WORK_ROOT=/scratch1/gsiprd_${ANAL_TIME}_prepbufr

Go to directory /scratch1 to check the GSI run directory.

A directory named ./gsiprd_2011032212_prepbufr should have been created. This


directory is the run directory for this GSI case study. While GSI is still running, the
contents of this directory should include:

imgr_g12.TauCoeff.bin ssmi_f15.SpcCoeff.bin
imgr_g13.SpcCoeff.bin ssmi_f15.TauCoeff.bin
imgr_g13.TauCoeff.bin ssmis_f16.SpcCoeff.bin

These files are CRTM coefficients that have been linked to this run directory through the
GSI run scripts. Additionally, many other files are linked or copied to this run directory or
generated during run, such as:

stdout: standard out file


wrf_inout: background file
gsiparm.anl: GSI namelist
prepbufr: PrepBUFR file for conventional observation
convinfo: data usage control for conventional data
berror_stats: background error file
errtable: observation error file

87
GSI Applications

The presence of these files indicates that the GSI run scripts have successfully setup a run
environment for GSI and the GSI executable is running. While GSI is still running,
checking the content of the standard output file (stdout) can monitor the stage of the GSI
analysis:

$ tail -f stdout

grepgrad grad,reduction= 1 0 8.188107512799259666E+02 1.000000000000000000E+00


pcgsoi: cost,grad,step = 1 0 7.024833842077980808E+04 8.188107512799259666E+02
1.788853101746809213E-02
pcgsoi: gnorm(1:2),b= 3.574858851399047417E+05 3.574858851399045670E+05
5.332020690447850653E-01
stprat 0.114387353859176
stprat 4.672705475826555E-016
Minimization iteration 1
grepcost J,Jb,Jo,Jc,Jl = 1 1 5.825495408135202160E+04 2.145440277602700405E+02
5.804041005359175324E+04 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 1 1 5.979012335995843159E+02 7.302068672950051686E-01
pcgsoi: cost,grad,step = 1 1 5.825495408135202160E+04 5.979012335995843159E+02
2.019904649670459587E-02
pcgsoi: gnorm(1:2),b= 2.237482294472646608E+05 2.237482294472649228E+05
6.258938849003770066E-01
stprat 0.226513888727740
stprat 1.049725208537512E-015

The above output lines show that GSI is in the inner iteration stage. It may take several
minutes to finish the GSI run. Once GSI has finished running, the number of files in the
directory will be greatly reduced from those during the run stage. This is because the run
script was set to clean the run directory after a successful run. The important analysis
result files and configuration files will remain in the run directory. Please check Section 3.4
for more details on GSI run results. Upon successful completion of GSI, the run directory
looks as follows:

anavinfo fort.206 gsiparm.anl


berror_stats fort.207 l2rwbufr
convinfo fort.208 list_run_directory
diag_conv_anl.2011032212 fort.209 ozinfo
diag_conv_ges.2011032212 fort.210 pcpbias_out
errtable fort.211 pcpinfo
fit_p1.2011032212 fort.212 prepbufr
fit_q1.2011032212 fort.213 prepobs_prep.bufrtable
fit_rad1.2011032212 fort.214 satbias_angle
fit_t1.2011032212 fort.215 satbias_in
fit_w1.2011032212 fort.217 satbias_out
fort.201 fort.218 satinfo
fort.202 fort.219 stdout
fort.203 fort.220 stdout.anl.2011032212
fort.204 fort.221 wrfanl.2011032212
fort.205 gsi.exe wrf_inout

88
GSI Applications

5.1.3: Check for successful GSI completion

It is important to always check for successful completion of the GSI analysis. Completion
of the GSI run without crashing does not guarantee a successful analysis. First, check the
stdout file in the run directory to be sure GSI completed each step without any obvious
problems. The following are several important steps to check:

1. Read in the avaninfo and namelist


The following lines show GSI started normal and has read in the anavinfo and namelist:

state_vectors*init_anasv: 2D-STATE VARIABLES ps sst


state_vectors*init_anasv: 3D-STATE VARIABLES u v
tv tsen q oz cw p3d
state_vectors*init_anasv: ALL STATE VARIABLES u v
tv tsen q oz cw p3d

GSI_4DVAR: nobs_bins = 1
SETUP_4DVAR: l4dvar= F
SETUP_4DVAR: l4densvar= F
SETUP_4DVAR: winlen= 3.00000000000000
SETUP_4DVAR: winoff= 3.00000000000000
SETUP_4DVAR: hr_obsbin= 3.00000000000000

2. Read in the background field


The lines in stdout immediately following the namelist section, indicate that GSI is
reading the background fields. Checking the range of the max and min values will
indicate if certain background fields are normal.

end_index= 348 247 50 0


max,min XLAT(:,1)= 13.41553 2.813988
max,min XLAT(1,:)= 47.02686 2.813988
xlat(1,1),xlat(nlon,1)= 2.813988 4.969986
xlat(1,nlat),xlat(nlon,nlat)= 47.02686 52.26216
rmse_var=XLONG
...............
rmse_var=U
ordering=XYZ
WrfType,WRF_REAL= 104 104
ndim1= 3
staggering= N/A
start_index= 1 1 1 0
end_index= 349 247 50 0
k,max,min,mid U= 1 20.05023 -21.65548 -6.996003
k,max,min,mid U= 2 20.64079 -22.58930 -7.982791
k,max,min,mid U= 3 21.84538 -24.38444 -9.791903
k,max,min,mid U= 4 24.33893 -27.59095 -12.01432
k,max,min,mid U= 5 27.30596 -29.83475 -14.94501
k,max,min,mid U= 6 28.86383 -31.97169 -14.71747
k,max,min,mid U= 7 30.24547 -33.28873 -12.40972

89
GSI Applications

3. Read in observational data


Skipping through a majority of the content towards the middle of the stdout file, the
following lines will appear:

OBS_PARA: ps 2352 2572 8367 2673


OBS_PARA: t 4617 4331 12418 4852
OBS_PARA: q 3828 3908 11096 3632
OBS_PARA: pw 89 31 141 23
OBS_PARA: uv 5704 4835 15025 4900
OBS_PARA: sst 0 0 2 0

This table is important to see if the observations have been read in, which types of
observations have been read in, and the distribution of observations in each sub
domain. At this point, GSI has read in all the data needed for the analysis. Following
this table is the inner iteration information.

4. Inner iteration
The inner iteration step in the stdout file will look as follows:

Minimization iteration 0
grepcost J,Jb,Jo,Jc,Jl = 1 0 7.024833842077980808E+04 0.000000000000000000E+00
7.024833842077980808E+04 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 1 0 8.188107512799259666E+02 1.000000000000000000E+00
pcgsoi: cost,grad,step = 1 0 7.024833842077980808E+04 8.188107512799259666E+02
1.788853101746809213E-02
pcgsoi: gnorm(1:2),b= 3.574858851399047417E+05 3.574858851399045670E+05
5.332020690447850653E-01
stprat 0.114387353859176
stprat 4.672705475826555E-016

Following the namelist set up, similar information will be repeated for each inner loop.
In this case, 2 outer loops with 50 inner loops in each outer loop have been set. The
last iteration looks like:

Minimization iteration 40
grepcost J,Jb,Jo,Jc,Jl = 2 40 3.664942287987162126E+04 6.573582409713609195E+03
3.007584047015801480E+04 0.000000000000000000E+00 0.000000000000000000E+00
grepgrad grad,reduction= 2 40 7.094588989162899789E-03 2.206814011091630791E-04
pcgsoi: cost,grad,step = 2 40 3.664942287987162126E+04 7.094588989162899789E-03
4.844607290004876443E-02
PCGSOI: WARNING **** Stopping inner iteration ***
gnorm 0.750736287079361097E-10 less than 0.100000000000000004E-09
Minimization final diagnostics

Clearly, the iteration met the stop threshold before meeting the maximum iteration
number (50). As a quick check of the iteration: the J value should descend through
each iteration. Here, the J has a value of 7.024833842077980808E+04 at the beginning
and a value of 3.664942287987162126E+04 at the final iteration. This means the value
has reduced by almost half, which is an expected reduction.

90
GSI Applications

5. Write out analysis results


The final step of the GSI analysis procedure looks very similar to the portion where the
background fields were read in:

max,min MU= 3993.281 -2043.336


rmse_var=MU
ordering=XY
WrfType,WRF_REAL= 104 104
ndim1= 2
staggering= N/A
start_index= 1 1 1 0
end_index1= 348 247 50 0
k,max,min,mid T= 1 310.9237 233.9436 280.6658
k,max,min,mid T= 2 311.3734 235.8086 280.8300
k,max,min,mid T= 3 311.6932 238.1501 281.0092
k,max,min,mid T= 4 312.7413 243.5058 281.4198
k,max,min,mid T= 5 313.2800 242.0858 282.9984
k,max,min,mid T= 6 313.7712 247.9757 285.4900
k,max,min,mid T= 7 314.9586 250.0384 289.3155

6. As an indication that GSI has successfully run, several lines will appear at the
bottom of the file:

ENDING DATE-TIME JUL 14,2012 23:50:12.367 196 SAT 2456123


PROGRAM GSI_ANL HAS ENDED. IBM RS/6000 SP
* . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .

GSI was mainly developed on an IBM machine, so the IBM RS/6000 SP markers will
still appear on a Linux machine.

After carefully investigating each portion of the stdout file, it can be concluded that GSI
successfully ran through every step and there were no run issues. A more complete
description of the stdout file can be found in Section 4.1. However, it cannot be concluded
that GSI did a successful analysis until more diagnosis has been completed.

5.1.4: Diagnose GSI analysis results

5.1.4.1: Check analysis fit to observations

The analysis uses observations to correct the background fields to fit the observations
under certain constraints. The easiest way to confirm the GSI analysis fits the observations
better than the background is to check a set of files with names fort.2??, where ?? is a

91
GSI Applications

number from 01 to 19. In the run scripts, several fort files also have been renamed as fit_t1
(q1, p1, rad1, w1).YYYYMMDDHH. Please check Section 4.5.1 for a detailed explanation
of the fit files. The following are several examples of these fit files.

fit_t1.2011032212 (fort.203)
This file shows how the background and analysis fields fit to temperature observations.
The contents of this file show three data types were used in the analysis: 120, 130, and
180. Also included are the number of observations, bias and rms of observation minus
background (O-B) on each level for the three data types. The following is a part of the
file:

ptop 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 50.0 0.0
it obs type styp pbot 1200.0 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 2000.0
------------------------------------------------------------------------------------------------------------------------
o-g 01 t 120 0000 count 185 527 573 995 1117 637 226 443 642 979 855 8692
o-g 01 t 120 0000 bias 0.62 0.88 -0.22 -0.20 -0.19 -0.38 -1.07 -0.89 -0.95 -0.99 -1.63 -0.92
o-g 01 t 120 0000 rms 2.62 2.57 2.01 1.31 0.90 1.02 1.66 1.71 1.94 1.86 2.67 2.27
o-g 01 t 130 0000 count 0 0 0 0 0 11 177 626 67 0 0 881
o-g 01 t 130 0000 bias 0.00 0.00 0.00 0.00 0.00 0.18 0.06 -0.02 -1.23 0.00 0.00 -0.09
o-g 01 t 130 0000 rms 0.00 0.00 0.00 0.00 0.00 0.96 0.97 1.48 2.28 0.00 0.00 1.46
o-g 01 t 180 0000 count 1260 28 0 0 0 0 0 0 0 0 0 1288
o-g 01 t 180 0000 bias 0.67 0.80 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.68
o-g 01 t 180 0000 rms 1.76 1.19 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.75
o-g 01 all count 1445 555 573 995 1117 648 403 1069 709 979 855 10861
o-g 01 all bias 0.67 0.88 -0.22 -0.20 -0.19 -0.37 -0.57 -0.38 -0.97 -0.99 -1.63 -0.66
o-g 01 all rms 1.89 2.52 2.01 1.31 0.90 1.02 1.40 1.58 1.97 1.86 2.67 2.16

---------------------------------------------------------------------------------------------------------------------------
o-g 03 t 120 0000 count 185 527 573 995 1117 637 226 443 642 979 855 8692
o-g 03 t 120 0000 bias 0.37 0.51 -0.15 -0.04 -0.02 0.02 -0.24 -0.17 -0.03 -0.17 -0.31 -0.10
o-g 03 t 120 0000 rms 2.12 2.05 1.60 1.00 0.61 0.59 0.90 1.01 1.28 1.36 1.93 1.46
o-g 03 t 130 0000 count 0 0 0 0 0 11 177 626 67 0 0 881
o-g 03 t 130 0000 bias 0.00 0.00 0.00 0.00 0.00 0.19 -0.04 0.07 -0.33 0.00 0.00 0.02
o-g 03 t 130 0000 rms 0.00 0.00 0.00 0.00 0.00 0.66 0.76 1.10 1.47 0.00 0.00 1.07
o-g 03 t 180 0000 count 1260 28 0 0 0 0 0 0 0 0 0 1288
o-g 03 t 180 0000 bias 0.38 0.30 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.38
o-g 03 t 180 0000 rms 1.46 0.76 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.45
o-g 03 all count 1445 555 573 995 1117 648 403 1069 709 979 855 10861
o-g 03 all bias 0.38 0.50 -0.15 -0.04 -0.02 0.03 -0.15 -0.03 -0.06 -0.17 -0.31 -0.03
o-g 03 all rms 1.56 2.01 1.60 1.00 0.61 0.59 0.84 1.06 1.29 1.36 1.93 1.43

For example: data type 120 has 1172 observations in layer 400.0-600.0 hPa, a bias of
-0.19, and a rms of 0.90. The last column shows the statistics for the whole atmosphere.
There are several summary lines for all data types, which is indicated by all in the
data types column. For summary O-B (which is o-g 01 in the file), we have 10861
observations total, a bias of -0.66, and a rms of 2.16.

Skipping ahead in the fort file, o-g 03 columns (under it) show the observation
minus analysis (O-A) information. Under the summary (all) lines, it can be seen that
there were 10861 total observations, a bias of -0.03, and a rms of 1.43. This shows that
from the background to the analysis the bias reduced from -0.06 to -0.03, and the rms

92
GSI Applications

reduced from 2.16 to 1.43. This is about a 34% reduction, which is a reasonable value
for large-scale analysis.

fit_w1.2011032212 (fort.202)
This file demonstrates how the background and analysis fields fit to wind observations.
This file (as well as fit_q1) are formatted the same as the above example. Therefore,
only the summary lines will be shown for O-B and O-A to gain a quick view of the
fitting:

ptop 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 50.0 0.0
it obs type styp pbot 1200.0 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 2000.0
------------------------------------------------------------------------------------------------------------------------
o-g 01 all count 1161 1175 1218 2195 1952 1080 557 1131 653 876 915 14812
o-g 01 all bias -0.31 1.01 0.58 0.08 -0.10 -0.05 -0.09 0.81 1.05 0.20 0.28 0.33
o-g 01 all rms 3.43 3.93 4.60 4.59 4.93 4.88 5.04 5.31 5.74 5.59 5.59 4.82

o-g 03 all count 1163 1173 1223 2197 1955 1096 558 1132 651 881 915 14843
o-g 03 all bias 0.28 0.69 0.32 0.19 0.15 0.17 0.09 0.32 0.47 0.22 0.55 0.37
o-g 03 all rms 2.69 2.92 2.85 2.72 2.87 2.50 2.62 3.02 3.65 4.24 4.90 3.35

O-B: 14812 observations in total, bias is 0.33 and rms is 4.82


O-A: 14843 observations in total, bias is 0.37 and rms is 3.35
The total bias was not reduced, however the rms reduced from 4.82 to 3.35 (~30%
reduction).

fit_q1.2011032212 (fort.204)
This file demonstrates how the background and analysis fields fit to moisture
observations (relative humidity). The summary lines for O-B and O-A are as follows:

ptop 1000.0 950.0 900.0 850.0 800.0 700.0 600.0 500.0 400.0 300.0 0.0 0.0
it obs type styp pbot 1200.0 1000.0 950.0 900.0 850.0 800.0 700.0 600.0 500.0 400.0 300.0 2000.0
------------------------------------------------------------------------------------------------------------------------
o-g 01 all count 686 228 312 343 228 561 423 473 571 415 0 4240
o-g 01 all bias 1.37 -0.29 1.15 0.92 0.93 0.23 -4.64 -7.81 -8.29 -8.55 0.00 -2.84
o-g 01 all rms 12.92 17.73 16.92 15.55 15.36 15.72 19.37 21.03 21.42 17.97 0.00 17.61

o-g 03 all count 686 229 312 343 228 561 423 473 571 415 0 4241
o-g 03 all bias 0.20 -1.54 0.28 0.43 -0.50 0.66 -0.98 -0.71 -1.09 -3.94 0.00 -0.64
o-g 03 all rms 7.31 10.29 10.66 9.83 10.64 11.57 14.67 14.34 14.23 13.01 0.00 11.94

O-B: 4240 observations in total and bias is -2.84 and rms is 17.61
O-A: 4241 observations in total and bias is -0.64 and rms is 11.94
The total bias and rms were reduced.

fit_p1.2011032212 (fort.201)
This file demonstrates how the background and analysis fields fit to surface pressure
observations. Because the surface pressure is two-dimensional, the table is formatted

93
GSI Applications

different than the three-dimensional fields shown above. Once again, the summary
lines will be shown for O-B and O-A to gain a quick view of the fitting:
it obs type stype count bias rms cpen qcpen
o-g 01 all 13986 0.4223 1.0427 0.8230 0.7487

o-g 03 all 14003 0.0172 0.6744 0.3537 0.3379

O-B: 13986 observations in total and bias is 0.4223 and rms is 1.0427
O-A: 14003 observations in total and bias is 0.0172 and rms is 0.6744
Both the total bias and rms were reduced.

These statistics show that the analysis results fit to the observations closer than the
background, which is what the analysis is supposed to do. How close the analysis fit to the
observations is based on the ratio of background error variance and observation error.

5.1.4.2: Check the minimization

In addition to the minimization information in the stdout file, GSI writes more detailed
information into a file called fort.220. The content of fort.220 is explained in section 4.6.
Below is an example of a quick check of the trend of the cost function and norm of
gradient. The value should get smaller with each iteration step.

In the run directory, the cost function and norm of the gradient information can be dumped
into an output file by using the command:

$ grep 'penalty,grad ,a,b=' fort.220 | sed -e 's/penalty,grad ,a,b=//g' > cost_gradient.txt

The file cost_gradient.txt includes 6 columns, however only the first 4 columns are shown
below. The first 6 and last 6 lines read are:
1 0 0.702483384207798081E+05 0.670451046411596704E+06
1 1 0.582549540813520216E+05 0.357485885139904742E+06
1 2 0.510340800654954874E+05 0.223748229447264661E+06
1 3 0.451910522149672179E+05 0.100693261241731772E+06
1 4 0.421994383952682037E+05 0.726689181584937469E+05
1 5 0.403738674444606295E+05 0.565482157493619452E+05

2 35 0.366494229475844040E+05 0.654227325952160183E-03
2 36 0.366494229180964758E+05 0.328958749059858212E-03
2 37 0.366494229007449394E+05 0.215132065808595375E-03
2 38 0.366494228897850917E+05 0.140109917756355446E-03
2 39 0.366494228834806418E+05 0.873987085206934423E-04
2 40 0.366494228798716213E+05 0.503331929251514545E-04

The first column is the outer loop number and the second column is the inner iteration
number. The third column is the cost function, and the forth column is the norm of

94
GSI Applications

gradient. It can be seen that both the cost function and norm of gradient are descending
with each iterations.

To get a complete picture of the minimization process, the cost function and norm of
gradient can be plotted using a provided NCL script located under:

./util/Analysis_Utilities/plot_ncl/GSI_cost_gradient.ncl.

the plot is shown below:

Figure 5.2: Cost function and norm of gradient change with iteration steps

The above plots demonstrate that both the cost function and norm of gradient descend very
fast in the first 10 iterations in both outer loops and drop very slowly after the 10th iteration.

For more details on how to use the tool to make this plot, please see section A.3.

95
GSI Applications

5.1.4.3: Check the analysis increment

The analysis increment gives us an idea where and how much the background fields have
been changed by the observations. Another useful graphics tool that can be used to look at
the analysis increment is located under:

./util/Analysis_Utilities/plot_ncl/Analysis_increment.ncl.

The graphic below shows the analysis increment at the 15th level. Notice that the scales are
different for each of the plots.

Figure 5.3: Analysis increment at the 15th level

It can be clearly seen that the U.S. CONUS domain has many upper level observations and
the data availability over the ocean is very sparse.

For more details on how to use the tool to make this plot, please see section A.4.

96
GSI Applications

5.2. Assimilating Radiance Data with GSI

5.2.1: Run script

Adding radiance data into the GSI analysis is very straightforward after a successful run of
GSI with conventional data. The same run scripts from the above section can be used to
run GSI with radiance with or without PrepBUFR data. The key step to adding the
radiance data is linking the radiance BUFR data files to the GSI run directory with the
names listed in the &OBS_INPUT section of the GSI namelist. The following example adds
the three following radiance BUFR files:

AMSU-A: le_gdas1.t12z.1bamua.tm00.bufr_d
AMSU-B: le_gdas1.t12z.1bamub.tm00.bufr_d
HIRS4: le_gdas1.t12z.1bhrs4.tm00.bufr_d

The location of these radiance BUFR files has been previously saved to the environment
variable OBS_ROOT, therefore the following three lines can be inserted below the link to the
PrepBUFR data in the script run_gsi.ksh:

ln -s ${OBS_ROOT}/gdas1.t12z.1bamua.tm00.bufr_d amsuabufr
ln -s ${OBS_ROOT}/gdas1.t12z.1bamub.tm00.bufr_d amsubbufr
ln -s ${OBS_ROOT}/gdas1.t12z.1bhrs4.tm00.bufr_d hirs4bufr

If it is desired to run radiance data in addition to the conventional PrepBUFR data, the
following link to the PrepBUFR should be left as is:

ln -s ${PREPBUFR} ./prepbufr

Alternatively to analyze radiance data without conventional PrepBUFR data, this line can
be commented out in the script run_gsi.ksh:

## ln -s ${PREPBUFR} ./prepbufr

In the following example, the case study will include both radiance and conventional
observations.

In order to link the correct name for the radiance BUFR file, the namelist section
&OBS_INPUT should be referenced. This section has a list of data types and BUFR file
names that can be used in GSI. The 1st column dfile is the file name recognized by GSI.
The 2nd column dtype and 3rd column dplat are the data type and data platform that are
included in the file listed in dfile, respectively. More detailed information on data usage
can be found in Section 4.3 and Section 8.1 for the radiance data specifically. For example,
the following line tells us the AMSU-A observation from NOAA-17 should be in a BUFR
file named as amsuabufr:

97
GSI Applications

dfile(30)='amsuabufr',dtype(30)='amsua',dplat(30)='n17',dsis(30)='amsua_n17',dval(30)=0.0,dthin(30)=2,

With radiance data assimilation, two important setups, data thinning and bias correction,
need to be checked carefully. The following is a brief description of these two setups:

Radiance data thinning

The radiance data thinning is setup in the namelist section &OBS_INPUT. The following
is a part of namelist in that section:
dmesh(1)=120.0,dmesh(2)=60.0,dmesh(3)=60.0,dmesh(4)=60.0,dmesh(5)=120

dfile(30)='amsuabufr',dtype(30)='amsua',dplat(30)='n17',dsis(30)='amsua_n17',dval(30)=0.0,dthin(30)=2,

The first line of &OBS_INPUT lists multiple thinning grids as elements of the array
demesh. For each line specifying a data type, the last element of that line is a
specification such as, dthin(30)=2. This selects the mesh grid to be used for
thinning. It can be seen that the data thinning option for NOAA-17 AMSU-A
observations is 60 km because the dmesh(2) is set to 60 km. For more information
about data thinning, please refer to section 3.3 and Section 8.1.

Radiance data bias correction

The radiance data bias correction is very important for a successful radiance data
analysis. In the run scripts, there are two files related to bias correction:

SATANGL=${FIX_ROOT}/global_satangbias.txt
cp ${FIX_ROOT}/sample.satbias ./satbias_in

The first file (global_satangbias.txt) provides GSI the coefficients for angle bias
correction. The angle bias coefficients are calculated off line outside of GSI. The
second file (sample.satbias) provides GSI the coefficients for mass bias correction.
They are calculated from within GSI in the previous cycle. In the released version 3.1,
most of the mass bias correction values in sample.satbias are 0. This means there was
not a good estimate base for mass bias correction in this case. The angle bias file
global_satangbias.txt is also out of date. These two files are provided in ./fix as an
example of the bias correction coefficients. For the best results, it will be necessary for
the user to generate their own bias files. The details of the radiance data bias correction
are discussed in Section 8.4 of this document.

For this case, the GDAS bias correction files were downloaded and saved in the
observation directory. The run script should contain two lines to link to the bias
correction coefficient files, in order to do this, change the following lines in run script:

SATANGL=${FIX_ROOT}/global_satangbias.txt
cp ${FIX_ROOT}/sample.satbias ./satbias_in

98
GSI Applications

The first line sets the path to the angle bias coefficient file, and the second copies the mass
bias coefficient file into the working directory. Once these links are set, we are ready to run
the case.

5.2.2: Run GSI and check run status

The process for running GSI is the same as described in section 5.1.2. Once run_gsi.ksh
has been submitted, move into the run directory to check the GSI analysis results. For our
current case, the run directory will look almost as it did for the conventional data case, the
exception being the three links to the radiance BUFR files and new diag files for the
radiance data types used. Following the same steps as in section 5.1.2, check the stdout file
to see if GSI has run through each part of the analysis process successfully. In addition to
the information outlined for the conventional run, the radiance BUFR files should have
been read in and distributed to each sub domain:
OBS_PARA: ps 2352 2572 8367 2673
OBS_PARA: t 4617 4331 12418 4852
OBS_PARA: q 3828 3908 11096 3632
OBS_PARA: pw 89 31 141 23
OBS_PARA: uv 5704 4835 15025 4900
OBS_PARA: sst 0 0 2 0
OBS_PARA: hirs4 metop-a 0 0 28 35
OBS_PARA: amsua n15 2564 1333 133 195
OBS_PARA: amsua n18 1000 2117 0 90
OBS_PARA: amsua metop-a 0 0 58 67
OBS_PARA: amsub n17 0 0 57 70
OBS_PARA: hirs4 n19 246 1097 0 37
OBS_PARA: amsua n19 656 3503 0 93

When comparing this output to the content in step 3 of section 5.1.3, it can be seen that
there are 7 new radiance data types that have been read in: HIRS4 from METOP-A and
NOAA-19, AMSU-A from NOAA-15, NOAA-18, NOAA-19, and METOP-A, and
AMSU-B from NOAA-17. The table above shows that most of the radiance data read in
this case are AMSU-A from NOAA-15.

5.2.3: Diagnose GSI analysis results

5.2.3.1: Check file fort.207

The file fort.207 contains the statistics for the radiance data, similar to file fort.203 for
temperature. This file contains important details about the radiance data analysis. Section
4.5.2 explains this file in detail. Below are some values from the file fort.207 to give a
quick look at the radiance assimilation for this case study.

The fort.207 file contains the following lines:

99
GSI Applications

For O-B, the stage before the first outer loop:


it satellite instrument #read #keep #assim penalty qcpnlty cpen qccpen
o-g 01 rad n15 amsua 128055 53932 14552 24875. 24875. 1.7094 1.7094
o-g 01 rad n17 amsub 213920 254 0 0.000 0.0000 0.0000 0.0000

For O-A, the stage after the second outer loop:


o-g 03 rad n15 amsua 128055 53932 36345 9993.3 9993.3 0.27496 0.27496
o-g 03 rad n17 amsub 213920 254 0 0.0000 0.0000 0.0000 0.0000

From the above information, it can be seen that AMSU-A data from NOAA-15 have
128055 observations within the analysis time window and domain. After thinning, 53932
of this data type remained, and only 14552 were used in the analysis. The penalty for this
data decreased from 24875 to 9993.3 after 2 outer loops. It is also very interesting to see
that the number of AMSU-A observations assimilated in the O-A calculation increase to
37345 from 14552. This is almost triple times larger than the number of O-B. In this
analysis, AMSU-B data from NOAA-17 have 213920 within the analysis time window and
domain. After thinning, only 254 were left and none of them were used in the analysis.

The statistics for each channel can be view in the fort.207 file as well. Below channels
from AMSU-A NOAA-15 are listed as an example:

For O-B, the stage before the first outer loop:


1 1 amsua_n15 1134 157 3.000 -0.6857597 0.4101223 0.2064464 2.1807583 2.1418465
2 2 amsua_n15 1104 186 2.000 -1.0022746 -0.2686506 0.2293813 2.1137854 2.0966439
3 3 amsua_n15 1180 93 2.000 1.3881152 -1.0935757 0.3919319 2.1384668 1.8376976
4 4 amsua_n15 1266 0 0.600 -0.1387697 -0.1315877 0.2929980 0.4427612 0.4227555
5 5 amsua_n15 1266 3 0.300 -0.1162391 -0.2221524 0.9228904 0.3347665 0.2504335
6 6 amsua_n15 2293 491 0.230 -1.5129871 -0.3181822 2.2705673 0.3760536 0.2004405
7 7 amsua_n15 2216 1933 0.250 -0.6290472 -0.4804044 3.3199152 0.5077016 0.1642332
8 8 amsua_n15 1826 2323 0.275 -0.7189088 -0.5710642 3.5927614 0.5909604 0.1520519
9 9 amsua_n15 629 3520 0.340 -1.2802610 -0.8482793 4.5752183 0.8554846 0.1107981
10 10 amsua_n15 19 4130 0.400 -0.9309823 -0.9484701 4.7934543 0.9528062 0.0907966
12 12 amsua_n15 859 3290 1.000 0.7343006 0.0869171 0.0672267 0.5131227 0.5057077
15 15 amsua_n15 760 597 3.000 1.8184975 -0.9903618 0.3098348 2.5077433 2.3039010

For O-A, the stage after the second outer loop:


1 1 amsua_n15 2468 356 3.000 -0.7052460 0.3474635 0.0926576 1.6449965 1.6078814
2 2 amsua_n15 2403 421 2.000 -0.8484674 -0.1059943 0.1397163 1.7843977 1.7812469
3 3 amsua_n15 2642 182 2.000 1.5872484 -0.7332827 0.2142122 1.7468399 1.5854798
4 4 amsua_n15 2824 0 0.600 -0.0563292 0.0350535 0.0958685 0.2964619 0.2943823
5 5 amsua_n15 2824 0 0.300 0.0507068 0.0043107 0.1646666 0.1623035 0.1622463
6 6 amsua_n15 4088 5 0.230 -1.2792988 -0.0009505 0.2258004 0.1226569 0.1226532
7 7 amsua_n15 4139 10 0.250 -0.1862899 -0.0170967 0.2394344 0.1334548 0.1323551
8 8 amsua_n15 4139 10 0.275 -0.2879666 -0.0262668 0.2453775 0.1501916 0.1478769
9 9 amsua_n15 4006 143 0.340 -0.6683869 -0.0845316 0.4433605 0.2511950 0.2365446
10 10 amsua_n15 3520 629 0.400 -0.5653461 -0.2697082 0.8731219 0.4392917 0.3467487
12 12 amsua_n15 1145 3004 1.000 1.7278088 0.1030682 0.0421959 0.4321463 0.4196753
15 15 amsua_n15 2147 677 3.000 2.3030982 -0.4271812 0.1395728 1.7659190 1.7134720

The second column is channel number for AMSU-A and the last column is the standard
deviation for each channel. It can be seen that most of the channels fit to the observation
more or less, but the standard deviation increased for channels 9 and 10. The increase in
the standard deviation seems to be occurring because a much greater number of
observations are used in the O-A calculation (column 3), not because the fit has gotten
worse.

100
GSI Applications

5.2.3.2: Check the analysis increment

The same methods for checking the optimal minimization as demonstrated in section
5.1.4.2 can be used for radiance assimilation. Similar features to the conventional
assimilation should be seen with the minimization. The figures below show detailed
information of how the radiance data impact the analysis results. Using the same NCL
script as in section 5.1.4.3, analysis increment fields are plotted comparing the analysis
results with radiance and conventional to the analysis results with conventional
assimilation only. This difference is only shown at level 49 and level 6, which represent
the maximum temperature increment level (49) and maximum moisture increment level
(6).

Figure 5.4: Analysis increment fields comparing to the analysis with PREPBUFR only at level 6

101
GSI Applications

Figure 5.5: Analysis increment fields comparing to the analysis with PREPBUFR only at level 49

In order to fully understand the analysis results, the following needs to be understood:

1. The weighting functions of each channel and the data coverage at this analysis time.
There are several sources on the Internet to show the weighting function for the
AMSU-A channels. Chanel 1 is the moisture channel, while the others are mainly
temperature channels (Channels 2, 3 and 15 also have large moisture signals). Because
a model top of 10 mb was specified for this case study, the actual impact should come
from channels below channel 12.
2. The usage of each channel is located in the file named satinfo in the run directory.
The first two columns show the observation type and platform of the channels and the
third column tells us if this channel is used in the analysis. Because a lot of amsua_n15
and amsua_n18 data were used, they should be checked in detail. In this case, Channel
11 and 14 from amsua_n15 and channel 9 and 14 from amsua_n18 were turned off.
3. Thinning information: a quick look at the namelist in the run directory: gsiparm.anl
shows that both amsua_n15 and amsu_n18 using thinning grid 2, which is 60 km. In
this case, the grid spacing is 30 km, which indicates to use the satellite observations
every two grid-spaces, which may be a little dense.
4. Bias correction: radiance bias correction was previously discussed. It is very important
for a successful radiance data analysis. The run scripts can only link to the old bias
correction coefficients that are provided as an example in ./fix:
SATANGL=${FIX_ROOT}/global_satangbias.txt
cp ${FIX_ROOT}/ sample.satbias ./satbias_in

102
GSI Applications

Users can also download the operation bias correction coefficients during experiment
period as a starting point to calculate the coefficients suitable for their experiments.

Radiance bias correction for regional analysis is a difficult issue because of the limited
coverage of radiance data. This topic is out of the scope of this document, but this
issue should be considered and understood when using GSI with radiance applications.

5.3. Assimilating GPS Radio Occultation Data with GSI

5.3.1: Run script

The addition of GPS Radio Occultation (RO) data into the GSI analysis is similar to that of
adding radiance data. In the example below, the RO data is used as refractivities. There is
also an option to use the data as bending angles. The same run scripts used in sections 5.1.1
and 5.2.1 can be used with the addition of the following link to the observations:
ln -s ${OBS_ROOT}/le_gdas1.t12z.gpsro.tm00.bufr_d gpsrobufr

For this case study, the GPS RO BUFR file was downloaded and saved in the OBS_ROOT
directory. The file is linked to the name gpsrobufr, following the namelist section
&OBS_INPUT:

dfile(10)='gpsrobufr',dtype(10)='gps_ref',dplat(10)='',dsis(10)='gps',dval(10)=1.0,dthin(10)=0,

This indicates that GSI is expecting a GPS refractivity BUFR file named gpsrobufr. In
the following example, GPS RO and conventional observations are both assimilated.
Change the run directory name in the run scripts to reflect this test:

WORK_ROOT=/scratch1/gsiprd_${ANAL_TIME}_gps_prepbufr

5.3.2: Run GSI and check the run status

The process of running GSI is the same as described in section 5.1.2. Once run_gsi.ksh has
been submitted, move into the working directory, gsiprd_2011032212_gps_prepbufr, to
check the GSI analysis results. The run directory will look exactly the same as with the
conventional data, with the exception of the link to the GPS RO BUFR files used in this
case. Following the same steps as in section 5.1.3, check the stdout file to see if GSI has
run through each part of the analysis process successfully. In addition to the information
outlined for the conventional run, the GPS RO BUFR files should have been read in and
distributed to each sub domain:

103
GSI Applications

OBS_PARA: ps 2352 2572 8367 2673


OBS_PARA: t 4617 4331 12418 4852
OBS_PARA: q 3828 3908 11096 3632
OBS_PARA: pw 89 31 141 23
OBS_PARA: uv 5704 4835 15025 4900
OBS_PARA: sst 0 0 2 0
OBS_PARA: gps_ref 3538 5580 2277 6768

Comparing the output to the content in section 5.1.3, it can be seen that the GPS RO
refractivity data have been read in and distributed to four sub-domains successfully.

5.3.3: Diagnose GSI analysis results

5.3.3.1: Check file fort.212

The file fort.212 is the file for the fit of gps data in fractional difference. It has the same
structure as the fit files for conventional data. Below is a quick look to be sure the GPS RO
data were used:

Observation Background (O-B)


ptop 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 50.0 0.0
it obs type styp pbot 1200.0 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 2000.0
--------------------------------------------------------------------------------------------------------------------------
o-g 01 all count 4 87 155 548 942 675 411 497 618 855 1416 8709
o-g 01 all bias -0.50 -0.27 -0.18 -0.19 -0.11 -0.03 0.12 0.16 0.09 0.07 -0.06 -0.07
o-g 01 all rms 0.77 0.87 0.83 1.02 0.74 0.43 0.44 0.50 0.54 0.62 0.57 0.66

Observation Analysis (O-A)

ptop 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 50.0 0.0
it obs type styp pbot 1200.0 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 2000.0
--------------------------------------------------------------------------------------------------------------------------
o-g 03 all count 5 108 192 615 950 673 413 500 631 868 1417 8891
o-g 03 all bias -0.31 -0.11 -0.06 -0.01 -0.01 -0.04 0.00 0.01 0.01 -0.01 -0.01 0.01
o-g 03 all rms 0.43 0.62 0.79 0.77 0.54 0.29 0.20 0.21 0.24 0.28 0.40 0.48

It can be seen that most of the GPS RO data are located in the upper levels, with a total of
8709 observations used in the analysis during the 1st outer loop, and 8891 used to calculate
O-A. After the analysis, the data bias reduced from -0.07 to 0.01, and the rms was reduced
from 0.66 to 0.48. It can be concluded that the analysis with GPS RO data looks reasonable
from these statistics.

104
GSI Applications

5.3.3.2: Check the analysis increment

The same methods for checking the minimization in section 5.1.4.2 can be used for the
GPS RO assimilation. The following figures give detailed information of how the new
data impacts the analysis result. Using the NCL script used in section 5.1.4.3, analysis
increment fields are plotted comparing the analysis results with GPS RO and conventional
to the analysis results with conventional assimilation only for level 48, which represents the
maximum temperature increment.

Figure 5.6: Analysis increment fields comparing to the analysis with PrepBUFR only at level 48

5.4 Introduction to GSI hybrid analysis

The 3DVAR hybrid analysis is a new analysis option in the GSI system. It provides the
ability to bring the flow dependent background error covariance into the analysis based on
ensemble forecasts. If ensemble forecasts have been generated, setting up GSI to do a
hybrid analysis is straightforward and only requires two changes in the run script in
addition to the current 3DVAR run script:

105
GSI Applications

Change 1: Link the ensemble members to the GSI run directory

This change is to link the ensemble members to the GSI run directory and assign
each ensemble member a name that GSI recognizes. For ensemble forecasts using
WRF, GSI assumes each ensemble member has been named as follows:

wrf_en001, wrf_en002, , wrf_en011, wrf_en012, , wrf_en040,

This naming convention is used for forecast files with WRF NMM binary and
netcdf format as well as WRF ARW netcdf format. This version of GSI does not
support WRF ARW binary as an ensemble member format. Another ensemble
forecast format GSI supports is GEFS.

Change 2 Setup the namelist options in section HYBRID_ENSEMBLE

Users need to set l_hyb_ens=.true. to turn on hybrid ensemble analysis.


Commonly used namelist options for the hybrid analysis are listed below:

l_hyb_ens - if true, turn on hybrid ensemble option


uv_hyb_ens - if true, ensemble perturbation wind variables are u,v;
otherwise, ensemble perturbation wind variables are stream
function, velocity potential functions.

generate_ens - if true, generate internal ensemble based on existing


background error
n_ens - number of ensemble members.
beta1_inv - (1/beta1), the weight given to the static background error
covariance. 0 <= beta1_inv <= 1, should be tuned for optimal
performance; beta2_inv = 1 - beta1_inv is the weight given to
the ensemble derived covariance
=1, ensemble information turned off
=0, static background turned off

s_ens_h - homogeneous isotropic horizontal ensemble localization


scale (km)
s_ens_v - vertical localization scale (grid units for now)
regional_ensemble_option - integer, used to select type of ensemble to
read in for regional applications. Currently takes values from
1 to 4:
=1: use GEFS internally interpolated to ensemble grid.
=2: ensembles are in WRF NMM format
=3: ensembles are in ARW netcdf format.
=4: ensembles are in NEMS NMMB format.

Please note: the parameters s_ens_h, s_ens_v, and beta1_inv are tunable
parameters. They should be tuned for best performance.

106
GSI Applications

In this release version, a new option for the GSI hybrid application is added in the run
script. The run script assumes all the ensemble members are located under the directory
located at the path defined by the parameter ${mempath} and the ensemble members have a
name such as: wrfout_d01_${iiimem}, where ${iiimem} is an integer number indicating the
ensemble member ID. If users have ensemble forecasts in a different file structure with
different names, please change the following lines in the run script:

if [ -r ${mempath}/wrfout_d01_${iiimem} ]; then
ln -sf ${mempath}/wrfout_d01_${iiimem} ./wrf_en${iiimem}
else
echo "member ${mempath}/wrfout_d01_${iiimem} is not exit"
fi

After setup of the namelist parameters and the path and name of the ensemble members,
GSI can be run following the other 3DVAR cases introduced in this Chapter.

Summary

This chapter applied the knowledge from the previous chapter to demonstrate how to set
up, run, and analyze GSI for various applications. It is important to always check for
successful completion of the GSI analysis, as running to completion does not always
indicate a successful run. Using the tools and methods described in this chapter, a
complete picture of the GSI analysis can be obtained.

107
GSI Theory and Code Structure

Chapter 6: GSI Theory and Code Structure

6.1 GSI Theory

The major application of GSI is used as a three-dimensional variational (3DVAR) data


assimilation system. As a reference for users to understand the GSI analysis procedure, a
brief summary of the 3DVAR mathematical theory and the minimization steps used in the
GSI is given in this section.

6.1.1 3DVAR equations:

J = ( xa xb )T B 1 ( xa xb ) + ( Hxa oo )T O 1 ( Hxa oo ) + J c (1)


where:
xa : Analysis fields
xb : Background fields
B : Background error covariance matrix
H : Observation operator
oo : Observations
O : Observation error covariance
Jc: Constraint terms (e.g., dynamical constraint, moisture constraint)

Define an analysis increment (x=) x = xa xb , then equation (1) becomes:

J = xT B 1 x + ( H ( xb + x) oo )T O 1 ( H ( xb + x) oo ) + J c (2)

By assuming the linearity of the observation operator H, equation (2) can be written as:

J = xT B 1 x + ( Hx (oo Hxb ))T O 1 ( Hx (oo Hxb )) + J c (3)

Next, define the observation innovation as o = oo Hxb , equation (3) becomes:

J = xT B 1 x + ( Hx o)T O 1 ( Hx o) + J c (4)

108
GSI Theory and Code Structure

6.1.2 Iterations to find the optimal results

To improve convergence ,GSI preconditions its cost function by defining a new variable
y = B 1 x . Equation (4), in terms of the new variable y, becomes:

J = yT BT y + ( HBy o)T O 1 ( HBy o) + J c (5)

Using the chain rule, the gradients of background and observation parts of the cost function
(4) with respect to x and cost function (5) with respect to y have the form:

x J = B 1 x + H T O 1 ( Hx o) (6)
y J = BT y + BT H T O 1 ( HBy o) = B x J (7)

Equations (6) and (7) are simultaneously minimized by employing an iterative Conjugate
Gradient process.

Start by assuming:

x0 = y 0 = 0

Then iterate over n:

x J n = B 1 x n1 + H T O 1 (Hx n1 o) = y n1 + H T O 1 (Hx n1 o)
y J n = B x J n

Dir x n = y J n + Dir x n 1
Dir y n = x J n + Dir y n 1

x n = x n 1 + Dir x n
y n = y n 1 + Dir y n

Until either the maximum number of iterations has been reached or the gradient is
sufficiently minimized.

During the above iteration, the step size is calculated in subroutine dprodx based on the
equation:
( x J n x J n 1 )T y J n
=
( x J n x J n 1 )T Dir x n

The stepsize ( ) is calculated in subroutine stpcalc.

109
GSI Theory and Code Structure

6.1.3 Analysis variables

Typically, there are seven analysis variables used in GSI analysis:

Stream function ()
Unbalanced velocity potential ()
Unbalanced virtual temperature (T)
Unbalanced surface pressure (P)
Pseudo relative humidity [qoption =1] or normalized relative humidity [qoption=2]
Ozone mixing ratio (only for global GSI)
Cloud condensate mixing ratio (only for global GSI)

With broader application of GSI for chemical data assimilation, some new variables, such
as trace gases, aerosols, and chemistry are added as analysis variables. Also, gust and
visibility were added as analysis variables for RTMA application.

6.2 GSI Code Structure

This section introduces the basic code structure of the GSI. Section 6.2.1 describes the
main processes of the GSI using the three main routines. Sections 6.2.2 to 6.2.5 introduce
the code related to four important parts of GSI: background IO, observation ingestion,
observation innovation calculation, and minimization iteration.

6.2.1 Main process

At the top most level of abstraction, the GSI code is divided into three phases; the
initialization, the run, and the finalize phase. The philosophy behind this division is to
create a modular program structure with tasks that are independent of one another.

The main top-level driver routine is called gsimain and is located in the file gsimain.f90.
Ninety percent of gsimain.f90 is a variety of useful Meta data.

Major change history


List of input files
Subroutines and modules
List of external libraries
Complete list of exit states
A discussion of important namelist options

Possibly the most important of these is the list of exit codes. Should the GSI run fail from
an internal error, the exit code may provide sufficient insight to resolve the issue. The final

110
GSI Theory and Code Structure

lines of gsimain.f90 consist of the three main calls to initialize, run and finalize. The
table below summarizes each of these phases.

gsimain.f90 main steps in each call

gsi_4dcoupler_parallel_init
MPI initialize
Initialize defaults of variables in modules
Read in user input from namelist
4DVAR setup if it is true (not supported)
call gsimain_initialize Check user input for consistency among
(gsimod.F90) parameters for given setups
Optional read in namelist for single observation
run
Write namelist to standard out
If this is a wrf regional run, the run interface
with wrf:
call convert_regional_guess (details in section 6.2.2)
Initialize variables, create/initialize arrays
Initialize values in radinfo and aeroinfo

call gsimain_run Call the main GSI driver routine


(gsimod.F90) call gsisub(mype)
(check next page for steps in gsisub)

If 4DVAR, then:
call gsi_4dcoupler_final_traj

call gsimain_finalize Deallocate arrays


(gsimod.F90) MPI finialize

111
GSI Theory and Code Structure

GSI main process (continue) subroutine glbsoi (glbsoi.F90)


driver for GSI
subroutine gsisub (gsisub.F90) Initialize timer for this procedure
high level driver for GSI If l_hyb_ens is true, then initialize
If not ESMF machinery for hybrid ensemble 3dvar
Allocate grid arrays Check for alternative minimizations
Get date, grid, and other information Initialize observer
from background files Check GSI options against available
Set communicators between sub- number of guess time levels
domain and global/horizontal slabs Read observations and scatter
End if not ESMF Create/setup background error and
If single observation test: background error balance
Create prep.bufr file with single obs If l_hyb_ens is true, then read in
in it ensemble perturbations
If regional analysis: If 4d-var and not 1st outer loop, then
Read in Level 2 land radar winds read output from previous min.
and create radar wind superob file Set error (variance) for predictors
call radar_bufr_read_all (only use guess)
If initial pass: Set errors and create variables for
Read info files for assimilation of dynamical constraint
various observations Main outer analysis loop
If initial pass:
Computer random number for do jiter=jiterstart,jiterlast
precipitation forward model Set up right hand side of analysis equation
call setuprhsall (details in section 6.2.4)
Set up right hand side of adjoint of analysis
Complete setup and execute external equation if forecast sensitivity to observations
and internal minimization loops Inner minimization loop
if (lobserver) then if (laltmin) then
if initial pass: call observer_init if (lsqrtb) call sqrtmin
call observer_run if (lbicg) call bicg
else
if last pass: call observer_finalize if (newpc4pred) call pcinfo
else call pcgsoi (details in section 6.2.5)
call glbsoi(mype) endif
endif Save information for next minimization
Save output of adjoint of analysis equation
end do ! jiter
If last pass:
Deallocate arrays
Calculate and write O-A information
Deallocate arrays
Note: lobserver = if true, calculate Write updated bias correction
coefficients
observation departure
vectors only. Finalize observer
Finalize timer for this procedure

112
GSI Theory and Code Structure

6.2.2 GSI background IO (for 3DVAR)


Read background
Background files Convert to internal format Read in and distribution

convert_regional_guess read_guess (read_guess.F90)

NMM NetCDF convert_netcdf_nmm read_wrf_nmm_netcdf_guess

NMM binary convert_binary_nmm read_wrf_nmm_binary_guess

ARW NetCDF convert_netcdf_mass read_wrf_mass_netcdf_guess

ARW binary convert_binary_mass read_wrf_mass_binary_guess


convert_binary_2d read_2d_guess
Twodvar
nems_nmmb convert_nems_nmmb read_nems_nmmb_guess
read_cmaq_guess
CMAQ
read_bias (bias correction fields)
if ( use_gfs_nemsio ) then
global GFS read_nems
read_nems_chem
else
Note: this chart doesnt include read_gfs
ensemble member ingest for hybrid read_gfs_chem

Output analysis result


Analysis results file
write_all (write_all.F90) write_regional_analysis
wrwrfnmma_netcdf NMM NetCDF
update_netcdf_nmm
wrwrfnmma_binary NMM binary
wrwrfmassa_netcdf
ARW NetCDF
write_regional_analysis update_netcdf_mass
wrwrfmassa_binary ARW binary
wr2d_binary Twodvar
wrnemsnmma_binary nems_nmmb
if ( use_gfs_nemsio ) write_cmaq CMAQ
write_nems
else global GFS
write_gfs
write_bias (bias correction)

113
GSI Theory and Code Structure

6.2.3 Observation ingestion


Data type Observation type Subroutine that reads
(ditype) (obstype) data
t, q, ps, pw, spd, mta_cld, read_prepbufr Note: This table is based on
gos_ctp, gust, vis subroutine read_obs in
uv from satwnd read_satwnd read_obs.F90:
Not from satwnd read_prepbufr
sst from mods read_modsbufr Data type is saved in
not from mods read_prepbufr array ditype
srw read_superwinds Observation type is save
conv in array obstype. In
tcp read_tcps
namelist, the observation
lag read_lag
type is dtype
rw (radar winds Level-2) read_radar
dw (lidar winds) read_lidar
rad_ref read_RadarRef_mosaic
lghtn read_lightning
larccld read_NASA_LaRC
pm2_5 read_anowbufr
pblh read_pblb
amsua
amsub Each observation type uses one or
(platform) msu more processors to read in the
not AQUA mhs read_bufrtovs data and then write the data into a
hirs4,3,2 file called obs_input.*, where * is
ssu (TOVS 1b data) a processor ID that is used to read
(platform) airs in certain observation type.
AQUA amsua read_airs
hsb (airs data)
rad atms read_atms
iasi read_iasi
(satellite
cris read_cris
radiances)
sndr, sndrd1/2/3/4 read_goesndr
(GOES sounder data) Then in subroutine obs_para
ssmi read_ssmi (obs_para.f90), each
amsre_low/mid/hig read_amsre processor reads all
obs_input.* files and save all
ssmis* read_ssmis
observations within its sub-
goes_img read_goesimg
domain into a file called:
seviri* read_seviri
pe*.obs-type_outer-lopp,
avhrr_navy read_avhrr_navy
avhrr read_avhrr
where * is 4 digital processor
ozone subuv2, omi, gome, read_ozone ID.
o3lev, mls
co mopitt read_co
pcp pcp_ssmi, pcp_tmi, read_pcp
pcp_amsu,pcp_stage3
gps gps_ref, gps_bnd read_gps
aero aod read_aerosol

114
GSI Theory and Code Structure

6.2.4 Observation innovation calculation


Data type Observation type Subroutine
Note: this table is based on subroutine
(ditype) (obstype) calculate
setuprhsall in setuprhsall.f90:
innovation
Data type is saved in array ditype
t setupt
Observation type is save in array
uv setupw
obstype
q setupq
ps setupps
pw setuppw
spd setupspd The observation departure from the
sst setupsst background of each outer loop is
conv calculated in subroutine setuprhsall.
srw setupsrw
tcp setuptcp A array (rdiagbuf) that holds
lag setuplag observation innovation for diagnosis is
rw (radar winds Level-2) setuprw generated in each setup routine. (Also
dw (lidar winds) setupdw see A.2)
pm_2_5 setuppm2_5 The index of the data array in setupt is
list below:
gust setupgust
vis setupvis index content
pblh setuppb1h 1 ier obs error
amsua 2 ilon grid relative obs location
amsub (x)
msu 3 ilat grid relative obs location
(platform) mhs (y)
not AQUA 4 ipres pressure
hirs4,3,2
5 itob t observation
ssu
6 id station id
(platform) airs
7 itime observation time in data
AQUA amsua
array
hsb
8 ikxx observation type
atms
9 iqt flag indicating if
iasi
moisture obs available
cris
rad setuprad 10 iqc quality mark
sndr, sndrd1, sndrd2
(satellite 11 ier2 original-original obs
sndrd3, sndrd4
radiances) error ratio
ssmi 12 iuse use parameter
amsre_low/mid/hig 13 idomsfc dominant surface type
ssmis* 14 iskint surface skin temperature
goes_img 15 iff10 10 meter wind factor
seviri* 16 isfcr surface roughness
avhrr_navy 17 ilone longitude (degrees)
avhrr 18 ilate latitude (degrees)
ozone subuv2, omi, gome, setupozlay 19 istnelv station elevation (m)
o3lev, mls setupozlev 20 iobshgt observation height (m)
pcp pcp_ssmi, pcp_tmi, setuppcp 21 izz surface height
pcp_amsu,pcp_stage3 22 iprvd observation provider
co mopitt, subuv2 setupco 23 isprvd observation subprovider
gps gps_ref setupref 24 icat data level category
gps_bnd setupbend 25 iptrb t perturbation

115
GSI Theory and Code Structure

6.2.5 Inner iteration

The inner iteration loop of GSI is where the cost function minimization is computed. GSI
provides several minimization options, but here we will focus on the preconditioned
conjugate gradient method. The inner iteration of the GSI variational analysis is performed
in subroutine pcgsoi (pcgsoi.f90). inside the following loop:

inner_iteration: do iter=0,niter(jiter)

end do inner_iteration

The main steps inside the loop are listed as a table below with the corresponding code and
the terms of equation in Section 6.1.

Steps in inner Code in pcgsoi.f90 Corresponding equations in


iteration Chapter 5
Gradient of call intall H T O 1 ( Hx n 1 o)
observation term
Add gradient of gradx(i)=gradx(i)+yhatsave(i) x J n = H T O 1 ( Hx n 1 o) + y n 1
background term
Apply background call bkerror(gradx,grady) y J n = B x J n
error covariance
Calculate new b=dprodx b=
search direction ( x J n x J n 1 )T y J n
and norm of =
( x J n x J n 1 )T Dir x n
gradients
Calculate new dirx(i)=-grady(i)+b*dirx(i) Dir xn = y J n + Dir xn1
search direction diry(i)=-gradx(i)+b*diry(i)
Dir y n = x J n + Dir y n 1
Calculate stepsize call stpcalc a
stp= =
b
Update solution xhatsave(i)=xhatsave(i)+stp*dirx(i) x n = x n 1 + Dir x n
yhatsave(i)=yhatsave(i)+stp*diry(i) y n = y n 1 + Dir y n

For detailed steps, advanced developers are suggested to read through the code and send
questions to gsi_help@ucar.edu.

116
Observation and Background Error Statistics

Chapter 7: Observation and Background Error Statistics

7.1 Conventional Observation Errors

Each observation type has its own observation errors. In this section, we introduce several
topics related to the conventional observation error processing in GSI. The observation
error for satellite radiance and its adjustment will be discussed in Chapter 8.

7.1.1 Getting original observation errors

For the global GSI analysis, when oberrflg (a namelist option in section &obsqc) is true,
observation errors are generated based on an external observation error table according to
the types of observations. Otherwise, observation errors are read in from the PrepBUFR
file.

For regional GSI runs, GSI forces the use of an external observation error table to get
observation errors no matter what the oberrflg is set to (oberrflg is set to true for
regional runs in gsimod.F90).

The external observation error table file, errtable, includes observation errors for all types
of observations. It is copied from the ~/comGSI_v3.1/fix directory by the run script. This
release package provides two sample observation error table files, nam_errtable.r3dv and
prepobs_errtable.global, in the ./fix directory. The nam_errtable.r3dv is used in the sample
run script as a default observation error table. The observation error file is a text file and
therefore the error values can be tuned easily. The following shows a portion of
nam_errtable.r3dv for rawinsondes and its description highlighted in bold and in the table
following the file:

Column# 1 2 3 4 5 6
120 OBSERVATION TYPE
0.11000E+04 0.12671E+01 0.56103E+00 0.10000E+10 0.68115E+00 0.10000E+10
0.10500E+04 0.13302E+01 0.63026E+00 0.10000E+10 0.68115E+00 0.10000E+10
0.10000E+04 0.14017E+01 0.73388E+00 0.10000E+10 0.68115E+00 0.10000E+10
0.95000E+03 0.14543E+01 0.86305E+00 0.10000E+10 0.71307E+00 0.10000E+10
0.90000E+03 0.14553E+01 0.99672E+00 0.10000E+10 0.74576E+00 0.10000E+10
0.85000E+03 0.13865E+01 0.11210E+01 0.10000E+10 0.77845E+00 0.10000E+10

220 OBSERVATION TYPE


0.11000E+04 0.10000E+10 0.10000E+10 0.17721E+01 0.10000E+10 0.10000E+10
0.10500E+04 0.10000E+10 0.10000E+10 0.20338E+01 0.10000E+10 0.10000E+10
0.10000E+04 0.10000E+10 0.10000E+10 0.22927E+01 0.10000E+10 0.10000E+10
0.95000E+03 0.10000E+10 0.10000E+10 0.24559E+01 0.10000E+10 0.10000E+10
0.90000E+03 0.10000E+10 0.10000E+10 0.25377E+01 0.10000E+10 0.10000E+10
0.85000E+03 0.10000E+10 0.10000E+10 0.25705E+01 0.10000E+10 0.10000E+10

117
Observation and Background Error Statistics

Column # 1 2 3 4 5 6
Content Pressure T q UV Ps Pw
Unit hPa degree C percent/10 m/s mb kg/m2(or mm)

For each type of observation, the error table has 6 columns and 33 rows(levels). The 1st
column prescribes 33 pressure levels, which cover from 1100 hPa to 0 hPa. The columns 2-
6 prescribe the observation errors for temperature (T), moisture (q), horizontal wind
component (UV), surface pressure (Ps), and the total column precipitable water (Pw). The
missing value is 0.10000E+10.

The observation error table for each observation type starts with the PrepBUFR type
number, such as:
120 OBSERVATION TYPE

220 OBSERVATION TYPE

The PrepBUFR data type number 100-199 are for temperature (T), moisture (q), and
surface pressure (Ps) observations, while number 200-299 are horizontal wind component
(UV) observations. Please check the BUFR/PrepBUFR Users Guide for more details on
PrepBUFR, which can be downloaded at the DTC BUFR/PrepBUFR website:

http://www.dtcenter.org/com-GSI/BUFR/index.php

During the GSI analysis, the entire contents of this file are read in by the subroutine
converr_read and stored in a three-dimensional array etabl. When a conventional
observation of a certain type is read in from the prepbufr file (read_prepbufr.f90), the
corresponding observation error profile of this type in array etabl will be vertically
interpolated to the specific observation level and used as the error of this observation.
Consequently, for each conventional type of observation, a matrix (obserr) is generated
holding the corresponding observation errors for the basic five variables at observation
levels. The contents of obserr array are listed below:

obserr(1,k)= Ps, obserr(2,k)=q, obserr(3,k)=T, obserr(5,k)=UV, obserr(7,k)=Pw

Where k is the index of observation levels.

7.1.2 Observation error adjustment and gross error check within GSI

The actual observation errors used in GSI start with the external (either from PrepBUFR
files or an error table file) observation errors in the obserr array and go through multiple
adjustment steps, which are set based on observation quality, vertical sigma location,

118
Observation and Background Error Statistics

observation density, time of the observations, etc. The adjustment occurs in


read_prepbufr.f90 and its main steps are listed as follows:

1. Observation errors for each variable are bonded by their corresponding lower limits.
Currently, these lower limits are hard coded and prescribed in read_prepbufr. The
observation error limits for temperature, moisture, wind, surface pressure and total
precipitable water are: terrmin=0.5, qerrmin=0.1, werrmin=1.0, perrmin=0.5,
pwerrmin=1.0, respectively.
2. Observation errors are adjusted based on the quality markers from the prepbufr data
files. If the quality markers from prepbufr are larger than a threshold value (lim_qm
in read_prepbuf.f90r), the corresponding observation errors are adjusted to a very
large number (1.0x106, which indicates a bad observation and will not make any
impact on the analysis results). If the quality markers are smaller than lim_qm, the
observation errors are adjusted based on the vertical location and vertical
distribution of the observations. Please refer to the BUFR/PrepBUFR Users Guide
for more details on the quality markers and the values of lim_qm.
3. If an observation quality marker is either 3 or 7, the observation error can be
inflated by setting inflate_error as true. The value of the inflation factor may be set
based on observation types. However, currently it is fixed as 1.2.
4. For certain observation types (e.g., T), their observation errors are amplified by a
factor of 1.2 if the observation locations are above 100 hPa.

Besides the above-mentioned steps, observation errors are further inflated during the
observation innovation calculation (e.g., in the subroutine setupt, see section 6.2.4 for
detail) when the observation is located either lower than the lowest analysis level or higher
than the highest analysis level. In the same routine, GSI performs gross error checks and, if
oberror_tune is set to true, observation error tuning (this function is not discussed in this
document).

The gross error check is an important quality check step to exclude questionable
observations that degrade the analysis. Users can adjust the threshold of the gross error
check for each data type within the convinfo file. For example, the following is a part of
convinfo without last three columns:
!otype type sub iuse twindow numgrp ngroup nmiter gross ermax ermin var_b var_pg ithin
ps 120 0 1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0
ps 180 0 1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0
t 120 0 1 3.0 0 0 0 8.0 5.6 1.3 8.0 0.000001 0
t 126 0 -1 3.0 0 0 0 8.0 5.6 1.3 8.0 0.001000 0

The gross check for each data type is controlled by gross, ermax, and ermin. If an
observation has observation error: obserror, then a gross check ratio is calculated:

ratio = (Observation-Background)/max(ermin,min(ermax,obserror))

If ratio > gross, then this observation fails the gross check and will not be used in the
analysis. The unused observation is indicated as rejection in the fit files.

119
Observation and Background Error Statistics

7.2 Background Error Covariance

Background error covariance plays a very important role in determining the quality of
variational analysis for NWP models. It controls what percentage of the innovation
becomes the analysis increment, how each observation impacts a broad area, and the
balance among different analysis variables.

Since most of the data assimilation background are model forecasts from a prior time step,
the background error covariance matrix (B) can be defined as the error covariance of model
forecasts:
[Forecast (x) Truth (xtruth)]

Since the actual state of atmosphere (truth) is not known, the forecast errors need to be
estimated. When estimating forecast errors, the most common methods are the NMC
method and ensemble method. In the NMC method, forecast errors are estimated with
the difference of two (typically 12 and 24 hours) forecasts valid for the same time. In the
ensemble method, the forecast errors are estimated with ensemble perturbations
(ensemble - ensemble mean).

Because of the size of the model variables, the full size of a B matrix is extremely large. It
is typically on the order of 106x106, which in its present form cannot be stored in any
computer. This problem is simplified by using an ideal set of analysis variables for which
the analysis is performed. These are generally referred to as analysis control variables.
The analysis control variables are selected such that the cross-correlations between these
variables are minimum, which means less off-diagonal terms in B. The cross dependency is
removed with pre-computed regression coefficients. Further, the forecast errors are
modeled as a Gaussian distribution with pre-computed variances and lengthscale
parameters for each of the analysis control variables. We will use the following sub-
sections to briefly introduce how GSI processes these pre-computed background error
statistics and applies them in a GSI analysis.

To achieve desired regression coefficients, variance and lengthscale parameters, offline


computation should be conducted with a sufficiently large data set for a period of time,
typically, more than one month. For this purpose, a separate utility called gen_be can be
used. It is released as a stand alone tool for the generation of the background error
covariance matrix based on the forecasts from a user defined forecast system. Details about
this utility can be found in the GSI residential tutorial lecture by Rizvi et al. (the lecture
slides are available on-line at the GSI Users Page).

120
Observation and Background Error Statistics

7.2.1 Processing of background error statistics

The GSI package has several files in ~/comGSI_v3.1/fix/ to hold the pre-computed
background error statistics for regional and global domains for different grid
configurations. Since the GSI code has a build-in mechanism to interpolate input
background error statistics to any desired analysis grid, the following two files can be used
to specify the B matrix for any GSI regional application.

nam_nmmstat_na.gcv contains the regional background error statistics, computed using


forecasts from the NCEPs NAM model covering the northern hemisphere. It is a big
endian binary file. In the same directory, nam_nmmstat_na.gcv_Little_Endian is its
little endian version for Linux machines.
nam_glb_berror.f77.gcv contains the global background errors based on the NCEPs
GFS model, a global forecast model. It is a big endian binary file. In the same directory,
nam_glb_berror.f77.gcv_Little_Endian is its little endian version for Linux machines.

All the parameters for the global background error statistics are latitude dependent. In the
case of the regional background error statistics, regression coefficients of velocity potential
as well as variances and horizontal lengthscales for all the control variables are latitude
dependent. The remaining parameters such as regression coefficients for unbalanced
surface pressure, temperature and vertical lengthscales for all the fields do not vary
with latitude. In the GSI code, the background error statistics are initially read in at their
original sigma levels and interpolated vertically in log(sigma) coordinates on the desired
vertical sigma levels. All the subroutines performing such interpolations for global and
regional applications are available in m_berror_stats.F90 and m_berror_stats_reg.f90
modules, respectively.

In subroutines prewgt and prewgt_reg, lengthscales (both horizontal and vertical) and
variance information are read in and then vertically interpolated to analysis grids by calling
berror_read_wgt and berror_read_wgt_reg, while the balance information is read in
and vertically interpolated to analysis grids by calling berror_read_bal and
berror_read_bal_reg, respectively for global and regional applications

The following shows the list of arrays in which the original background error statistics are
read by the various subroutines discussed above:

121
Observation and Background Error Statistics

Category Array Dimension Content


name
agvi 0:mlat+1,1:nsig,1:nsig Regression coefficients for stream
Balance function and temperature
(Horizontal wgvi 0:mlat+1,1:nsig Regression coefficients for stream
regression function and surface pressure
coefficients) bvi 0:mlat+1,1:nsig Regression coefficients for stream
function and velocity potential
hwll 0:mlat+1,1:nsig,1:nc3d horizontal lengthscales for stream
function, unbalanced velocity potential,
Horizontal unbalanced temperature, and relative
and vertical humidity
influence hwllp 0:mlat+1, nc2d horizontal lengthscale for unbalanced
scale surface pressure
vz 1:nsig, 0:mlat+1, 1:nc3d Vertical lengthscale for stream function,
unbalanced velocity potential, unbalanced
temperature, and relative humidity
corz 1:mlat,1:nsig,1:nc3d Square root of variance for stream
function, unbalanced velocity potential,
variance unbalanced temperature, and relative
humidity
corp 1:mlat,nc2d Square root of variance for unbalanced
surface pressure
Note: mlat = number of latitude in original background error coefficient domain,
nsig = number of vertical levels in analysis grid
nc3d = number of 3 dimensional analysis variables
nc2d = number of 2 dimensional analysis variables

Horizontal interpolation of regression coefficients to the desired grid is done for global and
regional applications respectively in subroutines prebal and prebal_reg, residing in the
balmod.f90 module. Horizontally interpolated regression coefficients on the desired grid
are stored in bvz, agvz,wgvz and bvk, agvk, wgvk arrays for global and
regional applications, respectively. These regression coefficients are used in subroutine
balance to build the respective balance part of velocity potential, temperature, and surface
pressure fields.

In subroutines prewgt_reg and prewgt, horizontal and vertical lengthscales (hwll,


hwllp, vz) and variance (corz, corp) information are horizontally interpolated and adjusted
with the corresponding input tuning parameters (vs,hzscl, as3d and as2d) supplied
through gsiparm.anl and anavinfo.txt. Desired information is finally processed and
transformed to new arrays such as slw, sli, dssv and dssvs, which are subsequently
used for recursive filter applications both in the horizontal and vertical directions. The
variance array: dssv is an allocated array for 3D variables with dimensions lat, lon,
nsig, variables. The dssvs is an allocated array for 2D variables with dimensions
lat, lon, variables. For both of these arrays, allocation of variables is decided by the
input parameters supplied via anavinfo and from the background grid configuration.

122
Observation and Background Error Statistics

7.2.2 Apply background error covariance

According to the variational equations used in the GSI, the background error covariance is
used to calculate the gradient of the cost function with respect to y based on the gradient of
the cost function with respect to x, which can be represented below following Section
6.1.2:

y J = B x J (subroutine bkerror(gradx,grady))

Because B is very complex and has a very large dimension in most data analysis domains,
in reality, it must be decomposed into several sub-matrices to fulfill its function step by
step. In GSI, the B matrix is decomposed into the following form:

B = B balanceV B Z ( B x B y B y B x ) B Z V B Tbalance
The function of each sub-matrix is explained in the following table:

Sub-matrix of B Function Subroutine GSI files


Bbalance balance among different variables balance balmod.f90
B T adjoint of balance equation tbalance balmod.f90
balance
V Square root of variance bkgvar bkgvar.f90
BZ vertical smoother frfhvo smoothzrf.f90
Bx B y B y Bx Self-adjoint smoothers in West-East smoothrf smoothzrf.f90
(Bx) and South-North (By) direction

The composition of B is achieved by calling bkerror in following three steps:

T
Step 1. Adjoint of balance equation ( Bbalance ) is done by calling tbalance

Step 2. Apply square root of variances, vertical and horizontal parts of background error
correlation by calling subroutine bkgcov
Multiply by square root of background error variances ( V ) by calling bkgvar;
Apply vertical smoother ( BZ ) by calling frfhvo;
Convert from subdomain to full horizontal field distributed among processors
by calling sub2grid;
Apply self-adjoint smoothers in West-East (Bx) and South-North (By) direction
by calling smoothrf. Smoothing in horizontal is achieved by calling ryxyyx at
each vertical sigma level in a loop over number of vertical sigma levels (nlevs).
Smoothing for three horizontal scales is done with the corresponding weighting
factors (hswgt) and horizontal lengthscale tuning factors (hzscl);
The horizontal field is transformed back to respective subdomains by calling
grid2sub;
Apply vertical smoother ( BZ ) by calling frfhvo;

123
Observation and Background Error Statistics

Multiply by the square root of background error variances ( V ) by calling


bkgvar.

Step 3. Application of balance equation ( Bbalance ) is done by calling balance


In this step the balance part of velocity potential, temperature and surface pressure is
computed from the stream function filled by using the corresponding regression
coefficients as follows:

velocity potential = unbalanced velocity potential + BBalance ( st vp ) stream function


temperature = unbalanced temperature + BBalance ( st t ) stream function
surface pressure = unbalanced surface pressure + BBalance ( st p ) stream function

124
Satellite Radiance Data Assimilation

Chapter 8: Satellite Radiance Data Assimilation

Satellite radiance data analysis is one of the most advanced and important features in the
GSI system. GSI has developed complex functions and code components to ingest,
analyze, bias correct, and monitor radiance observations from various satellite instruments.
In this Chapter, we will discuss these satellite radiance analysis related aspects from the
users point of view, including how to correctly setup and run GSI with radiance
observations, how to check and understand the radiance analysis results, bias correction,
and monitoring radiance observations. Related code structure will also be described to help
advanced users to further investigate and apply radiance data analysis with GSI.

8.1. Satellite radiance data ingest and distribution

8.1.1 link radiance BUFR files to GSI recognized names

All radiance observations used by the GSI are saved in the BUFR format. For detailed
information on the BUFR format and its processing techniques, please see the
BUFR/PrepBUFR Users Guide, which is available on line:

http://www.dtcenter.org/com-GSI/BUFR/docs/index.php

In the Section 3.1 of this users guide, we introduced all GSI BUFR/PrepBUFR
observation files and the GSI recognized observation file names in a table format. From
this table, we can see most of the BUFR files are used for satellite radiance data. Here, we
will use a small part of the table to explain the link between the GSI name and the file
name:

GSI Name Content Example file names


amsuabufr AMSU-A 1b radiance (brightness gdas1.t12z.1bamua.tm00.bufr_d
temperatures) from satellites NOAA-15,
16, 17,18, 19 and METOP-A
amsubbufr AMSU-B 1b radiance (brightness gdas1.t12z.1bamub.tm00.bufr_d
temperatures) from satellites NOAA15,
16,17

The right column of the table gives example radiance BUFR files that can be downloaded
from the NCEP data servers (please see BUFR/PrepBUFR users guide for the naming
convention for these files), while the left column is the data file name that GSI expects
during observation data ingestion. The middle column is a brief explanation of the data
content in each file.

125
Satellite Radiance Data Assimilation

As explained in section 5.2.1, running radiance data analysis with GSI could be as simple
as linking the radiance BUFR files to the GSI run directory with the GSI recognized name
in the run script. For example, if we add the following two lines to the GSI run script:

# Link to the radiance data


ln -s ${OBS_ROOT}/gdas1.t12z.1bamua.tm00.bufr_d amsuabufr
ln -s ${OBS_ROOT}/gdas1.t12z.1bamub.tm00.bufr_d amsubbufr

we should see AMSU-A and AMSU-B observations are analyzed in the GSI analysis, as
illustrated in the rest of Section 5.2. Here, we will give more detail on the setup and usage
of the GSI recognized observation file name (GSI name) in the left column of the section
3.1 table.

The GSI names, amsuabufr and amsubbufr, are actually decided by the parameters in the GSI
namelist section OBS_INPUT. As an example, the part of OBS_INPUT is:
dfile(28)='amsuabufr', dtype(28)='amsua', dplat(28)='n15', dsis(28)='amsua_n15',
dfile(29)='amsuabufr', dtype(29)='amsua', dplat(29)='n16', dsis(29)='amsua_n16',
dfile(30)='amsuabufr', dtype(30)='amsua', dplat(30)='n17', dsis(30)='amsua_n17',
dfile(31)='amsuabufr', dtype(31)='amsua', dplat(31)='n18', dsis(31)='amsua_n18',
dfile(32)='amsuabufr', dtype(32)='amsua', dplat(32)='metop-a', dsis(32)='amsua_metop-a',
dfile(33)='airsbufr', dtype(33)='amsua', dplat(33)='aqua', dsis(33)='amsua_aqua',
dfile(34)='amsubbufr', dtype(34)='amsub', dplat(34)='n15', dsis(34)='amsub_n15',
dfile(35)='amsubbufr', dtype(35)='amsub', dplat(35)='n16' dsis(35)='amsub_n16',
dfile(36)='amsubbufr', dtype(36)='amsub', dplat(36)='n17', dsis(36)='amsub_n17',

Please note that the last two columns of the OBS_INPUT have been excluded for
conciseness. From this list, we can see the content of dfile is the GSI name, which is the
observation file name recognized by GSI, while dtype and dplat indicate the radiance
instruments and the satellite name associated with the GSI name in dfile. The dsis is the
radiance observation type that is the combination of the instruments and satellite names.
This list tells us that the GSI expects NOAA-15 AMSU-A radiance observations from a
BUFR file with name amsuabufr. It also reads in the NOAA-18 AMSU-A observations
from the same file. For NOAA-17 AMSU-B observations, GSI will read them in from a
file named amsubbufr.

It is possible to change the GSI name in dfile to a user specified name (for example,
amsuagsi rather than 'amsuabufr') as long as the GSI name (amsuabufr)in the link from
the BUFR file (gdas1.t12z.1bamua.tm00.bufr_d) to the GSI name has also been
changed. The following demonstrates the process required to change the name in dfile.

Set new name in namelist section:


dfile(28)='amsuagsi', dtype(28)='amsua', dplat(28)='n15', dsis(28)='amsua_n15',
dfile(29)='amsuagsi', dtype(29)='amsua', dplat(29)='n16', dsis(29)='amsua_n16',

Then change the GSI name in the run script:


ln -s ${OBS_ROOT}/gdas1.t12z.1bamua.tm00.bufr_d amsuagsi

126
Satellite Radiance Data Assimilation

It is advised to use the GSI names provided in the released run script because they describe
the contents of the file well and are used by many users. However, the flexibility to setup a
different GSI name does give GSI more ability to analyze radiance observations from
different resources. For example, if we want GSI to assimilate NOAA-15 AMSU-A
observations from a BUFR file named gdas1.t12z.1bamua.tm00.bufr_d and
NOAA-16 AMSU-A observations from another BUFR file named
gdas2.t12z.1bamua.tm00.bufr_d, we can setup the run script and namelist as
follows:

Set the GSI names in the namelist section:


dfile(28)='amsuabufr', dtype(28)='amsua', dplat(28)='n15', dsis(28)='amsua_n15',
dfile(29)='amsuagsi', dtype(29)='amsua', dplat(29)='n16', dsis(29)='amsua_n16',

And then, link them in the run script:


ln -s ${OBS_ROOT}/gdas1.t12z.1bamua.tm00.bufr_d amsuabufr
ln -s ${OBS_ROOT}/gdas2.t12z.1bamua.tm00.bufr_d amsuagsi

Now, GSI will read in NOAA-15 AMSU-A observations from the GSI file amsuabufr,
which is the BUFR file gdas1.t12z.1bamua.tm00.bufr, and read in NOAA-16
AMSU-A observations from another GSI file amsuagsi, which is the BUFR file
gdas2.t12z.1bamua.tm00.bufr.

A common user mistake in the setup of the radiance data analysis is forgetting to add the
radiance observation type the user wants to use into the OBS_INPUT. Some users may
notice that NOAA-19 AMSU-A is not on the list of the OBS_INPUT setups in release
version 3. To use GSI to analyze NOAA-19 AMSU-A observations with the run script and
name list from release version 3, users need to add one more line in OBS_INPUT, for
example:
dfile(79)='amsuabufr', dtype(79)='amsua', dplat(79)='n19', dsis(79)='amsua_n19',

Where index 79 for this new line is from the existing number of parameter ndat in
namelist section SETUP plus 1. The ndat should also be set to 79.

In this case, NOAA-19 AMSU-A observations should be included in the BUFR file that
amsuabufr is linked to. The released run script will be continually updated to include new
satellite platforms, however users are suggested to double-check the content of the BUFR
file and the setup of the namelist if desired data types are missing from the analysis.

The radiance data normally need to be thinned in the analysis, the last column (dthin(26)=1,)
in the namelist section OBS_INPUT is used to setup radiance data thinning. The details of
radiance data thinning are described in section 3.3 under item 7.

More detailed control on how to use each channel of certain radiance observation types in
the GSI analysis can be achieved by setting up the satinfo file. The use of the satinfo file

127
Satellite Radiance Data Assimilation

was previously introduced in section 4.3. Please note the satinfo file is structured
differently in release version 3.1 from previous released versions.

8.1.2 GSI Code to ingest radiance data

GSI has a set of files (subroutines) named read_*.f90 to read in different types of
observations, including satellite radiance. The table in Section 6.2.3 gives a complete list of
such subroutines. Below is an excerpt of the table that applies to radiance data:

Data type Observation type Subroutine that reads data


(ditype) (obstype)
amsub
amsua
(platform) msu read_bufrtovs
not AQUA mhs
rad hirs4,3,2 (TOVS 1b data)
(satellite ssu
radiances)

sndr, sndrd1, sndrd2 read_goesndr


sndrd3, sndrd4 (GOES sounder data)

ssmi read_ssmi

From this table, we can see TOVS 1b observations from the NOAA and METOP satellites
are read in by subroutine read_bufrtovs and the GEOS sounder and SSMI observations are
read in by subroutine read_goesndr and read_ssmi.

In general, GSI reads in radiance observations from external BUFR files, picks the
observations within the analysis domain and time window, performs thinning based on the
coarse grid setup in OBS_INPUT, and saves them into an intermediate binary file using a
general data format across all observation types.

In this users guide, we will use subroutine read_bufrtovs (read_bufrtovs.f90) as an


example to introduce some important aspects of GSI radiance observation ingesting. All
these aspects can be extended to other radiance ingesting subroutines because they share
the same code structure and BUFR techniques. We hope these points can help advanced
users learn the detailed content inside the GSI radiance observation process and add new
observations for their GSI application.

BUFR file ingesting

The basic structure of BUFR file ingesting is two loops to read in every message
(read_subset) from the BUFR file and then reads in all observations (read_loop) from

128
Satellite Radiance Data Assimilation

each message. In the subroutine read_bufrtovs, the two loops are marked by the following
code:

! Loop to read bufr file


next=0
read_subset: do while(ireadmg(lnbufr,subset,idate)>=0)
...
read_loop: do while (ireadsb(lnbufr)==0)

...

! End of bufr read loops


enddo read_loop
enddo read_subset
call closbf(lnbufr)

The content of each observation needed by GSI can be found by searching the BUFR
mnemonics (bold in following code sample), for example, the following lines of the code
give a list of mnemonics included in the subroutine:
hdr1b ='SAID FOVN YEAR MNTH DAYS HOUR MINU SECO CLAT CLON CLATH CLONH HOLS'
if (atms) hdr1b ='SAID FOVN YEAR MNTH DAYS HOUR MINU SECO CLAT CLON CLATH CLONH HMSL'

hdr2b ='SAZA SOZA BEARAZ SOLAZI'

call ufbrep(lnbufr,data1b8,1,nchanl,iret,'TMBR')
call ufbrep(lnbufr,data1b8,1,nchanl,iret,'TMBRST')

An explanation of each mnemonic can be easily found from the BUFR table used to
generate this BUFR file. Users can get this BUFR table on-line, from decoding the BUFR
file, or checking the BUFR file bufrtab.012 in the fix directory of the release package. For
example, a search for SAZA SOZA in bufrtab.012, we found the following two lines:
| SAZA | 007024 | SATELLITE ZENITH ANGLE |
| SOZA | 007025 | SOLAR ZENITH ANGLE |

These lines tell us that GSI needs to read in satellite zenith angle and solar zenith angle for
each observation profile.

Data selection in reading process

In the data ingesting subroutine, only observations within the analysis domain (for regional
applications) and time window are processed for the thinning. After establishing a coarse
grid based on the setups in the parameters from OBS_INPUT, GSI starts a smart selection
of radiance profiles for the coarse grid. This processing of radiance data thinning not only
selects the nearest radiance profile in a coarse grid, but also considers the quality of the
radiance profiles. Each observation needs to go through the following steps in the data
thinning to only keep the radiance profile with the best quality for analysis:

1. Remove the profiles where the key channels are bad


2. Select a profile that has a larger number of good channels

129
Satellite Radiance Data Assimilation

3. Skip profiles that the Field of View (FOV) are out of range
4. Select profiles that are over better surface fields. Pick the observation in such
order based on surface features: sea, sea ice, snow and land, mixed surface.
5. Select profile based on the data quality predictor

Internal observation data format

After data thinning, the best quality radiance observation for each coarse grid is then saved
with surface status (calculated from the background) in a two-dimensional array called
data_all. The 1st dimension of the array saves all information about one observation and
the 2nd one loops through the observations. The code that assigns the content of the array
starts like:
data_all(1 ,itx)= rsat ! satellite ID
data_all(2 ,itx)= t4dv ! time
data_all(3 ,itx)= dlon ! grid relative longitude
data_all(4 ,itx)= dlat ! grid relative latitude

and ends like:


do i=1,nchanl
data_all(i+nreal,itx)=data1b8(i)
end do

The code itself gives clear notation on the content of the 1st dimension of the array except
for the last three lines. For example, it clearly tells us the first 4 items in the array are
satellite ID (rsat), observation time (t4dv), and grid relative longitude (dlon) and latitude
(dlat). However, there is no clear notation for data_all(i+nreal,itx), a little search for
the array data1b8 indicates it contains the brightness temperature from all channels in an
observation profile.

After reading and processing all observations in the BUFR file and saving them in the data
array data_all, this array is written to an intermediate binary file at the end of the
subroutine read_bufrtovs.

Observation count in stdout file

From the stdout file, we can see the following information counting the data during the data
ingesting stage, an example from the case in Chapter 5:
READ_BUFRTOVS: file=amsuabufr type=amsua sis=amsua_n15 nread= 128055
ithin= 2 rmesh= 60.000000 isfcalc= 0 ndata= 53932 ntask= 1

This tells us that the subroutine read_bufrtovs is reading NOAA-15 AMSU-A observations
from file amsuabufr. There are 128055 observations (profile number * channels number)
read in from the BUFR file and 53932 observations kept for further processing after data
selection and thinning.

130
Satellite Radiance Data Assimilation

8.1.3 information on ingesting and distribution

The analysis in GSI is done in each subdomain for MPI runs. The observation number in
each sub-domain can be found in the stdout file. All data types are listed in the stdout file
as shown in the following example, using the same example as section 5.2.2:
3:OBS_PARA: ps 2352 2572 8367 2673
3:OBS_PARA: t 4617 4331 12418 4852
3:OBS_PARA: q 3828 3908 11096 3632
3:OBS_PARA: uv 7859 6216 17360 6126
3:OBS_PARA: sst 96 239 564 150
3:OBS_PARA: pw 89 31 141 23
3:OBS_PARA: hirs4 metop-a 0 0 28 35
3:OBS_PARA: amsua n15 2564 1333 133 195
3:OBS_PARA: amsua n18 1000 2117 0 90
3:OBS_PARA: amsua metop-a 0 0 58 67
3:OBS_PARA: amsub n17 0 0 58 70
3:OBS_PARA: hirs4 n19 245 1097 0 37
3:OBS_PARA: amsua n19 656 3503 0 93

Please note the number in each subdomain is the number of the radiance profiles, not the
number of observed channels. Each profile includes many channels. For example, each
HRIS observation has 19 channels, each MSU has 4 channels, each AMSU-A has 15 and
AMSU-B has 5, each MHS has 5 and SSU has 3.

8.2. Radiance observation operator

The observation operator for radiance observations is very complex and out of the scope of
this users guide. Here, we only briefly introduce some features of the radiance observation
operator. The Community Radiative Transfer Model (CRTM) developed by JCSDA is
employed by the GSI system to transform control variables into simulated radiance or
brightness temperatures. This operator can be illustrated by the following equation:

y=K(x,z)
where:
y are simulated radiance observations;
x are profiles of temperature, moisture, and ozone;
K is the radiative transfer equation (CRTM);
z are unknown parameters such as the surface emissivity, CO2 profile, methane
profile, etc.

In GSI, x (including surface condition) are calculated based on the background fields and
then are put into the CRTM functions (K) to calculate the simulated radiance observations
y. When unknowns in K(x, z) are too large, which may be from the formulation of K or
unknown variables (z), observed radiance data cannot be reliably used and must be
removed during quality control. Examples of this include when clouds, trace gases, or
aerosols exist in the observed column. The description of radiance data quality control can
be found in the next section. For advanced users needing to learn the details of the radiance

131
Satellite Radiance Data Assimilation

observation operator in GSI, please check the corresponding subroutine listed in the right
column of the section 6.2.4 table.

Because GSI uses the CRTM functions as part of the radiance observation operator, the
CRTM coefficients have to be available during the radiance data analysis. In the GSI
release package, these CRTM coefficients are linked to the running directory by the run
script before the GSI starts to run. The details of linking CRTM coefficients can be found
in Chapter 3 in the introduction of the GSI run scripts. Please note that the GSI run script
does not know which kind of radiance observations will be used in the analysis. The script
links all the CRTM coefficients for the radiance observation types listed in the satinfo file.
After reading in radiance observations from BUFR files, GSI recognizes which kind of
radiance observations to be used and only reads in the corresponding coefficients needed.
Therefore, users only need to check whether the CRTM coefficients of the user interested
radiance data types are linked correctly. At the same time, users can ignore the warning
information on the missing CRTM coefficients if those coefficients are for the radiance
data types that are not used in the application.

8.3. Radiance observation quality control

The quality control (QC) may be the most important aspect of satellite data assimilation.
Unlike conventional observations from a prepbufr file, which includes the quality markers
from the NCEP quality control process, the satellite radiance BUFR file does not include
observation quality information. Instead, the quality control for radiance observations is
inside the GSI.

The GSI radiance data quality control starts right after the radiance observations are read in
(such as in read_bufrtovs.f90). We can think of the processing of radiance data thinning as
a part of the quality control because the thinning process selects the best quality
observations. The major radiance data quality control step is after the calculation of the
radiance observation departure in file setuprad.f90. Many QC steps are employed to
capture problematic satellite data, which mainly come from the following 4 sources:

Instrument problems
Clouds and precipitation simulation errors
Surface emissivity simulation errors.
Processing errors (e.g., wrong height assignment, incorrect tracking, etc.)

In GSI, each instrument has its own quality control subroutine. All these subroutines are in
the file qcmod.f90 and are listed as follows for reference:

132
Satellite Radiance Data Assimilation

subroutine name Quality Control for


qc_ssmi ssmi, amsre, and ssmis
qc_seviri seviri data
qc_ssu ssu data
qc_goesimg GOES image
qc_msu msu data
qc_irsnd ir sounder data(hirs, goessndr, airs, iasi, cris)
qc_avhrr avhrr and avhrr_navy
qc_amsua amsua data
qc_mhs amsub, mhs and hsb data
qc_atms atms data

After calculating the radiance observation departure from the background and bias
correction, these QC functions are called for each instrument to either toss the bad
(questionable) observations or inflate the low confidence observations. The number of
filtered observations by these QC functions is summarized in the radiance fit file (fort.207)
as 7 QC categories (steps). To help users understand the meanings of these numbers in the
radiance fit file, we will briefly introduce these QC steps in subroutine qc_amusa in the
following table. Please note these QC categories may have different meaning for different
instruments:

Category Quality Control steps Action to observations


QC1 Cloud affected profile, (factch4 > 0.5) Toss channel 1-6, 16
QC2 Inaccurate emissivity /surface temperature Toss channel 1-6, 16
estimate over sea
QC3 Cloud affected profile (Scattering index Toss channel 1-7, 16
factch6 > 1.0)
QC4 Inflate observation error over high terrain Inflate channel 7
(>2000m) observation error
QC5 Inflate observation error over high terrain Inflate channel 8
(>4000m) observation error
QC6 Retrieved could liquid water path > 1.0 Count profiles
QC7 Part of Scattering index > 1.0 Count profiles

Using the same example as section 4.5.2:


sat type penalty nobs iland isnoice icoast ireduce ivarl nlgross
n15 amsua 19769.16042371 4149 673 1475 268 1311 30453 0
qcpenalty qc1 qc2 qc3 qc4 qc5 qc6 qc7
19769.16042371 883 63 2127 183 0 20 46

Using the above table, we can understand numbers listed under qc1 to qc7. Listed below
also includes the explanation of the numbers not in the above table, for a complete
understanding of this part of the radiance fit file. For other portions of the fit file, please see
the introduction in section 4.5.2.

133
Satellite Radiance Data Assimilation

From the above example, we see there are 4149 NOAA-15 AMSU-A profiles after
thinning, among which there are:

673 profiles over land (iland)


1475 profiles over snow or ice (isnoice)
268 profiles over coast (icoast)
1311 profiles within tropics that has reduced qc bounds (ireduce)
30453 channels that failed in the general gross check (ivarl)
0 channels that passed the general gross check but failed the nonlinear gross check (nlgross)

883 profiles were tossed because of cloud affect based on factch4


63 profiles were tossed because of inaccurate emissivity /surface temperature estimate over sea
2127 profiles were tossed because of cloud affect based on factch6
183 profiles have inflated observation error because of high terrain (>2000m)
0 profiles have inflated observation error because of high terrain (>4000m)
20 profiles meet criterion QC6 (part of qc 1)
46 profiles meet criterion QC7 (part of qc 3)

So, nearly of the observations were tossed because of cloud affects.

8.4. Bias correction for radiance observations

Using bias correction to correct the system bias in the satellite radiance observations is one
of the key steps to get a successful satellite radiance data assimilation. This section will
introduce the basic theory of the GSI bias correction, the procedures and configurations of
the bias correction in the GSI system, an explanation of the namelist, satinfo, and
coefficients for bias correction, the use of the angle bias correction utility, and discussions
of some common issues users encounter in the application of the GSI bias correction.

8.4.1. Bias correction for satellite observations

Observation bias can systematically damage the data assimilation results and,
consequently, the quality of the forecasting system. Biases in satellite observations are of
particular concern because they may larger than the signal and damage the numerical
weather prediction system in a very short period of time.

Biases between the satellite observations and the model may come from the following
sources:
satellite instrument itself (e.g. poor calibration or characterization, or adverse
environmental effects);
radiative transfer model (RTM) linking the atmospheric state to the radiation
measured by the satellite (e.g. errors in the physics or spectroscopy, or from non-
modeled atmospheric processes);
systematic errors in the background atmospheric state provided by the NWP model
used for monitoring.

134
Satellite Radiance Data Assimilation

In GSI, satellite observation bias is represented as a linear regression based on N state-


dependent predictors Pi(x), with associated coefficients i :

Since the bias correction is applied to the radiance departures, this is equivalent to using the
modified definition of the observation operator:

H (x, )=H(x)+ BC(, x)

The training of the bias correction consists in finding the vector that allows the best fit
between the NWP fields x and the observations. This is obtained by minimizing the
following cost function:

For more details on the bias correction, please see the references listed below:

1. Auligne T., A. P. McNally and D. P. Dee. 2007. Adaptive bias correction for satellite data in a numerical
weather prediction system. Q. J. R. Meteorol. Soc. 133: 631-642.
2. Derber JC, Wu W-S. 1998. The use of TOVS cloud-cleared radiances in the NCEP SSI analysis system.
Mon. Weather Rev. 126: 22872299.
3. Harris BA, Kelly G. 2001. A satellite radiance-bias correction scheme for data assimilation. Q. J. R.
Meteorol. Soc. 127: 14531468.
4. Dee, D. P., 2004: Variational bias correction of radiance data in the ECMWF system. Pp. 97112 in
Proceedings of the workshop on assimilation of high-spectral-resolution sounders in NWP. 28 June1
July 2004, ECMWF, Reading, UK.
5. Dee, D. P. and S. M. Uppala, 2009, Variational bias correction of satellite radiance data in the ERA-
Interim reanalysis. Q. J. R. Meteorol. Soc. 135, 18301841.

8.4.2. The GSI Bias correction procedure and configurations

In GSI, the bias correction for satellite radiance has two parts: one part is air mass bias
correction, also called the predictive part of the bias correction; another part is angle
dependent bias correction. Each part of bias correction has its own bias correction
coefficient file:

The satbias_angle file contains the angle dependent part of the brightness
temperature bias for each channel/instrument/satellite. Also included in this file is
the mean temperature lapse rate for each channel weighted by the weighting
function for the given channel/instrument.
The satbias_in file contains the coefficients for the predictive part of the bias
correction.

135
Satellite Radiance Data Assimilation

GSI will read in the coefficients from both satbias_angle and satbias_in files, combine
them together with predictors to generate a system bias value for each channel, and then
subtract this system bias from the observation innovation during the radiance observation
operator calculation. During the minimization process, GSI will calculate the updated
coefficients for the predictive part of the bias correction and save the updated coefficients
in another file called satbias_out. The angle dependent bias coefficients are updated
outside of GSI using a utility named gsi_angleupdate in the release package. These new
mass and angle dependent bias coefficients should be used for the bias correction in the
next cycle of the GSI analysis.

To set up the bias correction for satellite radiance in the GSI system, users need to link the
right coefficient files in the run directory and keep the coefficient files updated in cycles:

Step 1, Link coefficient files for both mass bias correction and angle dependent bias into
the GSI run directory before running the GSI executable.

The coefficient files should come from the previous data assimilation cycle. However, if
there is no previous data analysis cycle, the sample coefficient files can be copied from the
directory ./fix within the community release version as a cold start. When using the run
script with the released version, the following lines in the run script copy the coefficient
files:

SATANGL=${FIX_ROOT}/global_satangbias.txt
SATINFO=${FIX_ROOT}/global_satinfo.txt
...
cp $SATANGL satbias_angle
cp $SATINFO satinfo

# for satellite bias correction


cp ${FIX_ROOT}/sample.satbias ./satbias_in

Within the directory ./fix, the sample angle dependent bias correction coefficients file is
called global_satangbias.txt, and the file for mass bias correction coefficients is
sample.satbias. Here, we also include the copy to the satinfo file because the bias
correction needs information from the satinfo file.

Step 2, Run GSI and save the output from the mass bias correction for next cycle

After running the GSI, an updated coefficient file for the mass bias correction is generated
in the run directory. This file is called satbias_out, which should be saved for the next
cycle of the GSI analysis. There is a line commented out in the released GSI run script
reserved for this purpose. The user should choose how to save the file for the next cycle:
# GSI updating satbias_in
#
# cp ./satbias_out ${FIX_ROOT}/sample.satbias

Step 3, Run the angle dependent bias correction utility after GSI runs and save updated
coefficients of angle dependent bias correction for use in the next cycle

136
Satellite Radiance Data Assimilation

The update of the coefficients for angle dependent bias correction is done by a stand alone
application named gsi_angleupdate located under the directory ./util, but outside the GSI
itself. This application reads in the diag files from the GSI analysis results and the old
angle dependent bias coefficients, updates the coefficients and saves them as a new file
called satbias_ang.out. We will introduce how to apply this utility in the next section.

Please note the cycling of the coefficients to let the bias information accumulate during the
data assimilation cycle is the key to getting the right bias correction.

8.4.3 Namelist, satinfo, and coefficients for bias correction

To conduct the bias correction, GSI needs several pieces of information from different
files:

The satellite platform information from the GSI namelist


The usage information for each channel from the satinfo file
The coefficients from both mass and angle dependent bias correction coefficient
files

The following is a brief introduction to these files to help the user to understand the
contents of each file and know how to check if the user interested satellite channels are
correctly configured in these files.

The satellite platform information from the GSI namelist

The complete explanation of the GSI run script and most often used namelist options can
be found in Chapter 3 of this guide. More details of setting up radiance data analysis in the
run script are described in section 1 of this chapter. Users should make sure that user
interested satellite instruments and platforms are in the list in &OBS_INPUT and have
been correctly linked to the BUFR files.

Also, the following is a list of GSI namelist options related to the bias correction:

Variable name Default value Description


diag_rad .true. logical to turn off or on the diagnostic radiance file
(true=on)
passive_bc .false. logical to turn off or on radiance bias correction for
monitored channels
adp_anglebc .false. option to perform variational angle bias correction

137
Satellite Radiance Data Assimilation

The usage information for each channel from the satinfo file

The GSI uses an information file called satinfo to control how to use each radiance
channel. Detailed information about satinfo can be found in the GSI Users Guide Section
4.3. The following is an example:
!sensor/instr/sat chan iuse error error_cld ermax var_b var_pg
amsua_n15 1 1 3.000 9.100 4.500 10.000 0.000
amsua_n15 2 1 2.000 13.500 4.500 10.000 0.000
amsua_n15 3 1 2.000 7.100 4.500 10.000 0.000
amsua_n15 4 1 0.600 1.300 2.500 10.000 0.000
amsua_n15 5 1 0.300 0.550 2.000 10.000 0.000
amsua_n15 6 1 0.230 0.230 2.000 10.000 0.000
amsua_n15 7 1 0.250 0.195 2.000 10.000 0.000
amsua_n15 8 1 0.275 0.232 2.000 10.000 0.000
amsua_n15 9 1 0.340 0.235 2.000 10.000 0.000
amsua_n15 10 1 0.400 0.237 2.000 10.000 0.000
amsua_n15 11 -1 0.600 0.270 2.500 10.000 0.000
amsua_n15 12 1 1.000 0.385 3.500 10.000 0.000
amsua_n15 13 1 1.500 0.520 4.500 10.000 0.000
amsua_n15 14 -1 2.000 1.400 4.500 10.000 0.000
amsua_n15 15 1 3.000 10.000 4.500 10.000 0.000
hirs3_n17 1 -1 2.000 0.000 4.500 10.000 0.000
hirs3_n17 2 -1 0.600 0.000 2.500 10.000 0.000

Users can easily understand the first 2 columns are sensor/instrument/satellite and channel
number information. The 3rd column is the usage information, which has the following
meanings:

iuse Channel usage in GSI


-2 do not use
-1 monitor if diagnostics produced
0 monitor and use in QC only
1 use data with complete bias correction
2 use data with no air mass bias correction
3 use data with no angle dependent bias correction
4 use data with no bias correction

For bias correction purposes, please make sure user interested channels are listed in the
satinfo file and have been set to the correct usage flag.

The coefficients from both mass and angle dependent bias correction coefficient
files

As previously introduced in this section, there are two bias correction coefficient files.
These files include the bias correction coefficients for each channel:

138
Satellite Radiance Data Assimilation

1) satbias_in

This file contains the coefficients for the predictive part of the bias correction (air mass
bias correction coefficients). There is a sample for this file named sample.satbias in the
GSI release package under the directory ./fix. All coefficients in this sample file are 0.
Here, we use NOAA-15 AMSU-A from the satbias_out file from the radiance application
case in Chapter 5 as example:
1 amsua_n15 1 0.472353 -0.231512 0.291223 0.000634 -0.148959
2 amsua_n15 2 -0.677697 0.382025 1.424922 -0.000061 0.016514
3 amsua_n15 3 -2.631062 0.134578 2.968469 -0.004946 1.213581
4 amsua_n15 4 -0.470401 2.121855 5.764014 0.006496 1.333609
5 amsua_n15 5 10.996354 -0.762965 1.787372 0.082404 -1.661531
6 amsua_n15 6 -22.026905 -1.543174 -1.397403 0.175626 -11.384948
7 amsua_n15 7 -1.954080 -0.293421 0.029899 -0.064129 3.039958
8 amsua_n15 8 -9.468913 -1.490995 -0.856006 -0.013090 -0.945916
9 amsua_n15 9 -22.737061 -2.195735 0.247890 -0.357354 -15.298422
10 amsua_n15 10 -0.875332 2.212551 -0.392323 -0.337414 -8.785395
11 amsua_n15 11 0.000000 0.000000 0.000000 0.000000 0.000000
12 amsua_n15 12 2.800800 -4.042608 0.060067 0.913834 15.980004
13 amsua_n15 13 0.000000 0.000000 0.000000 0.000000 0.000000
14 amsua_n15 14 0.000000 0.000000 0.000000 0.000000 0.000000
15 amsua_n15 15 -0.439501 0.539856 0.412582 -0.001741 0.158646

The first 3 columns are series number, the sensor/instrument/satellite, and channel number
of each instrument. Columns 4 through 8 are 5 coefficients for the predictive (air mass) part
of the bias correction, which has 5 predictors.

2) satbias_angle

The satbias_angle contains the angle dependent part of the brightness temperature bias.
There are two sample files for this in the GSI release package under the directory ./fix:
global_satangbias.txt and nam_global_satangbias.txt. Here we only give two channels as
examples from the file global_satangbias.txt:
1 amsua_n15 1 0.528768E-02
0.063 -0.200 -0.411 -0.588 -0.638 -0.523 -0.493 -0.466 -0.482 -0.475
-0.666 -0.587 -0.593 -0.602 -0.766 -0.955 -1.080 -1.218 -1.149 -1.374
-1.553 -1.635 -1.715 -1.783 -1.689 -1.507 -1.473 -1.244 -1.233 -1.259
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
2 amsua_n15 2 0.290661E-02
-2.769 -2.880 -2.583 -2.449 -2.218 -1.810 -1.536 -1.242 -0.882 -0.788
-0.676 -0.697 -0.508 -0.464 -0.544 -0.790 -0.945 -1.108 -1.002 -1.364
-1.404 -1.315 -1.318 -1.151 -0.826 -0.219 0.086 0.631 1.121 1.807
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

139
Satellite Radiance Data Assimilation

The first line for each channel looks like:


1 amsua_n15 1 0.528768E-02
2 amsua_n15 2 0.290661E-02

The columns are series number, the sensor/instrument/satellite, channel number of each
instrument, and T lap mean, respectively. The next 90 numbers are coefficients for angle
dependent bias correction. These numbers correspond to the number of the FOV per scan.
For example, AMSU-A has 30 FOV per scan, which is using the first 30 numbers to
represents the bias correction coefficients for each FOV position, while the AMSU-B has
90 FOV per scan, all 90 numbers are used to do bias correction.

If there are some instruments in satinfo but not in satbias_in and satbias_angle, GSI will
set 0 as the initial value for these instruments and write out updated coefficients for these
new instruments in coefficient results files.

8.4.4. Utility for Angle Bias Correction

The coefficients for correcting the angle dependent part of the brightness temperature bias
in the GSI are calculated after each GSI run. The NCEP has developed a tool to calculate
these coefficients and, for the first time, this community GSI release (v3.1) started to
include this tool as a part of the release package. This tool is released as a directory named
./gsi_angupdate within ./comGSIv3.1/util .

Please note the community GSI release package version 3.0 doesnt have this tool. If users
are using the GSI release 3.0 and need this tool, please download the tar file
gsi_angupdate.tar from the GSI download page on-line. Once untarred, under the GSI
directory ./comGSIv3/util, you will see a new directory: ./gsi_angupdate, which includes
the tool to update coefficients for radiance angle dependent bias correction.

Compile

Inside the directory ./gsi_angupdate, type the following command:

./make

then check if executable gsi_angupdate.exe exists in the same directory.

Please note that before compiling this utility, the community GSI should already be
compiled successfully. Please refer to Chapter 2 of this Users Guide on how to configure
and compile the community GSI.

Run

Before running gsi_angupdate.exe, make sure that GSI with radiance data have finished
successfully and the diagnostic files that hold O-B information have been generated in the

140
Satellite Radiance Data Assimilation

GSI run directory. The executable gsi_angupdate.exe will read in O-B information from
the diagnostic files for each sensor to update coefficients for angle dependent bias
correction of the sensor.

To help users easily run this tool, a sample run script named run_gsi_angupdate.ksh is
provided within the ./comGSIv3.1.run directory. If a user uses the tar file downloaded
separately on-line, a similar run script can be found in the directory ./comGSI/util with
the code.

This script is modified based on the GSI run script. The script has a similar structure.
Please check section 3.2.2.1 for instructions on setting up the machine environment, and
section 3.2.2.2 for setting up the run environment. The run environment portion is
illustrated below:
#####################################################
# machine set up (users should change this part)
#####################################################
#
# GSIPROC = processor number used for GSI analysis
#------------------------------------------------
GSIPROC=1
ARCH='LINUX_PBS'
# Supported configurations:
# IBM_LSF,
# LINUX, LINUX_LSF, LINUX_PBS,
# DARWIN_PGI

In this script, only four parameters need to be set for a case study. These parameters have
been explained clearly in the run script and illustrated below:
#
#####################################################
# case set up (users should change this part)
#####################################################
#
# ANAL_TIME= analysis time (YYYYMMDDHH)
# WORK_ROOT= working directory, where angupdate executable runs
# GSI_WORK_ROOT= GSI working directory, where GSI runs
# GSI_ANGUPDATE_EXE = path and name of the gsi angupdate executable
ANAL_TIME=2011032212
WORK_ROOT=./comGSIv3.1/run/angupdate_${ANAL_TIME}
GSI_WORK_ROOT=./comGSIv3.1/run/arw_2011032212
GSI_ANGUPDATE_EXE=./comGSIv3.1/util/gsi_angupdate/gsi_angupdate.exe
#

These parameters tell the analysis case time, where to find the GSI run directory and
gsi_angupdate.exe, and where to run gsi_angupdate.exe.

The run time information can be found in the stdout file. A successful run should end with
the following information:
PROGRAM GLOBAL_ANGUPDATE HAS ENDED. IBM RS/6000 SP

141
Satellite Radiance Data Assimilation

After a successful run, an updated coefficients file named satbias_ang.out should be


found in the run directory.

Namelist

The namelist for gsi_angupdate.exe has two sections: setup and obs_input. Here, we
only show and illustrate part of the namelist as an example.
&setup
jpch=2680,nstep=90,nsize=20,wgtang=0.008333333,wgtlap=0.0,
iuseqc=1,dtmax=1.0,
iyy1=${iy},imm1=${im},idd1=${id},ihh1=${ih},
iyy2=${iy},imm2=${im},idd2=${id},ihh2=${ih},
dth=01,ndat=50
/
&obs_input
dtype(01)='hirs3', dplat(01)='n17', dsis(01)='hirs3_n17',
dtype(02)='hirs4', dplat(02)='metop-a', dsis(02)='hirs4_metop-a',
dtype(03)='goes_img', dplat(03)='g11', dsis(03)='imgr_g11',
dtype(04)='goes_img', dplat(04)='g12', dsis(04)='imgr_g12',
dtype(05)='airs', dplat(05)='aqua', dsis(05)='airs281SUBSET_aqua',
dtype(06)='amsua', dplat(06)='n15', dsis(06)='amsua_n15',

The section obs_input only has three columns, which have the same meaning as their
counterparts in the GSI namelist, i.e., dtype and dplat specify the radiance instrument and
the satellite name, respectively, and dsis indicates the radiance observation type with a
name combining both the instrument and the satellite names.

Most of the parameters in the section setup are different from the section setup in GSI
namelist. We will explain these parameters below:

jpch: total channel number in coefficients file : satbias_ang.in


nstep: maximum number of FOV per scan
iyy1,imm1,idd1,ihh1: start date: year, month, day, and hour
iyy2,imm2,idd2,ihh2: end date: year, month, day, and hour
dth: time interval between start and end date. If start date is not equal to
end date, the code will loop based on dth through the period to
process multiple cycles.
ndat: Number of radiance observation types that can be processed, which
is the dimension for parameters in section: obs_input
iuseqc: >0 (i.e., 1), check variance. If errinv= (1 /(obs error)) is small
(small = less than 1.e-6), the observation did not pass quality
control. In this case, do not use this observation in computing
the update to the angle dependent bias.
<=0 (i.e., 0 or -1), ensure (o-g)<dtmax. If the user says to ignore the
qc flag, check that the o-g difference falls within the user
specified maximum allowable difference. If the o-g lies outside

142
Satellite Radiance Data Assimilation

this bound, do not use this observation in computing the update


to the angle dependent bias.
dtmax: user specified maximum allowable difference for o-g difference
nsize: the sample size number. If sample size is less than this number, the
updating weight will be reduced based on sample size
wgtang: weight for updating the mean temperature lapse rate
wgtlap: weight for updating angle dependent bias coefficients. The update
will be faster as this number gets bigger.

8.4.5. Discussion of FAQ

In this section, we will discuss some frequently asked questions on satellite radiance bias
correction.

Where to get bias correction coefficient files for the NCEP operational system.

The real-time satellite bias correction coefficients used for the NCEP operational
system is available on-line from the same website that holds observation
BUFR/PrepBUFR files:

For GDAS: http://nomads.ncep.noaa.gov/ pub/data/n ccf/com/gfs/prod


Once in the sub-directory, look for files with name similar to:

gdas1.t00z.abias for coefficients of mass bias correction.

For NAM: http://nomads.ncep.noaa.gov/ pub/data/n ccf/com/nam/prod


Once in the sub-directory, look for files with name similar to:

nam.t00z.satbias.tm00 for coefficients of mass bias correction.

Right now, the coefficient files for angle dependent bias correction are not available in
these web sites.

Notes on released satbias_in and satbias_angle

As mentioned in this section, the released version provides sample files for these
coefficients under the directory ./fix:

satbias_in: sample.satbias
satbias_angle: global_satangbias.txt and nam_global_satangbias.txt

These files are provided as a sample only. Users need to generate their own coefficients
based on their experiments. Usually, these coefficients need to be cycled for a period
(weeks or months) to get to a stage to do the right bias correction.

143
Satellite Radiance Data Assimilation

What if the user has no bias correction coefficients and only runs short experiments
(e.g., a week) for radiance data assimilation?

Following the suggestion from NCEP experts, the following may help some users
to improve their radiance data assimilation experiments:
1) Start with coefficient files as close as possible to your cases.
2) Run a single GSI analysis with mass bias and angle dependent bias correction.
You can get updated mass bias and angle dependent bias correction coefficient
files.
3) Run the same GSI analysis as step 2 using the same background and
observations but supply GSI with updated mass and angle dependent bias
correction coefficient files.
4) Repeat step 3 about 10 times to spin up the mass and angle dependent
coefficients.
5) Move on to the next cycle or analysis time and repeat steps 2 to 4.
6) After one or two days, the mass coefficient should be ready for the real case
test. Angle dependent bias correction will spin up slowly.

By starting two days prior to your real case period to spin up the coefficients, you
should be able to get better bias correction results.

Channel lists in satinfo, satbias_in and satbias_angle do not match

The radiance channels in satinfo should match the channels in satbias_in and
satbias_angle. If they do not match, GSI will match satbias_in based on channels in
satinfo:

If radiance channels only exist in satinfo but not in satbias_in, these channels will
be added to the updated coefficient files with 0 as the initial values.

If radiance channels are not in satinfo but are in satbias_in, the extra channels in
satbias_in will be removed from the updated files.

If channels in satinfo and satbias_angle do not match, GSI will use the channels in both
files, but the angle dependent update tool will crash due to the mismatch. Therefore,
users need to make sure the channels in satinfo and satbias_angle match.

How to select suitable satellite radiance channels when assimilating radiance data with
GSI:
This question is not only for bias correction.
1) Model top and instrument weighting functions:
Each channel has its own weighting function. If part of the weighting function is
above the model top, you may need to exclude this channel because your model
cannot obtain the correct simulated radiance from background.

144
Satellite Radiance Data Assimilation

2) Bias correction:
If a particular channel cannot be bias-corrected, for reasons such as the channel
is not correctly calibrated or due to instrument failure, you need to turn that
channel off. You may be able to check the time-series of bias for a certain
channel to get an idea of the status of the channel bias correction.
3) Test:
Try to view the data impact of each channel on the forecast to decide which
channel(s) are best for your application. You can monitor and perform bias
correction on each channel for a certain period and then turn that particular
channel from monitoring to usage in order to check the impact of the channel.

8.5. Radiance data analysis monitoring

The NCEP operational GSI system includes a Radiance Monitoring Package to extract
certain radiance data from the GSI radiance diagnostic files and produce images as an aid
to monitor GSI radiance data assimilation performance and diagnose assimilation
problems. This package has been used at NCEP to support the following Radiance
Assimilation Monitoring web site:

http://www.emc.ncep.noaa.gov/gmb/gdas/radiance/index.html

As discussed in the previous sections of this Chapter, radiance data assimilation is a


complex process, in which data quality control and bias correction are key steps for a
successful GSI application with radiance observations. To help users to monitor their
radiance data assimilation with the GSI system, the DTC ported this useful package into
the community GSI system for the Linux platform and included it as one of the utility tools
in the release version 3.1. Within the DTC, this package was tested on a Linux for a single
case, which is very different from the NCEP environment of using an IBM platform and
data assimilation cycles. At this stage, we can only guarantee that most of the functions in
this package can successfully run through and generate image files. Still, this package can
be very helpful for users as a starting point and reference to build their own radiance data
monitoring system.

8.5.1 Organization and compilation

This Radiance Monitoring Package includes a set of shell and Perl scripts, and small
programs. The package has two major components: data extraction and image generation.
Both components include scripts and programs. The package is also divided according to
data sources; it supports both global (glb) and regional (rgn) sources. During porting and
testing, the data sources are always set as regional because the DTC supports the GSI with
the regional WRF model.

145
Satellite Radiance Data Assimilation

To install the package, users need to download the community GSI version 3.1 tar file and
untar the file. The Radiance Monitoring Package is located in a directory named
./Radiance_Monitor under the directory comGSIv3.1/util. There are two subdirectories and
one shell script in the directory ./Radiance_Monitor:

./data_extract : holds scripts and programs for data extraction;


./image_gen : holds scripts and programs for image generation;
makeall.sh : script to manage the compilation of the Fortran programs for data
extraction

Before using makeall.sh to compile all of the programs for data extraction, GSI itself needs
to be compiled first. After a successful GSI compilation, users can simply go to the
directory:

comGSIv3.1/util/Radiance_Monitor

and type:

./makeall.sh

This shell script will setup the compiling environment based on parameters in configure.gsi
and then moves to source code subdirectories under ./data_extract/sorc and
./image_gen/src to compile each of them. Finally, all of the executables are copied to the
directory ./data_extract/exec. The following is a list of the executables generated after
running makeall.sh:

horiz.glb.x radmon_angle.glb radmon_bcoef.rgn radmon_time.glb


horiz.rgn.x radmon_angle.rgn radmon_bcor.glb radmon_time.rgn
make_base radmon_bcoef.glb radmon_bcor.rgn

Basically, the executable with name including the tag glb is for global resources and with
rgn is for regional resources. The focus of the next section will be using the regional
resources.

8.5.2 Data Extraction

The first step to generate radiance monitoring images is to extract the data from the GSI
radiance diagnostic files and save the data into small, binary files that are ordered for use
with GrADS plotting scripts.

The DTC has provided two sample run scripts, vrfyrad.sh and run.ksh, to help users to run
these data extraction executables. Both scripts are in the directory
./Radiance_Monitor/data_extract/scripts.

146
Satellite Radiance Data Assimilation

Before using these scripts, users need to configure several parameters located in the top
part of the script vrfyrad.sh:

COMOUT=.../comGSIv3.1/run/arw_2011032212
HOMEgfs=...comGSIv3.1/util/Radiance_Monitor/data_extract
TANKverf=.../comGSv3.1I/run/verf

Where, COMOUT is the GSI run directory that has the GSI radiance data analysis results;
HOMEgfs is the full path of the subdirectory ./data_extract is the radiance monitoring
package; TANKverf is the location that users want to save the data extraction results. The
above example configurations tell the script that there is a GSI case under the community
GSI run directory in a subdirectory named ./arw_2011032212; the radiance monitoring
package is still located under community GSI system util directory; and results will be
saved under the directory ./run/verf.

Also, the default data resource is regional, which is set up by the following parameter in
script vrfyrad.sh:

RAD_AREA=${RAD_AREA:-rgn}

If users need to apply this package with global data, please change -rgn to -glb.
Please note the global resources have not been tested due to lack of data.

After setting the above environmental variables, users can run the vrfyrad.sh using:

./run.ksh

which only has one line:

vrfyrad.sh YYYYMMDD HH

Where the two input parameters gives the year, month, day, and hour of the GSI analysis
results that will be used as data resources for the data extraction. For our sample case, we
use:

vrfyrad.sh 20110322 12 .

The script vrfyrad.sh will create a working subdirectory called ./run_radmon under the
directory that runs vrfyrad.sh, copy radiance diagnostic files and bias correction coefficient
files into the working directory ./run_radmon, and call five child scripts to run executables
with names including angle, bcor, bcoef, time, and horiz. These names correspond to the
type of images that can be generated from the files. Examples of all types of plots are
available on the NCEP radiance assimilation monitoring web site given at the beginning of
this section. Here we briefly list the result data from each executable in the following table:

147
Satellite Radiance Data Assimilation

Image Executable name Result variables for certain radiance observation type
type
angle for each FOV and each channel:
radmon_angle.glb
radmon_angle.rgn
angle bias correction coefficient
number of observations
contribution to penalty
summary of o-g without bias correction
summary of total bias correction
summary of o-g with bias correction
summary of fixed angle dependent
summary of lapse rate
summary of (lapse rate)**2
summary of mean
summary of scan angle
summary of cloud liquid water
bcor bias correction term for each channel:
radmon_bcor.glb
radmon_bcor.rgn
number of observations
contribution to penalty
bias and standard deviation of total bias
correction
bias and standard deviation of fixed angle
dependent
bias and standard deviation of lapse rate
bias and standard deviation of (lapse rate)**2
bias and standard deviation of mean
bias and standard deviation of scan angle
bias and standard deviation of cloud liquid
water
bcoef bias correction coefficient related data:
radmon_bcoef.glb
radmon_bcoef.rgn
contribution to penalty from each channel
mass bias correction coefficients for each
channel and predictor
time radiance observation for each channel:
radmon_time.glb
radmon_time.rgn
number of observations
contribution to penalty
summary of o-g without bias correction
summary of total bias correction
summary of o-g with bias correction
horiz horiz.glb.x for each radiance observation profile:
horiz.rgn.x
observations
total bias correction
simulated observation with bias correction
simulated observation without bias correction

148
Satellite Radiance Data Assimilation

After successfully running of each child script in the working directory for each executable,
the result data are copied and stored in subdirectories named ./radmon.[yyyymmdd] under
the directory set up by parameter TANKverf.

All the result data files from the same day are saved under the same directory:
./radmon.[yyyymmdd]. The file starts with a name corresponding to an image type such as:
angle, bcor, bcoef, time, and horiz. For each radiance data type, each image type has three
files:

the binary data file: with a file extension of .ieee_d. The full name of the file
looks like: image-type.radiance_data_type.yyyymmddhh.ieee_d. For
example: angle.mhs_n18.2011032212.ieee_d
the GrADS data descriptor file: with a file extension .ctl. The full name of the
file looks like: image-type.radiance_data_type.ctl. For example:
angle.mhs_n18.ctl
the standard out file from the executable: with name stdout in it. The full
name of the file looks like: image-type.stdout.radiance_data_type.
For example: angle.stdout.mhs_n18

Both the GrADS data descriptor file and standard out file are text files that can be used to
check if the data extraction step completed successfully.

8.5.3 Image Generation

The original image generation portion of the package generates images and transfers them
to a web server. The scripts were only ported and tested to generate the image files,
whereas the image display portion was left up to the users. To generate images from the
files under ./radmon.[yyyymmdd], users need to run scripts in the directory:
./Radiance_Monitor/image_gen/ush. There are many scripts in this directory to generate
different kinds of images. We provide a single control script, run_rgnl.sh, as a starting
point to run most of these scripts.

The script run_rgnl.sh is very simple and only does two things: set up environment
variables and run child scripts to generate the images.

The environmental variables needed by image generation are similar to those for data
extraction. Here is a list of variables at the top of the script run_rgnl.sh and their
explanations:

PDATE=2011032212
export RAD_AREA=rgn
MY_TANKDIR=.../comGSv3.1/run
MY_RADMON=.../comGSIv3.1/util/Radiance_Monitor
export IMGNDIR=${MY_TANKDIR}/imgn

149
Satellite Radiance Data Assimilation

export TANKDIR=${MY_TANKDIR}/verf
export PLOT_WORK_DIR=${MY_TANKDIR}/plot_${RAD_AREA}_rad

Where PDATE indicates the date and time of GSI analysis in YYYYMMDDHH;
RAD_AREA indicates the data source, where rgn is for regional and glb for global; the
MY_TANKDIR is the working directory for both data extraction (has default subdirectory
name ./verf) and the image generation (has default subdirectory name ./imgn); the
MY_RADMON is the path of the radiance monitoring package; PLOT_WORK_DIR is working
directory for plotting. Please note these scripts need to use the result data files and data
descriptor files from the data extraction step, the variable RAD_AREA has to set up with the
same data sources as the data extraction step and TANKDIR has to point to the same path as
the variable TANKverf in the script./Radiance_Monitor/data_extract/scripts/ vrfyrad.sh.

Because the plotting script stops right after image generation, the generated image files are
still in the working directory: PLOT_WORK_DIR . In the above example, the image files are in
the directory: .../comGSIv3.1/run/plot_rgn_rad.

After setting up these parameters, users can start image generation by running:

./run_rgnl.sh

Similar to the data extract step, the script run_rgnl.sh calls five child scripts to plot the
following images for each radiance observation type from the data resources:

Time and Angle dependent plots


Horizontal data plots
Bias correction coefficients
Bias correction terms
Summary plots

Please see the Radiance Assimilation Monitoring web site for detailed examples.

As we discussed before, the generated image files are saved under the plot work directory:
.../comGSIv3.1/run/plot_rgn_rad. Under this directory, we can see many subdirectories
corresponding to the image types. The image file name represents the content of the image.
They all have the format:

radiance-data-type.image-content_frequency-category.png

For example: mhs_n18.penalty_fr1.png.

150
Satellite Radiance Data Assimilation

8.5.4 Additional notes

Please note we only have tested the scripts using a single case, however the scripts
themselves were built based on assimilation cycles with 30 days worth of data files (4
cycles per day). Therefore, the images generated here are mainly for testing on the Linux
platform and can only be used with further modifications according to users application.
For example, users may notice the time series plots are empty because this type of plot
must have at least two files from two GSI analysis cycles.

The adjustment for the users application could involve both the run scripts and the code.
Below are several suggested points that may be adapted by users:

1). In ./Radiance_Monitor/data_extract/scripts/run.ksh, set run.ksh to extract data


from cycles.

2). In ./Radiance_Monitor/data_extract/ush, the child scripts for each image type


include namelist:
cat << EOF > input
&INPUT
satname='${type}',
iyy=${iyy},
imm=${imm},
idd=${idd},
ihh=${ihh},
idhh=720,
incr=6,
nchanl=${nchanl},
suffix='${SUFFIX}',
imkctl=${MAKE_CTL},
imkdata=${MAKE_DATA},
gesanl='${gesanl}',
/
EOF

Need to set idhh to the number of cycles accumulated for the calculation (for
example, 720 is the cycle number for 30 day cycles, 4 cycles per day, which is the
GDAS setup) and incr to the cycling interval in hours (for example: 6 for GFS).

3). In ./Radiance_Monitor/image_gen/ush/ . For scripts like: plot_angle.sh,


plot_bcor.sh, plot_bcoef.sh, plot_time.sh, the user may need to turn on the
following line and set the right time calculation command to copy multiple time
data files into the plot work directory:

## while [[ $cdate -le $edate ]]; do

151
References

References

1. Wan-Shu Wu, R. James Purser, and David F. Parrish, 2002: Three-Dimensional


Variational Analysis with Spatially Inhomogeneous Covariances. Monthly Weather
Review, 130, 29052916.

2. R. James Purser, Wan-Shu Wu, David F. Parrish, and Nigel M. Roberts, 2003:
Numerical Aspects of the Application of Recursive Filters to Variational Statistical
Analysis. Part I: Spatially Homogeneous and Isotropic Gaussian Covariances.
Monthly Weather Review, 131, 15241535.

3. R. James Purser, Wan-Shu Wu, David F. Parrish, and Nigel M. Roberts, 2003:
Numerical Aspects of the Application of Recursive Filters to Variational Statistical
Analysis. Part II: Spatially Inhomogeneous and Anisotropic General Covariances.
Monthly Weather Review, 131, 15361548.

4. David F. Parrish and John C. Derber, 1992: The National Meteorological Center's
Spectral Statistical-Interpolation Analysis System. Monthly Weather Review, 120,
17471763.

5. R. James Purser, 2005: Recursive Filter Basics. 1st GSI User Orientation. 4th-5th
January 2005. Camp Springs, MD

6. Russ Treadon, 2005: GSI Compilation, Coding, and Updates. 1st GSI User
Orientation. 4th-5th January 2005. Camp Springs, MD

7. John Derber, 2005: minimization and Preconditioning. 1st GSI User Orientation. 4th-
5th January 2005. Camp Springs, MD

8. Wu, Wan-Shu, 2005: Background error for NCEPs GSI analysis in regional mode.
Fourth WMO International Symposium on Assimilation of Observations in
Meteorology and Oceanography, 18-22 April 2005, Prague, Czech Republic

9. Seung-Jae Lee, David F. Parrish, and Wan-Shu Wu, 2005: Near-Surface Data
Assimilation in the NCEP Gridpoint Statistical Interpolation System: Use of Land
Temperature Data and a Comprehensive Forward Model. NOAA/NCEP Office Note
446, 46 pp

152
GSI tools

Appendix A: GSI tools

A.1 BUFR Tools and SSRC

Under ./util/ bufr_tools, there are many Fortran examples to illustrate basic
BUFR/PrepBUFR file process skills such as encoding, decoding, and appending and a
small C program called ssrc.c to covert Big Endian BUFR file to Little Endian file. For
details of these examples and the BUFR format, please see the BUFR/PrepBUFR Users
Guide, which is free available on-line

http://www.dtcenter.org/com-GSI/BUFR/docs/index.php

The observation BUFR files generated by NCEP (for example, PrepBUFR and BUFR files
from NCEP ftp server or gdas1.t12z.prepbufr.nr in tutorial case) are in Big Endian binary
format. When running the GSI on a Linux or Darwin system, these files must be converted
to Little Endian binary format. The C program ssrc.c under ./util/ bufr_tools is to perform
this conversion.

The program ssrc.c may be compiled using any standard C compiler. After compiling
ssrc.c to generate executable: ssrc.exe, you can byte-swap a Big Endian BUFR/PrepBUFR
file to a Little_Endian file by:

ssrc.exe < name of Big Endian bufr file > name of Little Endian bufr file

In the on-line tutorial, the file little_endian.gdas1.t12z.prepbufr.nr has the same content as
PrepBUFR file as gdas1.t12z.prepbufr.nr, but has been byte-swapped to Little Endian for
use on Linux systems.

A.2 Read GSI Diagnostic Files

Lots of useful information about one observation used in the analysis such as innovation,
observation values, observation error and adjusted observation error, and quality control
information, may be output in diagnostic files. To generate the diagnosis files, users need
to the set namelist variable write_diag to true in namelist section &SETUP. The
write_diag variable has a dimension equal to the number of outer loops in the
minimization plus 1. The first element of write_diag controls the write of a diagnosis
file before the first outer loop (O-B), while the last element controls the write after the last
outer loop (O-A). The remaining elements control the write after the (n-1) outer loop.

For example, if we set the number of outer loops to 2, and set the write_diag namelist
variable to the following:
write_diag(1)=.true.,write_diag(2)=.false.,write_diag(3)=.true.,

153
GSI tools

the GSI will write out diagnostic files before the start of the 1st outer loop start (O-B) and
after the completion of the 2nd outer loop finish (O-A). We dont want GSI to write out
diagnosis files after the 1st outer loop because we set write_diag(2)=.false.

This is what we set in our example case described in section 5.2. From this case, we can
see the following diagnosis files generated from the GSI analysis:

diag_amsua_metop-a_anl.2011032212 diag_amsub_n17_anl.2011032212
diag_amsua_metop-a_ges.2011032212 diag_amsub_n17_ges.2011032212
diag_amsua_n15_anl.2011032212 diag_conv_anl.2011032212
diag_amsua_n15_ges.2011032212 diag_conv_ges.2011032212
diag_amsua_n18_anl.2011032212 diag_hirs4_metop-a_anl.2011032212
diag_amsua_n18_ges.2011032212 diag_hirs4_metop-a_ges.2011032212

All files are identified with a filename containing three elements. The first element diag
indicates these are combined diagnostic files. The second element identifies the
observation type (here, conv means conventional observation from prepbufr and
amsua_n15 is radiance observation from AMUS-A in NOAA 15). The last element
identifies which steps of outer loop the files were generated for. Here, anl means the
contents were written after the last outer loop (from write_diag(3)=.true.) and ges
means the contents were written before the first output loop (from write_diag(1)=
.true.).

To help users to read the information from these diagnosis files, we have provided two
Fortran programs in the ./util/Analysis_Utilities/read_diag/ directory:

read_diag_conv.f90 : Reads the diagnosis files for conventional observations. For


example: diag_conv_anl.2011032212 and diag_conv_ges.2011032212

read_diag_rad.f90 : Reads the diagnosis files for radiance observation. For


example: diag_amsua_n15_anl.2011032212
diag_amsua_n15_ges.2011032212
diag_amsub_n17_anl.2011032212 diag_amsub_n17_ges.2011032212

To compile the programs, use the makefile provided:

./make

Note: since information in the GSI include directory is required, the GSI must have been
compiled first.

To run read_diag_conv.exe, two things need to be in the directory along with the
executable. The first is the following namelist file namelist.conv, and the second is the
diagnosis file to be processed.

154
GSI tools

namelist.conv: Only two parameters in this namelist

&iosetup
infilename='./diag_conv_anl', : GSI diagnosis file
outfilename='./results_conv_anl', : Text file used to save the content
of diagnosis file
/

The user should copy or link the diagnosis file, for example diag_conv_anl.2011032212
in section 5.2, from the test case directory to the directory with the executable and name it
with the name specified by the entry infilename in the namelist. In this case that would be
diag_conv_anl . With everything in place, then run the executable

./read_diag_conv.exe

The results are placed in the file specified by the outfilename entry in the namelist. In this
case that would be a file results_conv_anl located in the directory where the executable
was run.

Similarly, to run read_diag_rad.exe, the namelist file namelist.rad is needed. It contains


the same parameters as namelist.conv. Using the same case from section 5.2, copy
diag_amsua_n18_ges.2011032212 to directory ./util/Analysis_Utilities/read_diag
(executable sits in this directory) and then change the namelist.rad to:

&iosetup
infilename='./diag_amsua_n18_ges.2011032212',
outfilename='./results_amsua_n18_ges',
/

Running the executable creates the text file results_amsua_n18_ges specified by the
namelist in the directory read_diag_rad.exe runs.

For the conventional observations, the data is stored in two arrays: rdiagbuf and cdiagbuf.
Their contents are listed as follows:
Cdiagbuf = station id
rdiagbuf(1) = observation type
rdiagbuf(2) = observation subtype
rdiagbuf(3) = observation latitude (degrees)
rdiagbuf(4) = observation longitude (degrees)
rdiagbuf(5) = station elevation (meters)
rdiagbuf(6) = observation pressure (hPa)
rdiagbuf(7) = observation height (meters)
rdiagbuf(8) = observation time (hours relative to analysis time)
rdiagbuf(9) = input prepbufr qc or event mark
rdiagbuf(10) = setup qc or event mark (currently qtflg only)
rdiagbuf(11) = read_prepbufr data usage flag
rdiagbuf(12) = analysis usage flag (1=use, -1=not used)
rdiagbuf(13) = nonlinear qc relative weight
rdiagbuf(14) = prepbufr inverse obs error (K**-1)
rdiagbuf(15) = read_prepbufr inverse obs error (K**-1)

155
GSI tools

rdiagbuf(16) = final inverse observation error (K**-1)


rdiagbuf(17) = temperature observation (K)
rdiagbuf(18) = obs-ges used in analysis (K)
rdiagbuf(19) = obs-ges without bias correction (K)

For wind observations, the content after index 16 is:


rdiagbuf(17) = u wind component observation (m/s)
rdiagbuf(18) = u obs-ges used in analysis (m/s)
rdiagbuf(19) = u obs-ges w/o bias correction (m/s)
rdiagbuf(20) = v wind component observation (m/s)
rdiagbuf(21) = v obs-ges used in analysis (m/s)
rdiagbuf(22) = v obs-ges w/o bias correction (m/s)
!

The read_diag_conv.exe reads these arrays and outputs important information in the text
file results_conv_anl specified by the user in the &iosetup namelist. Example:

station obs obs obs obs obs usag obs O-B


ID type time latitude longitude pressure value
ps @ 21997 : 180 0.00 42.65 190.98 1013.20 1 1013.2 -0.35
t @ PADK : 187 -0.07 51.88 183.35 993.34 1 276.0 0.33
uv @ RD01709C : 242 -1.48 40.50 189.00 886.30 1 14.04 3.12 -11.80 -0.24

For wind, the last 4 columns are the wind components in the order of: U observation, O-B
for U, V observation, O-B for V.

For radiance observations, the data is stored in two arrays: diagbuf and diagbufchan.
Their contents are listed as following:

diagbuf(1) = observation latitude (degrees)


diagbuf(2) = observation longitude (degrees)
diagbuf(3) = model (guess) elevation at observation location
diagbuf(4) = observation time (hours relative to analysis time)

diagbuf(5) = sensor scan position


diagbuf(6) = satellite zenith angle (degrees)
diagbuf(7) = satellite azimuth angle (degrees)
diagbuf(8) = solar zenith angle (degrees)
diagbuf(9) = solar azimuth angle (degrees)
diagbuf(10) = sun glint angle (degrees) (sgagl)

diagbuf(11) = surface fractional coverage by water


diagbuf(12) = surface fractional coverage by land
diagbuf(13) = surface fractional coverage by ice
diagbuf(14) = surface fractional coverage by snow
diagbuf(15) = surface temperature over water (K)
diagbuf(16) = surface temperature over land (K)
diagbuf(17) = surface temperature over ice (K)
diagbuf(18) = surface temperature over snow (K)
diagbuf(19) = soil temperature (K)
diagbuf(20) = soil moisture
diagbuf(21) = surface land type
diagbuf(22) = vegetation fraction
diagbuf(23) = snow depth
diagbuf(24) = surface wind speed (m/s)

156
GSI tools

if (.not.microwave) then
diagbuf(25) = cloud fraction (%)
diagbuf(26) = cloud top pressure (hPa)
else
diagbuf(25) = cloud liquid water (kg/m**2)
diagbuf(26) = total column precip. water (km/m**2)
endif

diagbufchan include loop through channel i from 1 to nchanl:


diagbufchan(1,i)= observed brightness temperature (K)
diagbufchan(2,i)= observed - simulated Tb with bias corrrection (K)
diagbufchan(3,i)= observed - simulated Tb with no bias correction (K)
diagbufchan(4,i)= inverse observation error
diagbufchan(5,i)= quality control mark or event indicator
diagbufchan(6,i)= surface emissivity
diagbufchan(7,i)= stability index
diagbufchan(8,i)= d(Tb)/d(Ts)
do j=1,npred+1
diagbufchan(7+j,i)= Tb bias correction terms (K)
end do

In the sample output file results_amsua_n18_ges, only the observation location and time
are written in the file. Users can write out other information based on the list.

A.3 Read and Plot Convergence Information from fort.220

In section 4.6, we introduced how to check the convergence information in the fort.220 file.
Here, we provide tools to filter this file and to plot the values of the cost function and the
norm of gradient during each iterations.

These tools - one ksh script and one ncl script are in the ./util/plot_cost_grad directory:

The ksh script, filter_fort220.ksh, only has one line:


grep 'penalty,grad ,a,b=' fort.220 | sed -e 's/penalty,grad ,a,b=//g' > cost_gradient.txt

To run filter_fort220.ksh, the fort.202 needs to be in the same directory. The script will
filter out the values of the cost function and the norm of gradient at each iteration from
fort.220 into a text file called cost_gradient.txt.

Before making plots, there are two lines in ncl script GSI_cost_gradient.ncl that needs to
be changed based on content of cost_gradient.txt:

line 6: nloop1= inner iteration number in 1st outer loop


line 7: nloop2= inner iteration number in 2nd outer loop

Then, in the same directory as the file cost_gradient.txt, run the ncl script:

ncl GSI_cost_gradient.ncl

157
GSI tools

The pdf file GSI_cost_gradient.pdf is created. The pdf file contains plots of the
convergence of the GSI analysis like those in section 4.6.

A.4 Plot Single Observation Test Result and Analysis Increment

In Section 4.2, we introduced how to do a single observation test for GSI. Here we provide
users with the ncl scripts to plot the single observation test results.

There are three ncl scripts in the . util/Analysis_Utilities/plots_ncl directory:

GSI_singleobs_arw.ncl Plot single observation test with ARW NetCDF background


GSI_singleobs_nmm.ncl Plot single observation test with NMM NetCDF background
fill_nmm_grid2.ncl E grid to A grid convertor

The main difference between the ARW and NMM core used in GSI is that ARW is on a C
grid, while NMM is on an E grid. GSI_singleobs_nmm.ncl calls fill_nmm_grid2.ncl to
convert the E grid to an A grid for plotting, while GSI_singleobs_arw.ncl itself includes a
C grid to A grid convertor.

Before running ncl scripts, users need to set up two links:

cdf_analysis Link to analysis result in NetCDF format (like wrf_inout)


cdf_bk Link to background file in netCDF format

These scripts read in the analysis and background fields of temperature (T), U component
of wind (U), V component of wind (V), and moisture (Q) and calculate the difference of
the analysis field minus the background field. Then XY sections (left column) and XZ
sections (right column) are plotted for T, U, V, and Q through the point that has maximum
analysis increment of single observation. Here the default single observation test is T for
NMM and U for ARW. If the user conducts other single observation tests, the
corresponding changes should be made based on the current scripts.

In the same directory are the two ncl scripts Analysis_increment.ncl and
Analysis_increment_nmm.ncl. These are provided to plot the analysis increment for the
ARW NetCDF and NMM NetCDF case. These scripts are very similar the one for the
single observation.

For more information on how to use ncl, please check the NCL website at:

http://www.ncl.ucar.edu/

158
GSI Namelist

Appendix B: GSI Namelist

The following are lists and explanations of the GSI namelist variables. You can also find
them in the source code gsimod.F90.

Variable name Default value Description


&SETUP General control namelist
gencode 80 source generation code
factqmin 1 weighting factor for negative moisture constraint
factqmax 1 weighting factor for supersaturated moisture
constraint
deltim 1200 model timestep
dtphys 3600 physics timestep
biascor -1 background error bias correction coefficient
bcoption 1 0=ibc (no bias correction to bkg);
1= sbc(original implementation)
diurnalbc 0 1= diurnal bias; 0= persistent bias
ndat 100 number of observations datasets
niter(0:50) 0, Maximum number of inner loop iterations for
each outer loop
niter_no_qc 1000000 Inner loop iteration at which to turn on variational
(0:50) quality control

miter 1 number of outer loops


qoption 1 option for moisture analysis variable; 1:q/qsatg
2:normalized RH
nhr_assimilation 6 assimilation time interval (currently 6 hours for
global, 3 hours for regional )
min_offset 3 time of analysis in assimilation window
iout_iter 220 output file number for iteration information
npredp 6 number of predictors for precipitation bias
correction
retrieval .false. logical to turn off or on the SST physical retrieval
nst_gsi 0 indicator to control the Tr Analysis mode:
0 = no nst info ingsi at all;
1 = input nst info, but used for monitoring only
2 = input nst info, and used in CRTM simulation,
but no Tr analysis

159
GSI Namelist

3 = input nst info, and used in CRTM simulation


and Tr analysis is on

nst_tzr 0 indicator to control the Tzr_QC mode:


0 = no Tz retrieval;
1 = Do Tz retrieval and applied to Q

nstinfo 0 number of nst variables


fac_dtl 0 index to apply diurnal thermocline layer or not: 0
= no; 1 = yes
fac_tsl 0 index to apply thermal skin layer or not: 0 = no; 1
= yes.
tzr_bufrsave .false. logical to turn off or on the bufr Tz retrieval file
true=on
diag_rad .true. logical to turn off or on the diagnostic radiance
file (true=on)
diag_pcp .true. logical to turn off or on the diagnostic
precipitation file (true=on)
diag_conv .true. logical to turn off or on the diagnostic
conventional file (true=on)
diag_ozone .true. logical to turn off or on the diagnostic ozone file
(true=on)
diag_aero .false. logical to turn off or on the diagnostic aerosol file
(true=on)
diag_co .false. logical to turn off or on the diagnostic carbon
monoxide file (true=on)
iguess 1 flag for guess solution (currently not working)
-1 do not use guess file
0 write only guess file
1 read and write guess file
2 read only guess file
reduce_diag .false. namelist logical to produce reduced radiance
diagnostic files
write_diag .false., logical to write out diagnostic files for outer
iteration
oneobtest .false. one observation test flag true=on
sfcmodel .false. if true, then use boundary layer forward model for
surface temperature data.
dtbduv_on .true. logical for switching on (.true.) sensitivity of uv
winds to microwave brightness temperatures
if true, use d(microwave brightness

160
GSI Namelist

temperature)/d(uv wind) in inner loop


ifact10 0 flag for recomputing 10m wind factor
= 1 compute using GFS surface physics
= 2 compute using MM5 surface physics
= 0 or any other value - DO NOT recompute - use
value from guess file
l_foto .false. option for First-Order Time extrapolation to
observation
offtime_data .false. if true, then allow use of obs files with ref time
different from analysis time. default value =
.false., in which case analysis fails if observation
file reference time is different from analysis time.
npred_conv_max 0 maximum number of conventional observation
bias correction coefficients
id_bias_ps 0 prepbufr id to have conv_bias added for testing
id_bias_t 0 prepbufr id to have conv_bias added for testing
id_bias_spd 120 prepbufr id to have conv_bias added for testing
conv_bias_ps 0 magnitude of ps bias(mb)
conv_bias_t 0 magnitude of t bias(deg K)
conv_bias_spd 0 magnitude of spd bias(m/sec)
stndev_conv_ps 1.0
stndev_conv_t 1.0
stndev_conv_spd 1.0
use_pbl .false. Logical flag to include PBL effects in tendency
model.
use_compress .false. option to turn on the use of compressibility
factors in geopotential heights
nsig_ext 13 number of layers above the model top which are
necessary to compute the bending angle for gpsro
gpstop 30.0 maximum height for gpsro data assimilation.
Reject anything above this height. (km)
perturb_obs .false. logical flag to perturb observation (true=on)
perturb_fact 1 magnitude factor for observation perturbation
oberror_tune .false. logical to tune (=true) oberror
preserve_restart .false. if true, then do not update regional restart file
_date date.
crtm_coeffs_path ./ path of directory w/ CRTM coeffs files
berror_stats berror_stats filename if other than "berror_stats"

161
GSI Namelist

newpc4pred .false. option for additional preconditioning for pred


coeff
adp_anglebc .false. option to perform variational angle bias correction
angord 0 order of polynomial for variational angle bias
correction
passive_bc .false. option to turn on bias correction for passive
(monitored) channels
use_edges .true. option to exclude radiance data on scan edges
lobsdiagsave .false. write out additional observation diagnostics
l4dvar .false. turn 4D-Var on/off (default=off=3D-Var)
lbicg .false. use B-precond w/ bi-conjugate gradient for
minimization
lsqrtb .false. Use sqrt(B) preconditioning
lcongrad .false. Use conjugate gradient/Lanczos minimizer
lbfgsmin .false. Use L-BFGS minimizer
ltlint .false. Use TL inner loop (ie TL intall)
nhr_obsbin -1 length of observation bins
nhr_subwin -1 length of weak constraint 4d-Var sub-window
intervals
nwrvecs -1 Number of precond vectors (Lanczos) or pairs of
vectors (QN) being saved
iorthomax 0 max number of vectors used for orthogonalization
of various CG options
ladtest .false. Run adjoint test
lgrtest .false. Run gradient test
lobskeep .false. keep obs from first outer loop for subsequent OL
lsensrecompute .false. does adjoint by recomputing forward solution
jsiga -1 calculate approximate analysis errors from
lanczos for jiter=jsiga
ltcost .false. calculate true cost when using Lanczos (this is
very expensive)
lobsensfc .false. compute forecast sensitivity to observations
lobsensjb .false. compute Jb sensitivity to observations
lobsensincr .false. compute increment sensitivity to observations
lobsensadj .false. use adjoint of approx. Hessian to compute obs
sensitivity

162
GSI Namelist

lobsensmin .false. use minimisation to compute obs sensitivity


iobsconv 0 compute convergence test in observation space
=1 at final point, =2 at every iteration
idmodel .false. uses identity model when running 4D-Var (test
purposes)
lwrtinc .false. when .t., writes out increments instead of analysis
jiterstart 1 first outloop iteration number
jiterend 1 last outloop iteration number
lobserver .false. when .t., calculate departure vectors only
lanczosave .false. save lanczos vectors for forecast sensitivity
computation
llancdone .false. use to tell adjoint that Lanczos vecs have been
pre-computed
lferrscale .false. Something related to forecast error
print_diag_pcg .false. logical turn on of printing of GMAO diagnostics
in pcgsoi.f90
tsensible .false. option to use sensible temperature as the analysis
variable. Works only for twodvar_regional=.true.
lgschmidt .false. option for re-biorthogonalization of the {gradx}
and {grady} set from pcgsoi when
twodvar_regional=.true.
lread_obs_save .false. option to write out collective obs selection info
lread_obs_skip .false. option to read in collective obs selection info
use_gfs_ozone .false. option to read in gfs ozone and interpolate to
regional model domain
check_gfs_ozone_date .false. option to date check gfs ozone before
interpolating to regional model domain
regional_ozone .false. option to turn on ozone in regional analysis
lwrite_predterms .false. option to write out actual predictor terms instead
of predicted bias to the radiance diagnostic files
lwrite_peakwt .false. option to writ out the approximate pressure of the
peak of the weighting function for satellite data to
the radiance diagnostic files
use_gfs_nemsio .false. option to use nemsio to read global model
NEMS/GFS first guess
liauon .false. treat 4dvar CV as tendency perturbation
(default=false)
use_prepb_satwnd .false. allow using satwnd's from prepbufr (historical)
file

163
GSI Namelist

l4densvar .false. logical to turn on ensemble 4dvar


ens4d_nstarthr 3 start hour for ensemble perturbations (generally
should match min_offset)
NOTE: for now, if in regional mode, then iguess=-1 is forced internally.

& GRIDOPTS Grid setup variables, including regional


specific variables
jcap 62 spectral resolution of the analysis
jcap_b 62 spectral resolution of background (model guess field)
nsig 42 number of sigma levels
nlat 96 number of latitudes
nlon 384 number of longitudes
hybrid .false. logical hybrid data file flag (true=hybrid)
nlat_regional 0 Number of y grid point in whole regional domain
nlon_regional 0 Number of x grid point in whole regional domain
diagnostic_reg .false. logical for regional debugging
update_regsfc .false. logical to write out updated surface fields to the
regional analysis file (default = false)
netcdf .false. if true, then wrf files are in netcdf format,
otherwise wrf files are in binary format.
regional .false. logical for regional GSI run
wrf_nmm_regional .false. logical for input from WRF NMM
nems_nmmb_regional .false. logical for input from NEMS NMMB
wrf_mass_regional .false. logical for input from WRF MASS-CORE (ARW)
twodvar_regional .false. logical for regional 2d-var analysis
filled_grid .false. logical to fill in points on WRF-NMM E-grid
half_grid .false. logical to use every other row of WRF-NMM E-Grid
nvege_type 24 number of types of vegetation; old=24, IGBP=20
nlayers(100) 1 number of sub-layers to break indicated model layer
into prior to calling radiative transfer model
cmaq_regional .false. Background input is from CMAQ model
nmmb_reference_grid H ='H', then analysis grid covers H grid domain
= 'V', then analysis grid covers V grid domain
grid_ratio_nmmb sqrt(2) ratio of analysis grid to nmmb model grid in nmmb
model grid units.

164
GSI Namelist

&BKGERR Background error related variables


vs 1/1.5 scale factor for vertical correlation lengths for
background error
hzscl(3) 1, 1, 1 scale factor for horizontal smoothing, n=1,number of
scales (3 for now)
specifies factor by which to reduce horizontal scales
(i.e. 2 would then apply 1/2 of the horizontal scale)
hswgt(3) 1/3, 1/3, 1/3 empirical weights to apply to each horizontal scale
norh 2 order of interpolation in smoothing
ndeg 4 degree of smoothing in recursive filters
noq 3 1/4 of accuracy in compact finite differencing
bw 0 factor in background error calculation
norsp 0 order of interpolation for smooth polar cascade routine
default is norsp=0, in which case norh is used with
original polar cascade interpolation (global only).
fstat .false. logical to separate f from balance projection
pert_berr .false. logical to turn on random inflation/deflation of
background error tuning parameters
pert_berr_fct 0 factor for increasing/decreasing berror parameters, this
is multiplied by random number
bkgv_flowdep .false. flag to turn on flow dependence to background error
variances
bkgv_rewgtfct 0 factor used to perform flow dependent reweighting of
error variances
bkgv_write .false. flag to turn on=.true. /off=.false. generation of binary
file with reweighted variances
fpsproj .true. controls full nsig projection to surface pressure

&ANBKGERR Anisotropic background error related


variables
anisotropic .false. if true, then use anisotropic background error
covariance
ancovmdl 0 covariance model settings - 0: pt-based, 1: ensemble
based
triad4 .true. for 2d variables, if true, use blended triad algorithm
ifilt_ord 4 filter order for anisotropic filters

165
GSI Namelist

npass 1 2npass = number of factors in background error


normal 200 number of random vectors to use for filter
normalization ( if < 0 then slightly slower, but results
independent of number of processors)
binom .true. if true, weight correlation lengths of factors using
binomial distribution, with shortest scales on outside,
longest scales on inside. This can help to produce
smoother correlations in the presence of strong
anisotropy
ngauss 3 number of Gaussians to add together in each factor
rgauss 0 multipliers on reference aspect tensor for each
Gaussian factor
anhswgt 1.0 empirical weights to apply to each gaussian
an_vs 1 scale factor for background error vertical scales
(temporary carry over from isotropic inhomogeneous
option)
grid_ratio 2.0 ratio of coarse to fine grid in fine grid units
grid_ratio_p 0 ratio of coarse to fine grid in fine grid units for polar
patches
nord_f2a 4 order of interpolation for transfer operators between
filter grid and analysis grid
an_flen_u 1 coupling parameter for connecting horizontal wind to
background error
an_flen_t 1 coupling parameter for connecting grad(potential
temperature) to background error
an_flen_z 1 coupling parameter for connecting grad(terrain) to
background error
rtma_subdomain .false. if true, then call alternative code which calls recursive
_option filter directly from subdomain mode, bypassing
transition to/from horizontal slabs. This is mainly to
improve efficiency for 2d rtma analysis. at the
moment, this only works for twodvar_regional=.true.
rtma_subdomain_option will be forced to false when
twodvar_regional=.false.
lreadnorm .false. if true, then read normalization from fixed files
nsmooth 0 number of 1-2-1 smoothing passes before and after
background error application
nsmooth_shapiro 0 number of 2nd moment preserving (shapiro)
smoothing passes before and after background error
application.
NOTE: default for nsmooth and nsmooth_shapiro is 0.

166
GSI Namelist

if both are > 0, then nsmooth will be forced to zero.


afact0 0.0 anistropy effect parameter, the range must be in 0.0-
1.0.
covmap .false. if true, covariance map would be drawn

&JCOPTS Constraint term in cost function (Jc)


ljcdfi .false. if .false., uses original formulation based on wind, temp,
and ps tends
when .t. uses digital filter initialization of increments
(4dvar)
alphajc 10.0 parameter for digital filter
switch_on_ .false., if true, then compute horizontal derivatives of all state
derivatives variables (to be used eventually for time derivatives,
dynamic constraints and observation forward models that
need horizontal derivatives)
tendsflag .false. if true, compute time tendencies
ljcpdry .false. when .t. uses dry pressure constraint on increment
bamp_jcpdry 0.0 parameter for pdry_jc
eps_eer -1.0 Errico-Ehrendofer parameter for q-term in energy norm
ljc4tlevs .false. when true and in 4D mode, apply any weak constraints
over all time levels instead of just at a single time

&STRONGOPTS Strong dynamic constraint


jcstrong .false. if .true., strong constraint on
jcstrong_option 1 =1 for slow global strong constraint
=2 for fast global strong constraint
=3 for regional strong constraint

nstrong 0 if > 0, then number of iterations of implicit normal


mode initialization to apply for each inner loop
iteration
period_max 1000000.0 cutoff period for gravity waves included in implicit
normal mode initialization (units = hours)
period_width 1.0 defines width of transition zone from included to
excluded gravity waves
nvmodes_keep 0 number of vertical modes to use in implicit normal
mode initialization
baldiag_full .false. flag to toggle balance diagnostics for the full fields

167
GSI Namelist

baldiag_inc .false. flag to toggle balance diagnostics for the analysis


increment
hybens_inmc_ 0 integer flag for strong constraint (various capabilities
option for hybrid):
=0: no TLNMC (consistent with jcstrong=.false.)
=1: TLNMC on static increment only (or if non-hybrid
run)
=2: TLNMC on total increment for single time level
only (or 3DHybrid)
if 4D mode, TLNMC applied to increment in center of
window
=3: TLNMC on total increment over all time levels (if
in 4D mode)

&OBSQC Observation quality control variables


Parameters used for gross error checks are set in file convinfo (ermin, ermax, ratio)
Parameters below used for nonlinear (variational) quality control
dfact 0 factor for duplicate observation at same location for
conventional data
dfact1 3.0 time factor for duplicate observation at same location
for conventional data
erradar_inflate 1 radar error inflation factor
oberrflg .false. logical for reading in new observation error table (if set
to true)
vadfile 'none' character(10) variable holding name of VAD wind bufr
file
noiqc .false logical flag to bypass OI QC (if set to true)
c_varqc 1 constant number to control variance qc turning on speed
blacklst .false. logical for reading in raob blacklist (if set to true)
use_poq7 .false. Logical to toggle accept (.true.) or reject (.false.)
SBUV/2 ozone observations flagged with profile ozone
quality mark
hilbert_curve .false. option for hilbert-curve based cross-validation. works
only with twodvar_regional=.true.
tcp_refps 1000.0 reference pressure for tcps oberr calculation (mb)
tcp_width 50.0 parameter for tcps oberr inflation (width, mb)
tcp_ermin 0.75 parameter for tcps oberr inflation (minimum oberr, mb)
tcp_ermax 5.0 parameter for tcps oberr inflation (maximum oberr, mb)

168
GSI Namelist

qc_noirjaco3 .false. controls whether to use O3 Jac from IR instruments


qc_noirjaco3_pole .false. controls wheter to use O3 Jac from IR instruments near
poles

& OBS_INPUT Controls input data


dfile(ndatmax) input observation file name
dtype(ndatmax) observation type
dplat(ndatmax) satellite (platform) id (for satellite data)
dsis(ndatmax) sensor/instrument/satellite flag from satinfo files
dthin(ndatmax) 1 satellite group
dval(ndatmax) 1.0 relative value of each profile within group relative
weight for observation = dval/sum(dval) within grid
box
dmesh(ndatmax) 1.0 thinning mesh for each group
mesh size (km) for radiance thinning grid (used in
satthin)
dsfcalc(10) 0,0, specifies method to determine surface fields within a
FOV. when equal to one, integrate model fields over
FOV. when not one, bilinearly interpolate model fields
to FOV center.
time_window time_window time window for each input data file
(ndatmax) _max

time_window_max 3 upper limit on time window for all input data


ext_sonde .false. logical for extended forward model on sonde data
NOTE: current value for ndatmax is 200.

&SINGLEOB_TEST Single observation test case setup


maginnov 1 magnitude of innovation for one observation
magoberr 1 magnitude of observational error
oneob_type observation type (t, u, v, etc.)
oblat 0 observation latitude
oblon 0 observation longitude
obpres 1000.0 observation pressure (hPa)
obdattim 2000010100 observation date (YYYYMMDDHH)
obhourset 0 observation delta time from analysis time

169
GSI Namelist

pctswitch .false. if .true. innovation & oberr are relative (%) of


background value (level ozone only)

SUPEROB_RADAR Level 2 bufr file to radar wind superobs


del_azimuth 5.0 azimuth range for superob box (default 5 degrees)
del_elev 0.25 elevation angle range for superob box (default .05
degrees)
del_range 5000.0 radial range for superob box (default 5 km)
del_time 0.5 1/2 time range for superob box (default .5 hours)
elev_angle_max 5.0 max elevation angle (default of 5 deg)
minnum 50 minimum number of samples needed to make a
superob
range_max 100000.0 max radial range in meters to use in constructing
superobs (default 100km)
l2superob_only .false. if true, then process level 2 data creating superobs,
then quit. (added for easier retrospective testing, since
level 2 bufr files are very large and hard to work with)

LAG_DATA Lagrangian data assimilation related


variables
lag_accur 1.0e-6 Accuracy used to decide whether or not a balloon is
on the grid
infile_lag inistate_lag.dat File containing the initial position of the balloon
lag_stepduration 900.0 Duration of one time step for the propagation
model
lag_nmax_bal 1000 Maximum number of balloons at starting time
lag_vorcore_stderr_a 2.0e3 Observation error for vorcore balloon
lag_vorcore_stderr_b 0.0 error = b + a*timestep(in hours)

HYBRID_ENSEMBLE Parameters for use with hybrid ensemble


option
l_hyb_ens .false. if true, then turn on hybrid ensemble option
uv_hyb_ens .false. if true, then ensemble perturbation wind variables are
u,v,
otherwise, ensemble perturbation wind variables are
stream, pot. Functions.
aniso_a_en .false. if true, then use anisotropic localization of hybrid
ensemble control variable a_en.

170
GSI Namelist

generate_ens .true. if true, then generate internal ensemble based on


existing background error
n_ens 0 number of ensemble members.
nlon_ens 0 number of longitudes on ensemble grid (may be
different from analysis grid nlon)
nlat_ens 0 number of latitudes on ensemble grid (may be
different from analysis grid nlat)
jcap_ens 0 for global spectral model, spectral truncation
pseudo_hybens .false. if true, turn on pseudo ensemble hybrid for HWRF
merge_two_grid_ .false. if true, merge ensemble perturbations from two
ensperts forecast domains to analysis domain (one way to deal
with hybrid DA for HWRF moving nest)
regional_ensemble_ 0 integer, used to select type of ensemble to read in for
option regional application. Currently takes values from 1 to
4
=1: use GEFS internally interpolated to ensemble grid.
=2: ensembles are WRF NMM format
=3: ensembles are ARW netcdf format.
=4: ensembles are NEMS NMMB format.
full_ensemble .false. if true, first ensemble perturbation on first guess
istead of on ens mean
betaflg .false. if true, use vertical weighting on beta1_inv and
beta2_inv
pwgtflg .false. if true, use vertical integration function on ensemble
contribution of Psfc
jcap_ens_test 0 for global spectral model, test spectral truncation (to
test dual resolution)
beta1_inv 1 - 1/beta1, the weight given to static background error
covariance
- 0 <= beta1_inv <= 1, tuned for optimal
performance
- =1, then ensemble information turned off
- =0, then static background turned off
- beta2_inv = 1 - beta1_inv is weight given to
ensemble derived covariance
s_ens_h 2828 homogeneous isotropic horizontal ensemble
localization scale (km)
s_ens_v 30 vertical localization scale (grid units for now)
s_ens_h, s_ens_v, and beta1_inv are tunable
parameters.
readin_ .false. flag to read (.true.)external localization information
localization file

eqspace_ensgrid .false. if .true., then ensemble grid is equal spaced, staggered

171
GSI Namelist

1/2 grid unit off ploes.


if .false., then gaussian grid assumed for ensemble
(global only)
grid_ratio_ens 1 for regional runs, ratio of ensemble grid resolution to
analysis grid resolution
default value = 1 (dual resolution off)
oz_univ_static .false. if true, decouple ozone from other variables and
defaults to static B (ozone only)
write_ens_sprd .false. writing global ensemble spread in byte addressable
format for plotting with grads

rapidrefresh_cldsurf Options for cloud analysis and surface


enhancement for RR appilcation
dfi_radar_latent_ 30.0 DFI forward integration window in minutes
heat_time_period
metar_impact_radius 10.0 metar cloud observation impact radius in grid
number
metar_impact_radius 4.0 impact radius for METAR cloud observation that
_lowCloud indicate low cloud base

l_gsd_terrain_match .false. if .true., GSD terrain match for surface temperature


_surfTobs observation

l_sfcobserror_ .false. namelist logical for adjusting surface temperature


ramp_t observation error

l_sfcobserror_ .false. namelist logical for adjusting surface moisture


ramp_q observation error

l_PBL_pseudo_ .false. if .true. produce pseudo-obs in PBL layer based on


SurfobsT surface obs T

l_PBL_pseudo_ .false. if .true. produce pseudo-obs in PBL layer based on


SurfobsQ surface obs Q

l_PBL_pseudo_ .false. if .true. produce pseudo-obs in PBL layer based on


SurfobsUV surface obs UV

pblH_ration 0.75 percent of the PBL height within which to add


pseudo-obs
pps_press_incr 30hPa pressure increase for each additional pseudo-obs on
top of previous level
l_gsd_limit_ocean_q .false. if .true. do GSD limitation of Q over ocean

172
GSI Namelist

l_pw_hgt_adjust .false. if .true. do GSD PW adjustment for model vs. obs


station height
l_limit_pw_innov .false. if .true. do GSD limitation of PW obs
max_innov_pct 0.1 sets limit of PW ob to a percent of the background
value (0-1)
l_cleanSnow_WarmTs .false. if .true. do GSD limitation of using retrieved snow
over warn area (Ts >
r_cleanSnow_WarmTs_threshold)
l_conserve_thetaV .false. if .true. conserve thetaV during moisture adjustment
in cloud analysis
r_cleanSnow_WarmTs 8.0 threshold for using retrieved snow over warn area
_threshold
i_conserve_thetaV_ 3 iteration number for conserving thetaV during
iternum moisture adjustment

l_cld_bld .false. if .true. do GSD GOES cloud building


cld_bld_hgt 1200m sets limit below which GOES cloud building occurs

CHEM Chemistry data assimialtion


berror_chem .false. if berror file is supplied for chemistry
oneobtest_chem .false. single observation test for chemistry
maginnov_chem 30.0 if oneobtest_chem=T magnitude of innovation for
chemistry
magoberr_chem 2.0 if oneobtest_chem=T magnitude of observation
error for chemistry
oneob_type_chem pm2_5 if oneobtest_chem=T type of chemical observation
oblat_chem 45.0 if oneobtest_chem=T latitude of the observation
oblon_chem 270.0 if oneobtest_chem=T longitude of the observation
obpres_chem 1000.0 if oneobtest_chem=T pressure of the observation
diag_incr .false. if user wishes to output to a binary file increment
elev_tolerance 500.0 for surface chemical observation sometimes
elevation (elev_obs) of the measurement is
available (sometimes not).
tunable_error 0.5 tuning parameter to specify representativeness error
for in-situ observations
in_fname cmaq_input.bin name of background file for cmaq
out_fname cmaq_output.bin name analysis file for cmaq

173
GSI Namelist

incr_fname chem_increment. if diag_incr=T name of the binary dump for pm2_5


bin

174

You might also like