Developer Guide and Reference

ID 767251
Date 10/31/2024
Public
Document Table of Contents

Sample Programs and Traceback Information

The following sections provide sample programs that show the use of traceback to locate the cause of the error:

The hex program counters (PCs) and contents of registers displayed in these program outputs are meant as representative examples of typical output. The program counters will change over time, as the libraries and other tools used to create an image change.

Example: End-of-File Condition, Program teof

In the following example, a READ statement creates an End-Of-File error, which the application has not handled:

  program teof
  integer*4 i,res
  i=here( )
  end
 integer*4 function here( )
  here = we( )
  end
 integer*4 function we( )
  we = go( )
  end
 integer*4 function go( )
  go = again( )
  end
 integer*4 function again( )
  integer*4 a
  open(10,file='xxx.dat',form='unformatted',status='unknown')
  read(10) a
  again=a
  end

The diagnostic output that results when this program is built with traceback enabled, optimization disabled, and linked against the shared Fortran runtime library on the Intel® 64 architecture platform is similar to the following:

forrtl: severe (24): end-of-file during read, unit 10, file E:\USERS\xxx.dat
Image            PC                Routine               Line     Source
libifcorert.dll  000007FED2B232D9  Unknown               Unknown  Unknown
libifcorert.dll  000007FED2B6CEE0  Unknown               Unknown  Unknown
teof.exe         000000013F931193  AGAIN                      17  teof.f90
teof.exe         000000013F93109B  GO                         12  teof.f90
teof.exe         000000013F931072  WE                          9  teof.f90
teof.exe         000000013F931049  HERE                        6  teof.f90
teof.exe         000000013F93101E  MAIN__                      3  teof.f90
teof.exe         000000013F96ADCE  Unknown               Unknown  Unknown
teof.exe         000000013F96B64C  Unknown               Unknown  Unknown
kernel32.dll     0000000076CC59BD  Unknown               Unknown  Unknown
ntdll.dll        0000000076EFA2E1  Unknown               Unknown  Unknown

If optimization is not disabled by option /Od (Windows) or option -O0 (Linux), procedure inlining may collapse the call stack and make it more difficult to locate a problem.

The first line of the output is the standard Fortran runtime error message. What follows is the result of walking the call stack in reverse order to determine where the error originated. Each line of output represents a call frame on the stack. Since the application was compiled with the traceback option, the Program Counters that fall in Fortran code are correlated to their matching routine name, line number and source module. Program Counters that are not in Fortran code are not correlated and are reported as Unknown.

The first two frames show the calls to routines in the Fortran runtime library (in reverse order). Since the application was linked against the shared version of the library, the image name reported is either libifcore.so (Linux) or libifcorert.dll (Window*). These are the runtime routines that were called to do the READ and upon detection of the EOF condition, were invoked to report the error. In the case of an unhandled I/O programming error, there will always be a few frames on the call stack down in runtime code like this.

The stack frame of real interest to the Fortran developer is the first frame in image teof.exe, which shows that the error originated in the routine named AGAIN in source module teof.f90 at line 17. Looking in the source code at line 21, you can see the Fortran READ statement that incurred the end-of-file condition.

The next four frames show the trail of calls in the Fortran user code that led to the routine that got the error (TEOF->HERE->WE->GO->AGAIN).

Finally, the bottom four frames are routines which handled the startup and initialization of the program.

If this program had been linked against the static Fortran runtime library, the output would then look like:

forrtl: severe (24): end-of-file during read, unit 10, file E:\USERS\xxx.dat
Image            PC                Routine               Line     Source
teof.exe         000000013F941FFB  Unknown               Unknown  Unknown
teof.exe         000000013F9380A0  Unknown               Unknown  Unknown
teof.exe         000000013F931193  AGAIN                      17  teof.f90
teof.exe         000000013F93109B  GO                         12  teof.f90
teof.exe         000000013F931072  WE                          9  teof.f90
teof.exe         000000013F931049  HERE                        6  teof.f90
teof.exe         000000013F93101E  MAIN__                      3  teof.f90
teof.exe         000000013F96ADCE  Unknown               Unknown  Unknown
teof.exe         000000013F96B64C  Unknown               Unknown  Unknown
kernel32.dll     0000000076CC59BD  Unknown               Unknown  Unknown
ntdll.dll        0000000076EFA2E1  Unknown               Unknown  Unknown

Note that the initial two stack frames now show routines in image teof.exe, not libifcore.so (Linux) or libifcorert.dll (Windows).

The routines are the same two runtime routines as previously reported for the shared library case but since the application was linked against the archive library libifcore.a (Linux) or the static Fortran runtime library libifcore.lib (Windows), the object modules containing these routines were linked into the application image (teof.exe). You can use Generating Listing and Map Files to determine the locations of uncorrelated program counters.

Now suppose the application was compiled without traceback enabled and, once again, linked against the static Fortran library. The diagnostic output would appear as follows:

forrtl: severe (24): end-of-file during read, unit 10, file E:\USERS\xxx.dat
Image            PC                Routine               Line     Source
teof.exe         000000013F851FFB  Unknown               Unknown  Unknown
teof.exe         000000013F8480A0  Unknown               Unknown  Unknown
teof.exe         000000013F841193  Unknown               Unknown  Unknown
teof.exe         000000013F84109B  Unknown               Unknown  Unknown
teof.exe         000000013F841072  Unknown               Unknown  Unknown
teof.exe         000000013F841049  Unknown               Unknown  Unknown
teof.exe         000000013F84101E  Unknown               Unknown  Unknown
teof.exe         000000013F87ADCE  Unknown               Unknown  Unknown
teof.exe         000000013F87B64C  Unknown               Unknown  Unknown
kernel32.dll     0000000076CC59BD  Unknown               Unknown  Unknown
ntdll.dll        0000000076EFA2E1  Unknown               Unknown  Unknown

Without the correlation information in the image that option traceback previously supplied, the Fortran runtime system cannot correlate Program Counters to routine name, line number, and source file. You can still use the Generating Listing and Map Files to at least determine the routine names and what modules they are in.

Remember that compiling with the traceback option increases the size of your application's image because of the extra Program Counter correlation information included in the image. You can see if the extra traceback information is included in an image (checking for the presence of a .trace section) by entering:

objdump -h your_app.exe   ! Linux
link -dump -summary your_app.exe   ! Windows

Build your application with and without traceback and compare the file size of each image. Check the file size with a simple directory command.

For this simple teof.exe example, the traceback correlation information adds about 512 bytes to the image size. In a real application, this would probably be much larger. For any application, the developer must decide if the increase in image size is worth the benefit of automatic Program Counter correlation or if manually correlating Program Counters with a map file is acceptable.

If an error occurs when traceback was requested during compilation, the runtime library will produce the correlated call stack display.

If an error occurs when traceback was disabled during compilation, the runtime library will produce the uncorrelated call stack display.

If you do not want to see the call stack information displayed, you can set the Supported Environment Variable FOR_DISABLE_STACK_TRACE to true. You will still get the Fortran runtime error message:

forrtl: severe (24): end-of-file during read, unit 10, file E:\USERS\xxx.dat

Example: Machine Exception Condition, Program ovf

The following program generates a floating-point overflow exception when compiled with the fpe option value 0:


  program ovf
  real*4 a
  a=1e37
  do i=1,10
    a=hey(a)
  end do
  print *, 'a= ', a
  end
 real*4 function hey(b)
  real*4 b
  hey = watch(b)
  end
 real*4 function watch(b)
  real*4 b
  watch = out(b)
  end
 real*4 function out(b)
  real*4 b
  out = below(b)
  end
 real*4 function below(b)
  real*4 b
  below = b*10.0e0
  end

Assume this program is compiled with the following:

  • Option fpe value 0

  • Option traceback

  • Option -O0 (Linux) or /Od (Windows)

On a system based on Intel 64 architecture, the traceback output is similar to the following:

forrtl: error (72): floating overflow
Image              PC                Routine            Line        Source
libc.so.6          00007F374B510DB0  Unknown               Unknown  Unknown
a.out              0000000000405348  below                      23  opt.f90
a.out              0000000000405329  below_.t110p                0  opt.f90
a.out              0000000000405305  out                        19  opt.f90
a.out              00000000004052E9  out_.t89p                   0  opt.f90
a.out              00000000004052C5  watch                      15  opt.f90
a.out              00000000004052A9  watch_.t68p                 0  opt.f90
a.out              0000000000405285  hey                        11  opt.f90
a.out              0000000000405269  hey_.t35p                   0  opt.f90
a.out              00000000004051C3  ovf                         5  opt.f90
a.out              000000000040516D  Unknown               Unknown  Unknown
libc.so.6          00007F374B4FBE50  Unknown               Unknown  Unknown
libc.so.6          00007F374B4FBEFC  __libc_start_main     Unknown  Unknown
a.out              0000000000405085  Unknown               Unknown  Unknown

Notice that unlike the previous example of an unhandled I/O programming error, the stack walk can begin right at the point of the exception. There are no runtime routines on the call stack to dig through. The overflow occurs in routine BELOW at PC 001211A3, which is correlated to line 23 of the source file ovf.f90.

When the program is compiled at a higher optimization level of O2, along with option fpe value 0 and the traceback option, the traceback output appears as follows:

forrtl: error (72): floating overflow
Image              PC                Routine            Line        Source
libc.so.6          00007F1E43356DB0  Unknown               Unknown  Unknown
a.out              00000000004051CB  ovf                        23  opt.f90
a.out              000000000040516D  Unknown               Unknown  Unknown
libc.so.6          00007F1E43341E50  Unknown               Unknown  Unknown
libc.so.6          00007F1E43341EFC  __libc_start_main     Unknown  Unknown
a.out              0000000000405085  Unknown               Unknown  Unknown

With option O2 in effect, the entire program has been inlined.

The main program, OVF, no longer calls routine HEY. While the output is not quite what one might have expected intuitively, it is still entirely correct. You need to keep in mind the effects of compiler optimization when you interpret the diagnostic information reported for a failure in a release image.

If the same image were executed again, this time with the environment variable called TBK_ENABLE_VERBOSE_STACK_TRACE set to True, you would also see a dump of the exception context record at the time of the error. Here is an excerpt of how that might appear on a system using Intel 64 architecture:

forrtl: error (72): floating overflow

Hex Dump of User Context at Exception:

Alternate Signal Stack Content:
SS_SP:   000000000048C620  SS_FLAGS:  00000000  SS_SIZE: 0000000000014000

General Registers From Machine Context:
R8:      00007FFE70BFE090  R9:      0000000000000000
R10:     0000000000000030  R11:     00007FBC3AFC33C0
R12:     00007FFE70BFE478  R13:     0000000000405150
R14:     00007FBC3B563AA0  R15:     000000000048B9C8
RDI:     000000000046E010  RSI:     0000000000000000
RBP:     00007FFE70BFE320  RBX:     0000000000000000
RDX:     0000000000000000  RCX:     0000000000000002
RAX:     0000000000000002  RSP:     00007FFE70BFE2B0
RIP:     00000000004051CB  EFL:     0000000000010297
CSGSFS:  002B000000000033  ERR:     0000000000000000
TRAPNO:  0000000000000013

Floating Point Control Registers From Machine Context:
CWD:     00000372  SWD:     00000000  FTW:     00000000  FOP:     00000000
RIP:     0000000000000000  RDP:     0000000000000000
MXCSR:   00009968  MXCSR MASK: 0000FFFF

Floating Point Register Stack From Machine Context:
ST       EXPONENT       SIGNIFICAND
--       --------  ----------------
0           0000  0000000000000000
1           0000  0000000000000000
2           0000  0000000000000000
3           0000  0000000000000000
4           0000  0000000000000000
5           0000  0000000000000000
6           0000  0000000000000000
7           0000  0000000000000000

Floating Point XMM Registers From Machine Context:
---------------------------------------------
XMM0     00000000 00000000 00000000 41200000
XMM1     00000000 00000000 00000000 7E967699
XMM2     00000000 00000000 00000000 447A0000
XMM3     00000000 00000000 00000000 00000000
XMM4     00000000 00000000 00000000 00000000
XMM5     00000000 00000000 00000000 00000000
XMM6     00007FFE 70BFDDA0 00000000 00000000
XMM7     00007FFE 70BFE33C B91287F7 B42C5A00
XMM8     706D6F63 2F686372 61726C70 6D632F6C
XMM9     00000000 00000000 00000000 00000000
XMM10    00000000 00000000 00000000 00000000
XMM11    00000000 00000000 00000000 00000000
XMM12    00000000 00000000 00000000 00000000
XMM13    00000000 00000000 00000000 00000000
XMM14    00000000 00000000 00000000 00000000
XMM15    00000000 00000000 00000000 00000000


In-Memory Floating Point Control Registers:
CWD:     00000000  SWD:     00000000  FTW:     00000000  FOP:     00000000
RIP:     0000000000000000  RDP:     0000000000000000
MXCSR:   00000372  MXCSR MASK: 00000000

In-Memory Floating Point Register Stack:
ST       EXPONENT       SIGNIFICAND
--       --------  ----------------
0           0000  0000000000000000
1           0000  0000000000000000
2           0000  0000000000000000
3           0000  0000000000000000
4           0000  0000000000000000
5           0000  0000000000000000
6           0000  0000000000000000
7           0000  0000000000000000

In-Memory Floating Point XMM Registers:
---------------------------------------------
XMM0     00000000 00000000 00000000 41200000
XMM1     00000000 00000000 00000000 7E967699
XMM2     00000000 00000000 00000000 447A0000
XMM3     00000000 00000000 00000000 00000000
XMM4     00000000 00000000 00000000 00000000
XMM5     00000000 00000000 00000000 00000000
XMM6     00007FFE 70BFDDA0 00000000 00000000
XMM7     00007FFE 70BFE33C B91287F7 B42C5A00
XMM8     706D6F63 2F686372 61726C70 6D632F6C
XMM9     00000000 00000000 00000000 00000000
XMM10    00000000 00000000 00000000 00000000
XMM11    00000000 00000000 00000000 00000000
XMM12    00000000 00000000 00000000 00000000
XMM13    00000000 00000000 00000000 00000000
XMM14    00000000 00000000 00000000 00000000
XMM15    00000000 00000000 00000000 00000000

Additional User Context:
UC_FLAGS:  0000000000000007
UC_LINK: 0000000000000000


Traceback symbolic or hex stack dump follows:

--------- Frame # 0 ---------------------------------------

Image:         libc.so.6
PC:            0x00007fbc3ae78db0
Routine name:  Unknown
Source file:   Unknown
Line number:   Unknown

--------- Frame # 1 ---------------------------------------

Image:         a.out
PC:            0x00000000004051cb
Routine name:  ovf
Source file:   opt.f90
Line number:   23

--------- Frame # 2 ---------------------------------------

Image:         a.out
PC:            0x000000000040516d
Routine name:  Unknown
Source file:   Unknown
Line number:   Unknown

--------- Frame # 3 ---------------------------------------

Image:         libc.so.6
PC:            0x00007fbc3ae63e50
Routine name:  Unknown
Source file:   Unknown
Line number:   Unknown

--------- Frame # 4 ---------------------------------------

Image:         libc.so.6
PC:            0x00007fbc3ae63efc
Routine name:  __libc_start_main
Source file:   Unknown
Line number:   Unknown

--------- Frame # 5 ---------------------------------------

Image:         a.out
PC:            0x0000000000405085
Routine name:  Unknown
Source file:   Unknown
Line number:   Unknown

See Also