IRIX Advanced Site and Server Administration Guide |
This chapter describes the basics of tuning the IRIX operating system for the best possible performance for your particular needs. Information provided includes the following topics:
General information on system tuning and kernel parameters. See "Theory
of System Tuning".
Tuning applications under development. See "Application
Tuning".
Observing the operating system to determine if it should be tuned. See
"Monitoring
the Operating System".
Tuning and reconfiguring the operating system. See "Tuning The Operating System".
The standard IRIX System configuration is designed for a broad range of uses, and adjusts itself to operate efficiently under all but the most unusual and extreme conditions. The operating system controls the execution of programs in memory and the movement of programs from disk to memory and back to disk.
The basic method of system tuning is as follows:
monitor system performance using various utilities,
adjust specific values (for example, the maximum number of processes),
reboot the system if necessary, and
test the performance of the new system to see if it is improved.
Note that performance tuning cannot expand the capabilities of a system beyond its hardware capacity. You may need to add hardware, in particular another disk or additional memory, to improve performance.
Table 5-1 lists the files/directories used for tuning and reconfiguring a system.
File/Directory: | Purpose: |
---|---|
/var/sysgen/system/* | File defining software modules |
/var/sysgen/mtune/* | directory containing files defining tunable parameters |
/var/sysgen/stune | File defining default parameter values. |
/var/sysgen/boot/* | Directory of object files |
/unix | File containing kernel image |
Typically you tune a parameter in one of the files located in the mtune directory (for example, the kernel file) by using the systune(1M) command.
Tunable parameters control characteristics of processes, files, and system activity. They set various table sizes and system thresholds to handle the expected system load. If certain system structures are too large, they waste memory space that would otherwise be used for other processes and can increase system overhead due to lengthy table searches. If they are set too low, they can cause excessive I/O, process aborts, or even a system crash, depending on the particular parameter.
This section briefly introduces the tunable parameters. Appendix A, "IRIX Kernel Tunable Parameters," describes each parameter, gives its default value, provides suggestions on when to change it, and describes problems you may encounter.
Tunable parameters are specified in separate configuration files in the /var/sysgen/mtune directory. See the mtune(4) reference page.
The default values for the tunable parameters are usually acceptable for most configurations for a single-user workstation environment. However, if you have a lot of memory or your environment has special needs, you may want to adjust the size of a parameter to meet those needs. A few of the parameters you may want to adjust are listed below.
You can often increase system performance by tuning your applications to more closely follow your system's resource limits. If you are concerned about a decrease in your system's performance, you should first check your application software to see if it is making the best use of the operating system. If you are using an application of your own manufacture, there are steps you can take to improve performance. Even if a commercially purchased application is degrading system performance, you can identify the problem and use that information to make any decisions about system tuning or new hardware, or even simply when and how to use the application. The following sections explain how to examine and tune applications. The rest of this chapter assumes that your applications have been tuned as much as possible according to these suggestions.
If your system seems slow, for example, an application runs slowly, first check the application. Poorly designed applications can perpetuate poor system performance. Conversely, an efficiently written application means reduced code size and execution time.
A good utility to use to try to determine the source of the problem is the timex(1) utility. timex will report how a particular application is using its CPU processing time. The format is:
timex -s program
which shows program's real (actual elapsed time), user (time process took executing its own code), and sys (time of kernel services for system calls) time. For example:
timex -s ps -el
The above command executes the ps -el command and then displays that program's time spent as:
real 0.95 user 0.08 sys 0.41
There are many reasons why an application spends a majority of its time in either user or sys space. For our purposes, suspect excessive system calls or poor locality of code.
Typically, you can only tune applications that you are developing. Applications purchased for your system cannot be tuned in this manner, although there is usually a facility to correspond with the application vendor to report poor performance.
If the application is primarily spending its time in user space, the first approach to take is to tune the application to reduce its user time by using the pixie(1) and prof(1) commands. See the respective reference pages for more information about these commands. To reduce high user time, make sure that the program:
makes only the necessary number of system calls. Use timex -s to
find out the number of system calls/second the program is making. The key
is to try to keep scall/s at a minimum. System calls are those like read(2),
exec(2); they are listed in Section 2 of the reference pages.
uses buffers of at least 4K for read(2) and write(2) system
calls. Or use the standard I/O library routines fread(3) and fwrite(3),
which buffer user data.
uses shared memory rather than record locking where possible. Record
locking checks for a record lock for every read and write to a file. To
improve performance, use shared memory and semaphores to control access
to common data (see shmop(2), semop(2), and usinit(3P)).
defines efficient search paths ($PATH variable). Specify the most used
directory paths first, and use only the required entries, so that infrequently
used directories aren't searched every time.
eliminates polling loops (see select(2)).
eliminates busy wait (use sginap(0)).
eliminates system errors. Look at /var/adm/SYSLOG, the system error log, to check for errors that the program generated, and try to eliminate them.
Run timex again. If the application still shows a majority of either user or sys time, suspect excessive paging due to poor "locality" of text and data. An application that has locality of code executes instructions in a localized portion of text space by using program loops and subroutines. In this case, try to reduce high user/sys time by making sure that the program:
groups its subroutines together. If often-used subroutines in a loaded
program are mixed with seldom-used routines, the program could require
more of the system's memory resources than if the routines were loaded
in the order of likely use. This is because the seldom-used routines might
be brought into memory as part of a page.
has a working set that fits within physical memory. This minimizes the
amount of paging and swapping the system must perform.
has correctly ported FORTRAN-to-C code. FORTRAN arrays are structured differently from C arrays; FORTRAN is column major while C is row major. If you don't port the program correctly, the application will have poor data locality.
After you tune your program, run timex again. If sys time is still high, tuning the operating system may help reduce this time.
There are a few other things you can do to improve the application's I/O throughput. If you are on a single-user workstation;, make sure that the application:
gains I/O bandwidth by using more than one drive (if applicable). If
an application needs to concurrently do I/O on more than one file, try
to set things up so that the files are in different file systems, preferably
on different drives and ideally on different controllers.
obtains unfragmented layout of a file. Try to arrange an application so that there is only one file currently being written to the file system where it resides. That is, if you have several files you need to write to a file system, and you have the choice of writing them either one after another or concurrently, you will actually get better space allocation (and consequently better I/O throughput) by writing these files singly, one after another.
If you are on a multi-user server, however, it's hard to control how other applications access the system. Use a large size I/O - 16Kb or more. You may also be able to set up separate file systems for different users. With high sys time output from timex, you need to monitor the operating system to determine why this time is high.
Many applications have routines that are executed over and over again. You can optimize program performance by modifying these heavily used routines in the source code. The following paragraphs describe the tools that will help tune your programs.
Profiling allows you to monitor program behavior during execution and determine the amount of time spent in each of the routines in the program. There are two types of profiling:
program counter (PC) sampling
basic block counting
PC sampling is a statistical method that interrupts the program frequently and records the value of the program counter at each interrupt. Basic block counting, on the other hand, is done by using the pixie(1) utility to modify the program module by inserting code at the beginning of each basic block (a sequence of instructions containing no branch instructions) that counts the number of times that each block is entered. Both types of profiling are useful. The primary difference is that basic block counting is deterministic and PC sampling is statistical. To do PC sampling, compile the program with the -p option. When the resulting program is executed, it will generate output files with the PC sampling information that can then be analyzed using the prof(1) utility.
To do basic block counting, compile the program and then execute pixie on it to produce a new binary file that contains the extra instructions to do the counting. When the resulting program is executed, it will produce output files that are then used with prof to generate reports of the number of cycles consumed by each basic block. You can then use the output of prof to analyze the behavior of the program and optimize the algorithms that consume the majority of the program's time. Refer to the cc(1), f77(1), pixie(1), and prof(1) reference pages for more information about the mechanics of profiling.
User program text is demand-loaded a page (currently 4K) at a time. Thus, when a reference is made to an instruction that is not currently in memory and mapped to the user's address space, the encompassing page of instructions is read into memory and then mapped into the user's address space. If often-used subroutines in a loaded program are mixed with seldom-used routines, the program could require more of the system's memory resources than if the routines were loaded in the order of likely use. This is because the seldom-used routines might be brought into memory as part of a page of instructions from another routine.
Tools are available to analyze the execution history of a program and rearrange the program so that the routines are loaded in most-used order (according to the recorded execution history). These tools include pixie, prof, and cc By using these tools, you can maximize the cache hit ratio (checked by running sar -b) or minimize paging (checked by running sar -p), and effectively reduce a program's execution time. The following steps illustrate how to reorganize a program named fetch
Execute the pixie command, which will add profiling code to fetch:
pixie fetch
This creates an output file, fetch.pixie and a file that contains basic
block addresses, fetch.Addrs.
Run fetch.pixie (created in the previous step) on a normal set or sets
of data. This creates the file named fetch.Counts, which contains the basic
block counts.
Next, create a feedback file that the compiler will pass to the loader. Do this by executing prof:
prof -pixie -feedback fbfile fetch fetch.Addrs fetch.Counts
This produces a feedback file named fbfile.
Compile the program with the original flags and options, and add the following two options:
-feedback fbfile
For more information, see the prof and pixie reference pages.
You cannot usually tune commercially available applications to any great degree. If your monitoring has told you that a commercially purchased application is causing your system to run at unacceptably slow levels, you have a few options:
You can look for other areas to reduce system overhead and increase
speed, such as reducing the system load in other areas to compensate for
your application. Options such as batch processing of files and programs
when system load levels permit often show a noticeable increase in performance.
See "Automating
Tasks with at(1), batch(1), and cron(1M)".
You can use the nice(1) and renice(1) utilities to change
the priority of other processes to give your application a greater share
of CPU time. See "Prioritizing
Processes with nice" and "Changing
the Priority of a Running Process".
You can undertake a general program of system performance enhancement,
which can include maximizing operating system i/o through disk striping
and increased swap space. See "Logical
Volumes and Disk Striping" and "Swap
Space".
You can add additional memory, disk space, or even upgrade to a faster
CPU.
You can find another application that performs the same function but that is less intensive on your system. (This is the least preferable option, of course.)
Before you make any changes to your kernel parameters, you should know which parameters should be changed and why. Monitoring the functions of the operating system will help you determine if changing parameters will help your performance, or if new hardware is necessary.
In rare instances a table overflows because it isn't large enough to meet the needs of the system. In this case, an error message appears on the console and in /var/adm/SYSLOG. If the console window is closed or stored, you'll want to check SYSLOG periodically.
Some system calls return an error message that can indicate a number of conditions, one of which is that you need to increase the size of a parameter. Table 5-2 lists the error messages and parameters that may need adjustment. These parameters are in /var/sysgen/master.d/kernel.
Message | System Call | Parameter |
---|---|---|
EAGAIN No more processes |
fork(2) | increase nproc or swap space |
ELIBMAX linked more shared libraries than limit |
exec(2) | increase shlbmax |
E2BIG Arg list too long |
shell(1),
make(1), exec(2) |
increase ncargs) |
Be aware that there can be other reasons for the errors in the previous table. For example, EAGAIN may appear because of insufficient virtual memory. In this case, you may need to add more swap space. For other conditions that can cause these messages, see the Owner's Guide appendix titled "Error Messages".
Other system calls will fail and return error messages that may indicate IPC (interprocess communication) structures need adjustment. These messages and the parameters to adjust are listed in Appendix A, "IRIX Kernel Tunable Parameters."
Two utilities you can use to monitor system performance are timex and sar. They provide very useful information about what's happening in the system.
The operating system has a number of counters that measure internal system activity. Each time an operation is performed, an associated counter is incremented. You can monitor internal system activity by reading the values of these counters.
These utilities monitor the value of the operating system counters, and thus sample system performance. Both utilities use sadc, the sar data collector, which collects data from the operating system counters and puts it in a file in binary format. The difference is that timex takes a sample over a single span of time, while sar takes a sample at specified time intervals. The sar program also has options which allow sampling of a specific function such as CPU usage ( -u option) or paging ( -p option). In addition, the utilities display the data some what differently.
When would you use one utility over the other? If you are running a single application or a couple of programs, use timex. If you have a multi-user/multi-processor system, and/or are running many programs, use sar.
As in all performance tuning, be sure to run these utilities at the same time you are running an application or a benchmark, and be concerned only when figures are outside the acceptable limits over a period of time.
The timex utility is a useful troubleshooting tool when you are running a single application. For example:
timex -s application
The -s option reports total system activity (not just that due to the application) that occurred during the execution interval of application. To redirect timex output to a file, (assuming you use the Bourne shell, (sh(1)) enter:
timex -s application 2> file
The same command, entered using the C shell, looks like this:
timex -s application > file
The sar utility is a useful troubleshooting tool when you're running many programs and processes and/or have a multi-user system such as a server. You can take a sample of the operating system counters over a period of time (for a day, a few days, or a week).
Depending on your needs, you can choose the way in which you wish to examine system activity. You can monitor the system:
during daily operation
consecutively with an interval
before and after an activity under your control
during the execution of a command
You can set up the system so sar will automatically collect system activity data and put it into files for you. Just use the chkconfig(1M) command to turn on sar's automatic reporting feature, which generates a sar -A listing. A crontab entry instructs the system to sample the system counters every 20 minutes during working hours and every hour at other times for the current day (data is kept for the last 7 days). To enable this feature, type:
/etc/chkconfig sar on
The data that is collected is put in /var/adm/sa in the form sann and sarnn, where nn is the date of the report (sarnn is in ASCII format). You can use the sar(1M) command to output the results of system activity.
You can use sar to generate consecutive reports about the current state of the system. On the command line, specify a time interval and a count. For example:
sar -u 5 8
This prints information about CPU use eight times at five-second intervals.
You may find it useful to take a snapshot of the system activity counters before and after running an application (or after running several applications concurrently). To take a snapshot of system activity, instruct sadc (the data collector) to dump its output into a file. Then run the application(s) either under normal system load or restricted load, and when you are ready to stop recording, take another snapshot of system activity. Then compare results to see what happened.
The following is an example of commands that will sample the system counters before and after the application:
/usr/lib/sa/sadc 1 1 file
Run the application(s) or perform any work you want to monitor, then type:
/usr/lib/sa/sadc 1 1 file sar -f file
If file does not exist, sadc will create it. If it does exist, sadc will append data to it.
Often you want to examine system activity during the execution of a command or set of commands. The aforementioned method will allow you do to this. For example, to examine all system activity while running nroff(1), type:
/usr/lib/sa/sadc 1 1 sa.out nroff -mm file.mm > file.out /usr/lib/sa/sadc 1 1 sa.out sar -A -f sa.out
By using timex, you can do the same thing with one line of code:
timex -s nroff -mm file.mm > file.out
Note that the timex also includes the real, user, and system time spent executing the nroff request.
There are two minor differences between timex and sar. The sar program has the ability to limit its output (e.g., the -u option reports only CPU activity), while timex always prints the -A listing. Also, sar works a variety of ways, as discussed previously, but timex only works by executing a command. However, this command can be a shell file.
If you are interested in system activity during the execution of two or more commands running concurrently, put the commands into a shell file and run timex -s on the file. For example, suppose the file nroff.sh contained the following lines:
nroff -mm file1.mm > file1.out & nroff -mm file2.mm > file2.out & wait
To get a report of all system activity after both of the nroff requests (running concurrently) finish, invoke timex as follows:
timex -s nroff.sh
Now that you have learned when and how to use sar and timex, you can choose one of these utilities to monitor the operating system. Then examine the output and try to determine what's causing performance degradation. Look for numbers that show large fluctuation or change over a sustained period; don't be too concerned if numbers occasionally go beyond the maximum.
The first thing to check is how the system is handling the disk I/O process. After that, check for excessive paging/swapping. Finally look at CPU use and memory allocation.
The sections immediately following assume that the system you are tuning is active (with applications/benchmark executing).
The system uses disks to store data and transfers data between the disk and memory. This input/output (I/O) process consumes a lot of system resources, so you want the operating system to be as efficient as possible when it performs I/O.
If you are going to run a large application or have a heavy system load, the system will benefit from disk I/O tuning. Run sar -A or timex -s and look at the %busy, %rcache, %wcache, and %wio fields.To see if your disk subsystem needs tuning, check your output of sar -A against the figures in the following table. (Note that in the tables that follow, the right column lists the sar option that will print only selected output, for example output for disk usage (sar -d) or CPU activity (sar -u).
Table 5-3 lists sar results that indicate an I/O-bound system.
Field | Value | sar Option |
---|---|---|
%busy (% time disk is busy) | >85% | sar -d |
%rcache (reads in buffer cache) | low, <85 | sar -b |
%wcache (writes in buffer cache) | low, <60% | sar -b |
%wio (idle CPU waiting for disk I/O) | dev. system >30 fileserver >80 |
sar -u |
Notice that for the %wio figures (indicates the percentage of time the CPU is idle while waiting for disk I/O), there are examples of two types of systems:
a development system that has users who are running programs such as
make. In this case, if %wio > 30, check the breakdown of %wio (sar -u).
By looking at the %wfs (waiting for file system) and %wswp (waiting for
swap), you can pinpoint exactly what the system is waiting for.
an NFS system that is serving NFS clients and is running as a file server. In this case, if %wio > 80, %wfs > 90, the system is disk I/O bound.
There are many other factors to consider when you tune for maximum I/O performance. You may also be able to increase performance by:
using logical volumes
using partitions on different disks
adding hardware (a disk, controller, memory)
By using logical volumes, you can:
grow an existing file system to a larger size without having to disturb
the existing file system contents.
stripe file systems across multiple disks - You may be able to obtain up to 50% improvement in your I/O throughput by creating striped volumes on different disks.
Striping works best on disks that are on different controllers. Logical volumes give you more space without remaking the first filesystem. Disk striping gives you more space with increased performance potential, but you do run the risk that if you lose one of the disks with striped data, you will lose all the data on the filesystem since the data is interspersed across all the disks.
Contiguous logical volumes fill up one disk, and then write to the next. Striped logical volumes write to both disks equally, spreading each file across all disks in the volume, so it is impossible to recover from a bad disk if the data is striped, but it is possible if the data is in a contiguous logical volume. For information on creating a striped disk volume, see "Logical Volumes and Disk Striping".
There are some obvious things you can do to increase your system's throughput, such as limiting the number of programs that can run at peak times, shifting processes to non-peak hours (run batch jobs at night), and shifting processes to another machine. You can also set up partitions on separate disks to redistribute the disk load.
Before continuing with the discussion about partitions, let's look at how a program uses a disk as it executes. Table 5-4 shows various reasons why an application may need to access the disk.
Application | Disk Access |
---|---|
execute object code | text and data |
uses swap space for data, stack | /dev/swap |
writes temporary files | /tmp and /var/tmp |
reads/writes data files | data files |
You can maximize I/O performance by using separate partitions on different disks for some of the aforementioned disk access areas. In effect, you are spreading out the application's disk access routines, which will speed up I/O.
By default, disks are partitioned to allow access in two ways, either:
three partitions: partitions 0, 1 and 6, or
one large partition, partition 7 (encompasses the three smaller partitions)
On the system disk, partition 0 is for root, 1 is for swap, and 6 is for /usr.
For each additional disk, you need to decide if you want a number of partitions or one large one and what file systems (or swap) you want on each disk and partition. It's best to distribute file systems in the disk partitions so that different disks are being accessed concurrently.
The configuration depends on how you use the system, so it helps to look at a few examples.
Consider a system that typically runs a single graphics application that often reads from a data file. The application is so large that its pages are often swapped out to the swap partition.
In this case, it might make sense to have the application's data file
on a disk separate from the swap area.
If after configuring the system this way, you find that it doesn't have
enough swap space, consider either obtaining more memory, or backing up
everything on the second hard disk and creating partitions to contain both
a swap area and a data area.
Changing the size of a partition containing an existing file system may make any data in that file system inaccessible. Always do a complete and current backup (with verification) and document partition information before making a change. If you change the wrong partition, you can change it back, providing you do not run mkfs on it or overwrite it. It is recommended that you print a copy of the prtvtoc command output after you have customized your disks, so that they may be more easily restored in the event of severe disk damage.
Also, if you have a very large application and have three disks, consider using partitions on the second and third disks for the application's executables (/bin and /usr/bin) and for data files, respectively. Next, consider a system that mostly runs as a "compile-engine."
In this case, it might be best to place the /tmp directory on a disk separate from the source code being compiled. Make sure that you check and mount the file system prior to creating any files on it. (If this is not feasible, you can instruct the compiler to use a directory on a different disk for temporary files. Just set the TMPDIR environment variable to the new directory for temporary files.) Now, look at a system that mainly runs many programs at the same time and does a lot of swapping.
In this case, it might be best to distribute the swap area in several partitions on different disks.
If improved I/O performance still does not occur after you have tuned as described previously, you may want to consider adding more hardware: disks, controllers, or memory.
If you are going to add more hardware to your system, how do you know which disk/controller to add? You can compare hardware specifications for currently supported disks and controllers by turning to your hardware Owner's Guide and looking up the system specifications. By using this information, you can choose the right disk/controller to suit your particular needs.
By balancing the most active file systems across controllers/disks, you can speed up disk access.
Another way to reduce the number of reads and writes that go out to the disk is to add more memory. This will reduce swapping and paging.
The CPU can only reference data and execute code that are loaded into memory. Because the CPU executes multiple processes, there may not be enough memory for all the processes. If you have very large programs, they may require more memory than is physically present in the system. So processes are brought into memory in pages; if there's not enough memory, the operating system frees memory by writing pages temporarily to a secondary memory area, the swap area.
Table 5-5 shows indications of excessive paging and swapping on a smaller system. Large servers will show much higher numbers.
Field | Value | sar Option |
---|---|---|
bswot/s (transfers from memory to disk swap area) | >200 | sar -w |
bswin/s (transfers to memory) | >200 | sar -w |
%swpocc (time swap queue is occupied) | >10 | sar -q |
rflt/s (page reference fault) | >0 | sar -t |
freemem (average pages for user processes) | <100 | sar -r |
There are several things you can do to reduce excessive paging/swapping:
limit the number of programs that run at peak times and shift processes
to non-peak hours (run batch jobs at night - see at(1)) or to another machine.
run multiple jobs in sequence, not in parallel.
if you increased various parameters (for example, nproc), decrease
them again.
reduce page faults. Construct programs with ``locality'' in mind (see
"Tuning an Application").
use shared libraries.
reduce resident set size limits with systune. See "System
Limits Tunable Parameters" for the names and characteristics of
the appropriate parameters.
add more memory.
You can also improve performance by adding swap partitions to spread swap activity across several disks or adding swap files on several partitions. For more information on swapping to files, see "Swap Space".
Why does a system sometimes page excessively? The amount of memory any one program needs during any given instant (the working set) should fit within physical memory. If over a 5 to 10 second period of time, routines access more pages than can fit in physical memory, there will be excessive thrashing and paging. For example, a SCSI drive can read at a rate of 1.5 Mb (384 pages) per second. Remembering all the overhead plus other processes that are contending for memory, 50-100 pages per second is a reasonable number of pages to bring into memory. However, 100 pages per second over a sustained period will result in poor performance.
In summary, there will be excessive paging and thrashing if, (1) the number of pages brought into physical memory is over about 100 pages per second for a sustained period of time, and (2) the working set size of all processes is larger than physical memory.
After looking at disk I/O and paging, next check CPU activity and memory allocation.
A CPU can execute only one process at any given instant. If the CPU becomes overloaded, processes have to wait instead of executing. You can't change the speed of the CPU (although you may be able to upgrade to a faster CPU or add CPU boards to your system if your hardware allows it), but you can monitor CPU load and try to distribute the load. Table 5-6 shows the fields to check for indications that a system is CPU bound.
Field | Value | sar Option |
---|---|---|
%idle (% of time CPU has no work to do) | <5 | sar -u |
runq-sz (processes in memory waiting for CPU) | >2 | sar -q |
%runocc (% run queue occupied and processes not executing) | >90 | sar -q |
You can also use the top(1) or gr_top(1) commands to display processes having the highest CPU usage. For each process, the output lists the user, process state flags, process ID and group ID, CPU cycles used, processor currently executing the process, process priority, process size (in pages), resident set size (in pages), amount of time used by the process, and the process name. For more information, see the top(1) or gr_top(1) reference pages.
To increase CPU performance, you'll want to make the following modifications:
off-load jobs to non-peak times or to another machine, set efficient
paths, and tune applications.
eliminate polling loops (see select(2)).
increase the slice-size parameter (the length of a process time
slice). For example, change slice-size from Hz/30 to Hz/10. However,
be aware that this may slow interactive response time.
upgrade to a faster CPU or add another CPU.
"Checking for Excessive Paging and Swapping" described what happens when you don't have enough physical (main) memory for processes.
This section discusses a different problem - what happens when you don't have enough available memory (sometimes called virtual memory), which includes both physical memory and logical swap space.
The IRIX virtual memory subsystem allows programs that are larger than physical memory to execute successfully. It also allows several programs to run even if the combined memory needs of the programs exceed physical memory. It does this by storing the excess data on the swap device(s).
The allocation of swap space is done after program execution has begun. This allows programs with large a virtual address to run as long as the actual amount of virtual memory allocated does not exceed the memory and swap resources of the machine.
Usually it's very evident when you run out of memory, because a message is sent to the console that begins:
Out of logical swap space...
If you see this message:
the process has exceeded ENOMEM or UMEM.
there is not enough physical memory for the kernel to hold the required
non-pageable data structures.
there is not enough logical swap space.
You can add virtual swap space to your system at any time. See "Swap Space" if you wish to add more swap. You need to add physical swap space, though, if you see the message:
Process killed due to insufficient memory
The following system calls will return EAGAIN if there is insufficient available memory: exec, fork, brk, sbrk (called by malloc), mpin, and plock. Applications should check the return status and exit gracefully with a useful message.
To check the size (in pages) of a process that is running, execute ps -el (you can also use top(1)). The SZ:RSS field will show you very large processes.
By checking this field, you can determine the amount of memory the process is using. A good strategy is to run very large processes at less busy times.
To see the amount of main memory, use the hinv(1) command. It displays data about your system's configuration. For example:
Main memory size: 64 Mb
To increase the amount of virtual memory, increase the amount of real memory and/or swap space. Note that most of the paging/swapping solutions also apply to ways to conserve available memory. These include:
limiting the number of programs
using shared libraries
adding more memory
decreasing the size of system tables
However, the most dramatic way to increase the amount of virtual memory is to add more swap space. The previous section that described using partitions explained how to do this and it is covered completely in "Swap Space".
The process of tuning your operating system is not difficult, but it should be approached carefully. Make complete notes of your actions in case you need to reverse your changes later on. Understand what you are going to do before you do it, and do not expect miraculous results; IRIX has been engineered to provide the best possible performance under all but the most extreme conditions. Software that provides a great deal of graphics manipulation or data manipulation also carries a great deal of overhead for the system, and can seriously affect the speed of an otherwise robust system. No amount of tuning will change these situations.
To tune a system, you first monitor its performance with various system utilities as described in "Monitoring the Operating System". This section describes the steps to take when you are tuning a system.
Determine the general area that needs tuning (for example, disk I/O
or the CPU) and monitor system performance using utilities like sar(1)
and osview(1M). If you have not already done so, see "Monitoring
the Operating System".
Pinpoint a specific area and monitor performance over a period of time.
Look for numbers that show large fluctuation or change over a sustained
period; don't be too concerned if numbers occasionally go beyond the maximum.
Modify one value/characteristic at a time (for example, change a parameter,
add a controller) to determine its effect. It's good practice to document
any changes in a system notebook.
Use the systune(1M) command to change parameter values.
Remeasure performance and compare the before and after results. Then evaluate the results (is system performance better?) and determine if further change is needed.
Keep in mind that the tuning procedure is more an art than a science; you may need to repeat the above steps as necessary to fine tune your system. You may find that you'll need to do more extensive monitoring and testing to thoroughly fine tune your system.
Before you can tune your system, you need to know the current values of the tunable parameters. To find the current value of your kernel parameters, use the systune(1M) command. This command, entered with no arguments, prints the current values of all tunable parameters on your system. For complete information on this command, see the systune(1M) reference page.
After determining the parameter or parameters to adjust, you must change the parameters and you may need to reconfigure the system for the changes to take effect. The systune(1M) utility will tell you when you make parameter changes if you must reboot to activate those changes. There are several steps to reconfiguration procedure:
back up the system
copy your existing kernel to unix.save
make your changes
reboot your system, if necessary
Before you reconfigure the system by changing kernel parameters, it's a good idea to have a current and complete backup of the system. See Chapter 6, "Backing Up and Restoring Files," for back-up procedures.
Caution: Always back up the entire system before tuning.
After determining the parameter you need to change (for example, you need to increase nproc because you have a large number of users) you must first back up the system and the kernel. Give the command:
cp /unix /unix.save
This command creates a copy of your kernel. Through the rest of this example, this is called your old saved kernel. If you make this copy, you can always go back to your original kernel if you are not satisfied with the results of your tuning.
Once your backups are complete, you can execute the systune(1M) command. An invocation of systune(1M) to increase nproc looks something like this:
systune -i Updates will be made to running system and /unix.install systune-> nproc nproc = 400 (0x190) systune-> nproc = 500 nproc = 400 (0x190) Do you really want to change nproc to 500 (0x1f4)? (y/n) y In order for the change in parameter nproc to become effective /unix.install must be moved to /unix and the system rebooted systune-> quit
To boot the kernel you just made, give the command:
mv /unix.install /unix
Then reboot your system. Also, be sure to document the parameter change you made in your system log book.
The systune command creates a new kernel automatically. However, if you changed parameters without using systune, or if you have added new system hardware (such as a new CPU board on a multiprocessor system), you will need to use autoconfig to generate a new kernel. To build a new kernel after reconfiguring the system, follow these steps:
Become the Superuser by giving the command:
su
Give the command:
/etc/autoconfig -f
This command creates a new kernel and places it in the file /unix.install.
Make a copy of your current kernel with the command:
cp /unix /unix.save
Reboot your system with the command:
reboot
Caution: When you issue the reboot command, the system overwrites the current kernel (/unix) with the kernel you have just created (/unix.install). This is why you should always copy the current kernel to a safe place before rebooting.
An autoconfiguration script, found in /etc/rc2.d/S95autoconfig, runs during system start-up. This script asks you if you would like to build a new kernel under the following conditions:
A new board has been installed for which no driver exists in the current
kernel.
There have been changes to object files in /var/sysgen/mtune, master files in /var/sysgen/master.d, or the system files in /var/sysgen/system. This is determined by the modification dates on these files and the kernel.
If any of these conditions is true, the system prompts you during startup to reconfigure the operating system:
Automatically reconfigure the operating system? y
If you answer y to the prompt, the script runs lboot and generates /unix.install with the new image.You can disable the autoconfiguration script by renaming /etc/rc2.d/S95autoconfig to something else that does not begin with the letter S, for example, /etc/rc2.d/wasS95autoconfig.
The following procedure explains how to recover from an unbootable /unix, and describes how to get a viable version of the software running after an unsuccessful reconfiguration attempt. If you use the systune(1M) utility, you should never have to use this information, since systune will not allow you to set your parameters to unworkable values.
If the system fails to reboot, try to reboot it a few more times. If
it still fails, you need to interrupt the boot process and direct the boot
PROM to boot from your old saved kernel (unix.save).
Press the reset button.You will see the System Maintenance Menu:
System Maintenance Menu
1) Start System. 2) Install System Software. 3) Run Diagnostics. 4) Recover System. 5) Enter Command Monitor.
Select option 5 to enter the Command Monitor. You see:
Command Monitor. Type "exit" to return to the menu. >>
Now at the >> prompt, tell the PROM to boot your old saved kernel. The command is:
boot unix.save
The system will then boot the old saved kernel.
Once the system is running, use the following command to move your old saved kernel to the default /unix name. This method also keeps a copy of your old saved kernel in unix.save:
cp /unix.save /unix
Then you can normally boot the system while you investigate the problem with the new kernel. Try to figure out what went wrong. What was changed that stopped the kernel from booting? Review the changes that you made.
Did you increase/decrease a parameter by a huge amount? If so, make
the change less drastic.
Did you change more than one parameter? If so, make a change to only one parameter at a time.
Copyright © 1997, Silicon Graphics, Inc. All Rights Reserved. Trademark Information