Software Installation Administrator's Guide

Appendix A. Installation Troubleshooting


Resolving Errors

This section discusses Inst error messages in detail. Error messages, possible reasons for the error, and possible solutions are provided. Errors are grouped according to how they are generated.

Error messages are shown in a typewriter-style font and are followed by indented explanatory text. For example:

Example of an error message

This is text describing the possible causes and solutions to the condition which produced the error.

Variables within the text of the error message (for example, hostnames) are shown in italics.

This section contains the following subsections:

Errors Loading the Miniroot

This section discusses three types of errors that can occur when you are loading the miniroot. The following types are discussed:

Errors Loading the Miniroot From Local CD

dk<unit> error: unrecognized scsi disk drivedksc(0,<unit>,8)sash.<cpu>: Command not found

If you see one of these errors while you are using a local CD during a miniroot installation, possible causes are


  1. Check to make sure a caddy with a CD is in the drive.

  2. Press the Reset button on the workstation main unit and begin the installation process again.

  3. Follow the procedure in "Verifying That a CD-ROM Drive Is Recognized".

CD-ROM drive not recognized

If you see this error while using local CD-ROM during an IRIX Installation, a possible cause is that IRIX doesn't recognize that the CD-ROM drive is present.

Perform these actions to correct the problem:


  1. Shut down the workstation, verify that the CD-ROM drive is connected and turned on, and start the installation process over again.

  2. If shutting down the workstation does not correct the problem, try turning the CD-ROM drive off and then on again.

  3. See the section "Resolving Problems With CDs".

dks0d3s8: Unexpected blank media: ASC=0x64
dks0d3s8: Can't read volume header
Error 20 while loading scsi(0)cdrom(3)partition(8)sashARCS

If you see these errors while using a CD distribution source during a miniroot installation, it is possible that the program cdman(1) was terminated, which left the CD-ROM drive in audio mode rather than in data mode.

To correct this problem, use the following procedure:


  1. Quit Inst.

  2. Return to the PROM Monitor.

  3. Press the Reset button on the workstation.

  4. Begin the installation again.

  5. Refer to the section "Resolving Problems With CDs"," if problems persist.

Errors Loading the Miniroot From Remote Host

In addition to the errors discussed in this section, refer to "Network Problem Diagnosis During Miniroot Installation" for a discussion of network problems that may occur during a miniroot installation.

No server for server:path(sash.cpu)
Unable to load bootp()server:path(sash.cpu):file not found

or

No server for server:CDdir/dist(sash.cpu)
open(bootp)server:CDdir/dist(sash.cpu) failed, errno = 6
Unable to load bootp()server:CDdir/dist(sash.cpu):file not found

If you see either of these errors during a miniroot installation, the cause might be an incorrect specification of the remote distribution source. To correct the error, enter the setenv command again. Specify the full, correct path to the distribution source, and be sure to include the /sa at the end of your specification. Then, enter the boot command again.

TFTP error: I/O error (code 0)
Unable to load bootp()server:path>(sash.cpu): ''bootp()server:path/sa(sash.cpu)'' is not a valid file to boot.

or

TFTP error: Access violation (code 2)
bootp()server:path/sa(sash.cpu): invalid
Unable to load bootp()server:path/sa(sash.cpu): ''bootp()server:path/sa(sash.cpu)'' is not a valid file to boot.

or

bootp()server:path/sa(sash.cpu): invalid
Unable to load bootp()server:path/sa(sash.cpu): ''bootp()server:path/sa(sash.cpu)'' is not a valid file to boot.

If you see any of these errors after specifying a remote distribution during a miniroot installation, the problem might be one of the following:


  1. Check server (the installation server name), path (the distribution directory), and cpu (the CPU number) to make sure that you have spelled them correctly and that they exist.

  2. Try to load the miniroot using the instructions in Chapter 3; you may see additional error messages that help you determine the cause of the problem.

  3. Check the inetd.conf file on the installation server. The line containing tftp should be modified and inetd should be restarted, as explained in "Setting Up an Installation Server".

  4. Check the inetd.conf file on each router between the target and installation server systems to verify that it has been modified, as explained in "Setting Up an Installation Server".

  5. Check Ethernet or other network cables and connections on the local and installation servers.

  6. Check the netaddr variable on the target to make sure that it is set correctly (see Step 3 on page 190).

  7. If possible, check the network connection to the installation server from a different system on the same network (see "Checking Network Connections").

  8. If the target system is a router (has multiple network connections) you might need to change its network connections so that the "normal" network device is connected to the same network as the installation server. Booting the miniroot is not supported over FDDI.

  9. If necessary, bring up IRIX on the target system and check the network connection (see "Checking Network Connections").

  10. Check the distribution directory as described in "Checking Distribution Directories and CDs".

  11. Perform any additional procedures that are necessary to identify the problem. See the section "Resolving Network Problems".

file file not found in server:path/sa; directory contains:
...
Unable to load bootp()...

or

File CDdir/dist/sa not found on server server
Unable to load bootp()server:CDdir/dist/sa(sash.cpu): no such file or directory

If you see either of these errors during a miniroot installation, possible causes are


  1. Examine the last command you entered and look for a spelling or capitalization error, the wrong CPU in the sash.cpu portion of the command, sash.IP12 rather than sashIP12, sash.IP17 rather than sashIP17, or any sash command with IP19, IP20, or IP22 rather than sashARCS. Enter the command again with the correct spelling.

  2. Check /var/adm/SYSLOG on the installation server to see whether it contains bootp messages. If SYSLOG contains bootp messages, bootp is running. The likely cause of the problem is that netaddr is set incorrectly on the target system.

  3. If the installation server has multiple network interfaces, try specifying the hostname for each interface alternately. This sometimes resolves routing problems. To display the hostname for each interface, give this command:

    % /usr/etc/netstat -i 
  4. The Address column in the output contains hostnames.

  5. Perform additional checks, as described in "Resolving Network Problems".

Installation tools not found at server:CDdir/dist

or

Installation tools not found at server:path

If you see either of these errors during a miniroot installation from a remote distribution source, the CD or distribution directory that you specified might not contain installation tools. To correct this problem, confirm that the distribution source contains the installation tools (the sa file).

bootp()server:path/sa/(sash.cpu) is not in a.out format

If you see this error message after you initiate miniroot loading, it may have one of the following causes:

Other Errors Loading the Miniroot

Unable to load dksc(cntlr,unit,8)sashcpu: file not found

or

dksc(cntlr,unit,8)sashcpu: invalid
Unable to load dksc(cntlr,unit,8)sash.cpu: file not found

or

open(bootp()server:CDdir/dist/sa(sash.cpu)) failed, errno=2
Unable to load bootp()server:CDdir/dist/sa(sash.cpu): file not found

If you see any of these errors during miniroot installation from CD, it may be that you are trying to load the miniroot from a CD that does not contain installation tools.

Switch to a CD that includes installation tools to load the miniroot, then switch back to your original CD.

root and swap are on the same partition. Either the system is misconfigured or a previous installation failed. If you think the miniroot is still valid, you may continue booting using the current miniroot image. If you are unsure about the current state of the miniroot, you can reload a new miniroot image. Finally, you may abort the installation and return to the PROM; in this case you will need to use the `fx' program to correct the disk label information. See the `Software Installation Guide' chapter on Troubleshooting for more information.
Enter `c' to continue booting the currently loaded miniroot.
Enter `r' to reload the miniroot.
Enter `a' to abort the installation.
Enter your selection and press ENTER (c, r, or a)

This error message only occurs with the Indy workstation. If you are not using an Indy workstation but see a similar error, refer to the discussion of the next error. This error (or the next one) occurs when you try to load the miniroot after a power failure or system restart has occurred during an installation. If you had attempted a system restart instead of loading the miniroot, you would have automatically been placed in the version of Inst that is in the previously-installed miniroot.

Take one of the following corrective actions:


  1. Enter c if you want to install software with the currently loaded miniroot.

  2. Enter r if you want to reload the miniroot. You might want to do this if, for example, the current version of the miniroot is corrupt, or if you want to load another version of the miniroot.

  3. Enter a to abort the installation and to go back to the command monitor. You can do this, for example, if you want to use the fx command to correct boot information and boot from the root partition, or if you want to abort the installation and restart the system.

Note: Entering c and quitting Inst fixes the boot information. You can then restart the system after Inst is loaded without using the fx command. If you are familiar with fx and want to use it, refer to the procedure documented in "Using fx to Restore the Swap Partition".

root and swap are on the same partition. This is most likely because a previous installation was in progress. If so, you may continue the boot into the miniroot. Otherwise the partition info needs to be corrected. Do you wish to continue booting (y or n)

This is the version of the previous error message for systems previous to the Indy. If you see this error after you have given the command to copy the miniroot to the swap partition, it may be that the power failed or the system was reset during installation, and the miniroot is still in the swap partition.

First decide whether you need to return to Inst to complete your installation, or are ready to restart your system. If you need to return to Inst, answer y to the question. The Inst Main Menu should appear, and you can finish your installation. To restart your system and cause it to boot normally from the root partition, enter n and you will be returned to the PROM Monitor where you can choose to start the system.

Errors While Starting an Installation Session

The error discussions in this section are grouped as follows:

In general, check what Inst is using as the default distribution. You may have to use the from command to point it at the desired distribution.

Wrong Diskless Modes

ERROR : Unable to start inst: /root appears to be a diskless client tree, since the file /root/var/inst/.client is present. If you are certain that /root is not a diskless client tree, remove the file /var/inst/.client and restart inst, otherwise restart in client mode using client_inst(1m).
ERROR : Unable to start inst: / appears to be a share tree for diskless client since the file /var/inst/.share is present. If you are certain that / is not a diskless share tree, remove the file /var/inst/.share and restart inst, otherwise restart in share mode using share_inst(1m). 

These messages mean that Inst believes that the target is a diskless client tree, because in a previous installation Inst was invoked in "diskless" mode reserved for the diskless installation tools share_inst(1M) and client_inst(1M).

If the target has been previously created as a diskless tree, then continuing with a normal (non-diskless) installation would severely corrupt the installed software. You should only attempt diskless installations using share_inst(1M) and client_inst(1M).

However, if you are certain that the target is not used for diskless installations, remove the files /var/inst/.share and /var/inst/.client (or, if in the miniroot, /root/var/inst/.share and /root/var/inst/.client). Then restart Inst.

If you are performing a miniroot installation, Inst will exit abnormally and prompt you to restart the system (y), enter Inst (n), or start a shell (sh). Choose sh:

Ready to restart the system? (y, n, sh) sh
# rm /root/var/inst/.share 
# rm /root/var/inst/.client
# exit

(You use only the /root prefix to the path for miniroot installations.) Then return to Inst:

Ready to restart the system? (y, n, sh) n
...
Inst> 

Errors Starting Live Installation

These errors occur when starting Inst from IRIX.

Sorry! The system is not set up for non-miniroot installations of all the selected subsystems, since the configuration file /var/inst/inst_special is missing. Try the installation again
from the miniroot.

You may not perform a live installation of some subsystems (labeled with b by the list command) without the inst_special configuration file present. If you are unable to obtain this file from another system, you must perform the installation from the miniroot.

Another inst is currently running

You may not have two copies of Inst running in read/write mode to the same target simultaneously. The second session is run in read-only mode.

Inst determines this by looking for a file called $rbase/var/inst/inst.lock. ($rbase is the root directory for the current software installation.) In rare cases, it may be necessary to remove this file by hand.

A previous installation session was not completed successfully.

This error means a previous version of Inst was interrupted or killed before it completed all the actions requested by the user. Information on the state of the last session has been saved in the file $rbase/var/inst/.checkpoint. For more information on recovering from the checkpoint file, see "If Inst Is Interrupted".

Inst Library libinst.so Errors

The Inst products   -       inst, swmgr, showfiles and showprods   -       all link with the libinst.so dynamic object. If, when starting one of these programs, an rld error appears regarding libinst.so, it is probable that you have an incompatibility between the binary and libinst.so. In this situation, it is best to reinstall eoe1.sw.unix from the miniroot to get the latest versions of these products.
26379:inst: rld: Fatal Error: cannot map soname 'libinst.so' using any of the filenames
/usr/lib/libinst.so:/lib/libinst.so:/lib/cmplrs/cc/libinst.so:/usr/lib/cmplrs/cc/libinst.so:
-- either the file does not exist or the file is not mappable (with reason indicated in previous msg)

This error message means the libinst.so file is missing.

852:swmgr: rld: Error: unresolvable symbol in swmgr:
post__15VkDialogManagerFPCcPFP10_WidgetRecPvT2_vN22PvT1P14VkSimpleWindow

This error message indicates that the libinst.so file is present but not the right version.

Errors in the Distribution

ERROR : No such host: host

This error can appear after executing a command that requires access to a distribution through the network.

The most likely cause is a bad hostname. Check the hostname and use the from command to set the correct distribution location.

If the host name appears correct and there was a delay before the error message appeared, it is possible that your system is experiencing network problems. See the section "Resolving Network Problems" for information on resolving this problem.

ERROR : The distribution dist:/pathname does not exist.

This error occurs when a command attempts to reference the distribution, but the distribution path references a non-existent directory or a product file. For example:

Inst> from dist:/sgi/baddir
Connecting to dist ...
ERROR : The distribution dist:/sgi/baddir does not exist.
Inst> from dist:/sgi/hacks/badprod
Connecting to dist ...
ERROR : The distribution dist:/sgi/hacks/badprod does not exist.
Inst> from /host/dist/sgi/baddir
ERROR : The distribution /host/dist/sgi/baddir does not exist.

Determine the correct pathname and use the from command to set the correct distribution location.

ERROR : The product host:/path/sc is bad.

This error occurs if the distribution specified references a file that is not a valid product file. For example:

Inst> from /usr/tmp/file
ERROR : The product /usr/tmp/file is bad.

Note that when referencing an individual product, the product file must be used. In the following error, the product was incorrectly specified using the idb file:

Inst> from dist:/sgi/hacks/sc.idb
Connecting to dist ...
ERROR : The product dist:/sgi/hacks/sc.idb is bad.

The product sc should be specified as follows:

Inst> from dist:/sgi/hacks/sc
Connecting to dist ...
ERROR : The distribution host:/path does not contain any products.

This error results when the distribution directory specified does not contain any product files. You must specify the correct distribution directory.

Missing products in listing

If a product prod appears in a distribution directory along with its idb file (prod.idb) and image files (prod.image ...), but does not appear in the product listing in Inst, then the product files may be corrupt.

Use ls to make sure that the product files are in the distribution directory. Make certain that you are viewing all of the products in the distribution by executing the following commands:

Inst> view dist
Current Location : distribution
Inst> view filter all
Inst> list

If the product is still not visible, the product was not read in and the product files are probably corrupt. See the section "Checking Distribution Directories and CDs" for more information.

Pre-Installation Check Errors

When you give the go command, Inst executes the pre-installation check before installing any files. If any errors are detected during this check, Inst lists the problems and returns to the main menu without installing or removing software.

Not enough space on / for the new unix kernel
Not enough space on /usr for requickstart overhead (see rqs(1))
Not enough space on /usr for the installation overhead
Not enough space on / (additional 85kbytes required)

These errors mean that you need to make more disk space available (in these examples on the / and /usr filesystems), or select fewer subsystems for installation.

Note: A live installation usually requires extra temporary disk space. Because some of the files to be upgraded are currently in use, either by the operating system or by running applications, Inst must maintain multiple copies of these files during a live installation and, in some cases, until you reboot the computer.

If you are running a live installation, you may encounter a situation where there is enough available disk space for all the new software, but not enough additional temporary disk space to accomplish the installation. In this situation, try closing some applications, and then giving the "go" command again. If there is still not enough space, you may have to run the installation in the miniroot.

The installation request will install or remove files in the following nfs-mounted filesystems:
 /filesystem
Please cancel or confirm the request.
1. Cancel the installation request
2. Continue the installation request
Please enter a choice [1]: 

Inst issues these warnings to protect against accidental installation of files into NFS mounted directories. Normally software installations are made on the local host. If you really want to install across an NFS mount, choose 2, otherwise cancel the installation (1), return to the Main Menu, and use the keep command to install fewer subsystems.

Note: To disable this confirmation, set the preference confirm_nfs_installs to off.

directory /pathname is write-protected
nfs-mounted directory /pathname is write-protected
filesystem /pathname is mounted read-only
nfs-mounted filesystem /pathname is read-only

Any of these messages mean that you lack the appropriate permission to install all the files in the selected products.

This is usually an indication that you are using NFS to share filesystems on a remote host, and some of the subsystems selected for installation install files into those remote filesystems.

Check your selections to make sure you are not installing or removing "shared" software such an online books or manual pages. Use the keep command to de-select those products.

Errors While Installing and Removing Software

This section contains the following subsections:

These errors cause the following Error/Interrupt menu to appear automatically:

Error/Interrupt Menu
 1. retry              Retry the failed operation
 2. stop               Terminate current command
 3. continue           Continue current command
 4. set [preferences]  List all preferences or set/clear a preference
 5. help [topic]       Get help in general or on a specific word
 6. sh [cmd]           Escape to a shell or run a command
 7. shroot [cmd]       Escape to a chrooted shell or run a command
Interrupt>

If the pre-installation check completes without errors, Inst begins installing and removing files. If an error occurs after this point, Inst stops and presents the interrupt menu. First try to correct the cause of the error, and then choose retry from the interrupt menu.

If this doesn't work, or you are unable to correct the problem, you can choose stop to cancel the installation immediately and return to the main menu.

Caution: If you stop the installation, the current image in progress (such as eoe1.sw) will be in an inconsistent state (partially installed/removed). The installation history will not have been updated for these subsystems (eoe1.sw.*). You are strongly advised to either re-install these products (just select Go at the main menu to re-start the installation from the beginning of the partial image) or, for products not marked "required," remove them completely.

Disk Space Errors

Despite efforts to accurately predict the required disk space, Inst may occasionally fail during the installation with an error such as:

ERROR : An error occurred while Installing new versions of selected product subsystems
Write of pathname failed: No space left on device

This produces the Error/Interrupt menu (see above). Use the shroot command to enter the shell. Remove or compress unnecessary large files, exit the shell, and retry the operation. If you are unable to locate any expendable files, stop the installation and choose fewer subsystems for installation. For example:

Interrupt> shroot
# df
Filesystem  Type  blocks     use  avail %use Mounted on
/dev/root    efs 1939714 1939702     12 100% /
# ls -l /usr/tmp/core.*
-rw------ 1 guest guest 20971520 Oct 20 01:00 /usr/tmp/core.0
-rw------ 1 guest guest 0        Oct 20 01:00 /usr/tmp/core.1
-rw------ 1 guest guest 3145728  Oct 20 01:01 /usr/tmp/core.3
# rm /usr/tmp/core.0 /usr/tmp/core.1
# compress /usr/tmp/core.3
# df
Filesystem Type   blocks     use avail  %use Mounted on
/dev/root   efs  1939714 1892566 47148  98%  /
# exit
Interrupt> retry
Installing new versions of selected pv.man subsystems
Installing new versions of selected pv.sw subsystems

If there is still not enough disk space, consider the possibility that you may not need some large files on your workstation. The list below gives filenames relative to root, but remember that if you are doing a miniroot installation, /root must be prepended to each of the filenames if you escape to the shell with sh. If you escape to the shell with shroot or are using IRIX Installation, use the filenames as given. Look for these large files:

Sub-Command and Exitop Errors

As part of the installation procedure, Inst executes sub-commands. These are UNIX shell commands that perform special initialization functions specific to each product. For example, some products use sub-commands to install a custom icon in the system Icon Catalog. Some sub-commands, called exit-commands, or exitops, run at the end of the installation, and sometimes originate from more than one subsystem.

Stderr: Cannot create pathname: No such file or directory 
ERROR : An error occurred while Installing new versions of selected product subsystems
Command "command"

If a sub-command fails during the installation of a specific product, an interrupt menu is also presented. The sub-commands that run at the end of the installation, during the "Exit-Commands" phase, may affect multiple subsystems. Inst displays any errors from these "exitops" but does not present the interrupt menu.

If an interrupt menu is presented, try to gauge from the error message the cause and severity of the problem. The error could indicate that the affected product won't function completely or correctly, or that the system might fail to boot. Decide whether to ignore the error and continue, to fix the problem and retry, or stop and return to the Inst main menu.

Consult the release notes of any affected product for further information. For example, the release notes may specify a particular order in which the software subsystems must be installed in order to function properly.

Network Timeout Errors

Connecting to host ...
host.domain: Interrupted system call
Host host is not responding, retrying
host.domain: Interrupted system call
Host host is not responding, retrying
host.domain: Interrupted system call
ERROR : Timed-out waiting for host

Inst presents the Error/Interrupt menu. See the section "Resolving Network Problems" to determine the cause of the network failure. You may need to continue the installation at a later time, depending on the availability of that host.

If the network is merely slow, or the server is heavily loaded, use the set command to raise the value of the timeout and/or network_retry preferences.

Archive Corrupt Errors

File filename not in compressed format
Compressed input file is corrupt (internal overflow)
Unexpected EOF
Can't open archive: archive
Archive archive is in an unrecognized format
Archive archive is corrupt

Inst is unable to properly extract files from the software distribution, which is compressed in a special format. If you are installing over a network, check the system logs for signs of network errors (see "Resolving Network Problems").

If you are performing a live installation, you may need to use a newer version of the installation tools, since older versions of Inst cannot always read more recent software distributions (but not vice-versa) if the distribution format is upgraded. Use Inst from the miniroot, preferably the miniroot that accompanies the software upgrade you are trying to install.

Device Busy Errors

filesystem: Device Busy

There may be a file open in the named filesystem if you get this error. Quit Inst and then re-invoke it to force it to close the open file. For example, if you were trying to unmount all filesystems from Inst Admin:

Admin> umount -a
</root/usr: Device Busy error messages>
Admin> return
Inst> quit
Ready to restart the system. Restart? { (y)es. (n)o, (sh)ell, (h)elp } n
Inst> admin
Admin> umount -a

Errors On Leaving an Installation Session (RQS Errors)

An error has occurred while requickstarting your system. No loss of functionality occurred.

A requickstart failure simply indicates that some files were not requickstarted. The net effect is that the startup time of the failed binary will be slightly slower than had it been successfully requickstarted. The error message will also provide the name of a log file where there is a detailed explanation of the RQS error(s). See rqs(1) for a detailed explanation of requickstart.

Sproc of /usr/etc/rqsread failed
Sproc of /usr/etc/rqsall failed
/usr/etc/rqsread terminated abnormally
/usr/etc/rqsall terminated abnormally

These messages indicate that you probably need to upgrade your system to get newer versions of these files.

/usr/etc/rqsread terminated abnormally due to signal #
/usr/etc/rqsall terminated abnormally due to signal #

These messages indicate that the named process was killed due to a signal. The relevant signal number will be provided so it will be possible to determine the cause of the termination.



Send feedback to Technical Publications.

Copyright © 1997, Silicon Graphics, Inc. All Rights Reserved. Trademark Information