Portlock Storage Manager

NetWare SnapShot Backup and Portlock Online Imaging

Contents

  • Introduction
  • Novell SnapShot Backup
  • Portlock Online Imaging
  • Common reasons that NetWare SnapShot Backup or Portlock Online Imaging fails
  • NSS Pools must be in a stable and warning free state for NetWare's Snapshot feature to work correctly
  • If you are seeing pool errors during an Online Image
  • Portlock Storage Manager Log files
  • NSS /PoolVerify Log files
  • NSS /PoolRebuild Log files
  • Reporting issues to Portlock

Introduction

Data backup and Recovery is probably one of the most essential tasks as a system administrator. In the event of a disaster, a successful recovery of data and services will determine whether you will be getting your raise or you will be looking for a new job. Yet, many administrator do not pay much attention to backup and recovery strategy. Many put absolute faith in the backup software logs that declares that the backup is completely successfully for the day. However, we forget that a successful backup is not the end goal but a successful restore in an event of a disaster is the end goal. Herein lies the trap of false security. In the event of disaster, without a clear recovery and tested plan, administrator may spend the next 2 days trying to figure why data cannot be recovered. It does not help that your users and boss are breathing down your neck.

Novell SnapShot Backup

Novell SnapShot Backup feature alleviates problems encountered when backing up open files. SnapShot creates a virtual image or snapshot of a volume at a particular point in time. Once the snapshot is created, further changes are stored as "deltas" from the snapshot so that delta data plus the active volume can create a complete replica of data storage. Novell's innovative SnapShot approach allows a pool snapshot to be created in 10 to 20 percent of the size of the original pool and in a minimum amount of time. NetWare can support up to 500 active snapshots on a given NSS storage pool (500 different snapshots in time on the disk).

Snapshots provide an instant copy of volume's data that otherwise would be difficult to backup because of open files. Novell's "freeze/thaw" technology provides a consistent data set that facilitates non-disruptive backups. As contrasted to a traditional, full-data copy of the pool, the block-level copy only takes a moment to create and occurs transparently to the end user.

The Novell freeze/thaw interface manages snapshot events so that applications (databases, GroupWise®, etc.) are informed that a snapshot is about to take place. Applications then ready file system data by getting it consistent and flushing pending transactions. The application indicates its frozen or ready state to the NSS system where buffers are flushed and open disk files are rendered consistent. The snapshot then takes place in less than two seconds and the system indicates that the applications are free to "thaw" and continue. No longer do administrators have to take down a database or mail server to get a consistent backup. Freeze/thaw interfaces are published and being consumed by Novell and its third party partners so that snapshot solutions provide consistent data when done in a SAN or storage array as well as at a host.

Snapshots can be used as part of a disaster recovery plan and archived for online or near-line access. Applications or users can proceed to work from the snapshot in the event that the original files are inaccessible.

Portlock Online Imaging

Portlock Online Imaging provides block based imaging for NetWare 6.5 SP3 or later NSS Pools while they are active with mounted volumes. Portlock Online Imaging depends upon Novell SnapShot Backup. An NSS Pool, which can be empty, is used by Novell SnapShot Backup to store the deltas during an image command. Portlock Storage Manager automatically freezes the selected pools, creates the snapshots, images the pools, thaws the selected pools and finally deletes the snapshots.

Common reasons that Online Imaging fails

  • The Pool is corrupted.
  • The Pool holding the snapshot is corrupted.
  • The Pool has too many deleted files. Consider purging the volumes within the pool.
  • The Pool does not have sufficient free space. Cleanup the volumes within the pool.
  • There is too much I/O activity on the Pool. NetWare must duplicate data that is modified on the pool during an online image. Consider moving some application's datasets to other volumes to balance pool I/O.
  • There is not enough free space on the pool holding the snapshot.
  • There are hardware problems causing I/O errors on either the Pool or the pool holding the snapshot.

NSS Pools must be in a stable and warning free state for NetWare's Snapshot feature to work correctly

  • Check your Pool:
    • Run "Check Pool" from Portlock Storage Manager.
      • This command is located under the "Pool Commands" menu.
      • Add the command line option "-logfile=filename", without the quotes when starting Portlock Storage Manager.
      • Specify the full path to the log file (example: -logfile=C:/STORMGR.LOG).
      • You can specify a floppy so that nothing on the server is modified (-logfile=A:/STORMGR.LOG).
      • Do not specify a volume located on a pool that you are "checking" as the log file would be closed before the full results of the Pool Check could be written.
  • Verify your Pool:
    • Execute Novell's "nss /poolverify" command from the console. Select the pool to be verified. See the notes below about log files.
  • Rebuild your Pool:
    • If there are any warnings or errors from the above commands run a "nss /poolrebuild" and then repeat the check and the verify.

If you are seeing pool errors during an Online Image

  • Reboot the server so that everything is in a stable state.
  • Consider purging the volumes in the pool. We have seen a number for issues (NetWare bugs) when there are a lot of "unpurged" files.
  • Manually create a snapshot and verify both the original pool and the snapshot pool:
  • Assuming that your problem pool is called "SYS" and you have another pool called "TEST" to store the snapshot, execute the following commands from the NetWare console:
    • mm snap list - This will display any snapshots on the server. This should be an empty list
    • mm snap create sys test sys_snap - This creates a new snapshot called "SYS_SNAP", stores the temporary pool data on pool "TEST". The original pool is called "SYS". Change the names according to your setup.
    • mm snap list - Verify that your snapshot was created successfully.
    • mm snap activate sys_snap - This activates the snapshot pool called "SYS_SNAP".
    • nss /PoolVerify=SYS - This will verify the active pool "SYS".
    • nss /PoolVerify=SYS_SNAP - This will verify the snapshot of pool "SYS"
    • When Portlock Storage Manager is performing an "Online Image" of a pool, SYS_SNAP is the pool being imaged.

Make sure that you don't have pool corruption in the pool that is holding the snapshot. In the above example, compete an "nss /PoolVerify" on the "TEST" pool.

If you see any warnings or errors with the above (and they are not corrected by a Pool Rebuild), report this to Novell as you have a setup that does not support snapshots correctly or there is a bug in Novell's code supporting snapshots.

To see the help screen for NetWare's "mm" commands type "help mm" at the console.

To delete the above snapshot: "mm snap delete sys_snap".

Portlock Storage Manager Log files

Portlock Storage Manager supports creating a log file for details of various warnings and errors. This is very important for commands such a Pool Check. Add the command line option "-logfile=filename", without the quotes. Specify the full path to the logfile (example: -logfile=C:/STORMGR.LOG). You can specify a floppy so that nothing on the server is modified (-logfile=A:/STORMGR.LOG) Do not specify a volume located on a pool that you are "checking" as the log file would be closed before the full results of the Pool Check could be written.

NSS /PoolVerify Log files

NetWare stores the results of a Pool Verify in a log file that starts with the pool name and ends with the suffix VLF (Verify Log File). Each time you run a Pool Verify the results are appended to this log file. If you are sending this log file to Portlock, delete the log files first so that the log file only contains information about this issue being analyzed. We will not review log files with the results of multiple Pool Verifies stored within them.

NSS /PoolRebuild Log files

NetWare stores the results of a Pool Rebuild in a log file that starts with the pool name and ends with the suffix RLF (Rebuild Log File). Each time you run a Pool Rebuild the results are appended to this log file. If you are sending this log file to Portlock, delete the log files first so that the log file only contains information about this issue being analyzed. We will not review log files with the results of multiple Pool Rebuilds stored within them.

Reporting issues to Portlock

  1. Send the NSS /PoolVerify log file. If reporting an Online Image issue, perform a Pool Verify on both the original pool and the snapshot pool. See above for manually setting up a snapshot.
  2. Send the NSS /PoolRebuild log file.
  3. Send the Portlock Storage Manager log file from the Pool Check command.
  4. In most cases we will also need an image of the problem pool. Contact Portlock Support to setup a debugging session and an upload account for you.. You will need to create an "offline image" using Portlock Storage Manager and then upload this to our FTP server.
    • Create a file based image of the problem pool.
    • This image should be an "offline image" meaning that the pool will be deactivated during the image.
    • Do not purge the volumes in the pool prior to creating the image (this would significantly modify the pool data structures). If you do decide to purge the volumes, perform the Pool Check and Pool Verify after purging and verify that you still have a problem.
    • Do not use image compression during the image command.
    • Set the file span size to be 50 MB or less so that the image files are manageable.
    • Verify that you have a valid image set. Perform a Restore, but press F-5 for each image object. Portlock Storage Manager will then read back the data from the image files and verify that there is no corruption. (We have had a number of customer who only sent the first file of an image set and we wasted time trying to figure out what went wrong).
    • Compress each image file using PKZIP after the image completes (this helps us verify that there was no file corruption during upload / download).
    • Upload the image files to our FTP server (you will need an FTP upload account from Portlock).