Note of the day...

I have a laptop on which I run Linux. When I first bought it, it had pre-installed Windows OS on in, and, in general, manufacturer only guaranteed support for Windows operating system. However, due to nature of my work, I simply had to reinstall it and put Linux OS instead. I picked up Ubuntu distribution simply because I have been using it for some time, but I strongly believe most of the things mentioned below would be same or very similar with any other Linux distribution - Debian or RedHat based.

Even though almost everything worked right out of the box, I did have some issues with graphical card. Card I have installed is NVIDIA GeForce GT 540M (yeap, laptop is a bit old by now). During system installation process I was able to find, download and install proper legacy binary drivers for it and it performed beautifully.

However, from time to time, update comes along that simply messes up graphical card drivers which renders it unusable. Symptom is usually always the same one: system starts booting up normally, even the login screen is displayed perfectly, but then, after successful login (you can hear appropriate sounds confirming successful login) screen completely goes blank. As if you simply shut off the monitor. Clear indicator that graphical card drivers are not fine.

It happened to me few times already so, when it happened last night, I simply decided to record quick steps how to recover from this problem so next time it happens, I can fix problem within minutes.

Of course, there are several different ways one can solve this problem, but after several times dealing with this issue, I found the following procedure the quickest.

Step 1: Reboot machine and boot in textual mode

To boot machine in textual mode simply wait for bootloader to display boot options, use arrows to select latest option available and then press 'e' key to enable boot option edit.

In options, find line with kernel options and replace 'quiet splash' option with 'text'.

In some cases, instead of 'text' option 'nomodeset' might work and should be able to start graphical interface in default mode, but in situation like this, when I have problematic graphical driver, I always prefer text mode only.

Step 2: Remove graphical drivers you have on machine

Since problem is wrong / faulty graphical driver, my quickest solution was to simply remove all installed drivers. In my case, I simply do:

sudo apt-get remove nvidia-*

If you wish, you can always dig a bit deeper and find exactly which driver is to blame and simply remove that installation. For me, last time culprit were nvidia-319-updates so I would have fixed problem by:

sudo apt-get purge nvidia-common nvidia-settings-319-updates nvidia-319-updates

Whichever option you have chosen, after removing wrong / faulty drivers, you should be able to move on.

Step 3: Reboot machine and install correct drivers

Simply reboot machine. It should be able to fully boot up using default graphical drivers. Sure, picture you get might not be the perfect one but at least it will get you to your home ground and all familiar GUI tools you use.

Use any tool to locate and install correct drivers for your graphical card (e.g. Additional Drivers). After installing and applying correct drivers, everything should get back to normal. I also tend to do one more system reboot just to make sure my changes were not temporary and that system will indeed now work as expected.

Veritas Cluster Server (also known as VCS) is a High-availability cluster software, for Unix, Linux and Microsoft Windows computer systems, created by Veritas Software (now part of Symantec). It provides application cluster capabilities to systems running other applications, including databases, network file sharing, and electronic commerce. As I'm working on it, here is short list of most important commands with short descriptions...

List is compiled more-less from VCS official pages and documents - so you won't find anything new her but merely short organized that I found to be the most useful in my case.

Basic commands

Cluster deamons and log files

Command	Description
had	High Availability Daemon
hashadow	Companion Daemon
Agent	Resource Agent daemon
CmdServer	Web Console cluster managerment daemon
/var/VRTSvcs/log	Log Directory
/var/VRTSvcs/log/engine_A.log	Primary log file (engine log file)

Cluster status

Command	Description
hastatus	Continually monitor cluster and display relevant information
hastatus -sum	Display cluster summary
hastatus -display	Verify the cluster is operating

Cluster details

Command	Description
haclus -display	Display information about a cluster
haclus -value	Display value for a specific cluster attribute
haclus -modify	Modify a cluster attribute
haclus -enable LinkMonitoring	Enable LinkMonitoring
haclus -disable LinkMonitoring	Disable LinkMonitoring

Starting and stopping the cluster

Command	Description
hastart [-stale\|-force]	"-stale" instructs the engine to treat the local config as stale "-force" instructs the engine to treat a stale config as a valid one
hastart [-onenode]
hasys -force	Bring the cluster into running mode from a stale state using the configuration file from a particular server
hastop -local	Stop the cluster on the local server but leave the application/s running, do not failover the application/s
hastop -local -evacuate	Stop cluster on local server but evacuate (failover) the application/s to another node within the cluster
hastop -all -force	Stop the cluster on all nodes but leave the application/s running

System operations

Command	Description
hasys -add	Add a system to the cluster
hasys -delete	Delete a system from the cluster
hasys -modify	Modify a system attributes
hasys -state	List a system state
hasys -force	Force a system to start
hasys -display [-sys]	Display the systems attributes
hasys -list	List all the systems in the cluster
hasys -load	Change the load attribute of a system
hasys -nodeid	Display the value of a systems nodeid (/etc/llthosts)
hasys -freeze [-persistent][-evacuate]	Freeze a system (No offlining system, No groups onlining) Note: main.cf must be in write mode
hasys -unfreeze [-persistent]	Unfreeze a system ( reenable groups and resource back online) Note: main.cf must be in write mode

User operations

Command	Description
hauser -add	Add a user
hauser -update	Modify a user
hhauser -delete	Delete a user
hauser -display	Display all users

Dynamic Configuration Commands

The VCS configuration must be in read/write mode in order to make changes. When in read/write mode the configuration becomes stale, a .stale file is created in $VCS_CONF/conf/config. When the configuration is put back into read only mode the .stale file is removed.

Command	Description
haconf -makerw	Change configuration to read/write mode
haconf -dump -makero	Change configuration to read-only mode
haclus -display \| grep -i 'readonly'	Check what mode cluster is running in (0 = Write mode; 1 = Read only mode)
hacf -verify /etc/VRTSvcs/conf/config	Check the configuration file Note: you can point to any directory as long as it has main.cf and types.cf
hacf -cftocmd /etc/VRTSvcs/conf/config -dest /tmp	Convert a main.cf file into cluster commands
hacf -cmdtocf /tmp -dest /etc/VRTSvcs/conf/config	Convert a command file into a main.cf file

Service groups

Command	Description
haconf -makerw hagrp -add groupw hagrp -modify groupw SystemList sun1 1 sun2 2 hagrp -autoenable groupw -sys sun1 haconf -dump -makero	Add a service group
haconf -makerw hagrp -delete groupw haconf -dump -makero	Delete a service group
haconf -makerw hagrp -modify groupw SystemList sun1 1 sun2 2 sun3 3 haconf -dump -makero	Change a service group
hagrp -list	List the service groups
hagrp -dep	List the groups dependencies
hagrp -display	List the parameters of a group
hagrp -resources	Display a service group's resource
hagrp -state	Display the current state of the service group
hagrp -clear [-sys]	Clear a faulted non-persistent resource in a specific grp
# remove the host hagrp -modify grp_zlnrssd SystemList -delete # add the new host (don't forget to state its position) hagrp -modify grp_zlnrssd SystemList -add 1 # update the autostart list hagrp -modify grp_zlnrssd AutoStartList	Change the system list in a cluster

Service group operations

Command	Description
hagrp -online -sys	Start a service group and bring its resources online
hagrp -offline -sys	Stop a service group and takes its resources offline
hagrp -switch to	Switch a service group from system to another
hagrp -enableresources	Enable all the resources in a group
hagrp -disableresources	Disable all the resources in a group
hagrp -freeze [-persistent]	Freeze a service group (disable onlining and offlining)
hagrp -unfreeze [-persistent]	Unfreeze a service group (enable onlining and offlining)
haconf -makerw hagrp -enable [-sys] haconf -dump -makero	Enable a service group. Enabled groups can only be brought online.
haconf -makerw hagrp -disable [-sys] haconf -dump -makero	Disable a service group. Stop from bringing online.
hagrp -flush -sys	Flush a service group and enable corrective action

Resources

Command	Description
haconf -makerw hares -add appDG DiskGroup groupw hares -modify appDG Enabled 1 hares -modify appDG DiskGroup appdg hares -modify appDG StartVolumes 0 haconf -dump -makero	Add a resource
haconf -makerw hares -delete haconf -dump -makero	Delete a resource
haconf -makerw hares -modify appDG Enabled 1 haconf -dump -makero	Change a resource
hares -global	Change a resource attribute to be globally wide
hares -local	Change a resource attribute to be locally wide
hares -display	List the parameters of a resource
hares -list	List the resources
hares -dep	List the resource dependencies

Resource operations

Command	Description
hares -online [-sys]	Online a resource
hares -offline [-sys]	Offline a resource
hares -state	Display the state of a resource( offline, online, etc)
hares -display	Display the parameters of a resource
hares -offprop -sys	Offline a resource and propagate the command to its children
hares -probe -sys	Cause a resource agent to immediately monitor the resource
hares -clear [-sys]	Clearing a resource (automatically initiates the onlining)

Resource types operations

Command	Description
hares -online [-sys]	Add a resource type
hatype -delete	Remove a resource type
hatype -list	List all resource types
hatype -display	Display a resource type
hatype -resources	List a particular resource type
hatype -value	Change a particular resource types attributes

LLT and GRAB

VCS uses two components, LLT and GAB to share data over the private networks among systems. These components provide the performance and reliability required by VCS.

LLT (Low Latency Transport) provides fast, kernel-to-kernel comms and monitors network connections. The system admin configures the LLT by creating a configuration file (llttab) that describes the systems in the cluster and private network links among them. The LLT runs in layer 2 of the network stack.

GAB (Group membership and Atomic Broadcast) provides the global message order required to maintain a synchronised state among the systems, and monitors disk comms such as that required by the VCS heartbeat utility. The system admin configures GAB driver by creating a configuration file (gabtab).

LLT and GAB files

Command	Description
/etc/llthosts	The file is a database, containing one entry per system, that links the LLT system ID with the hosts name. The file is identical on each server in the cluster.
/etc/llttab	The file contains information that is derived during installation and is used by the utility lltconfig.
/etc/gabtab	The file contains the information needed to configure the GAB driver. This file is used by the gabconfig utility.
/etc/VRTSvcs/conf/config/main.cf	The VCS configuration file. The file contains the information that defines the cluster and its systems.

Gabtab entries

Example entries

/sbin/gabdiskconf - i /dev/dsk/c1t2d0s2 -s 16 -S 1123

/sbin/gabdiskconf - i /dev/dsk/c1t2d0s2 -s 144 -S 1124

/sbin/gabdiskhb -a /dev/dsk/c1t2d0s2 -s 16 -p a -s 1123

/sbin/gabdiskhb -a /dev/dsk/c1t2d0s2 -s 144 -p h -s 1124

/sbin/gabconfig -c -n2

Command	Description
gabdiskconf	-i Initialises the disk region -s Start Block -S Signature
gabdiskhb (heartbeat disks)	-a Add a gab disk heartbeat resource -s Start Block -p Port -S Signature
gabconfig	-c Configure the driver for use -n Number of systems in the cluster

Command

Description

gabdiskconf

-i   Initialises the disk region
-s   Start Block 
-S   Signature

gabdiskhb (heartbeat disks)

-a   Add a gab disk heartbeat resource
-s   Start Block
-p   Port
-S   Signature

gabconfig

-c   Configure the driver for use
-n   Number of systems in the cluster

LLT and GAB commands

Command	Description
lltstat -n	Verifying that links are active for LLT
lltstat -nvv \| more	Verbose output of the lltstat command
lltstat -p	Open ports for LLT
lltstat -c	Display the values of LLT configuration directives
lltstat -l	Lists information about each configured LLT link
lltconfig -a list	List all MAC addresses in the cluster
lltconfig -U	Stop the LLT
lltconfig -c	Start the LLT
gabconfig -a	Verify that GAB is operating Note: port a indicates that GAB is communicating, port h indicates that VCS is started
gabconfig -U	Stop the GAB
gabconfig -c -n	Start the GAB
gabconfig -c -x	Override the seed values in the gabtab file