Solaris Express - Static IPs the Right Way

If you search the Internet, you'll likely strike dozens of articles describing how to set static IPs on Solaris Express by disabling NWAM (NetWork AutoMagic) and enabling the legacy physical interface service. I've been building a Solaris Express server to be some clustered storage for a test environment, and no matter what I did, NWAM would be enabled after a reboot, and I'd have no connectivity. I have a better suggestion; Don't do that, do it properly.

Instead, run up the nwamcfg utility and configure your interfaces as your very own Network Configuration Profile (ncp) there. You'll need a physical interface definition (controls your packet sizes, interface speeds etc) and an ip interface definition (controls IPv4/IPv6, IP address, subnet mask and gateway).

The nwam toolset lets you configure almost any reasonable environment. When you create an object, you get a chance to walk through what amounts to a text-mode wizard to set properties for the object - unless otherwise noted in bold, the example here accepts the default option for each property - the one listed in the round brackets. With that in mind,  Let's configure a machine with a single Intel Pro/1000 CT interface:

# nwamcfg
nwamcfg> create ncp CorpNet
nwamcfg:ncp:CorpNet> create ncu phys e1000g0
Created ncu 'e1000g0'.  Walking properties ...
activation-mode (manual) [manual|prioritized]> prioritized
enabled (true) [true|false]>
priority-group> 0
priority-mode [exclusive|shared|all]> shared
link-mac-addr>
link-autopush>
link-mtu>
nwamcfg:ncp:CorpNet:ncu:e1000g0> end
Committed changes
nwamcfg:ncp:CorpNet> create ncu ip e1000g0
Created ncu 'e1000g0'.  Walking properties ...
enabled (true) [true|false]>
ip-version (ipv4,ipv6) [ipv4|ipv6]>
ipv4-addrsrc (dhcp) [dhcp|static]> static
ipv4-addr> 10.0.1.2
ipv4-default-route> 10.0.1.1
ipv6-addrsrc (dhcp,autoconf) [dhcp|autoconf|static]> dhcp,autoconf,static
ipv6-addr> fd27:c503:709b:9d46::11:a4d4
ipv6-default-route>
fd27:c503:709b:9d46::11:1
nwamcfg:ncp:CorpNet:ncu:e1000g0> end
Committed changes
nwamcfg:ncp:CorpNet> end

Then activate the User NCP using nwamadm - and hey presto, your IP configuration is set:

# nwamadm enable -p ncp CorpNet
Enabling ncp 'CorpNet'

As a bonus - you can easily disable IPv6 by setting the "ip-version" property to "ipv4". Or enable only IPv6, if that's your preference. nwamcfg will only ask questions that make sense (so you might see different prompts if you're building a different configuration). If you don't configure an interface with nwam, it stays down.

Total time to configure a simple network with static IPs? About 30 seconds, once you're used to it (and yes, it's all scriptable, naturally).

My iSCSI interfaces are now IPv4 only with 9K jumbo frames enabled. My NAS interface is a Link Aggregation (LACP) of two NICs, using dladm to create/manage, and static IPv4 and IPv6 addresses. And it all starts up with the right configuration. Finally.

Further Reading

  1. Manual page for nwamcfg - http://download.oracle.com/docs/cd/E19963-01/html/821-1462/nwamcfg-1m.html
  2. Manual page for nwamadm - http://download.oracle.com/docs/cd/E19963-01/html/821-1462/nwamadm-1m.html
  3. NWAM Configuration and Administration (Overview) - http://download.oracle.com/docs/cd/E19963-01/html/821-1458/giyfo.html#scrolltoc
  4. NWAM Configuration Tasks - http://download.oracle.com/docs/cd/E19963-01/html/821-1458/giwtf.html#scrolltoc

Things You DON'T Want To Do

  1. Disable the nwam service - svcs disable svc:/network/physical:nwam
  2. Enable the network physical service - svcs enable svc:/network/physical:default
Posted by davidr with no comments
Filed under: ,

Improve your Hyper-V Virtual Availability - Live Migrate VMs on Shutdown

Hyper-V clustering is a pretty rock solid thing, and Live Migration (introduced as we all know with Server 2008 R2) is virtually identical to VMWare's long-available VMotion technology - pick up a running VM, and move it to another host in the cluster without users noticing. Generally speaking you might see a small hiccup - one ping lost as the machine stops on one host and starts on another:

Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Request timed out
Reply from 10.67.1.141: bytes=32 time=133ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127

But if you shut down a cluster host, say, because you're deploying a Windows update, or a new version of a backup or monitoring client, the situation is different. Windows will use Quick migration to move the virtual machines from one host to another - and Quick Migration is nothing like VMotion and Live Migration.

Instead of copying the VM memory and processor state across the network, the virtual machine is saved (to your SAN) on one host, then restored from that saved state on another. The difference is obvious:

Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Request timed out
Reply from 10.67.1.141: bytes=32 time=133ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127
Reply from 10.67.1.141: bytes=32 time<1ms TTL=127

That save and restore process can take anywhere from 25 to 90 seconds, during which time the VM is off the network.

Your remote desktop session? Dropped.

Your Outlook client? Offline.

Your open files on the file server? Connection lost, better hope you can save elsewhere.

You'd think there would be a better way. And there is - PowerShell.

12 lines of PowerShell (excluding white space) and Group Policy is all you need to take all the virtual machines on your host and distribute them across the cluster:

First we get the local computer name and use it to suspend the Cluster service on this computer. This prevents it from taking over other cluster resources.

Then, we get a list of all the other nodes in the cluster with a state of "Up" meaning that they will be able to accept live migration requests. We also need a counter variable ($i) - we'll use this to keep track of the host to which we will move a VM.

Next, we get a list of all the Virtual Machine resource groups that are currently hosted by the server we're shutting down.

Finally we cycle through the list, moving virtual machines to each of the other hosts in order. Once we're done, we resume the cluster service (note that because this is intended to run as a shutdown script, resuming presents very little risk of having groups moved TO this node in the short time between this script finishing and the host restarting).

Having written the script, we save it to a known location on each cluster node (perhaps C:\Scripts\Evac-VMs.PS1).  At this point you have a choice to make; there are two options:

  1. Sign the script (you'll need a Code Signing certificate and private key, then: $cert = @(gci cert:\currentuser\CodeSigningCert)[0]; Set-AuthenticodeSignature Evac-VMs.ps1 $cert);
  2. Set the script execution policy to Unrestricted (Set-ExecutionPolicy -ExecutionPolicy Unrestricted)

I strongly recommend option 1 for security, but in a lab or low security environment (i.e. you WANT your hosts to be compromised) option 2 might be acceptable.

Finally, you need to configure Group Policy or Local Policy for the host to have a Shutdown Script. You'll find these settings under Computer Settings > Windows Settings > Scripts (Startup/Shutdown):

PowerShell Scripts tab (not the default Scripts tab) - set PowerShell scripts to run First (so that the live migrations are the first tasks executed during shutdown):

Add a new script and set the script path to your saved file:

It's worth noting that scripts executed by Group Policy do not need to be signed - they bypass the script execution policy settings. Nevertheless you're going to want to sign it so that:

  1. If someone changes the script you will know when you run it manually;
  2. You can test the script;
  3. You can use it for reasons other than shutting down a host.

That's all that needs to be done - now it's testing time. I've attached the (unsigned) script so you can just save it, sign it and test it.

SCCM 2007 and Windows 2008 - WebDAV Fix

Installing SCCM 2007 SP2 + R2 today on a Windows 2008 R2 server I came across what appears to be a common problem; the SMS Site Component Manager fails to install the SMS_MP_CONTROL_MANAGER component due to a WebDAV error:

Severity: Error
Type: Detail
Component:  SMS_MP_CONTROL_MANAGER
Message ID: 4970
Description: SMS Site Component Manager faild to install component SMS_MP_CONTROL_MANAGER on server SCCMHOST.    The WebDAV server extension is either not installed or not configured properly.  Solution: Make sure WebDAV is installed and enabled. Make sure there is an authoring rule that allow "All users" read access to "All content". Make sure the WebDAV settings "Allow anonymous property queries" and "Allow property queries with infinite depth" are set to "true" and "Allow Custom Properties" is set to false.

"Ah-ha! I remember this one, I'll just go to the IIS Manager and set those properties" - but it doesn't work. Nothing, in fact, seems to help - setting at the site level, setting at the web server level, SCCM is still sitting there complaining that the properties are not configured correctly.

At this point you hit the Internet. Literally dozens of sites recommend going to the %SystemRoot%\System32\InetSrv\Config\Schema folder, taking ownership of the schema file WebDAV_Schema.XML and modifying the file with Notepad.

Stop! Put down the text editor and back away slowly.

Your first reaction on taking ownership of a system file should be to question whether you're doing the right thing or not.

Your second reaction should be about the same as the first, only in slow motion and with a distinctly curious expression on your face.

And the fact is that you don't need to do that - because you're modifying the default for every single web site on the server both now and forever more. Now, I'm not recommending that you run a web server on the same host as your SCCM environment, but think about this for a second; if Microsoft updates that schema file in a service pack or patch, you and your SCCM environment are screwed. Key point: you do not "own" the definition of the schema, you own the configuration that uses said schema.

Instead, do the right thing and take one of the following two actions.

  1. Think for a moment about the server you're working with, recall that Windows 2008 and Windows 2008 R2 enforce UAC; run the IIS console as an Administrator and make your changes to the WebDAV configuration. This is definitely the preferred recommendation!
  2. Use Notepad to edit the configuration file for the server rather than the schema - that'd be %SystemRoot%\System32\InetSrv\Config\applicationHost.config. Inside you will find the XML configuration for IIS, and you can change the section that reads:

        <location path="Default Web Site">
            <system.webServer>
                <webdav>
                    <authoring enabled="true">
                    </authoring>

    to:

        <location path="Default Web Site">
            <system.webServer>
                <webdav>
                    <authoring enabled="true">
                        <properties allowAnonymousPropfind="true" allowInfinitePropfindDepth="true" allowCustomProperties="false" />
                    </authoring>

Note: If there is an existing properties element, just set your configuration - don't add extra properties elements; it doesn't work that way.

Posted by davidr with no comments

Low Cost Cluster Storage for Windows Server 2008

Since the release of Windows Server 2008, there has been no way to build a cluster without purchasing dedicated, proprietary storage (such as an array from HP, Dell, EMC or NetApp, to name just a few). There are open source storage platforms like OpenFiler and FreeNAS; there is an iSCSI target in almost any Linux and BSD platform. Until very recently, none of these have passed the cluster storage tests that Windows 2008 executes before forming the cluster.

Without passing those tests, the cluster cannot be formed at all - and the culprit is normally the "SCSI-3 Persistent Reservation" tests, without which the cluster cannot operate (because it cannot guarantee that the disks are owned by only a single node at a time).

The forums for OpenFiler, OpenSolaris, FreeNAS, BSD and Linux distributions seem to be full of requests for the iSCSI targets to support Windows Server 2008 clustering (and also enable Solaris and OpenSolaris clustering). OpenFiler has been developing the feature, but the latest news I found is that they will be making it a premium (ie paid) feature. OpenSolaris has claimed to have support for months, but I was unable to make it work (despite updating the system past the build number in which support was added, snv_115).

And finally, today, I noticed that FreeNAS has released a new beta build, 0.7 RC1; and one of the key features is that it supports SCSI-3 Persistent Reservations.

I had to try it.

I set up a Hyper-V server (a Dell 2900 with 4GB of RAM and a 500GB RAID 10 set for VMs).

Then, I set up 3 VMs. Here's the plan for what it will look like, if it all works:



First I created a FreeNAS VM. 512MB of RAM, two CPUs, two legacy NICs (as FreeBSD does not yet support the Hyper-V extensions), a 1GB VHD for the "system" and a DVD-ROM drive from which to install. The installation was painless - two or three questions and it's done. Setting up the networks was confusing for a moment - the UI changes the text without highlighting the changes, so it took a few goes to sort it all out. I assigned a public IP to the public NIC and a private IP to the NIC on a new private network for iSCSI.

Then I deployed two Windows Server 2008 R2 Enterprise Edition hosts (the RC). Each has two legacy NICs with the same configuration (one public, one iSCSI). I configured IP addresses and ensured I could ping everything on both networks, then installed the File Server role and the Failover Cluster feature.

While they installed I switched back to configuring FreeNAS. I removed the DVD-ROM drive and added three new 128GB disks on the 3 available IDE controller slots. Then I loaded the FreeNAS config page and configured a RAIDZ1 (RAID 5 ZFS) device, a ZFS Pool using the device, a Portal group, an Initiator group and three extents, each of which provides a single iSCSI target (Quorum, File Server 1 and File Server 2). It sounds like a lot of work but it took less than 15 minutes.

Finally I was ready to connect the drives and hopefully build a cluster. On each server I connected the iSCSI targets, one at a time. On node 2 I brought each device online, initialized the disk and created a simple volume, assigning drive letters as needed. Then after taking the drives offline, I replicated the drive letter assignments on node 1.

With all those steps completed, the Failover Cluster Wizard can run. Normally the storage tests fail after a few moments, but in the FreeNAS case they succeeded - I have a working, validated file server cluster using free software for the storage and Windows 2008 Enterprise for the compute power.

The proof, as they say, is in the image:

Now you probably won't want to use this for a critical production system - FreeNAS is after all beta software. But for your lab environments, or running up a virtual cluster; this might just fit the bill perfectly.

Posted by davidr with 1 comment(s)
Filed under:

Replacing the Windows 7 Network Indicator

One of the things missing from Windows 7 is the network activity indicator in the system tray. It's been replaced with a static icon, supposedly to reduce power consumption on laptops and reduce visual clutter.

I'm not sure I believe the first reason, and the second is invalid because it could be left disabled as it was by default in Windows Vista.

Nevertheless, I've written a tiny .NET replacement for the icon. It's not identical - it's more visual than the original icons. It's also configurable so that the two indicators can show disk or disk activity instead of the network, or show error-type conditions - for example, show the light if the CPU is running at more than 95%.

The download is attached to this post - it's a small MSI. It's really an early beta version, but seems to run OK on my Windows 7 workstation and on the Vista development workstation I built it on.

Please post any bugs in the comments!

Posted by davidr with 2 comment(s)
More Posts Next page »