Cablecom hispeed business blocks GRE packets

Post by: Lukas Beeler on August 17th, 2008 | File Under Uncategorized

This weekend, my plan was to upgrade our internet connection from an aging ADSL-Line to a new ADSL2+ line from Cablecom. At the same time, i also replaced our aging, self built Linux Firewall/Reverse-Proxy/etc. with a SonicWALL NSA3500.

Up until now, we’ve been using PPTP for our VPN needs. PPTP is easy and painless to setup, but can cause several problems on customers site because it needs GRE. Many overzealous firewalls block GRE.

In the future, we are intending to use SonicWALLs Global VPN Client, that uses IPsec with it’s NAT-Traversal over UDP. Also, the SonicWALL GVC solution is able to plug directly into Active Directory for central authentication.

I intended to keep PPTP running for some time after the migration, in order to ease the transition. But as it looks now, Cablecom blocks OUTBOUND GRE packets. Mighty strange, because inbound GRE-Packets work.

Here’s how this looks in tcpdump:

10:58:13.927888 IP 77.59.216.227 > 194.88.212.200: off 0×5858 [|gre]
10:58:13.947131 IP 77.59.216.225 > 77.59.216.227: icmp 52: host 194.88.212.200 unreachable

.225 is the Cablecom CPE, and .227 is the Linux machine running the PPTP server.

I’ve already opened a support case with Cablecom, in the hope of having this issue sorted out quickly. So far, i haven’t heard back from them, even though i reported the issue almost a day ago. It’s not like we pay 180 CHF a month for 24/7 support.

Update: Cablecom was able to resolve the issue today. Apparently, it was a config issue on the router.

Comments (No responses yet)

ESXi - A perspective from the Microsoft World

Post by: Lukas Beeler on August 14th, 2008 | File Under Uncategorized

I’ve written a bit about ESXi before in a comparison to other free virtualization products from an SMB perspective.

I’ve seen the “big” ESX in a few places and worked a bit with it, but i decided to refresh my knowledge on VMware a bit. For this, i first had to scrounge up a machine that was able to pass the rigorous HCL from VMware.

Unfortunately i didn’t find something that was really a Small Business machine - i used a HS21 Blade from my BladeCenter S testing environment.

The HS21 blade has 4GB RAM, a 2.66 Ghz QuadCore CPU and two 500GB SATA Harddisks attached to an LSI1064 SAS Controller. Fortunately, this configuration is supported.

Installing ESXi

Similar to the installation of Windows Server 2008 or Windows Vista, the ESXi installation is extremely streamlined. All you have to do is pop the CD in, select the disk where you want to install ESXi and then let it continue. The whole setup took around 15 minutes, most of the delay owed to the extremely slow Laptop CD Drive installed in the BladeCenter S.

After installation, the Blade rebooted and you will be greeted by an extremely simplistic interface that allows you to change basics like the password of ESXi and reconfigure the management network interface and also display a few logfiles. On first startup, it also showed my a Web address where i can download the VI Client that is used to manage ESXi.

A very pleasant experience.

Installing the VI Client

After accessing the ESXi host through HTTP, i could then download the VI Client. Installation on another Blade running WS2008 was smooth. It also installed an Update Service that allows me to upgrade ESXi.

Configuring ESXi for the first time

After logging on using the VI Client to ESXi, i was greeted with a nicely detailed instructions that i would need to create a datastore. After few clicks i had a datastore created on the RAID1 that ESXi was installed.

The VI Client looks very impressive and neat. It looks like ESXi can read diagnostic information from the Blade, and can monitor RAID, Fan and other stati easily. One of the things i really like about this is that you get a standardized interface for monitoring your hardware - on Windows you usually have to use tools like IBM Director that are just one big mess to handle. Here, i didn’t have to configure anything - it just worked.

After entering licensing information, configurating a static IP Address, changing hostname and DNS information, i rebooted the blade.

Creating the first Virtual Machine

I decided to create a first virtual machine - the blade i killed for running ESXi was previously running Exchange 2007. As this is just a demonstration setup, i decided to recover the preexisting Exchange server into a VM, in order continue having a full featured demo setup.

So i created a new Virtual Machine, configured for running Windows Server 2008 x64. Now, i didn’t have WDS setup in the Demo Environment, so i had to find a way to boot the Blade from an ISO. Previously i used scp to copy the ISO to the ESX Management Partition, but that didn’t work on ESXi. Luckily, the VI Client has a “Datastore Browser” that allowed me to upload files to the vmfs3 filesystem.

After uploading the ISO, i booted from it. The installation was pretty slow, but comparisons to my Hyper-V hosts aren’t fair as those run 10kRPM 147GB SAS Disks in a RAID5 configuration instead of the slow-as-molasses 500GB 7.2kRPM SATA Disks.

After OS installation, i immediately installed the VMware tools. One reboot later, i had a working Windows Server 2008 machine.

One of the things i noticed: When running WS08 virtualized on Hyper-V with 4 virtual CPUs on a Quadcore machine, WS08 thinks i have on Quadcore. On VMware, WS08 thinks i have 4 real CPUs (Sockets). This can bite you if you want to give a WS08 Std Machine more than 4 Cores - as WS08 Std is only licensed to four sockets.

The next step obviously is restoring the Exchange server, but that doesn’t really have to do all that much with ESXi.

Conclusion

ESXi is great. One of the biggest advantages over Hyper-V is the VI Client that consolidates a lot of information that is all strewn about in Windows. For example, it has built-in performance metrics, raid status monitors, etc. You can get all the same information with a machine running Hyper-V, but you’ll have to use other tools for that (of course you can customize a MMC do include Perfmon, but it’s not exactly the same).

VMware shows that they have gained long term experience with Virtual Machines, and the VI Client clearly shows the maturity of their product.

Permission management seems much better than Hyper-V, but i didn’t find a way to use Active Directory integration. Maybe Virtual Center is required to this, or i just wasn’t able to find it in ESXi - it exists, because there are numerous references on the Web.

I’ll certainly consider using Hyper-V when i have to run non-Windows guests. For Windows guests, Hyper-V with it’s VMbus architecture seems better suited. For non-Windows guests, VMware can’t be beaten right now.

Comments (No responses yet)

Hyper-V vs. ESXi in the Small Business space

Post by: Lukas Beeler on August 11th, 2008 | File Under Uncategorized

Disclaimer: I work for a Microsoft Partner. So i’m probably biased.

Virtualization has always been a topic with a lot of hype, but as of today we have a single customer that is using it (out of 150).

Why? Because virtualization is still expensive. For larger companies, it was possible to save money by using virtualization, for smaller companies that wasn’t really the case. You’ll still need to license the guest OS. You’ll still need to maintain it.

Most customers decided to just buy a Windows Small Business Server, and run all apps from that machine. Though that usually required a technician that knew what he was doing to get all the apps running together on a single machine, it saved money in licensing cost and hardware - and the most important application ran on a separate machine anyway (our ERP software on the IBM i).

With the release of Hyper-V and it’s inclusion in SBS 2008 Premium (on the second machine), Virtualization will probably get picked up even in small businesses. But is it the right way?

I’ve started gearing up my knowledge on virtualization as it will become a topic for our customers. For that, the most important other factor is VMware. VMware offers virtualization products for longer than Microsoft, and i’ve been using their Workstation product for a long time.

Microsofts Desktop product Virtual PC is lackluster at best. The performance is awful and it doesn’t offer many features. There was also Virtual Server 2005, which we’ve used internally since mid-2005 (when you still had to purchase GSX Server and we got VS2005 through the MSPP for free).

Now VMware has an offering that is free, Microsoft has an offering that is included into most Windows Server licenses and Citrix offers a very limited edition of their product for free (Max 4GB RAM, Max 4 VM).

And the big question would be - what product should a small business use today, and why?

I’ve found a few good blog posts on ESXi:

What’s the difference between free ESXi and licensed ESXi?

And on Xen:

Citrix XenServer for the ESX Engineer

And on Hyper-V:

Hyper-V for the ESX Engineer
More on Hyper-V for the ESX Engineer

On ESXi

ESXi Installable Edition Free (short: ESXi) only runs on certain certified systems. Of course you can still build a whitebox machine that runs ESXi, but that would be a rather stupid decision. Running supported hardware is important even in a small business.

ESXi doesn’t support many systems, especially our bestseller system, the IBM x3650 is not supported with ESXi installable edition. I expect the list of machines supported by ESXi to grow steadily, though.

On the other hand, ESXi supports a wide variety of guest operating systems that are supported by VMware. This is one of the main advantages VMware has over Hyper-V. However, most Small Businesses struggle with the complexity of using one operating system. They are unlikely to use multiple ones. On the other hand, VMware offers preconfigured appliances, which sounds like a good use. Important to know: Microsoft does not directly support running Windows on VMware unless you pay big for a Premium support contract.

ESXi can be managed by the VMware “VI Client”. This allows you to do all the everyday tasks of configuring and setting up virtual machines.

ESXi doesn’t have any restrictions that would prohibit production usage, but the management features are a bit limited - you can’t monitor it using SNMP, you can’t script it using the RCLI. If you want those features, you’ll have to pay.

VirtualCenter, which is VMware’s variant of System Center Virtual Machine Manager, is quite expensive. Of course, SCVMM is also quite expensive. So i doubt that either will be used in a Small Business. The disadvantage i see here over Hyper-V is the fact that it can’t be scripted or automated. While not a showstopper, it’s important to consider this.

On XEN

The free XEN version supports a maximum of 4 VMs and 4GB of RAM. With that, i think everything is said and done. These restrictions do not allow production usage. It’s more like a demo version for the full products.

On Hyper-V

Hyper-V only works on 64bit installations of Windows Server 2008 Standard, Enterprise or Datacenter. In SBS 2008 Premium, one license for Windows Server 2008 Standard is included. This allows small businesses to get started with Hyper-V. WS2008 Standard x64 supports up to 32GB RAM. If you use “just” Hyper-V on a WS2008 Standard installation, you can also install a single guest VM with WS2008 Standard without having to purchase an additional license. Be aware that it does not work this way if you run any other software like SQL Server on the Hyper-V host.

Hyper-V can run on a lot of hardware, as described in the Windows Server Catalog. It is also a lot more flexible when it comes to storage configurations, as Windows supports more disk controllers than ESXi.

Hyper-V can be automated using WMI, there is no direct PowerShell support (though you can use PowerShells WMI support).

You can deploy Hyper-V on Windows Server Core, as a dedicated VM host. Managing Hyper-V in this scenario requires a machine running Windows Server 2008 or Windows Vista with the Hyper-V management tools installed. This is the recommended deployment mode.

You can also install Hyper-V on a full Windows installation. Though not recommended, this allows you to logon to the machine using RDP and manage the VMs directly on the server using the same Hyper-V management tools.

Here is one of the biggest advantages Hyper-V has over ESXi. For example, if you setup the WS2008 Standard Server as a SQL Server, you can install Hyper-V after the fact with a simple reboot. Though this is not what Microsoft recommends, the reality is that most Small Businesses have to achieve a lot with less equipment. Running such a configuration can help fix business problems without having to reinstall a machine.

System Center Virtual Machine Manager allows you to manage Hyper-V centrally. It’s quite expensive, so i doubt many small businesses will start using it. Maybe the next version of System Center Essentials will include a subset of SCVMM functionality.

Conclusion

Hyper-V supports more hardware, and is more flexible when it comes to it’s deployment. For me, this makes Hyper-V the better choice for a Small Business than ESXi. XEN Express is absolutely unusuable in a production deployment.

Now, Enterprise admins will probably slap me for the “flexible” deployment of Hyper-V, and they are right. But for most small businesses, being able to cut corners in IT is more important than running “recommended” configurations.

I’m using Hyper-V standalone on a machine in a hosting center to run my private infrastructure (where i plan on moving this blog to), and it’s also a full Windows installation. Hyper-V runs flawlessly in such a scenario.

I also didn’t talk about Vmotion, HA, DRS and all the other fancy features that VMware has and Hyper-V doesn’t have yet - simply because they do not matter to a small business.

Comments (No responses yet)

BackupExec Installation on a Windows Server 2008 RODC fails with V-225-212

Post by: Lukas Beeler on August 9th, 2008 | File Under Uncategorized

In our branch office in Lyss, BE i run an RODC - not because it’s needed, but a production environment is always better to gain experience than a few VMs.

As almost all data from that RODC is replicated through DFS-R, backing it up wasn’t that important, we had a few more business needs that couldn’t be solved by using DFS-R to backup in our HQ in Horgen.

So we purchased a BackupExec Media Server license, and i tried installing BackupExec. It reminded me that installing on an RODC requires a seperate Windows installation that runs SQL Server. Well, we have Hyper-V and enough Windows licenses to do this, so i didn’t think of this as a big deal.

I’ve setup a VM with WS08, installed SQL Server Express with an Instance called “BKUPEXEC” and tried installing BackupExec, pointing it at the remote SQL Server Express (that was configured to allow remote connections).

The RODC is called LYS-RODC-01. The SQL Server Express VM is called LYS-SQLE-01, with a SQL Server Instance called BKUPEXEC.

It didn’t work:

08-08-2008,23:22:58 : There is no MSSQL$BKUPEXEC Service
08-08-2008,23:22:58 : V-225-212: Unable to connect to SQL Server. ***To search for information about this error, click here
08-08-2008,23:22:58 : Failed to configure SQL instance LYS-RODC-01\BKUPEXEC SQL instance to allow updates.
08-08-2008,23:22:58 : Action ended 23:22:58: InstallFinalize. Return value 2.
08-08-2008,23:22:59 : Action 23:22:59: Rollback. Rolling back action:

The error message seems strange. Why does it connect to the RODC - there is no SQL Server on the RODC, and i configured it correctly in the setup.

I read through the logfile multiple times. Didn’t find a mistake. Reinstalled the SQL Server VM a few times using a variety of SQL Server and OS combinations.

I contacted Symantec Support (which was a bit of a letdown, first i had to talk someone in one of the Eastern European countries who could barely speak German, and next i had to talk to someone from India who could barely speak English, much less German). After almost a month, i still wasn’t anywhere near a solution.

I’ve spent a few more days playing around until i finally tried something that worked.

I changed the name of the SQL Server instance from BKUPEXEC to SQLEXPRESS.

This fixed the problem.

I’m still baffled by this.

Comments (No responses yet)

Finally - Microsoft Gold Certified Partner

Post by: Lukas Beeler on August 7th, 2008 | File Under Uncategorized

As our Microsoft Partner Program Renewal Date is coming up, i decided to do some work to get everything together and go Gold Certified.

With just a bit of work, this has worked out (the number of sales we made also played into that).

If you’re wondering how we got 120 points with us being a rather small company, it’s quite easy:

  • Get two competencies - for this you’ll need:
    • Two MCPs with relevant certs
    • The Information Worker & Network Infrastructure competencies are the easiest
    • Three customer references per competency minimum - 10 customer references total to get full points
  • A total of 10 customer references - you’ll need only 6 if you have 18 points for Sales Performance
  • Microsoft Small Business Specialist for 5 extra Points - for this you’ll need 70-282 and an online exam.
  • A minimum of 7 points through MCPs - two MCITP/MCSE and one MCP will give you that
  • A minimum of 10 points of Sales Performance

This should give you a total of 122 points - more than enough for a Gold Certified Partner!

As a side note, my current employer Acommit AG also has a job opening in a Systems Engineer position. If you have strong Windows, Exchange and IBM i skills and thinking of working for a company near Lake Zurich, apply now!

Comments (No responses yet)

IBM POWER Model 520 9407-M15

Post by: Lukas Beeler on July 25th, 2008 | File Under Uncategorized

The Front of a POWER 520The IBM POWER Model 520 9407-M15 or in short the M15 is a one core, 4.2 Ghz POWER6 server.

It’s the successor of several System p systems (which i know nothing about), and of the System i Model 515 (9407-515). As such, it targets small businesses.

Yesterday i received the first M15, to be installed in for our SaaS (Software as a Service, the IBM Slang for Application Service Provider) Project. This is the first standalone POWER System that i got my hands on, but i’ve already tested running IBM i on Blades.

The M15 is a standard 19″ 4U Server at half depth that can hold dual power supplies for redundant power, has a variety of expansions slots and supports up to 6 internal 3.5″ SAS Disks. The integrated SAS RAID Controller has support for a battery backed write cache. You can also install several PCI-X and PCI-E cards. The system has 2 8x PCI-E Slots, one 16x PCI-E Slot and two PCI-X slots.

POWER 520 M15 Front without CoverIn this case, the system came with a HMC - a normal System x3550 configured with a single 320GB SATA Disk drive. Interestingly, we had ordered the same HMC (7310-CR4) half a year ago. Back then, it shipped with a 80GB SATA Drive and an external modem. This unit shipped with increased capacity in the SATA Disk drive, and an internal Modem. Though i have no idea why anyone would still use the modem.

The HMC isn’t very interesting from a hardware perspective either, so the focus is purely on the M15.

IBM POWER 520 M15 Control PanelThe first view on the front shows a new green bar, that symbolizes POWER6. I think it could use a bit of improvement, doesn’t look that nice. Much more interesting is that the control panel has essentially vanished. Like the light path diagnostics model in the System x, it has to be pulled out from the System to be of any use. This was probably done to save space.

The new control panel isn’t an improvement, unfortunately. Of course on systems with the a HMC attached you don’t really need it anymore - but most of the Systems we are going to ship will not have a HMC. The buttons are hard to use and hard to reach - it’ll be interesting doing procedures like 65+21 on those. This isn’t a deal breaker - but from IBM i expect them to get even details as this right - all in all, this isn’t a 1k Dell Server - it’s a 20k High-End IBM Server.

Much better in my opinion is that with the POWER Systems, the IBM i has finally moved to the year 2008 in regard to IO technologies. PCI-E and SAS is finally here. What i do not understand is why the M15 uses 3.5″ SAS Drives, and not 2.5″ SAS Drives. This would allow to fit more arms into the same chassis. e.G. the 2U System x3650 ships with the possibility to install up to 8 2.5″ arms. A 4U machine could have up to 16 arms - without needing more space.

The power supplies have been moved to the front of the unit, similar to the PCI Expansion units. I like it - the new power supplies are bit smaller than the older ones, and replacement is easier, thanks to them sitting in the front.

>IBM POWER 520 Model M15 InternalsThe machine internals still seem a bit sketchy to me. The M15 wastes a lot of space that is used for the second CPU in larger machines. But it also has a completely new fan design, with four centrifugal fans in the back of the machine.

The new fan hot plug mechanism is very sturdy, and is comparable to the high quality fan design used in the System x3650. This was one of the biggest downfalls of the POWER5/5+ 520/515/525 hardware platforms that has been fixed in the new hardware. RAM accessibility still isn’t optimal in my opinion - you’ll still need to remove the fans to access the memory. IBM has better solutions for this - just look at the x3550 and x3650.

The fans are very, very loud. The unit we have here is a rack unit and the fact that there is no conversion option like for the POWER5 models might mean that the tower and rack units have different acoustic configurations or different dampeners. While loudness is a complete non-issue in a server room, smaller customers sometimes have the machine in their office. As soon as i get the first tower model i’ll write about their loudness level.

POWER 520 9407-M15 HEAOne of the things that’s completely new to the POWER platform is the HEA - the host ethernet adapter. It allows to share a physical NIC with other partitions - that’s a very good feature, but i wasn’t able to play with it yet - this machine is not partitioned.

The BBWC in this machine is now hot pluggable. It’s great to see this, but in my opinion it wasn’t really necessary. There’s a reminder directly on top of it that you need to set the disk cache into an error state before replacing it - and that reminder is very important. If you don’t pay attention to it, you might have to make an unexpected test of your DR strategy.

IBM POWER 520 M15 PCI SlotsThe expansion and console capabilities of the M15 are artificially restricted - you can not have a PCI-Expansion Unit (no HSL/12X Ports) and you cannot have IOPs in the base unit. The conclusion: No Twinax, No U320 Tapedrives.

Especially the Twinax bit is, in my opinion, a good move. It will force customers stuck in the AS/400 days to get current with all their other hardware like printers. On the other hand, it might also cause those customers to stay stuck with their model 270 or model 800. As for me personally, i’ll have to deal less with Twinax - which has to be a good thing.

The Thin Console, a very good option for Small Businesses, is gone (because HP bought Neoware). With Twinax also gone you now have the choice to either get a HMC for 6k or get a Windows PC and use LAN Console. Both options aren’t really what a Small Business needs - a console that “just works”. The TC and Twinax console fit those criteria. The LAN console has issues on their own (The whole Systam i Access package is … aging) and the HMC requires a boatload of expertise that SMB operators just don’t have. We will go with the LAN console, mainly due to the HMC pricing, but i’m not really content with the console situation on the new systems.

POWER 520 M15 Rear
This picture shows the rear of the unit, as you can see the cabling in my lab environment is always top notch. The HEA offers 4 ports by default, a bit much for a standalone system. I’ve implemented a Virtual IP Address setup to create redundant network connections. Not as cool as native Teaming/Bonding support, but it works well enough.

In general, the new model has improved several things on the old hardware, left one or two things in the same state, and has two new issues (the control panel, noise level). All in all, a good solid deal of hardware.

Questions? Comments?

Comments (No responses yet)

Fuck Symantec

Post by: Lukas Beeler on July 23rd, 2008 | File Under Uncategorized

Customer is running two Terminal Servers on Windows 2000 Server. 32bit. 4GB of RAM.

Recently upgraded to Symantec Endpoint Protection 11, around 1 Month ago. A week ago, the customer complained that one of the Terminal Servers crashed constantly, requiring a reboot to recover.

Quick investigation showed that the machine was running out of paged pool.

Event ID 2020
Event Type: Error
Event Source: Srv
Event Category: None
Event ID: 2020
Description:
The server was unable to allocate from the system paged pool because the pool was empty.

I’m not proficient with Terminal Servers or Windows 2000, but debugging this issue was mostly similar to what you do when debugging pool issues on Windows Server 2003. First you need to enable Pool Tagging, which is enabled by default on Windows Server 2003 but not on Windows 2000. KB177415 explains how.

After that, install the Windows 2000 Support Tools, and run poolmon /p /p /b.

In my case, the output looked like this:

The limit for Windows 2000 Terminal Servers is 160 MB. As you can see, the machine here is idle and without any users on it. And we’re already at 132MB utilisation.

There are two culprits: “CM” and “SavE”. The Pooltag “SavE” is the Symantec Endpoint Protection Virus Scanner Driver. It clocks in at 50MB. The other Pooltag “CM” stands for “Configuration Manager”, and is the registry. It is 67MB big.

This is not normal - the other Terminal Server, the CM tag is a lot smaller, only 35MB. The “SavE” tag is still 50MB. This explains why the other TS does not have the same problems as this one. But we don’t know why one registry is so much bigger than the other.

This can be found out by using the dureg.exe tool, which can help us resolve the issue.

As you can see from the picture above, the enlarged registry is also caused by Symantec.

C:\Programme\Resource Kit>dureg /lm “SOFTWARE\Symantec\Symantec Endpoint Protection\AV\Quarantine”
Size of HKEY_LOCAL_MACHINE\SOFTWARE\Symantec\Symantec Endpoint Protection\AV\Quarantine: 26111494

Clocks in at 26MB. The Quarantine key contained around 20′000 subkeys, each with a simple number below. Each was about the same .doc file.

After deleting the Quarantine key, the CM pooltag went down from 67MB to 35MB - just like the other TS.

The next step was obvious. Remove Symantec Endpoint Protection which something that doesn’t suck as bad: McAfee AntiVirus Enterprise. I downloaded an Evaluation Version, and installed it.

And the results were obvious:

Do you see the pooltag “NAI0″ somewhere in this list? I don’t. It’s there, but somewhere around Page 400, and surely not eating away 50MB of my paged pool.

So if you have problems with your machines running out of paged pool, frequently showing Event 2020 with Source Srv, check the registry size and replace Symantec Endpoint Protection with something that doesn’t suck that much.

Comments (No responses yet)

IBM i on Blade - How to save?

Post by: Lukas Beeler on July 22nd, 2008 | File Under Uncategorized

Disclaimer: These are my personal experiences, not a “How to i on Blade”. If you’re looking for decent i on Blade documentation, look at the i on Blade Readme.

Setting up the hardware

Finally, after almost two months i was able to get my hands on a SAS cable with the correct pinout to connect the TS3100 library to the SAS Connectivity Module. After plugging in the cable, both link lights went up. The TS3100 was connected - physically at least.

I logged into the Storage Configuration Manager, and dared to assign the HS21 Windows Blade and the JS12 Blade to the external SAS Port. When looking at the Windows Device Manager, it immediately recognized the Tape Drive, but didn’t recognize the media changer. No bummer, i haven’t installed BackupExec yet.

It’s important to notice that the IBM i OS will never, ever see your tape drive. You cannot connect the tape drive to the IBM i OS, only to VIOS. This means that you always will do a disk to disk backup to the VIOS partition, and then use VIOS to save the D2D Image to tape. Restoring works the other way around - you restore from tape to VIOS disk, then load that disk into IBM i, and then run your restore or even IPL from the virtual optical medium.

This is not optimal, as this means that you cannot use BRMS to manage media (you can still use it for saving, though). It also adds another layer of indirection that makes automating backups more difficult. Another important point is that other i machines will not be able to read the VIOS created tapes, and that your i Blade will not be able to read tapes created by standalone POWER machines running IBM i.

Configuring VIOS

So i went to logon VIOS, running on the JS12 blade. I ran “lsdev | grep rmt0″, but apparently there was no tape drive to be seen. I ran “cfgdev”, to let the operating system configure devices, but that wasn’t met with success either.

This time, i chose the easy way out. I just rebooted the entire blade, and finally, the tapedrive showed up:

$ lsdev -dev rmt0 -attr
attribute      value            description                               user_settable

block_size     262144           BLOCK size (0=variable length)            True
delay          45               Set delay after a FAILED command          True
density_set_1  0                DENSITY setting #1                        True
density_set_2  0                DENSITY setting #2                        True
extfm          yes              Use EXTENDED file marks                   True
mode           yes              Use DEVICE BUFFERS during writes          True
res_support    no               RESERVE/RELEASE support                   True
ret_error      no               RETURN error on tape change or reset      True
rwtimeout      144              Set timeout for the READ or WRITE command True
var_block_size 0                BLOCK SIZE for variable length support    True
ww_id          5000e1111c878001 World Wide Identifier                     False

Okay. So as the TS3100 is a LTO4 Tape Library, the next step obviously was to load a tape into the tape drive. Now this can be done through the Web Interface or the Control Panel of the library, but that’s not the way you want to go during day to day backup - media moving should be handled by the backup software (called BRMS on IBM i).

But alas, this is not as easy on VIOS. Basically, you can get an AIX root shell on VIOS by typing “oem_setup_env” - and then there are several AIX commands to manage a tape library.

There was only one problem:

# mtlib
ksh: mtlib:  not found.
# tapeutil
ksh: tapeutil:  not found.

They’re not there. A quick google search revealed nothing. I didn’t know how IBM thought how we should use a tape library. In sequential mode? Or is there some way to manage a tape library in VIOS? If you know, tell me!

So, the next step was to create writable optical media, so i could start with creating a Save 21 of the system. This seemed as a sane first step.

Initializing the media on the i side

First, i needed to create a virtual optical volume in a volume library on VIOS - i already had the volume library from the IBM i installation, so all i needed to do was to add a writable optical volume. The system used about 40GB DASD, so i created a virtual optical device with a size of 80GB to leave room for future growth. This process took around 2 hours, probably because VIOS pre-blanked the file (it effectively used up 80GB on disk). A rate of 0.6GB/min. During this operation, the VIOS webinterface grinded to halt (didn’t respond), but SSH was still available and responded very slowly to each command. I suppose this is some issue with the disk controller, either the driver or firmware.

According to vmstat, most of the time is spent in system or IO wait context:

kthr    memory              page              faults              cpu
----- ----------- ------------------------ ------------ -----------------------
 r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa    pc    ec
 1  1 203749  6107   0   0   0 2485 3279   0 150 29483 6542  6  9 77  7  0.18  17.5
 1  1 203732  6167   0   0   0 2464 2869   0 153 30479 6333  7  9 74 10  0.18  17.6

The next step was to load the newly created virtual media into the virtual optical drive that was attached to the IBM i partition. This was rather easy to do, just click through the web interface.

Now, we need to initialize the optical volume in IBM i OS:

INZOPT NEWVOL(’Save21′)
DEV(OPT01)
CHECK(*NO)
TYPE(*PRIMARY)

This finished rather quickly, and i then started the Save 21.

The Save 21

GO SAVE/21. The performance wasn’t much better, though. the vmstat looked the same as above, indicating the same problem.

Further debugging using iostat revealed that a volume group is not a RAID1 array - but nonetheless, the disk subsystem is behaving oddly:

hdisk0, hdisk1: VIOS installation - part of it is mirrored VIOS, other part is volume group that is spanned across these two disks
hdisk2-5: SCSI passthrough to the IBM i

tty:      tin         tout    avg-cpu: % user % sys % idle % iowait physc % entc
          0.0        614.0                0.0   7.7   72.2     20.1   0.1    8.3

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
hdisk0           1.0       8.0       2.0          0         8
hdisk1          85.0     20036.0     179.0      10080      9956
hdisk2           4.0     926.0      50.0        922         4
hdisk3          14.0     4481.0      98.0       4477         4
hdisk4          13.0     3730.0      96.0       3726         4
hdisk5           3.0     652.0      36.0        648         4

This is pure sequential IO - why is it reading from the disk? A similar picture was seen throughout the whole backup - even when backing up image catalogs from the IFS. hdisk1 consistently showed strong write activity. No idea why.

My AIX skills are weak, and i didn’t know a way to see on which files the write IO happened - however it’s important to know that read and write always showed the same numbers. To me, this looks like a problem - either in my config, firmware levels, or even a problem on IBMs side.

Either way, the Save 21 completed in 45 minutes. At around 40GB, this brings us to 0.9GB/min.

Backing up the virtual image to tape

The next step is to backup to our LTO4 tape.

Here’s how the backup itself looks:

# find /var/vio/VMLibrary/D2D_1 -print | backup -ivqf /dev/rmt0 -b 512
Backing up to /dev/rmt0.
Cluster 262144 bytes (512 blocks).
Volume 1 on /dev/rmt0
Backup finished on Mon Jul 21 20:45:18 CEST 2008; there are 167772672 blocks on 1 volumes.

And the sequential IO performance is much more reasonable:

tty:      tin         tout    avg-cpu: % user % sys % idle % iowait physc % entc
          0.0        615.0               11.2  29.0   58.2      1.6   0.4   40.9

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
hdisk0           0.0       0.0       0.0          0         0
hdisk1          99.0     81920.0     160.0      81920         0
hdisk2          17.0     2250.0      29.0          0      2250
hdisk3          12.0     2236.0      28.0          0      2236
hdisk4          12.0     2385.0      30.0          0      2385
hdisk5          16.0     2250.0      29.0          0      2250

That’s roughly 5GB per Minute. A very decent performance.

Interoperability

Now, this is where it gets interesting. Attached to the BladeCenter S, we have a TS3100 with a single drive, in the BladeCenter S we have three Intel Blades running Windows Server 2008 and one POWER Blade running IBM i.

We need to back up all this to the TS3100 - on the Windows Side, i’ll be using BackupExec 12, on the i Side VIOS. How do i make sure that the tape drive can be used from both sides, without to much interaction?

The SAS Connectivity Module can attach the same port to multiple Blades. So installed BackupExec on one of the Windows Blades, just to see how interoperability would work out.

I ran a backup & restore on another tape, from BackupExec. This worked fine. The next step was loading the tape from the i Save back, and then run a test restore from that. Unfortunately, i couldn’t use BackupExec to just move the tape in the drive, so i had to use the TS3100 Web Interface again.

I looked at the tape drive from VIOS, which also seemed okay. I also saw how much space was used on the VIOS tape after checking it up with BackupExec (this is stored on a small RFID Chip on the Tape itself). But the real test was yet to come:

Restoring from Tape

I started the restore from tape.

# restore -xvqf /dev/rmt0 -b 512 /var/vio/VMLibrary/D2D_1
New volume on /dev/rmt0:
Cluster size is 262144 bytes (512 blocks).
The volume number is 1.
The backup date is: Mon Jul 21 20:24:04 CEST 2008
Files are backed up by name.
The user is root.
x  85899345920 /var/vio/VMLibrary/D2D_1
The total size is 85899345920 bytes.
The number of restored files is 1.

iostat also told me that the performance on restore was bit worse than when backing up:

tty:      tin         tout    avg-cpu: % user % sys % idle % iowait physc % entc
          0.0        611.0               14.7  85.0    0.0      0.2   1.0  101.9

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
hdisk0           0.0       0.0       0.0          0         0
hdisk1         100.0     48216.0      68.0          0     48216
hdisk2           0.0       0.0       0.0          0         0
hdisk3           0.0       0.0       0.0          0         0
hdisk4           0.0       0.0       0.0          0         0
hdisk5           0.0       0.0       0.0          0         0

So after having restored the image file to VIOS, i could IPL from it and run a complete system restore. Nice. I didn’t want to scratch my whole setup and wait until a full restore, so i tried something simpler.

In the end, it turned out to be around 2GB/min. This was a lot faster than the creation of the file, which seems really odd to me.

Running a Test restore on the IBM i side

After restoring the file in the VMLibrary, it automatically appeared again in VIOS. I only had to mount it to my IBM i partition. This could easily be done through the VIOS web interface.

I ran a simple restore:

RSTLIB SAVLIB(AVNEDIAS) DEV(OPT01)

With simple results:

12 Objekt(e) von AVNEDIAS nach AVNEDIAS zurückgespeichert.

I also tried IPLing from the virtual optical media, which brought me into the limited paging DST. Nice!

Is this good?

AS you can see, this is not your fathers AS/400. This is a POWER Blade running IBM i. You’ll need to learn a bit about VIOS and AIX in order to make any sense on how this whole stuff works. But it’s not rocket science - i only know a bit about Linux, and was able to figure out the tasks i needed to do.

But now, how should one run this in production?

The current configuration i have seems unsuitable to production to me.

  • VIOS/AIX can’t handle the tape library in random mode. This is a big letdown.
  • Backing up to tape and restoring seems very, umm, basic to me
  • The fact that you are using virtual optical devices, with no ability on the i side to change media, makes a usuable backup procedure hard to implement
  • Automation on the VIOS side could be implemented by the i running the ssh client in command execution mode (similar to how this is used with the HMC
  • Integration between the Windows and VIOS seems cumbersome

Solutions

Two Half-Height Tapedrives in a partitioned library

The TS3100 can be partitioned, and we could install two half-height LTO4 tapes. Reserving 12 Slots for Windows in Random Mode, and 12 Slots for VIOS in Sequential Mode. This would work, but there’s one huge downside: Price

Backing up on Windows only

Instead of using VIOS and clumsy hacks to get a halfway decent functionality, you can use Windows to backup everything. Save 21 and Systems Saves would run through VIOS in order to create bootable media, and then retrieved on the Windows side using SFTP.

For daily backups, we can use savefiles directly on the i, which is probably easier to deal with for most IBM i admins. These can be retrieved from the Windows side using FTP/TLS. The downside: If you have i and Windows people that work well together, i don’t see much of an issue. But if not, you got a big mess in your hands.

SAS Passthrough to IBM i

Unfortunately, this does option not exist yet. This is something that IBM should work on intensively. It will allow i admins to use well established backup processes with full library integration using BRMS.

Conclusions

Backing up the IBM i on Blade isn’t exactly easier than backing up a standalone POWER machine. In fact, it’s more difficult and requires additional skills.

Before buying a JS12 blade running IBM i, make sure that you think your disaster recovery strategy through completely. Your business partner should be able to help you with this.

Planning is crucial - Backup & Restore on the blade is different, and you’ll need to deal with VIOS when creating a procedure for fully automated Save 21 backups.

Any questions? What do you think about the situation? Want me to test something for you? Just leave a comment!

I also created a “i on Blade” category. Look at it if you want to see all my posts about this subject.

Comments (No responses yet)

i on Blade - More details and installing software on the JS12 Blade

Post by: Lukas Beeler on July 10th, 2008 | File Under Uncategorized

The i Blade is up and running, and i’ve received quite a bit of feedback on the Installing the JS12 Blade post.

In the meantime i wasn’t just wasting my time on trivial things such as getting actual customer work done, but also playing a bit further with the JS12 Blade.

I’ve installed the software my company produces (DIAS-iS) on the JS12 blade, and ran a few very unscientific benchmarks. But first let’s talk about the disk situation in with IBM i on a JS12 blade in a BladeCenter S (i really like those convoluted product names).

Managing disks under IBM i on a JS12 Blade

As i found out the hard way during the initial bladecenter setup, the JS12 blade only supports SAS disks, and can cause issues if you have SATA disks zoned to it.

There are a few important considerations when thinking about the IBM i/JS12/BladeCenter S combination: First off, disks are directly attached to VIOS, and then virtualized by VIOS for the IBM i as SCSI disks. It’s important to note here that you do not have any (supported) options of RAIDing the disks before the IBM i sees them. So all disks are mapped through 1:1 to the IBM i OS, and then mirrored using IBM i mirrored protection. This is entirely different from the approach you would use in a BladeCenter H with a FC attached SAN.

Just to be clear: There is no cache on the JS12 and there is no way to use any disk protection except IBM i mirrored protection. You can’t use RAID5, RAID6 or hotspares. You can’t VIOS mirrored volume groups either, because it’s unsupported.

I’m thinking about removing one of the disks from the BladeCenter in order to test how recovery from a disk failure would look like, but i’m afraid of wasting a lot of work that i’ve already invested in this system - i’ll try this shortly before i have to give everything back.

I’m not sure what the virtualization by VIOS exactly entails, but i would assume it’s fairly similar to what Hyper-V/ESX do when you create “Passthrough disks”. This probably means that things like Predictive Failure Analysis (PFA) will probably not work.

Another, rather obvious, drawback is that you cannot install any expansion cards (well, there is the odd one you can install into the blade). But it also means there is no Twinax, no SNA directly over Ethernet, no Modems, etc. Not a big issue for us, as we’re urging our customers to stay current on technology, but not everyone is an IBM i shop - there are still lots of AS/400 shops out there.

If you access the System i Navigators SST/Disk management function, it will not be able to help you with disk locations. I haven’t found out on how to call disk locations in IVM/VIOS, but then again i don’t really know much about IVM/VIOS.

Installing DIAS-iS on the JS12 Blade

I’ve installed our software without a hitch, and loaded our 30GB Test/Benchmarking database on it. I ran several benchmarks, and the JS12 with it’s four SAS 15kRPM 147GB Arms in a mirrored configuration and one core and 13GB RAM in the IBM i LPAR was a bit slower (less than 5%) than our System i515 with four U320 15kRPM 70GB Arms in a RAID5 configuration and 3.5 GB in the IBM i LPAR.

Unfortunately, i do not have a M15 to pit against the JS12, as these two would be using comparable technology. It would also be interesting to see an M15 with four 147GB SAS disks in a mirrored configuration to compare the systems 1:1, especially regarding disk performance.

Next steps to go?

What’s next? Well, Backup obviously. If you’ve read the i on Blade manual you’ll see that saving to tape will be interesting to say the least. I already have a TS3100 ready to go, but i’m currently missing a SAS for attaching it. As soon as i have the cable, expect a big post about saving and restoring.

Questions? Suggestions? Any specific questions about i on Blade? Want me to test something for you?

Leave a comment or drop me a mail. I’ll be happy to help.

Comments (No responses yet)

i on Blade - Number of CPUs licensed

Post by: Lukas Beeler on July 9th, 2008 | File Under Uncategorized

I noticed something interesting when installing the IBM license key on the JS12 blade running IBM i V6R1. I received Message CPF9E2D when trying to install one of the 5761-SS1 license key. Turns out this one controls the number of cores licensed in the machine.

Luckily, this was easy to remedy. Just access IVM and remove one of the cores assigned to the LPAR running IBM i.

IBM licenses it Software based on numbers of Core, not number of Sockets like most other vendors do. This is important to note, because i though the message was wrong as the blade only has a single CPU.

Comments (No responses yet)