Friday, December 11, 2009

New RHQ Beta Release Is Out!

The RHQ project is pleased to announce the first developer release of the RHQ 1.4.0 management platform. This developer release, 1.4.0-B01, provides an early look at the new features which are being planned for the 1.4.0 general release.

Note that we've pulled in the Jopr plugins into this RHQ release. So, consider this new RHQ release a merging of both the RHQ project and the Jopr project.

Note: do not upgrade any existing Jopr or RHQ installs with this build! Specifically, the Jopr plugins in this new RHQ build are versioned differently and will not upgrade your existing Jopr plugins. We'll fix this in a later release.

Now, here are the particulars if you are interested in learning more:

Saturday, December 5, 2009

RHQ Server Plugins - Innovation Made Easy

RHQ is introducing a major new feature - server plugins. I know what you are saying... "ho-hum, another 'plugin mechanism'". It is true that this is analogous to the RHQ Agent plugin functionality - just as you can write your own agent plugins to extend the types of resources an agent can manage, you now have similar capabilities in the RHQ Server.

But when you examine the capabilities of this new server plugin subsystem, you will notice that you now have access to the full breadth and depth of functionality that the core RHQ Server code has, without having to modify core code. Using server plugins, you can plugin your own management code to the RHQ Server, allowing you to add to the existing core management feature set. With a minimum of a single Java POJO class bundled with an XML descriptor, you can now extend RHQ server functionality in areas you couldn't with the agent plugins. The ability to innovate new features and capabilities will be much easier.

I put together a demo for an illustration of the potential that server plugins open up. If you are interested, the demo's plugin code can be found here. This sample plugin has a single Java POJO and an XML plugin descriptor.

The core of the server plugin container infrastructure is checked in. More work will be done to enhance it further, but it works today. What the RHQ team is currently working on is adding support for different types of server plugins.

For example, the demo mentioned above uses a "generic plugin" - that is just one type of server plugin. This is a "catch-all" type of plugin - all other plugin types will essentially build on top of the same capabilities of a "generic plugin". This generic functionality allows for the server plugin developer to write Java code that gets embedded in the server and runs. Each plugin can define a plugin component (much like an agent resource component in an agent plugin). The plugin component is notified when lifecycle events occur (such as when the plugin is started and stopped). The generic plugin can also utilize the server's scheduler to run jobs on a periodic basis. For example, if I want to generate management reports, I can write a server plugin that has a job that runs every day, generating reports when it runs (in fact, this is what the demo above illustrates).

Other types of plugins that are actively being developed are:

  • Content Plugins - these will allow you to define remote content repositories so you can pull in software from those remote locations to be pushed out to remote machines.

  • Alert Plugins - these will allow you to plug in remote alert destinations. Heiko has some great ideas on this and has already demoed a few of them. The one I like in particular is the alert plugin that integrates with Mobicents so when an RHQ alert is triggered, a voice message is sent to an administrator's phone. Very cool :)

  • Perspective Plugin - this is an extremely powerful type of plugin that will enable a developer to add additional graphical user interfaces to the core RHQ Server. There is a entire perspectives design, which will build within the server plugin infrastructure.



A lot more will be communicated about these server plugins as they are built out. But you can still do server plugin development today. Documentation is available, and we'll be building out more documentation as we get further along.

Thursday, December 3, 2009

New Source Repository for RHQ

Recently some changes have been made to the infrastructure used by the RHQ and Jopr projects.

We have consolidated the work into a single fedorahosted.org project.

We have also changed from the use of JIRA to Bugzilla. All JIRA issues have been ported over to the RHQ Bugzilla project.

You may see changes to my previous blog entries - I'll go through them all to edit the links so they point to the new places, as opposed to the old, outdated SVN and JIRA locations.

Tuesday, December 1, 2009

Configuration Remediation

There was previously some requests for RHQ to support "configuration remediation". In short, it would be nice to "freeze" a resource configuration such that if that configuration ever changes for whatever reason, we want that configuration to immediately revert back to its "frozen" configuration state.

I spent a couple of hours coding up a prototype to show this feature in RHQ. See the flash demo here.

There are a few things to say about this.

First, specifically about configuration remediation - this demonstrates how easily it was for RHQ to support such a feature.

Second, and more generally, this effort illustrates the benefits of the RHQ architecture and the concept of "management at the edges". The code was added to the core configuration subsystem - it did not involve any one specific type of managed resource. Because of that, the couple hours that I spent adding this feature provided a wide array of functionality - I now have configuration remediation support for any managed resource that supports configuration. The storage of the configuration on the managed resources is handled at the edges by each manage resource's plugin. But the remediation logic is in the core server, independent of any managed resource that is out in the environment. So, not only was I able to quickly add a new configuration feature to RHQ, but that feature was immediately available across all types of resources without any additional effort.

What does this mean? It means, for example, that I can now freeze configuration of a JBossAS data source just as easily as I can freeze the configuration of my OpenSSH daemon (or any other managed resource that I can configure). On top of that, my RHQ users won't have to learn many different user interfaces or different mechanisms depending on what resource they want to configure - learn one user interface and you will know how to do this for JBossAS data sources, or OpenSSH daemons, or anything else.

Wednesday, October 28, 2009

Byteman Plugin for RHQ

I read Andrew Dinn's post on Byteman 1.1.1 and was very intrigued. His post and the Byteman documentation was very quick and easy to read and I found out how easy it is to use Byteman. Just one (albeit long) command line option to your Java VM and you can easily inject code into existing Java classes using Byteman's byte-code instrumentation. I plan to use Byteman to do things like "fault-injection" testing - inject code that purposefully causes errors in my application to see how my application handles the errors (this is especially useful for forcing XA/2PC transaction errors to see how the application and transaction manager handles it).

I ended up doing some quick but useful enhancements to Byteman (thanks, Andrew, for the commit access :-) that allows any Java application to perform remote byte-code injection in a JVM where Byteman is running. There is now a Java API to the Byteman "submit" client (the new-and-improved Submit class).

I then took a few hours to write up a quick Byteman RHQ plugin prototype. I recorded a Flash demo to show how it works. Still a lot more that can be done with some additional enhancements to this plugin, but hopefully this gets across how easy it is to write RHQ plugins to manage anything. This plugin will allow you to view deployed rule definitions files and their rules and allow you to remove rules. With some enhancements, it could allow you to upload and deploy additional rules as well.

Tuesday, August 4, 2009

Interactive Shell / CLI for Jopr

Work is ongoing at a feverish pace to complete the new Jopr CLI. It is filling out quite nicely. What is this CLI, you ask? It's the long-awaited and often-requested command line interface utility for the Jopr Server.

Up until now, the only user interface into the Jopr system is via the web-based GUI. The new CLI now provides a scripting interface that allows users to tap into the Jopr system from their shell.

The CLI is actually a two-in-one utility - it can act as what people typically think of as a CLI but it is also an interactive shell (this is analogous to say, perl or python). The underlying scripting language is JavaScript - its the default Java scripting implementation found in the Java6 runtime.

Under the covers, the CLI utilizes the remote client API - this remote client API can be used by developers to write their own Jopr clients (anyone out there interested in writing a CLI GUI? :-) You can use the Java client or you can rely on the Web Services API that is exposed by the Jopr Server, allowing a client to consume the Jopr WSDLs and perform remote commands over Web Services.

I'll briefly show you just a few things you can do with the CLI. This should give you a taste for how you can interact with the CLI and some of its basic features.

First, here's the "long way" to query a list of resources and filter the results based on some custom criteria. Here I will ask the Jopr Server for all resources that have "mazzthink" in their name:


[mazz@mazzfedora bin]$ ./rhq-cli.sh -u rhqadmin -p rhqadmin -s mazzthink
RHQ - RHQ Enterprise Remote CLI 1.3.0-SNAPSHOT
Login successful
rhqadmin@mazzthink:7080$ criteria = new ResourceCriteria();
ResourceCriteria:

rhqadmin@mazzthink:7080$ criteria.addFilterName('mazzthink');

rhqadmin@mazzthink:7080$ ResourceManager.findResource

findResourceComposites findResourceLineage findResourcesByCriteria
rhqadmin@mazzthink:7080$ ResourceManager.findResourcesByCriteria(criteria);
id name version curre
ntAvailability resourceType
------------------------------------------------------------------------------------------
----------------------------------------------------------------------
10001 mazzthink Win32 5.1 UP
Windows
10002 Trapd (mazzthink) DOWN
SnmpTrapd
10004 mazzthink RHQ Agent 1.3.0-SNAPSHOT UP
RHQ Agent
10005 mazzthink Jopr Server, JBossAS 4.2.3.GA default (0.0.0.0:2099) 4.2.3.GA UP
JBossAS Server
10007 mazzthink Apache 2.2.9 (C:\mazz\apache-httpd\) 2.2.9 DOWN
Apache HTTP Server
10008 mazzthink Apache 2.2.9 (C:\apache\) 2.2.9 DOWN
Apache HTTP Server
10035 mazzthink File System (local) C:\ UP
File System
10059 mazzthink Embedded JBossWeb Server 2.0.1.GA (0.0.0.0) 2.0.1.GA UP
Embedded Tomcat Server
8 rows


A couple things to note from above. I created a ResourceCriteria that allows me to set filters, sorting and paging controls. Here I just filter on the name "mazzthink".

Second, I used the CLI interactive shell's auto-complete feature to help me figure out what methods are available on the ResourceManager. I simply typed "ResourceManager.findResource" then hit the Tab key and you see the suggestions printed out by the auto-completer. I then chose to use the findResourcesByCriteria method.

Since this is one of the most common things to want to do in the CLI (that is, query for resources), there is a shortcut method you can use for this - findResources. Below you will see me doing the same query as above, only using the "short way":


rhqadmin@mazzthink:7080$ findResources('mazzthink')
id name version curre
ntAvailability resourceType
------------------------------------------------------------------------------------------
----------------------------------------------------------------------
10001 mazzthink Win32 5.1 UP
Windows
10002 Trapd (mazzthink) DOWN
SnmpTrapd
10004 mazzthink RHQ Agent 1.3.0-SNAPSHOT UP
RHQ Agent
10005 mazzthink Jopr Server, JBossAS 4.2.3.GA default (0.0.0.0:2099) 4.2.3.GA UP
JBossAS Server
10007 mazzthink Apache 2.2.9 (C:\mazz\apache-httpd\) 2.2.9 DOWN
Apache HTTP Server
10008 mazzthink Apache 2.2.9 (C:\apache\) 2.2.9 DOWN
Apache HTTP Server
10035 mazzthink File System (local) C:\ UP
File System
10059 mazzthink Embedded JBossWeb Server 2.0.1.GA (0.0.0.0) 2.0.1.GA UP
Embedded Tomcat Server
8 rows


Now that I got my list of resources, I can use the results to help perform additional work. Suppose I want to know how much free memory I have on my "mazzthink" machine. I can see from my query above that the "mazzthink" Windows platform has an id of 10001. I'll pass that to the ProxyFactory.getResource method to retrieve a resource object that I can use to access information about that resource (such as its free memory metric):


rhqadmin@mazzthink:7080$ myResource = ProxyFactory.getResource(10001);
rhqadmin@mazzthink:7080$ myResource.

OSName OSVersion
architecture children
createdDate description
freeMemory freeSwapSpace
getChild getMeasurement
handler hostname
id idle
manualAutodiscovery measurementMap
measurements modifiedDate
name operations
pluginConfiguration pluginConfigurationDefinition
systemLoad toString
totalMemory totalSwapSpace
usedMemory usedSwapSpace
userLoad version
viewProcessList waitLoad
rhqadmin@mazzthink:7080$ myResource.freeMemory
Measurement:
name: Free Memory
displayValue: 872.8MB
description: The total free system memory


Notice above I illustrate the usage of the auto-completer again. I hit Tab after typing in "myResource." and the CLI showed me all the valid measurements, operations and other methods that are available for that resource. I chose to get the "freeMemory" measurement, and you can see above that this resource has 872.8MB of free memory.

In addition to getting measurement data, another thing I can use the CLI for is to execute operations on managed resources. Continuing our example with the mazzthink platform resource, suppose I wanted to get a list of all the processes running on that box. The platform resource has a "viewProcessList" operation available to it (as you can see above from the auto-completer output). If you are unsure what operations are available on a resource, you can use the "operations" method on your resource proxy object to get the list:


rhqadmin@mazzthink:7080$ myResource.operations
name description
------------------------------------------------------------------------------------------
----------------------------------------------------------------------
viewProcessList View running processes on this system
manualAutodiscovery Run an immediate discovery to search for resources
2 rows


Using the CLI, we can invoke that viewProcessList operation on the mazzthink Windows platform and have the results shown on the CLI console:


rhqadmin@mazzthink:7080$ myResource.viewProcessList()
Invoking operation viewProcessList
Configuration [12186] - null
processList [99] {
process [5] {
kernelTime = 25140
name = winlogon.exe
pid = 1836
userTime = 8156
size = 95723520
}
process [5] {
kernelTime = 31
name = httpd.exe
pid = 516
userTime = 31
size = 25780224
}
process [5] {
kernelTime = 2296
name = postgres.exe
pid = 1572
userTime = 937
size = 75284480
}
process [5] {
kernelTime = 159421
name = javaw.exe
pid = 2776
userTime = 380347
size = 1215180800
}
process [5] {
kernelTime = 91190
name = firefox.exe
pid = 6088
userTime = 251285
size = 876466176
}
...
}


And finally, to illustrate that the CLI is really a CLI and not just an interactive shell, you can pass a script command directly on the command line via the -c option:


[mazz@mazzfedora bin]$ ./rhq-cli.sh -u rhqadmin -p rhqadmin -s mazzthink -c "Pr
oxyFactory.getResource(10001).freeMemory;"
RHQ - RHQ Enterprise Remote CLI 1.3.0-SNAPSHOT
Login successful
Measurement:
name: Free Memory
displayValue: 884.5MB
description: The total free system memory


The CLI also has a -f option that allows you to specify a script file to be executed - allowing you to write and prepackage scripts for execution later.

That's the CLI in a nutshell. It's still a work in progress - the remote API is still being flushed out and being enhanced to support the many use-cases we expect people are going to want to support.

The community documentation is also currently being worked on. But there is some documentation now, with more to come at the following pages:

CLI Install Documentation
Running the CLI documentation

Friday, July 10, 2009

Monitoring Generic Processes

A previous blog of mine talked about how you can monitor an application or generic operating system process by going through its Script/CLI interface (you can watch a demo of the Script plugin to see it in action).

There is now another way to do this in Jopr, thanks to the new process monitoring feature introduced in the platform plugin.

Even if you do not have a CLI interface for the operating system process you want to monitor, you can still ask Jopr to monitor it - you do so by pointing Jopr to a pidfile containing the process's pid, or you can give Jopr a PIQL process query to execute to find the process. (PIQL = process information query langage). Note that this is all cross-platform - you can monitor processes on Windows just as easily as you can on any UNIX flavored operating system, so long as there is native support for your platform and it is enabled (which is normally the case, so you typically don't have to worry about this - you'll get this stuff for free).

What this means is for any process running anywhere out on your network, you can monitor that process. This process monitoring mechanism can be used to inform you if your managed process is up or down, and it can collect some metrics on the managed process, such as CPU utilization, memory consumption, and file descriptor usage.

You can watch a demo of this process monitoring feature to get a feel for how it can be used and what it does.

Saturday, June 27, 2009

Managed Resources Exposed Via WebDAV

Thanks to Greg Hinkle, Jopr has recently introduced an experimental feature that exposes your managed environment as a WebDAV repository.

This means that you can use any WebDAV client to browse your managed environment as if it were a simple file system and peek into your resource hierarchy, obtaining information such as a resource's availability, its configuration and measurement trait data (watch this WebDAV demo to see it in action; Greg's blog post and demo are here).

If you aren't familiar with WebDAV, all this means is that your managed environment will look like a simple file system (in the Microsoft Windows vernacular, it is called a "Web Folder"). Directories in this "file system" or "Web Folder" represent managed resources and files found in those directories represent data about those resources.

So, for example, if you had a machine called "comp.xyz.com", and on that machine you have installed a JBossAS server, a PostgreSQL server and a network adapter, your WebDAV file system paths to access those resources could look like this (notice how they look like simple file system paths):






Machine itself:/webdav/resource/comp.xyz.com
Network adapter:/webdav/resource/comp.xyz.com/eth0
JBossAS server:/webdav/resource/comp.xyz.com/Banking%20App
PostgreSQL DB:/webdav/resource/comp.xyz.com/My%20Postgres


This is currently in the "experimental" phase; however, by watching my WebDAV demo, it should be easy to see how this can prove to be a very powerful feature. Using any WebDAV client, you potentially could obtain alot of information about your managed resources (measurement data, alerts, events, logs, configuration and much more). It's pretty easy to add functionality to this WebDAV interface, so providing access to things like measurement data, alerts, events and the like should not be hard to accomplish. In about 30 minutes, I added the ability to view a resource's measurements traits (you will see that feature in the demo). 6/29/2009 note: spent another couple hours coding and was able to incorporate authentication/authorization into the WebDAV tier so it utilizes the normal Jopr authz layer; also added a new WebDAV resource "measurement_data.xml" which is an XML file of the numeric measurement data for a resource (min, max, avg values, etc).

Now we just need to find the time to flush out this functionality and make it production ready with additional features. If you are interested in working with the Jopr team and willing to get your hands dirty writing code, this might not be a bad place to start. You really don't need to know too much about the internals of the Jopr Server - you just have to interface with the internal Jopr stateless session EJB3 beans to obtain data about resources. The WebDAV API seems pretty easy to use - it's all based on the third party library called Milton. Don't hesitate to ask the team for help in getting started as a Jopr developer, we are usually around on freenode at #jopr or send an email to "jopr-dev at lists.jboss.org"

Friday, June 5, 2009

Jopr Is Now Easier To Install and Demo

Well, Joe did it again. I don't know how he does these things so fast but here you go:

http://josephmarques.wordpress.com/2009/06/03/jopr-has-embedded-database-support/

With this new H2 embedded database support, you can now install Jopr much easier now. You no longer have to install your own PostgreSQL or Oracle database separately. You now have the option to use the "embedded mode" which tells Jopr to use the embedded H2 database and the embedded agent.

This means to install and try out Jopr involves these simple steps (oh, and these are cross-platform instructions - it works the same on Windows as it does on UNIX or Linux):

  1. Unzip the Jopr distribution
  2. Run the Jopr server startup script (rhq-server.[sh, bat] start)
  3. Point your browser to http://localhost:7080
  4. Click the "Embedded Mode" button
  5. Click the "Install!" button
That's it! Jopr will finish the installation for you, creating your H2 database and starting up both the server and embedded agent. The agent will automatically begin discovering and monitoring resources it finds on your box. The browser will provide you with a link to get started in the Jopr UI.

Use this quick and easy method to try out Jopr. You can try it out and demo it on anything from a laptop to a mainframe - so long as you have a Java virtual machine available.

(To Joe - thanks, job well done! Joe did the heavy lifting, I just enhanced the installer to be able to install Jopr using the embedded database.)

Monday, May 18, 2009

Supporting Events In Your Custom Plugin

This blog is in response to a question from user "earthling" about my Correlating Events with Jopr blog entry.

"How do i use this feature [Events] in a custom plugin?"

It's easiest if you just copy the code from one plugin to your custom plugin. It isn't that much code.

First, you need to declare that your plugin's custom resource type supports events. Look at how the JBossAS plugin descriptor does this - you can probably copy the <plugin-configuration>...<c:group name="event" displayName="Events"> section verbatim.

Then, your custom ResourceComponent subclass needs to add an event poller to the plugin container. You can do this within your start() method, as the JBossASServerComponent does, like this:

public void start(ResourceContext context) throws Exception {
this.logFileEventDelegate = new LogFileEventResourceComponentHelper(this.resourceContext);
this.logFileEventDelegate.startLogFileEventPollers();
...
}


Note that the "LogFileEventResourceComponentHelper" class is new that Ian added after 2.2. So, you'll have to build trunk/HEAD to be able to use it. If you have 2.2 or less, just grab that class and use it in your plugin (or at least copy-n-paste its code).

In your components stop() method, you need to remove the event poller, too:

public void stop() {
this.logFileEventDelegate.stopLogFileEventPollers();
...
}


Once your resource has started, it will start emitting logs that the Jopr Server will show you! You get all the event functionality by adding that little bit of metadata and code. Of course, the example above assumes your events come from log4j log files. But events can come from anywhere. For example, the Windows platform plugin component can emit events from the Windows Events subsystem and Greg Hinkle has updated the Linux plugin component so that it emits syslog messages from his Linux box (he should probably check that in soon :).

Tuesday, April 28, 2009

Managing Resources That Have A CLI

Sometimes, you may find yourself wanting to manage a resource for which there is no current Jopr plugin or, perhaps, the resource doesn't even have a management API or interface.

I have developed a new "Script plugin" that may help manage these "unmanageable" resources. You can watch a demo that shows the Script plugin in action.

This plugin will assume there is a command line executable or script that can be used to interface with the managed resource. What you do is manually add your "Script" resource to the Jopr inventory - defining where the command line executable or script is, and how its results/output should be used to determine the state of the managed resource. To be clear, the "CLI" executable/script is not really the managed resource - it is only used to interface with the real managed resource. The typical example I give people is: "apachectl" is the script, Apache Web Server is the real managed resource.

There is the ability to show RED or GREEN availability for any Script resource. For example, I can configure the Script plugin to assume the resource is UP (aka GREEN) if:
  • the executable merely exists on the file system, or...
  • the executable was able to run or...
  • the executable ran and returns a certain exit code and/or...
  • the executable ran and output a certain text string

The idea is that I have a command line executable or script that needs to get executed with a certain set of arguments. A successful execution would be determined by a regular expression that needs to match the exit code or output from that CLI.

You can also use the same type of mechanism to define how to invoke the executable to obtain a certain set of metrics. You can define arguments and a regular expression and based on what the executable returns (and what the regex captures) would be the value of the metric data.

This plugin should be useful because it is very generic - you can use it as-is for your own generic resources that have command line interfaces, a developer could extend the plugin classes, or a developer could just write their own plugin descriptor and have it pick up the Script plugin's ability to execute any executable and get the results. I think we'll be able to extend this to build a Nagios plugin, for example.

If you have any thoughts or ideas on what features such a Script plugin should have, feel free to comment. I would be interested to know what kinds of CLIs people run that manage their resources.

If you are interested in helping out develop a Nagios plugin, let me know - I'm very desparate in getting help on that. :)

[note: this blog post was updated July 2009 to reflect the new name of the plugin. This plugin was previously called the "CLI" plugin, but in order to avoid confusion with Jopr's new remote client/CLI, it was renamed to "Script" plugin.]

Sunday, April 26, 2009

Applying Patches, Updates and Other Content via Jopr

We like to say Jopr is a management platform, not just a monitoring tool. This is because Jopr does more than just monitor the health of managed resources. It can be used to configure resources and control resources, too. In addition, Jopr has a content subsystem that allows it to deploy content to any managed resource that supports the content facet.

What does this mean? It means that you can set up your Fedora boxes to be able to "yum install" packages directly from Jopr; it means you can install patches to your JBossAS Servers; it means just about anything you want it to mean (in the context of pushing and pulling content) because Jopr is extensible in such a way that you can write your own plugins to do what you need it to do with respect to pulling down content for deployment to your custom resources.

I've created a Jopr content demo that shows how you can aggregate content from multiple remote repositories into Jopr and then have Jopr serve that content to resources it is managing.

Here is an architectural diagram that discusses how content flows from remote repositories to the Jopr Server through to the managed resources via the Jopr Agent. This diagram, coupled with the demo, should provide some good insight into the basics of the Jopr content subsystem.


Tuesday, April 21, 2009

Jopr and Embedded Jopr - What Are They?

Jopr and Embedded Jopr. Aside from each having Jopr in the name, how are they related? How are they different? Why do these two projects exist? I'll try to explain that here.

Jopr is a management platform. It allows you to manage an entire network of systems and products hosted in your IT environment.

Embedded Jopr is a management console that is embedded in a JBoss Application Server to allow you to manage that particular JBossAS instance.

Jopr and Embedded Jopr use what are known as "plugins" to do the real management work - each plugin is specific to a particular product or software component that needs to be managed. For example, today there exists plugins to manage JBossAS, Tomcat, Hibernate, and PostgreSQL, among other things. These "plugins" live in what is called the "Jopr Plugin Container". This plugin container is responsible for controlling the lifecycle of the plugins.

Jopr and Embedded Jopr are related in one important aspect - they share alot of the same code! The beauty of the architectural design of Jopr is such that you can take the Jopr Plugin Container and its plugins and embed them in any Java virtual machine - including a virtual machine that is running a JBossAS instance.

Embedded Jopr uses the Jopr Plugin Container and embeds it directly inside a web application (aka .war) that is deployed inside JBossAS, allowing you to directly manage that JBossAS instance. In other words, Embedded Jopr is managing the very JBossAS instance in which it lives.

Jopr, on the other hand, has a standalone Jopr Agent and it is this Jopr Agent that embeds the Jopr Plugin Container. The Jopr Agent is able to do a few things Embedded Jopr cannot do but the main difference is that the Jopr Agent can communicate with a Jopr Server cloud, allowing it to participate in a full-fledged management enterprise environment.

The next couple diagrams illustrate these two different models. First is Embedded Jopr. Notice that everything here is directly hosted within a managed JBossAS instance. Inside JBossAS are, of course, its own internal services, one of which is Embedded Jopr. Embedded Jopr is used to manage this JBossAS instance. It is analogous to the jmx-console, only with more powerful features. You will notice that inside Embedded Jopr is code for the actual user interface as well as the Jopr Plugin Container. This Jopr Plugin Container code is identical to the plugin container used by Jopr - not a single line of code is different in this Jopr Plugin Container as compared to the plugin container deployed within the Jopr Agent (which we'll talk about later on). The plugins are, also, identical. In fact, this is designed in such a way that allows others to write custom Jopr plugins and deploy them, not only in Jopr Agents, but in Embedded Jopr as well. The hope is that Embedded Jopr will be able to support the management of other services deployed in its JBossAS instance, thus allowing you to enhance and extend Embedded Jopr with off-the-shelf plugins or your own custom plugins to manage your own custom services. There are many opportunities to extend Embedded Jopr in this way - think of all the different components that you could deploy in JBossAS that you'd want to manage (Portal, Seam, jBPM, Drools, JBossTS, JBossCache, etc.); if you had plugins for them, you'd just deploy them inside Embedded Jopr's plugin container and you could immediately begin to manage them. You could then take those plugins and later deploy them inside a Jopr Agent (without the need to touch or even rebuild your plugins) and have them be used within a full-fledged Jopr management enterprise. This concept has been proven to work by the mere fact that Embedded Jopr exists. Embedded Jopr already ships with some plugins common with the Jopr Agent (the JMX plugin is one of them).

Now let's look closer at the deployment model of Jopr. First, notice that the Jopr Agent, at its core, is the same Jopr Plugin Container that we saw in Embedded Jopr. It's the same code. Reusability is a wonderful thing. And remember, not only is the Jopr Plugin Container reused, but the plugins themselves are 100% reuseable, too. But now notice the difference. The Jopr Agent is standalone - its separate from any managed JBossAS instance. In fact, the Jopr Agent can manage multiple JBossAS instances! Not only that, but it can also manage any number of products or components - PostgreSQL databases, remoted Java Virtual Machines, operating system services, and anything else you want (even your own custom products or components), as long as you have a plugin that is capable of managing them (I won't get into plugin development topics, see the Plugin Development wiki to learn how you can write your own plugins).

The next major difference you see here is the Jopr Agent can communicate with the Jopr Server cloud (which consists of 1 or more Jopr Servers with persistence storage backing those servers). The Jopr Server provides additional capabilities not available to Embedded Jopr. These include the ability to: persist historical metric data from your managed resources, provide an alerting mechanism to notify IT administrators when something goes wrong within your managed environment, provide a persisted audit trail for security tracking, provide persistence storage of events occuring within your managed resources, and many other things.

Hopefully, I've answered the basic question, "What is Embedded Jopr and how is it different from Jopr?".

What I've also hoped to convey was a bit of an architectural overview of the Jopr Plugin Container and its management plugins and how their reusability is exploited to make development and use of both Embedded Jopr and Jopr much more easier.

Saturday, April 4, 2009

SVN Statistics

I wanted to collect some SVN statistics for some codebases and found a very nice project that generates useful SVN reports and graphs. The project is called StatSVN.

It's very easy to use. I'll include both a UNIX and Windows script here so no matter what platform you are on, you can easily use this. Each script is only about 20 lines long.

First, download the statsvn.jar from the StatSVN download page.

Now, put the run-svnstats script (which I give below) in the same directory as the statsvn.jar file. If on UNIX, use the first script; Windows use the second script:

UNIX Script



#!/bin/sh

WORKING_COPY=/home/me/perf/src/rhq/trunk/modules
STATS_GEN_DIR=/home/me/svnstats-gen
START_END_REVISIONS=10:HEAD

SVN_LOGFILE=$STATS_GEN_DIR/svn.log
STATSSVN_JAR=$STATS_GEN_DIR/statsvn.jar

cd $WORKING_COPY
svn up
svn log -v --xml -r $START_END_REVISIONS > $SVN_LOGFILE
cd $STATS_GEN_DIR
if [ -d svnstats ]; then
rm -rf svnstats
fi
mkdir -p svnstats
cd svnstats
java -jar $STATSSVN_JAR -concurrency-threshold 2000 -threads 50 $SVN_LOGFILE $WORKING_COPY

echo SVN Statistics are now located here: `pwd`


Windows Script



@echo off

set WORKING_COPY=C:\source\rhq\trunk\modules
set STATS_GEN_DIR=C:\svnstats-gen
set START_END_REVISIONS=10:HEAD

set SVN_LOGFILE=%STATS_GEN_DIR%\svn.log
set STATSSVN_JAR=%STATS_GEN_DIR%\statsvn.jar

cd %WORKING_COPY%
svn up
svn log -v --xml -r %START_END_REVISIONS% > %SVN_LOGFILE%
cd %STATS_GEN_DIR%
if exist svnstats rmdir /S /Q svnstats
mkdir svnstats
cd svnstats
java -jar %STATSSVN_JAR% -concurrency-threshold 2000 -threads 50 %SVN_LOGFILE% %WORKING_COPY%

echo SVN Statistics are now located here: %STATS_GEN_DIR%\svnstats


Lastly, you just have to edit the script to match your environment. Change the three variables defined at the top:


  • WORKING_COPY: the full path to the SVN working copy as found on your local file system. This SVN working copy will be svn updated as part of the script, so make sure you don't mind having this working copy get updated to the latest revision.

  • STATS_GEN_DIR: the full path of the location where you stored the statsvn.jar and the scripts. The generated SVN log and the final HTML pages, reports and images will get stored here as well.

  • START_END_REVISIONS: the starting revision number and ending revision number of your SVN working copy. Only SVN statistics for that revision range will be reported on. This should be in SVN format (e.g. 1234:5678 or 1:HEAD)



That's it. Run the script and out comes an "svnstats" directory with a full set of HTML pages and images (this svnstats directory will be located under the STATS_GEN_DIR directory). Just copy the svnstats directory to a webserver (e.g. Apache's htdocs directory) and point your browser to the svnstats location and you can now peruse your SVN reports.

You can even run this script as a cron job to automatically update your stats periodically. Just have STATS_GEN_DIR point to a directory that your webserver can serve up and make sure your START_END_REVISION takes into account HEAD (so new checkins will get reported).

I thought this was a very simple yet cool way to get SVN stats for any SVN project; hope you can find this as helpful as it did me.

Thursday, March 26, 2009

Cross-Facet Correlation Provided by Jopr

I'm surprised you are still reading this, given the nebulous title of this blog :). But I'm glad you are here. I want to further explain what my last blog was really trying to convey.

Jopr is essentially an abstract management framework - at its core, Jopr does not understand about the concrete managed resources it actually is managing. Specific, concrete knowledge of the manage resource implementations is pushed out to the edges - in the agent plugins themselves.

But what Jopr knows, at its core, are the different management "facets" that those managed resources support. What is a "management facet"? Simple - a facet is a subset of functionality that all managed resources may or may not support. Facets are orthogonal - a managed resource and its plugin can support any number of facets in any combination.

Jopr supports several types of facets - for example, the measurement facet (for collecting metric data), the operation facet (for controlling managed resources), the configuration facet (for retrieving and setting resource configurations), the content facet (for pushing and pulling file content to/from managed resources), the event facet (to emit asynchronous event information such as a managed resource's log file messages) and others.

There are numerous advantages for having an abstract management platform that can handle all of these different management facets for many different kinds of managed resources. One of these advantages is the ability to correlate and process information across all the different facets.

Looking at Jopr's summary timeline, and what "cross-facet correlation" means soon becomes very apparent:This timeline (with time moving from left to right), in one view has correlated information such as:
  • when the resource was up and down; notice the background color - light green means the resource was up at that time and a red background means the resource was down with grey meaning it is unknown what the state of the resource was (measurement facet)
  • when events (e.g. log messages) happened, how many of them occurred and how severe these events were (severity is shown in the badging of the events icons). (event facet)
  • when alerts were raised and how severe those alerts were (alerts are shown as flag icons with their severity indicated by their color). Alerting isn't an actual management facet per-se, the alert subsystem is provided to all managed resources with no effort needed on the part of the plugin - alerts are provided "for free".
  • when a resource's configuration was changed and whether or not that change was successful or if it failed (configuration facet)
  • when an operation was invoked on a resource and whether or not that attempt to control the resource was successful or not (operation facet)
This is what "cross-facet correlation" is all about. How is this helpful? Well, if I see a gap of events in my timeline, where that gap has a background of red, that tells me very quickly that I am not receiving any events because the resource is down! If I see that I changed the configuration of a resource, and soon after I see a flood of warning events and then perhaps followed by a red background, I can immediately begin to suspect that that configuration change caused an adverse affect on the behavior of my resource. If I executed an operation and soon after see one or more alerts trigger, I should start by investigating if that operation caused the resource to act strangely.

Cross-facet correlation occurs in other areas of the Jopr UI (and the sky's the limit to where we can take Jopr from here in the future). For example, here is the summary tab for my agent. Notice how all the different facets are combined into a single view so at a glance you can see what is currently going on with this agent resource.

I can see that I had one operation, out of the past three, that failed. I can see that I recently updated the configuration of this resource several times, one of which failed. I can see what the current measurements are for this resource, and what events and alerts have been logged, when they happened and their severity.

All of this data is linked to their respective pages in the UI - if I want to see more about the alerts, I click the alert links. More about the operations? Click the operation links. And so on.

And because Jopr has abstracted the facets so they are applicable across any number of managed resources, we can manage all different kinds of resources - JBoss Application Servers, Apache Web Servers, Tomcat Web Application Servers, hardware boxes, operating system services, even Jopr itself - and we reuse the same UI pages, the same code, and the same look & feel - no additional code needs to be introduced to the server to support additional types of managed resources!

For example, above you see the summary for the Jopr agent. What's the summary information look like for a JBoss Application Server that I am managing? You can see this here. Notice that the same look & feel, the same UI pages and code is used, the same SQL queries - everything is reused. No additional integration is needed on the server side. But notice the difference. The agent resource supports the "configuration facet" - which is why you saw the "Configure" tab and the configuration update information earlier. But the agent does not support the "content facet" (the agent does not push or pull content over the Jopr content subsystem). But look at the JBossAS resource - it does not support the configuration facet (so you don't see the Configure tab and there are no recent config updates) but it does support the content facet (you can see the Content tab along with some recent package history). The JBossAS resource component in the agent will send up to the Jopr server information about what packages (e.g. jar libraries) it has installed. If a resource supports it, you can even ship down updated packages to the resource (for example, to send down new jars that incorporate bug fixes).

The above just talks about correlating cross-facet information for a specific resource. What if I want to see information across my entire inventory? What if I want to see all of the alerts that were triggered by Jopr, regardless of the resource that triggered the alert. What if I want to see all the configuration changes made to my environment, regardless of which resource was reconfigured?

Again, because Jopr is abstract in nature, we can scan the inventory history and aggregate this kind of information. Below you see that we can view all the alerts and all configuration changes - you can even filter the results if you only care about a subset of the data.




I think this clearly shows how an abstract management platform that supports multiple management facets can provide tremendous value to anyone managing a network of hardware and software products.

Well, that's all I have to say on this subject (for now). I hope this makes it a bit more clear what Jopr brings to the table and its value-add that it can bring to your IT environment.

Wednesday, March 25, 2009

Correlating Events with Jopr

Today, I rediscovered how nice Jopr really is when I enhanced the agent plugin so it can track the agent log files.

In just an hour or two, I added a feature allowing you to enable log tracking for the agent itself. If you enable this feature, you can view the agent's log messages directly in the Jopr UI. This enables you to see what's going on inside an agent, and you can corrolate those log events with other types of changes happening to your agent (e.g. configuration changes, monitoring data, alerts, etc).

Let me discuss some of the nice things this allows you to do. Remember, even though this is a concrete example using the agent itself as the managed resource, everything I'm about to discuss can be done for your own managed resources because all of these features are abstract and can be utilized by any plugin, should the plugin developer choose to use them. This is what an abstract management framework provides you and is what Jopr is all about.

First, I took between one and two hours to add code to the agent plugin in order for it to utilize the event subsystem provided by Jopr. A little bit of Java code, a little bit of XML and I went from nothing to being able to fully integrate the agent log files into the Jopr events subsystem.

OK, so, what does this get you? First, and the most obvious, is you can now view the log message events from within the Jopr UI. You do not have to remotely log into the machine where the agent is running to view its log messages. See the image here on the left - this is the event history view. Because the agent emits its log messages as events, this event history view is essentially browsing the log file, with the added bonus of being able to filter the view based on the log message content and the severity of the messages (INFO, WARN, ERROR, etc).

Second, you can view the events corrolated with monitoring data - this allows you to see what your resource's measurements looked like at the time the resource was logging messages (which might help you in diagnosing a problem). Take a look at the bottom of the graphs and you can see the different colored icons to indicate the highest severity of events that occurred in any timeslice shown on the graph. Here you can see INFO (green), WARN (yellow) and DEBUG (blue) messages occurring in different times. And notice how you can corrolate those times with measurement data and event activity.

You can even drill down into the log file directly from this view so you can read the first several log messages that occurred within a narrow span of time. Again, this might help you to diagnose a problem if, using this agent resource as an example, you can see the actual ERROR log messages that occurred in or around the same time you saw the average execution time for sent commands starting to go up.



Jopr can be more proactive with these events/log messages, as well. You can define an alert definition such that you will get notified (via email for example) if the agent emits a log message at the ERROR severity level. You can even be alerted when a specific log message is emitted (e.g. have you ever wanted to be emailed when your application spits out an OutOfMemoryError log message? Now you can!). The alert definition UI page allows you to set this up.

And finally, you can use the summary timeline to further corrolate log message events with other things that have happened to this agent resource. For example, notice that I can see when my event messages occurred (and what their highest severities were) on this timeline correlated with other things that happened to this agent, such as when its configuration changed and when alerts were triggered. You would also see when operations were invoked on this timeline as well had any operations been executed during this timeframe that the timeline is showing. In effect, you get a wholistic view of what happened to this agent resource, across all the different management facets: configuration changes, control executions, alerts, events, etc!


Once again, I must emphasize the fact that you can get all of this functionality, too - and all you need is to write a plugin that talks to your managed resource and provides the raw data to Jopr. You get everything else for free - the corrolated timeline, the monitoring graphs, alerting and more.

So, even though the above screenshots show this functionality for the agent resource, the UI and all of these capabilities would be the same for any resource you want to manage, so long as that managed resource has a plugin that provides the same kind of raw information. In my case, I just had to spend a couple of hours to get the agent to report events from its log files, and Jopr took care of the rest. For example, I did not have to do anything to enable the alerting capabilities or the corrolating timeline. That is the value that Jopr brings to the table!

Monday, March 23, 2009

The Mighty Embeddable Plugin Container

Heiko has just demonstrated another way that the agent-side plugin container can be embedded in any Java VM.

We've already proven that this concept works because the plugin container has already been embedded in a few places: not only does the agent itself embed the plugin container, but our unit tests do it when the validity of plugins needs to be tested and also the Embedded Jopr project does it by embedding the plugin container directly in a JBossAS5 application server!

But what Heiko has done is go a step further by providing a very small, yet useful, wrapper around the plugin container to support plugin developers (it is called the "standalone plugin container"). It is "standalone" because you no longer need to install and run a full Jopr environment (a server and a database) in order to test your plugin's functionality.

If you are writing a custom plugin, just use the standalone plugin container and deploy and execute your plugin. This means you just take an existing agent distribution, and use a very simple script to start the standalone plugin container (there are Windows and UNIX versions of the script). The simplicity of these scripts border on trivial. Under the covers, all this does is run a new main class that embeds the plugin container, as opposed to the original AgentMain class (which does all the complex agent-to-server communications). This new standalone plugin container will accept commands on the prompt to help you exercise your plugin code - a great help for those writing plugins. See the README for some install instructions.

This type of capability sort of existed in the old JBoss ON 1.x code base - but its old (now obsolete) plugin model was never this modular and could never have been this easily embedded in so many different ways.

Saturday, February 28, 2009

Jopr 2.2 Sneak Peek

Jopr 2.2 is almost ready for release. I was going to build some nice flash demos, but there is so much new to cover, I didn't know where to begin! I wanted to get the message out quickly though, so, rather than wait for my flash demos to be completed, I decided to simply take some screen snapshots and post them up on the Jopr wiki.

Take a sneak peek here at what's coming - I think you will find Jopr 2.2 to be a great leap forward. It is much better than anything JBoss ON 1.x ever could or did deliver, and it is a major milestone even from the latest releases of JBoss ON/Jopr.

I am very proud of what we have been able to accomplish as a team - this is definitely the most feature-rich release of any previous JBoss ON/Jopr release.

Sunday, February 15, 2009

Quick Java Heap Analysis

I had a need to do some quick profiling of a Java application. I didn't want to spend alot of time (or money) setting up a profiler and my app to run within that profiler's environment. But it turns out that SUN's Java 6 distribution ships with some very handy tools that you can use to do what I wanted without setting up anything special - didn't even have to pass in any special set of -D system property definitions.

First, run your application and get it into a state that you want to analyze. Then perform the following steps:

  1. Run "jps" to determine the process ID (aka pid) of your Java application. BTW: this is a cool nugget in and of itself. I've been so used to doing "ps -elf | grep java" or "pgrep java" to find my running Java applications that I never bothered looking for a simpler way. "jps" dumps the pid and the simple name of the Java main class - and most of the time this is all you are looking for. But you can also get the fully qualified main class names and the argument lists - use "jps -help" to see the usage syntax. This is a very helpful tool all by itself.

  2. Run "jmap -dump:format=b,file=dump.dat <pid>" where <pid> is your Java applications's pid that you found in the previous step. This will dump information about your Java VM's memory in the file "dump.dat" and will be used by jhat in the next step.

  3. Run "jhat -J-Xmx512m dump.dat" to start the Java heap analysis tool (jhat). This will start an HTTP listener on port 7000. The -J option lets me configure the jhat VM, which is needed if you have a large dump to analyze.

  4. Point a browser to http://localhost:7000 and begin browsing around.



This jhat tool let's you examine the objects and classes currently found within your VM. It's nothing glorious and certainly not as well featured as many of the other commercial and open-source profilers out there. But, it comes with the JDK, and requires absolutely no setup. It can't get much easier to perform a quick heap analysis than this.

Credit goes to Frank Kieviet, whose blog brought my attention to these tools.

Monday, February 9, 2009

If Your Computer Is Going To Blow Up, Let Jopr Warn You

Well, OK, Jopr won't let you know if there is a cherry bomb strapped to your laptop, but, it can alert you if something is physically wrong with your hardware that may cause it to malfunction, assuming your hardware can provide Jopr with relevent data.

Take for example, the premature ending of Greg Hinkle's Jopr demo a few weeks ago. Greg's computer crashed for what seemed like an unknown reason - but because he was running Jopr on his laptop (which had his experimental hardware plugin deployed), he was actually able to use Jopr to figure out what the problem was. Turns out, his laptop was slowly overheating - which ended up crashing his box. Check out his blog to see what the graph looked like.

Too bad he didn't have any alerts set up on the temperature metric - he could have received an email from Jopr telling him his machine was about to blow up because his hardware reached 100 degrees :)

Sunday, January 25, 2009

Classloaders Keeping Jar Files Open

If you write code that creates classloaders, you need to know about this bug:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5041014

It is very insidious and something I just came across myself in some code.

You normally only have to worry about this if you are writing code that creates and destroys classloaders (for example, if you have some kind of pluggable architecture where a pluggable component found in a jar file gets its own classloader, and you want that pluggable component to be hot-deployable - that is, you want to be able to overwrite or modify that jar file with updated code). In Jopr's case, this happens on the agent - each "product plugin" (e.g. the JBossAS plugin, or the Postgres plugin, etc) has its own classloader, managed separately and kept independent of other plugin classloaders (there is a dependency model in place, but ignore that for this discussion).

Well, this VM bug is so bad it seems that anytime a classloader loads in a jar file, that jar file's file descriptor remains open for the lifetime of that VM (in other words, the classloader never calls JarFile.close() for all the jar files it previously streamed content from). At least that's what the bug report infers and what I'm seeing when I was debugging this. There is a nifty tool from Timothy Quinn that he used to track issues in Glassfish, but this tool is useful to track this kind of problem for any application, not just Glassfish - in fact, I used it to debug the issue in the Jopr agent. This bug manifested itself in the Jopr agent when hot-deploying agent plugins on Windows (Windows has the "feature" of not being able to manipulate files that are locked by others). I suspect similar issues will occur on UNIX because, even though UNIX doesn't do the file locking that Windows does, the file descriptors are still open and copying a file with the same name over the opened file will probably just create a second file descriptor.

The worst part about this is - there is no real workaround. The Jopr agent has its own classloader implementation - it is very basic and extends java.net.URLClassLoader to reuse most of its functionality. But the Java classloader API has no public, protected or package-scoped method or data field that you can override or access within URLClassLoader to help workaround the problem.

To actually fix the problem, it is simple - when you know you are done with a classloader, you just need to have that classloader close all .jar files it previously had opened. Alas, there is no "close" type method on the classloader object - there is absolutely no way to tell a classloader "I am done with you, clean up any resources you have open".

Once a classloader opens a jar file, that jar file's file descriptor remains open by the operating system for the lifetime of the VM. I find this completely unacceptable - this is clearly a design flaw that slipped through the cracks when the Java API was conceived and implemented. In order to support hot-deployable Java code, one would need to destroy and recreate classloaders. The current Java implementation does not make it easy to do this (requiring people to write their own classloader implementations from scratch does not meet the definition of "easy-to-do" and doesn't that defeat the purpose of OO and code reuse anyway?).

So, how do you support hot-deployable code and not see this bug? There are two main ways to do this as I see it:

1) write your own classloader implementation that allows you to close the open file descriptors when the classloader is no longer needed
2) copy the jar files that a classloader needs to a temporary location and put the temporary jars in the classloader (NOT the original jar files). When you need to hot-deploy an updated jar file, simply copy that new jar to a new temporary location, throw away the old classloader (which still has the file descriptor open, but its the old temporary jar file) and create a new classloader that opens the new temporary jar file. This sucks because if you hot-deploy frequently, you may run into your limit of the number of allowed open file descriptors (along with the problem that Windows presents - that being you can't delete the old temporary jar files until your VM exits).

Anyway, here is some code you can use to "workaround" this issue. It is a major hack - it only works if you are running in a SUN VM and because it relies on the implementation of internal SUN classes and code, you may break in the future should SUN decide to change how these classes are implemented (however, the good thing about this code is it has no compile time dependencies on any SUN-specific classes). I tested this code on SUN's Java6 JRE.

This method needs to be placed in your classloader that extends URLClassLoader. It uses reflection to iterate over the set of currently opened jar files as found in a private data member (URLClassLoader.ucp.loaders) of the classloader you want to discard. After running this code, I verified that no more jar files are left open.


public void close() {
try {
Class clazz = java.net.URLClassLoader.class;
java.lang.reflect.Field ucp = clazz.getDeclaredField("ucp");
ucp.setAccessible(true);
Object sun_misc_URLClassPath = ucp.get(this);
java.lang.reflect.Field loaders =
sun_misc_URLClassPath.getClass().getDeclaredField("loaders");
loaders.setAccessible(true);
Object java_util_Collection = loaders.get(sun_misc_URLClassPath);
for (Object sun_misc_URLClassPath_JarLoader :
((java.util.Collection) java_util_Collection).toArray()) {
try {
java.lang.reflect.Field loader =
sun_misc_URLClassPath_JarLoader.getClass().getDeclaredField("jar");
loader.setAccessible(true);
Object java_util_jar_JarFile =
loader.get(sun_misc_URLClassPath_JarLoader);
((java.util.jar.JarFile) java_util_jar_JarFile).close();
} catch (Throwable t) {
// if we got this far, this is probably not a JAR loader so skip it
}
}
} catch (Throwable t) {
// probably not a SUN VM
}
return;
}



If you happen to be using JNI (native libraries), you might also have to play games like the above to close the JNI jars too (same cavets as above apply regarding this needing to access the SUN implementation code). You can add this code to the close() method above:



// now do native libraries
clazz = ClassLoader.class;
java.lang.reflect.Field nativeLibraries = clazz.getDeclaredField("nativeLibraries");
nativeLibraries.setAccessible(true);
java.util.Vector java_lang_ClassLoader_NativeLibrary =
(java.util.Vector) nativeLibraries.get(this);
for (Object lib : java_lang_ClassLoader_NativeLibrary) {
java.lang.reflect.Method finalize =
lib.getClass().getDeclaredMethod("finalize", new Class[0]);
finalize.setAccessible(true);
finalize.invoke(lib, new Object[0]);
}



But even if you do this, I'm still not sure everything will work due to yet more SUN VM bugs (well, I think these are all basically the same bug):

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4299094
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4642062
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4286309

In the end, the Jopr agent didn't really need to do the above. I found that in the Jopr agent code, it was creating temporary classloaders unnecessarily which was locking the plugin jars. Once I removed the unnecessary classloaders from being created, the agent hot-deployment worked just fine since the plugin jars no longer got locked. For the record, the Jopr agent uses method #2 as described above to do its hot-deployment.

Saturday, January 10, 2009

Jopr Agent Auto Update Complete

I have completed the agent auto-update functionality. This provides the ability for a Jopr agent running in an environment to automatically detect that it needs to be updated and does so without the need for manual intervention.

The cool thing about this is it is completely cross platform! I've testing on Windows and Linux and I see no reason why this wouldn't work on other UNIX flavors such as HP-UX, AIX and MacOS.

Here's the basics of how it works:

When a Jopr Agent tries to connect or register with a Jopr Server, that server verifies the version of that agent. If the agent is not a compatible version, the server will forbid that agent from connecting/registering and will tell the agent it needs to update itself.

At this point, the agent will shutdown all of its internals, download the latest agent update binary (either from the server or some other download location previously configured in the agent), fork another Java VM that will unpackage the new agent binary and update the old agent with the new binary. The old agent will shutdown its VM and the new agent VM will be started.

From an administrator's point of view, this all happens under the covers and automatically and the agent just looks like it goes offline for a minute or two before coming back online.

Tangential to this, is the addition to several features to the agent plugin. The agent resource metadata now includes several more child services that allow you to configure your agent without having to manually log onto the agent box (i.e. we are using Jopr to manage Jopr!). You can now even change agent JVM settings and restart the agent with those new settings (in case you need to change a -Xmx option, for example). You can read this wiki page to learn about these new plugin features.

All of this code will be forthcoming in our next RHQ/Jopr release.

Here's some additional documentation you can read if curious:

http://www.rhq-project.org/display/JOPR2/RHQ+Agent+Installation#RHQAgentInstallation-PreparingYourAgentToBeAutoUpdatable
http://www.rhq-project.org/display/RHQ/Design-AgentAutoUpdate