Saturday, December 6, 2008

Bundling the Jopr Agent For Deployment

One of the most requested enhancements for Jopr is for an easier way to perform agent installations. Because of this, efforts are underway to ease the pain of agent deployment.

For small Jopr environments, you can take the agent distributions as-is, install them and individually set up each agent by answering the setup questions at startup. This can be tedious the more machines you have. It would be nice if you could bundle your own "golden distro", push them out to all your machines and with no additional configuration or manual setup required, have the agents "just work".

With the current Jopr code, this is now possible. Following these steps, you can bundle your own agent into one "golden distro". We call it the "golden distro" because it's the one and only distro you will need in order to install all the agents in your environment. That distro will be able to install all your agents and have them start and configure themselves with no further setup required.

  1. Unpackage the agent distribution that comes out-of-box. This is the starting point to build your own "golden distro".

  2. Next, consider what, if any, customized environment variables you need to set for your agents. If there are any, edit (rhq-agent-env.bat for agents that are to run on Windows). For example, if there are any -XX options you want to pass to your agent's JVM, set RHQ_AGENT_ADDITIONAL_JAVA_OPTS in the -env script appropriately.

  3. The next several steps involve editing the conf/agent-configuration.xml file. So, load that file in an editor. Here is where you will preconfigure your agents in order for the agents to start up and successfully configure and initialize themselves, without requiring an admin to answer the setup quetions.

  4. Set the configuration preference "rhq.agent.configuration-setup-flag" to "true". This tells the agent that, when it starts up, it should not ask any setup questions. Instead, it will immediately use what configuration preferences it has and attempt to initialize itself automatically. Of course, setting this to true infers that the rest of your agent's configuration in agent-configuration.xml is complete. But that's what we are going to make sure we do next.

  5. Make sure that the "" configuration preference is left undefined (out-of-box, this setting is commented out in the agent-configuration.xml, make sure you keep it that way). Leaving this undefined will force the agent to attempt to auto-generate its own name. It does this by looking up the agent machine's fully qualified domain name and using that as the agent name. This should ensure that all agents will obtain a unique agent name (since by definition, a fully qualified domain name or IP address is unique within a network).

  6. Next, determine which Jopr Server your agents will use as their "Registration Server". When new agents start up, they must communicate with a Jopr Server in order to register themselves into the Jopr environment. You must decide which of your Jopr Servers will be used to register newly installed agents (we'll call it the "Registration Server"). There is nothing special or different about a "Registration Server" compared to your other servers in your server cloud, i.e. you won't see any configuration settings or UI controls that turn on or off some "registration feature". Any Jopr Server can register any Jopr Agent. However, you must specify something in your golden distro's agent configuration so the agent knows where a server is so the agent can bootstrap itself into the Jopr environment. Once you determine which of your Jopr Servers will be the one to handle all new agent registrations, set that server's endpoint information in your golden distro's agent-configuration.xml settings:

    • rhq.agent.server.transport

    • rhq.agent.server.bind-port

    • rhq.agent.server.bind-address

    • rhq.agent.server.transport-params

    Note that this will not necessarily be the Jopr Server that will be assigned as the agent's primary server. Once the registration is complete, the agent will be assigned a server failover list, with the first server in the list to be designated as its "primary". This primary server may or may not be the same as the settings you provide here.

  7. If you wish to assign multiple "Registration Servers" to your agent, you may do so by prepopulating a failover list and putting it in your golden distro. This allows you to have more than one Jopr Server assigned to all of your agents as Registration Servers. If the main registration server is down or is in maintenance mode, the agents will be able to failover to your secondary servers as defined in your failover list. Create a directory "data" in your distribution and place a file called "failover-list.dat" in it. Each line in that file must be of the form "address:port" where address is the IP or hostname of a Jopr Server and port is the port number the server is listening on (each server must require the same transport and transport parameters, so the "rhq.agent.server.transport[-params]" settings will be used for all servers). If you prepackage a failover list in your golden distro, you should place your main Registration Server (the one you configured in the previous step) as the top-most server in the list. Each server thereafter can be listed. If the servers at the top of the list are down, the agent will still be able to register because it just moves down the list until it finds a server it can talk to. Note that this prepopulated failover list is only temporary and is used only the first time the agent starts. Once the agent registers, it will be given a new failover list which will overwrite the list shipped in the golden distro. This is what you want because the server maintains an up-to-date failover list for each agent and you want the agent to refresh its list everytime it regsiters and starts up.

  8. Make sure "rhq.communications.connector.bind-address" is left undefined (out-of-box, it is commented out in agent-configuration.xml, make sure you keep it that way). Leaving this undefined will tell the agent it needs to lookup its local IP address and use that as its bind address. It does so by using the Java API "InetAddress.getLocalHost().getCanonicalHostName()". Therefore, this uses whatever network adapters are installed on the box and chooses one from the list to determine which IP to use - usually it chooses the first network adapter that the operating system reports. (side note: this may choose an IP from the list of available IPs that is different from the one you actually want to use. You usually have to do some special configuration in your network adapters to get InetAddress.getLocalHost() to return the one you want).

  9. Repackage the agent installation in a new jar - this is your "golden distro". Take this distro and push it out to all of your agent machines and they can all start up without any additional configuration or setup needed.

RHQ-496 now allows the agent to determine its name at runtime if one was not specifically given to it at setup time. This code is in trunk, but not in any current Jopr releases. So to get the ability to bundle the "golden distro" and deploy it to multiple agents, you must use a trunk build, until we release our next version.

After you have deployed your golden distro to machines in your network, you are then left with the question of how do I upgrade my agents? This then leads to the desire to perform automatic upgrades of agents already deployed and running in your environment. This feature is not fully complete, but most of the work is done and exists in trunk (see my earlier blog on this topic). To follow the development of this agent auto-update feature, watch the JIRA RHQ-110. The finished implementation will hopefully look like the design described on the RHQ wiki.

Thursday, December 4, 2008

Configuration Change Detection in Jopr

A new feature has been added to trunk, a feature so interesting that it deserves its own blog.

I am sure most security-conscious administrators configure their IT infrastructure in a very specific way and they do not want anyone going onto any machine and re-configuring the machine or any of its software components willy-nilly. In fact, if something is reconfigured outside of a business' normal change-control processes, I would think administrators would want to be notified about it. It could be an innocent user mistakenly modifying something they should not be, or it could be an intruder trying to hack into the system. Being notified of configuration changes sounds like it could be a very useful thing.

Jopr now has this feature. If a plugin supports the configuration subsystem (i.e. it can retrieve configuration from its managed resource), the alert subsystem will have the ability to detect changes made in that remote managed resource and send notifications when that happens.

I've put together a demo that shows this feature in action. The scenario is quite simple - I have a Fedora box running sshd, and I do not want that sshd daemon process' configuration to change. If, for whatever reason, the configuration of sshd on the box does change, I want to be notified.

And because this config-change-notification feature is built into the core engine, any plugin that supports configuration gets this feature for free. So, if Jopr does not have a plugin that supports a particular resource whose configuration you want to monitor for changes, you can quite simply write your own plugin and deploy it into your Jopr environment and have this capability.

I can envision watching the following for configuration changes would be something people find helpful (and some of these you can already do today thanks to existing Jopr plugins):

  • JBossAS's main jboss-service.xml configuration file

  • JBossAS's authentication configuration (login-config.xml)

  • JBossAS's datasource configuration

  • /etc/hosts

  • Jopr Agent's own configuration

  • ...and many more...

And configuration does not have to be stored in a file on a filesystem. The Jopr configuration subsystem makes no distinction between configuration stored in a file, in a database, an LDAP server or whatever you can think of. It's the plugin's job to translate the resource's configuration into configuration data that conforms to the plugin's metadata. Once the configuration data makes it into the core engine, it is treated the same.

And finally, if a configuration change is detected, and that change was unauthorized, the Jopr user has the ability to immediately rollback that change by reverting to an earlier configuration set. This configuration-rollback feature is orthogonal to the change-notification feature, but you can see how both can be used hand-in-hand to keep a tight grip on your IT infrastructure's configuration.