[Yum] Not quite an idea.

Mon Aug 4 02:07:28 UTC 2003

seth vidal wrote:

>>These kind of considerations come into the frame when you are thinking 
>>about accountability and traceability - e.g. an ISP that want's to show 
>>that it _did_ have a security patch installed when that DDoS happened.
>>    
>>
>
>That's what logs are for and specifically the rpm cron job which dumps
>rpm -qa to a file nightly.
>  
>
We're thinking a bit more about the theoretical side - specifically in 
terms of being able to do things like meeting SLAs.  Logs are fine for 
after the event and assume that your logs have not been tampered with. 
In addition what happens on a machine is quite often not exaclty what 
you think happened. Comparing Logs against a declarative statement of 
what you acually wanted to happen makes it much simpler to spot the bit 
that went wrong.

If you give your client a "machine configuration definition" along with 
an SLA things get, erm, interesting (I'm not saying it's sensible :-)).

>> As long as you can do things like rollback updates and uninstall 
>> software then functionally it would do most things we have found useful.
>  
>
>
>do you mean rpm rollbacks? Have you done them?
>
No, I mean be able to make a machine into a web server on a monday, then 
into a database server on a tuesday, then back to being a web server on 
wednesday. Why? Think about IBM's computing power as a utility project. 
We want to be able to hold configuration definitions for a "development 
environment" that has all the data required to configure a network of 
machines and all the software. Just like you do something like 
File->New->Project in a development IDE, we want to be able to do 
Config->New->Software Development Team and have all the infrastructure 
allocated from a pool of hot swap office workstations. No human 
intervention required. We sort of have this ability already, but it's 
not very clean in implementation terms.

Using a procedural model means having to "code" the logic of the state 
transitions from Webserver (A) to Database Server(B) back to WebServer 
(A) =  A -> B -> A.

A declarative model describing information about the states is much 
simpler than describing the transitions (the procedure required to move 
from A to B). Writing a tool to do the hard work is better. It gets 
really complicated when you consider that a transition contains ordering 
information.

install foo, install bar, uninstall foo, uninstall bar

can have different results to

install foo, install bar, uninstall bar, uninstall foo

This is not such a problem on a single machine, but if you have two 
machines that you want to make 100% identcal a tool will be more 
consistent than a human when it comes to "coding" the transition.

>So you want to make a list of what the machine should look like and have
>the client tool figure out what needs to happen to make that so?
>  
>
Yes :-)

Is this what I'm hearing gridweaver does?

Yes - although that's not gridweaver - gridweaver is a research project 
between HP, Informatics and EPCC. Updaterpms is the software tool, LCFG 
and SmartFrog are the configuration tools.

>It seems a pain to keep up with specific version numbers in this file
>though. That's one of the reasons I like just being able to say 'this
>package, whatever version is newest and available'
>
At the moment we do both:

foo-1.2.3-4
bar-*-*

.. * wildcard match on the latest version.

Yes, keeping the version numbers matched _is_ a pain :-) This is why I'm 
looking for other ways of doing fine grain controls on updates. What we 
have is very fine grain, but not very user friendly. We'd rather 
contribute our ideas (and effort when I have time) to a good general 
tool like yum than keep maintaining our tool though.

I must admit that some of my comments only really start to hold strength 
when combined with using a central system for configuration information 
not just centralised software management.

Carwyn