Answers about Puppet

DevOps, Automation

Universe and Everything

Need Puppet help?

Contact Alessandro Franceschi / example42
for direct expert help on Puppet.
If solution is quick, it’s free. No obligations.

NextGen Modules Lessons Learned

Two years have passed since the first works on Example42's NextGen modules and I think it's time to review what has been done, what has worked and what hasn't.

The NextGen modules layout has introduced various solutions and approaches to reusable modules design which have been more or less successful: some have been used also by other modules authors, some have been made obsolete by Puppet's evolution, some have remained practically unused.

Some of the ideas used were already around or emerging, other were quite new and unexplored, some are still valid, others would probably be done differently now.

What I can say is that my main task of that module set has been achieved: I, and, afaik, various other people, use effortlessly the same modules unmodified on different environments and infrastructures: that's what I want from a reusable module.

What worked, what failed, what could be better

Params lookup

This is a function that is used on every main class parameter and allows choice on where data is defined: on an ENC (as Top Scope vars), Hiera or directly passed as parameters to the class. Dan Bode gave me the original idea, on which I added some frills like the possibility to look for a global variable after a module specific one.  
Basically is the same concept of Puppet 3's data bindings, with the difference that it has been introduced before the release of Puppet 3 and it works on any version of Puppet > 2.6.

The point is that now, with Puppet 3 more and more used, such a function is quite redundant and introduces some extra calculation time that might be avoided.

Alternatives for configuration files

All the NextGen modules have params like template and source that let users decide how to provide configuration files. The usage of a parameter like template is now common and practically required for a reusable module, at those times it wasn't. I think this kind of params (eventually with a wider "content" one) must stay in a module that aims to be considered reusable.

Also there's the possibility to manage whole configuration dirs, with source_dir , source_dir_purge, which even if not generally recommended with large amount of files to syncronize, it can be a valid solution in some cases.

Decommissioning support

To provide parameters that allow the removal of the managed resources (like 'absent') is another possibility that at the times wasn't much spread and it's becoming more and more common. My only concern here is the unfortunate naming choice, something like 'ensure' is definitively a better and clearer name.

Service management options

Params like disable, disableboot and service_autorestart were introduced to let users decide how to manage services startup and their behaviour when config file change. I've seen them being used also in other modules and this confirms me that they make sense, even if also in this case I think that the naming choice was quite poor.

Multi OS support and user's override options

Multi OS support was not new for modules also at the time, I consider it a condicio sine qua non, for a reusable module, and the params class pattern was probably the best way to manage it.
Now this is going to change with Puppet 3.3.0 , Hiera 2 and data in modules, where a brand new world opens to module's internal data management.
NextGen modules also expose all the OS specific params as class parameters, so that the OS defaults values set in params.pp can be overridden by users: this has the consequence of giving more reusability options for edge cases but adds also a bunch of parameters which are rarely used, such as: package, service, config_fileconfig_dirconfig_file_owner, config_file_group, config_file_mode ... Probably I'd keep only the first 3 or 4 of these parameters, now.

Integrated monitoring and firewalling options

I firmly believe that a module should provide the possibility to automatically monitor and firewall the resources it installs and do it in a "tool neutral" way, that is it should not contain parameters related to specific monitoring or firewalling tools. I've seen this concept (mostly the firewalling integration) be reused in other modules like the PuppetLabs ones, and I think it's definitively worth to be followed. The current implementation, with the usage of meta classes like Example42's monitor and firewall ones, still doesn't satisfy me fully and also I'm wondering if and where it makes sense to expose in the module all the parameters required to make this work ( monitor , monitor_tool, monitor_target , firewall , firewall_tool , firewall_src , firewall_dst and other ones like port , protocol , pid_file , process , process_args ).

Probably the wonders and the evolution of the Puppet  Future Parser , which will probably be the default in Puppet 4, will allow better management of params like this, with the usage and manipulation of configuration hashes that would limit the exposure of a bunch of parameters and the possibility to freely expand them with tools specific settings.

Puppi integration

This has remained an incomplete work. One of Puppi's aims is to use Puppet's data to feed a CLI command. "Puppet Knowledge to the Shell" was my mantra, and I still think this can lead to powerful results. The problem is that the current Puppi still doesn't work with NextGen Puppi integration (which is quite ridiculous, I admit) and the Puppi 2 which should support it is still incomplete. Also in the whole picture is missing a sane web front end for all the data that Puppi might collect on the system.

It's all in the TODO list, but it's there for quite a long time, so I would rate the Puppi integration in the NextGen modules a failure.
Also the Puppi integration bring a bunch of other hardly used parameters like data_dirlog_dirlog_file and various ones already used for other functions.

Note that you can still use Puppi for local puppi checks (setting it as monitor_tool) and for its other main function, application deployment, and that works quite well.

Templates + Options Hash pattern

I don't like the idea of adding a parameter to a class for each/most configuration option of the managed application, as you might end up adding a large amount of parameters to your module.
For this reason a quite open solution, if you really want all your configurations as data, is to provide a custom template and feed it with a single configuration hash via the options parameter.

The puppi module (which you can consider the stdlib for Example42 modules) provides also a useful function, options_lookup , contributed by Mike Novak, which allows easy usage of the options hash in an erb template. I'd like to see such an option in the stdlib.

Personally I've not used too much this solution (generally I just provide a custom template that has encoded most of the specific settings and uses variables only the most important or qualifying ones),  and I suppose is the same for others, also because in the modules there weren't sample templates that could show some usage patterns.

Debug, audit and noops

Some extra juice was added to the modules.
A debug parameter which dumps the whole class scope in a file. Useful, even if I rarely use it.
An audit_only parameter which was supposed to let the user define to audit the changes of the module. Never used it.
A noops parameter, introduced later, which is supposed to run per module noops (the noop metaparameter on the class has no effects on the contained resources). The idea makes sense, imho, the implementation was buggy and incidentally I've spent the last weekend fixing it on the modules who have this parameters, so, it's time to update them!

Custom and dependency classes

In all the modules there's a my_class parameter which allows you to define a custom class where you can place extra resources related to the module. Strictly speaking this is not needed, as you can place these resources directly in the role/service class that uses the module, but the idea is to have whatever is related to a module (config files, extra resources and so on) defined in a single point. Not essential but not harmful, imho.

Recently I've also started to introduce a parameter like dependency_class, which allows the definition of a custom class where resources needed by the module, but provided by other modules, could be placed. The idea is to give the user the possibility to use other modules different from the Example42 ones to manage the required resources  ( database operations, extra repos , virtual host defines ... ). I think that such an option is useful to allow better integration with other modules and to allow usage of single Example42 modules without entering into dependencies hell with modules from other authors.

Modules cloning via templates / blueprints 

All the nextgen modules are made in a way that is easy to rename them, make some sed works and have a brand new full featured modules with limited effort.
Actually there's a script available to clone Example42 modules that allows easy creation of a new module based on an existing ones.

More than once someone has told me that if a module can be generated from a blueprint then there should be a saner way to inject data into a single module layout without making a new module every time.
Actually the idea behind a structure that allows quick cloning is not to have modules that are all the same (with some changes in the params class) but to have a solid and standard base from which to build features specific to the application managed.

This approach has permitted also other authors to create modules based on the NextGen layout and actually various of them, like the ones from Netmanagers guys or Marcus Burger or others, have been integrated and linked in the NextGen modules set and are managed directly from the original authors. That's a model I like and hope to see it grow.

Docs, Lints and Spec tests

All the modules have PuppetDoc compliant documentation, and are tested via Puppet Lint and Rspec Puppet. This is basically a good thing, and it's effortlessly cloned from module to module. What I must admit here is that not always I've given much care to the documentation or the test coverage of the new, module specific, features. The result is that documentation looks often all the same and the tests have not that much added value.
This is something that I'd like to better care in the future.

What's next

The future parser and Hiera 2 may really change radically the way we design our modules and there's so much development in the Puppet ecosystem that is clear that we are moving to another transition, where the DSL is radically opened to new solutions and patterns.

 If I had to rewrite the NextGen modules now I'd do some things differently: no params_lookup, better naming choices, less parameters, but I think this layout still works well with current setups and actually I'm using it in production in many different places.

There will be sooner or later a third iteration of Example42 modules:
- The first one, was done in pre-2.6 era, it has some interesting concepts, for the times, and various drawbacks. I'd not use it anywhere now. Actually I should remove the OldGen modules remained in the Official Example42 modules set.
- The second one, the NextGen, is strongly based on parametrized classes, requires Puppet > 2.6 and works well with Puppet 3, even if the params_lookup is redundant.
- The third one has yet to be done, and it might be Puppet 4 only compliant (or at least Puppet > 3.3), trying to fully use the future parser, data in modules and what will be available.
It will also probably use rspec-system, adhere to stdmod naming standards, whatever they will be, and be based on stdlib.

Also I'll definitively try to find a more shared and open development effort, so that applications specific experts can create and be maintainers of the modules they need and use.

There's time to experiment on that, some patterns reveal their degree of success or failure only when they have been used and tested in different conditions.
Puppet 4 is expected for the end of the year, but some time will pass before it gets massively used in production, so I presume the NextGen modules are here to stay for many months.

I'd love to discuss about this and what has been written here directly on Example42 Puppet Modules Google group (I definitely surrendered at the idea on managing comments on this spam-infested blog).

And thanks for reading up to here! 

I'm not good in writing short blog posts :-)