Puppet in action
Client-server communication is done using REST-like API calls on a SSL socket, basically it's all HTTPS traffic from clients to the server's port 8140/TCP.
The first time we execute Puppet on a node, its x509 certificates are created and placed in ssldir
, and then the Puppet Master is contacted in order to retrieve the node's catalog.
On the Puppet Master, unless we have autosign
enabled, we must manually sign the clients' certificates using the cert
subcommand:
puppet cert list # List the unsigned clients certificates puppet cert list --all # List all certificates puppet cert sign <certname> # Sign the given certificate
Once the node's certificate has been recognized as valid and been signed, a trust relationship is created and a secure client-server communication can be established.
If we happen to recreate a new machine with an existing certname
, we have to remove the certificate of the old client from the server:
puppet cert clean <certname> # Remove a signed certificate
At times, we may also need to remove the certificates on the client; a simple move command is safe enough:
mv /etc/puppetlabs/puppet/ssl /etc/puppetlabs/puppet/ssl.bak
After that, the whole directory will be recreated with new certificates when Puppet is run again (never do this on the server—it'll remove all client certificates previously signed and the server's certificate, whose public key has been copied to all clients).
A typical Puppet run is composed of different phases. It's important to know them in order to troubleshoot problems:
- Execute Puppet on the client. On a root shell, run:
puppet agent -t
- If
pluginsync = true
(default from Puppet 3.0), then client retrieves any extra plugin (facts, types, and providers) present in the modules on the Master's$modulepath
client output with the following command:Info: Retrieving pluginfacts Info: Retriving plugin
- The client runs facter and sends its facts to the server client output:
Info: Loading facts in /var/lib/puppet/lib/facter/... [...]
- The server looks for the client's
certname
in its nodes list. - The server compiles the catalog for the client using its facts. Server logs as follows:
Compiled catalog for <client> in environment production in 8.22 seconds
- If there are syntax errors in the processed Puppet code, they are exposed here and the process terminates; otherwise, the server sends the catalog to the client in the PSON format. Client output is as follows:
Info: Caching catalog for <client>
- The client receives the catalog and starts to apply it locally. If there are dependency loops, the catalog can't be applied and the whole run fails. Client output is as follows:
Info: Applying configuration version '1355353107'
- All changes to the system are shown on
stdout
or in logs. If there are errors (in red or pink, depending on Puppet versions), they are relevant to specific resources but do not block the application of the other resources (unless they depend on the failed ones). At the end of the Puppet run, the client sends a report of what has been changed to the server. Client output is as follows:Notice: Applied catalog in 13.78 seconds
- The server sends the report to a report collector if enabled.
Resources
When dealing with Puppet's DSL, most of the time, we use resources as they are single units of configuration that express the properties of objects on the system. A resource declaration is always composed of the following parts:
- type: This includes package, service, file, user, mount, exec, and so on
- title: This is how it is called and may be referred to in other parts of the code
- Zero or more attributes:
type { 'title': attribute => value, other_attribute => value, }
Inside a catalog, for a given type, there can be only one title; there cannot be multiple resources of the same type with the same title, otherwise we get an error like this:
Error: Duplicate declaration: <Type>[<name>] is already declared in file <manifest_file> at line <line_number>; cannot redeclare on node <node_name>.
Resources can be native (written in Ruby), or defined by users in Puppet DSL.
These are examples of common native resources; what they do should be quite obvious:
file { 'motd': path => '/etc/motd', content => "Tomorrow is another day\n", } package { 'openssh': ensure => present, } service { 'httpd': ensure => running, # Service must be running enable => true, # Service must be enabled at boot time }
For inline documentation about a resource, use the describe
subcommand, for example:
puppet describe file
Note
For a complete reference of the native resource types and their arguments check: http://docs.puppetlabs.com/references/latest/type.html
The resource abstraction layer
From the previous resource examples, we can deduce that the Puppet DSL allows us to concentrate on the types of objects (resources) to manage and doesn't bother us on how these resources may be applied on different operating systems.
This is one of Puppet's strong points, resources are abstracted from the underlying OS, we don't have to care or specify how, for example, to install a package on Red Hat Linux, Debian, Solaris, or Mac OS, we just have to provide a valid package name. This is possible thanks to Puppet's Resource Abstraction Layer (RAL), which is engineered around the concept of types and providers.
Types, as we have seen, map to an object on the system. There are more than 50 native types in Puppet (some of them applicable only to specific OSes), the most common and used are augeas
, cron
, exec
, file
, group
, host
, mount
, package
, service
, and user
. To have a look at their Ruby code, and learn how to make custom types, check these files:
ls -l $(facter rubysitedir)/puppet/type
For each type, there is at least one provider, which is the component that enables that type on a specific OS. For example, the package
type is known for having a large number of providers that manage the installation of packages on many OSes, which are aix
, appdmg
, apple
, aptitude
, apt
, aptrpm
, blastwave
, dpkg
, fink
, freebsd
, gem
, hpux
, macports
, msi
, nim
, openbsd
, pacman
, pip
, pkgdmg
, pkg
, pkgutil
, portage
, ports
, rpm
, rug
, sunfreeware
, sun
, up2date
, urpmi
, yum
, and zypper
.
We can find them here:
ls -l $(facter rubysitedir)/puppet/provider/package/
The Puppet executable offers a powerful subcommand to interrogate and operate with the RAL: puppet resource
.
For a list of all the users present on the system, type:
puppet resource user
For a specific user, type:
puppet resource user root
Other examples that might give glimpses of the power of RAL to map systems' resources are:
puppet resource package puppet resource mount puppet resource host puppet resource file /etc/hosts puppet resource service
The output is in the Puppet DSL format; we can use it in our manifests to reproduce that resource wherever we want.
The Puppet resource
subcommand can also be used to modify the properties of a resource directly from the command line, and, since it uses the Puppet RAL, we don't have to know how to do that on a specific OS, for example, to enable the httpd
service:
puppet resource service httpd ensure=running enable=true
Nodes
We can place the preceding resources in our first manifest file (/etc/puppetlabs/code/environments/production/manifests/site.pp
) or in the form included there and they will be applied to all our Puppet managed nodes. This is okay for quick samples out of books, but in real life things are very different. We have hundreds of different resources to manage, and dozens, hundreds, or thousands of different systems to apply different logic and properties to.
To help organize our Puppet code, there are two different language elements: with node
, we can confine resources to a given host and apply them only to it; with class
, we can group different resources (or other classes), which generally have a common function or task.
Whatever is declared in a node, definition is included only in the catalog compiled for that node. The general syntax is:
node $name [inherits $parent_node] { [ Puppet code, resources and classes applied to the node ] }
$name
is the certname
of the client (by default its FQDN) or a regular expression; it's possible to inherit, in a node, whatever is defined in the parent node, and, inside the curly braces, we can place any kind of Puppet code: resources declarations, classes inclusions, and variable definitions. An example is given as follows:
node 'mysql.example.com' { package { 'mysql-server': ensure => present, } service { 'mysql': ensure => 'running', } }
But generally, in nodes we just include classes, so a better real life example would be:
node 'mysql.example.com' { include common include mysql }
The preceding include statements that do what we might expect; they include all the resources declared in the referred class.
Note that there are alternatives to the usage of the node statement; we can use an External Node Classifier (ENC) to define which variables and classes assign to nodes or we can have a nodeless setup, where resources applied to nodes are defined in a case statement based on the hostname or a similar fact that identifies a node.
Classes and defines
A class can be defined (resources provided by the class are defined for later usage but are not yet included in the catalog) with this syntax:
class mysql { $mysql_service_name = $::osfamily ? { 'RedHat' => 'mysqld', default => 'mysql', } package { 'mysql-server': ensure => present, } service { 'mysql': name => $mysql_service_name, ensure => 'running', } […] }
Once defined, a class can be declared (the resources provided by the class are actually included in the catalog) in multiple ways:
- Just by including it (we can include the same class many times, but it's evaluated only once):
include mysql
- By requiring it—what makes all resources in current scope require the included class:
require mysql
- Containing it—what makes all resources requiring the parent class also require the contained class. In the next example, all resources in
mysql
and inmysql::service
will be resolved beforeexec
:class mysql { contain mysql::service ... } include mysql exec { 'revoke_default_grants.sh': require => Class['mysql'], }
- Using the parameterized style (available since Puppet 2.6), where we can optionally pass parameters to the class, if available (we can declare a class with this syntax only once for each node in our catalog):
class { 'mysql': root_password => 's3cr3t',}
A parameterized class has a syntax like this:
class mysql ( $root_password, $config_file_template = undef, ... ) { […] }
Here, we can see the expected parameters defined between parentheses. Parameters with an assigned value have it as their default, as it is here. The case of undef
for the $config_file_template
parameter.
The declaration of a parameterized class has exactly the same syntax of a normal resource:
class { 'mysql': $root_password => 's3cr3t', }
Puppet 3.0 introduced a feature called data binding; if we don't pass a value for a given parameter, as in the preceding example, before using the default value, if present, Puppet does an automatic lookup to a Hiera variable with the name $class::$parameter
. In this example, it would be mysql::root_password
.
This is an important feature that radically changes the approach of how to manage data in Puppet architectures. We will come back to this topic in the following chapters.
Besides classes, Puppet also has defines, which can be considered classes that can be used multiple times on the same host (with a different title). Defines are also called defined types, since they are types that can be defined using Puppet DSL, contrary to the native types written in Ruby.
They have a similar syntax to this:
define mysql::user ( $password, # Mandatory parameter, no defaults set $host = 'localhost', # Parameter with a default value [...] ) { # Here all the resources }
They are used in a similar way:
mysql::user { 'al': password => 'secret', }
Note that defines (also called user defined types, defined resource type, or definitions) like the preceding one, even if written in Puppet DSL, have exactly the same usage pattern as native types, written in Ruby (such as package, service, file, and so on).
In types, besides the parameters explicitly exposed, there are two variables that are automatically set. They are $title
(which is the defined title) and $name
(which defaults to the value of $title
) and can be set to an alternative value.
Since a define can be declared more than once inside a catalog (with different titles), it's important to avoid to declare resources with a static title inside a define. For example, this is wrong:
define mysql::user ( ...) { exec { 'create_mysql_user': [ … ] } }
Because, when there are two different mysql::user
declarations, it will generate an error like:
Duplicate definition: Exec[create_mysql_user] is already defined in file /etc/puppet/modules/mysql/manifests/user.pp at line 2; cannot redefine at /etc/puppet/modules/mysql/manifests/user.pp:2 on node test.example42.com
A correct version could use the $title
variable which is inherently different each time:
define mysql::user ( ...) { exec { "create_mysql_user_${title}": [ … ] } }
Class inheritance
We have seen that in Puppet classes are just containers of resources that have nothing to do with Object Oriented Programming classes so the meaning of class inheritance is somehow limited to a few specific cases.
When using class inheritance, the parent class (puppet
in the sample below) is always evaluated first and all the variables and resource defaults sets are available in the scope of the child class (puppet::server
).
Moreover, the child class can override the arguments of a resource defined in the parent class:
class puppet { file { '/etc/puppet/puppet.conf': content => template('puppet/client/puppet.conf'), } } class puppet::server inherits puppet { File['/etc/puppet/puppet.conf'] { content => template('puppet/server/puppet.conf'), } }
Note the syntax used; when declaring a resource, we use a syntax such as file { '/etc/puppet/puppet.conf': [...] }
; when referring to it the syntax is File['/etc/puppet/puppet.conf']
.
Even when possible, class inheritance is usually discouraged in Puppet style guides except for some design patterns that we'll see later in the book.
Resource defaults
It is possible to set default argument values for a resource type in order to reduce code duplication. The general syntax to define a resource default is:
Type { argument => default_value, }
Some common examples are:
Exec { path => '/sbin:/bin:/usr/sbin:/usr/bin', } File { mode => 0644, owner => 'root', group => 'root', }
Resource defaults can be overridden when declaring a specific resource of the same type.
It is worth noting that the area of effect of resource defaults might bring unexpected results. The general suggestion is as follows:
- Place the
global
resource defaults insite.pp
outside any node definition - Place the
local
resource defaults at the beginning of a class that uses them (mostly for clarity's sake, as they are parse-order independent)
We cannot expect a resource default defined in a class to be working in another class, unless it is a child class, with an inheritance relationship.
Resource references
In Puppet, any resource is uniquely identified by its type and its name. We cannot have two resources of the same type with the same name in a node's catalog.
We have seen that we declare resources with a syntax such as:
type { 'name': arguments => values, }
When we need to reference them (typically when we define dependencies between resources) in our code, the syntax is (note the square brackets and the capital letter):
Type['name']
Some examples are as follows:
file { 'motd': ... } apache::virtualhost { 'example42.com': .... } exec { 'download_myapp': .... }
These examples are referenced, respectively, with the following code:
File['motd'] Apache::Virtualhost['example42.com'] Exec['download_myapp']