The Role of Jenkins in the DevOps Toolchain
The traditional role of Jenkins in the DevOps toolchain is that of the platform for building application artifacts from source code and sharing that as a common resource among various engineering teams. The common build on Jenkins serves different purposes depending on the team:
It helps the Development team to identify and resolve conflicts in code changes.
The QA team uses the artifacts built by Jenkins for testing, both automated and manual.
The Production Engineering team uses the verified or released artifacts for deployments in various application environments.
If the artifacts built by Jenkins are tested automatically using features provided by Jenkins, then any code change can automatically be tested soon after those are checked in. That process is called Continuous Integration (CI). Jenkins is touted as a CI platform. Following are the core features that qualify it a CI platform:
Options to execute and analyze unit tests.
Chaining of jobs using which build and test dependencies can be defined easily.
Support of slave nodes that could be configured for automated tests.
Pipeline feature that could be used to implement complex build and test workflows as code.
A Jenkins feature that is not very well advertised (but heavily put to use regardless) is the set of scheduling options it offers. Typically, those options are used to check on the code changes happening in a code repository that triggers fresh builds.
Luckily, a Jenkins job doesn’t have to be tied to a code repository that's optional. That means you can set up a Jenkins job to do pretty much anything that you can dream up and use its scheduling option to run it whenever you want. Jenkins supports named macros like @daily and all the features of crontab scheduling.
This article primarily looks at the advantages of moving some of your cron jobs to Jenkins.
Automated Jobs on crontab
If a job is run from crontab, a couple of things are certain: the job is automated and it is periodically run. The main requirements of using a crontab are:
You need a server to set it up.
The crontab is specific to a user and the job is defined on the crontab of one of the users.
The crond daemon should be running on the server for the job to be executed as per the schedule.
In a cloud-based computing infrastructure with cloud-based storage (like S3, serverless computing services like Lambda, and the increased use of REST APIs to integrate and query on systems), sometimes a cron job that runs on a server might not have anything to do with the server where it runs. That begs for moving the scheduled jobs to an infrastructure that is compatible with cloud computing.
Normally, there are two broad categories of cron jobs:
Jobs that are part of the application, like ETL jobs running in a big data application.
General purpose jobs that are not tied to any application, such as backup and system maintenance jobs.
The general purpose jobs are prime candidates for running from Jenkins instead of running those from some random server that would be hard to even locate at times.
Disadvantages of Using crontab
Lack of Visibility
To locate a job run from crontab, you need to locate the related server and the user. Any amount of documentation will not make this task easier. The Configuration Management (CM) systems, especially Puppet, offer excellent methods to define crontab jobs as code. Yet, you might end up with many cron jobs that were set up manually which nobody wants to disturb.
Difficult or No Access to Logs
The output from a cron job can be saved into a log file locally on the server where it runs. If some issue happens in production, engineers from different teams may want to look at it. Typically, only Operations staff has access to the server where the log file is available and it is a nightmare to grant temporary access, especially when someone wants to look at the live log and tail on it.
For the jobs scheduled on crontab to be executed, the cron daemon (crond) should be running. Though it is not very common, the daemon can crash, so for higher reliability, that daemon process has to be monitored.
Code Not in SCCS
The scripts run from crontab are not usually checked into Source Code Control System (SCCS) like Git. If the server where the cron jobs are hosted crashes, the scripts would be lost, too, and I can vouch for such mishaps that could happen in production.
Jenkins as a Scheduler
These are some of the advantages of using Jenkins for scheduling period jobs:
The jobs defined on Jenkins are highly visible, with the option to group similar jobs in a view. If you ever had to deal with an unwieldy number of cron jobs on a single crontab, the Jenkins feature to organize jobs would be a boon.
The logs are readily accessible and such accesses can be granted based on the user role. There is no need to grant access to the user to the actual server machine for looking at logs.
Though it is possible to create a log per job run on crontab also, that feature is out-of-the-box on Jenkins. The log info organized per each run of the job increases the readability of output from jobs and facilitate triaging issues if something breaks.
If the cron jobs are consolidated on Jenkins, the requirement to monitor is minimal and you need to only make sure that the Jenkins instance is up and running instead of making sure that crond on all the servers are running.
Jenkins provides the option to check out the latest version of a script from SCCS and run it. Even if in-line shell scripting is used, by backing up the build config file, that code could easily be saved.
There are other advantages also that may not be implemented using crontab at all.
One -off Invocation of job
A job that is scheduled to be run periodically could be executed manually anytime as needed. Such manual executions of cron jobs are a little clumsy and are not directly supported. The script has to be run manually from a shell.
UI Form for Job With Parameters
Though jobs with parameters are designed to be run interactively (in which case, the user can input values for job parameters before executing it), the input values could be injected into such jobs by the upstream jobs and be run automatically by using suitable plugins. Also, the jobs with parameters could be invoked using REST API with parameters posted.
Handling of Secrets
In a cron job, secrets such as user passwords and API keys have to be hard coded either in the scripts or in the config files the scripts would use. Jenkins offers good options to manage secrets and read those from encrypted repositories.
Execute Jobs Using API Calls
Jenkins jobs could be run programmatically using API calls or Jenkins CLI. That opens up the opportunity to implement complex workflows that could be used to integrate very disparate systems in the company. The newly available pPipeline feature would augment the workflow features.
What Jobs Should be on Jenkins
In my opinion, all general purpose automated jobs that need to be run periodically should be moved to Jenkins. These are typical jobs from that category:
Backup jobs in general. These could be related to taking backup of databases, application configs, cloud resources, etc.
Periodic updates and monitoring of system resources such as applying patches, monitoring, enforcing security settings, etc.
I think that the periodic jobs related to applications should still be running from crontab. However, if any advantage of using Jenkins that we had discussed earlier would be an important factor in your decision-making process, Jenkins could just be configured as a scheduler in the application stack.
Configuring Jenkins to Run Jobs Remotely
There are two types of jobs that you have to schedule from Jenkins, those with no remote server access requirement and those that require remote server access.
Jobs With no Remote Access Requirement
These could be jobs that would only use REST APIs or native application APIs or CLIs to get the job done. REST APIs need hardly any additional setup except for the availability of curl or wget on the Jenkins server. Native APIs and CLI would need some client applications and libraries to be installed on the Jenkins server like MySQL client and AWS CLI.
Jobs With Remote Server Access Requirement
Normally, these are the jobs that run remotely on the target host itself, in which case, it could be run locally from the crontab on the target host. If the job is managed from Jenkins, it has to be launched remotely on the target host. There are two methods readily available to do this: one native to Jenkins and the other using SSH.
Target Host as Jenkins slave
If the target host is configured as a slave, any script could be launched on the slave with no additional configuration. For adding a machine as a slave to a Jenkins server, a Java process using slave.jar should be running on the slave machine. Besides that, there is nothing much more to do. So, if several jobs have to be run on a target host remotely, it might make sense to configure it as a Jenkins slave.
Launching Jobs Over SSH
Jenkins users can launch jobs on the target host over SSH. As the job is automated, the SSH access from Jenkins server to target host must be configured passwordless. The details of setting it up in various ways are discussed in another article. Following are the main points:
Set up passwordless SSH access for Jenkins user Jenkins from Jenkins server to target host. The following access could be done passwordless:
jenkins> ssh -i /path/to/private-key remote_user@remote-host
The public key corresponds to a private key that must be in ~remote_user/.ssh/authorized_keys on the target host.
remote_user should have the permission to run the script passwordless. Usually, this means having passwordless psuedo access on the remote host for remote_user.
Disable requiretty on the Jenkins server by commenting out the Defaults requiretty entry in the /etc/sudoers file.
Pointers for Maintenance
Use scripts checked into SCCS for executing a job. Have only minimum inline code in the Jenkins job configuration.
Take a backup of (at least) JENKINS_HOME/jobs/JOB/config.xml files. A full backup of JENKINS_HOME (usually /var/lib/jenkins) is better, as you will end up installing important plugins, and those plugins and the global configuration settings will get saved if JENKINS_HOME is backed up.
Use Jenkins to manage secrets and get those injected into the runtime environment of a job instead of hard coding them. The visibility of hard coded secrets is much more on Jenkins due to the easy availability of job logs. Chances of exposing secrets should be avoided.