Monday, April 12, 2021

GitHub Actions: Quirks, Issues, and Tips for Real-world CI/CD Pipelines

Con­tin­u­ous inte­gra­tion and deliv­ery (CI/CD) is an impor­tant part of any mod­ern soft­ware devel­op­ment cycle. It ensures code qual­i­ty remains high, helps keep appli­ca­tions secure, and bridges the gap between every­day work and your vis­i­tors’ experience.

Nowa­days it’s a giv­en that a CI/CD pipeline will be part of a work­flow, but choos­ing a provider and/​or plat­form can be dif­fi­cult. Oomph has made use of a num­ber of CI/CD tools over the years: Deploy­Bot, Jenk­ins, and Travis CI have all made appear­ances. Most of our projects in the last few years have used Travis, but more recent­ly we’ve found it to be unre­li­able. Just as we began search­ing for a new provider, full CI/CD sup­port was announced for GitHub Actions.

We imme­di­ate­ly added Actions to the list of providers we were inter­est­ed in, and after some com­par­i­son, we began migrat­ing projects to it. Over­all we’ve found it to be ben­e­fi­cial — the syn­tax is well-designed, work­flows are exten­si­ble and mod­u­lar, the plat­form is reli­able and per­for­mant, and we’ve expe­ri­enced no major trou­ble. There are already plen­ty of good guides and arti­cles on how to use GitHub Actions; we won’t repeat that here. Instead, we’ll look at a few gotchas and issues that we’ve encoun­tered while using the plat­form, to give an accu­rate pic­ture of things you may come across while imple­ment­ing GitHub Actions.

Con­sid­er­a­tions #

The team behind GitHub Actions knew what they were doing, and it’s clear they learned from and improved on pre­vi­ous CI/CD imple­men­ta­tions. This is most obvi­ous in the clear struc­ture of the syn­tax, the straight­for­ward pric­ing mod­el, and the use­ful fea­ture set. How­ev­er, Actions’ in-progress state is appar­ent in some areas.

Arti­fact Stor­age and Billing

GitHub pro­vides a gen­er­ous amount of free build time for all repos­i­to­ries and orga­ni­za­tions. Stor­age, though, is much more lim­it­ed — only 2GB is includ­ed for GitHub Teams orga­ni­za­tions. If you want to store build arti­facts for all of your CI/CD jobs (a good idea for test­ing and repeata­bil­i­ty) you may need to con­fig­ure a spend­ing lim­it” — i.e. a max­i­mum amount you’re will­ing to spend each month on stor­age. GitHub charges $0.25/GB for stor­age beyond the includ­ed 2GB.

Arti­fact stor­age is still rudi­men­ta­ry. Jobs can upload arti­facts for down­load by oth­er jobs lat­er in the work­flow, but the life­time of those arti­facts can­not be con­fig­ured; they will expire after 90 days and the only way to delete them before­hand is man­u­al. Man­u­al dele­tions will also take some time to free up stor­age space.

We also expe­ri­enced an issue where our report­ed usage for Actions stor­age was great­ly (~500%) exag­ger­at­ed, putting us far past our spend­ing lim­it and break­ing builds. When we reached out to GitHub’s sup­port, though, they respond­ed quick­ly to let us know this was a sys­tem-wide issue and they were work­ing on it; the issue was resolved some days lat­er and we were not charged for the extra stor­age. We were able to work around it in the mean­time by extend­ing our spend­ing limit.

Restart­ing and Debug­ging Jobs

If a work­flow fails or is can­celed, it can be restart­ed from the work­flow page. How­ev­er, it’s not yet pos­si­ble to restart cer­tain jobs; the entire work­flow has to be run again. GitHub is work­ing on sup­port for job-spe­cif­ic restarts.

Debug­ging job fail­ures also is not yet offi­cial­ly sup­port­ed, but var­i­ous com­mu­ni­ty projects make this pos­si­ble. We’ve used Max Schmitt’s action-tmate to debug our builds, and that does the job. In fact, I pre­fer this approach to the Travis method; with this we can spec­i­fy the point of the work­flow where we want to start debug­ging, where­as Travis always starts debug­ging at the begin­ning of the build.

Log Out­put

GitHub Actions has an excel­lent lay­out for view­ing the out­put of jobs. Each job in a work­flow can be viewed and with­in that each step can be expand­ed on its own. The out­put from the cur­rent step can also be seen in near-real-time. Unfor­tu­nate­ly, this last bit has been some­what unre­li­able for us, lag­ging behind by a bit or fail­ing to show the out­put for short steps. (To be fair to GitHub, I have nev­er used a CI/CD plat­form where the live out­put worked flaw­less­ly.) View­ing the logs after com­ple­tion has nev­er been a problem.

Con­fig­ur­ing Variables/​Outputs

GitHub Actions allows you to con­fig­ure out­puts for an action, so a lat­er step can use some val­ue or out­come from an ear­li­er step. How­ev­er, this only applies to pack­aged actions that are includ­ed with the uses method.

To do some­thing sim­i­lar with a free-form step is more con­vo­lut­ed. First, the step must use some odd syn­tax to set an out­put para­me­ter, e.g.:

- name: Build
  id: build
  run: |
    echo "::set-output name=appsize::$(du -csh --block-size=1G  build/ | tail -n1 | cut -d$'\t' -f1)"

Then a lat­er step can ref­er­ence this para­me­ter with the steps context:

- name: Provision server
  run: terraform apply -var “app_ebs_volume_size=${{ }}”

How­ev­er, the scope of the above is lim­it­ed to the job it takes place inside of. To ref­er­ence val­ues across jobs you must also set the val­ues with­in the out­puts map in the jobs con­text, e.g.:

    runs-on: ubuntu-latest
      appsize: ${{ steps.step1.outputs.appsize }}
    - name: Build
      id: build
      run: |
        echo "::set-output name=appsize::$(du -csh --block-size=1G  build/ | tail -n1 | cut -d$'\t' -f1)"
    runs-on: ubuntu-latest
    needs: build
    - run: terraform apply -var “app_ebs_volume_size=${{ }}”

Impor­tant­ly, the out­puts map from a pre­vi­ous job is only made avail­able to jobs that require it with the needs directive.

While this set­up is work­able, the syn­tax feels a lit­tle weird, and the lack of doc­u­men­ta­tion on it makes it dif­fi­cult to be cer­tain of what you’re doing. This is evolv­ing, as well; the jobs..out­puts con­text was only released in ear­ly April. Before that was added, per­sist­ing data across jobs required the use of build arti­facts, which was clunky and pre­clud­ed its use for sen­si­tive values.

Self-host­ed Runners

Some­times secu­ri­ty or access require­ments pro­hib­it a cloud-host­ed CI/CD run­ner from reach­ing into an envi­ron­ment to deploy code or pro­vi­sion resources, or some sen­si­tive data needs to be secured. For these sce­nar­ios, GitHub pro­vides the abil­i­ty to self-host Actions run­ners. Self-host­ed run­ners can instead run the CI/CD process from an arbi­trary VM or con­tain­er with­in the secured net­work or envi­ron­ment. You can use them along­side cloud-host­ed run­ners; as an exam­ple, in some sit­u­a­tions we use cloud-host­ed run­ners to test and val­i­date builds before hav­ing the self-host­ed run­ners deploy those builds to an environment.

This fea­ture is cur­rent­ly in beta, but it has proven reli­able and extreme­ly use­ful in the places we’ve need­ed them.

Reli­a­bil­i­ty and Performance

Over­all GitHub Actions has been very reli­able for us. There have been peri­ods of trou­ble here and there but GitHub is open about the issues and gen­er­al­ly address­es them in short order. We have not (yet) been seri­ous­ly imped­ed by any out­ages or degra­da­tion, which is a sig­nif­i­cant improve­ment over our pre­vi­ous CI/CD situation.

Over­all Expe­ri­ence #

In gen­er­al, the switch to GitHub Actions has been a pos­i­tive expe­ri­ence. We have made sig­nif­i­cant improve­ments to our CI/CD work­flows by switch­ing to Actions; the plat­form has some great fea­tures and it has cer­tain­ly been ben­e­fi­cial for our devel­op­ment life­cy­cle. While Actions may have a few quirks or small issues here and there we wouldn’t hes­i­tate to rec­om­mend it as a CI/CD platform.