Changes

Gentoo Infrastructure

10,808 bytes added, 11:18, 21 August 2014
/* File mirror host requirements */
== Binary package requirements ==
* Ability to build and install binary packages with the same version but different USE flags. For example, MySQL server package (<code>-minimal</code> and MySQL client & libs package <code>minimal</code>)** don't go there: this imposes a significant amount of maintenance work and may still break. Rather provide large enough base sets and accept that some packages install too much (you can still disable them at runtime) and build the few deviations from the rule on the servers from source --[[User:Tiziano|Tiziano]] ([[User talk:Tiziano|talk]]) 14:39, 3 January 2014 (CET)*** Yes, we need to and can go there :-) I agree with you, that we should do this only if necessary, apache for example can be built once and has the ability to turn features (module loading) on/off via its configuration. Other software does not provide such run-time configuration which results in unwanted server-software and dependencies on the installed hosts (<code>net-analyzer/zabbix</code> for example). I clearly do not want to have a dedicated build environment for each of those packages, I would rather see a build env, called minimal for example, which is used to build all those database packages with only lib and clients enabled (use the same env for PostgreSQL, OpenLDAP, MySQL etc.). As stated before, the whole build process needs to be automated, so I don't see a considerable increase of maintenance work coming up here. The dependency problem is mitigated through the fact that we have a frozen portage tree for all our build envs and therefore use the same versions everywhere. --[[User:Chrigu|Chrigu]] ([[User talk:Chrigu|talk]]) 12:04, 6 January 2014 (CET)*** Yes and no on this one. We clearly need to keep the list of packages that require this at bare minimum. <code>net-analyzer/zabbix</code> for instance doesn't warrant this, we just won't start the server on non server nodes. Easy as cake. The server code and it's deps wont do any harm on say a desktop or other server box. Even though I can't think of example, I do believe we will be needing this possibility when we encounter packages that need to be built using different profiles for different use cases, things like having a php with-curlwrappers vs one with the curl module sans curlwrappers. The important point I take from this is that creating new profiles with small deviations from our default must be very easy (ie. not much work). Basically we need the infras support for n different build profiles to be fully automated and well documented. [[User:Lucas|Lucas]] ([[User talk:Lucas|talk]]) 19:52, 9 January 2014 (CET)**** The <code>net-analyzer/zabbix</code> is definitely a good example, I don't want to install and maintain MySQL, Apache, PHP, snmpd (including all the deps) etc. on hosts which just need a Zabbix agent. I would also like to pragmatically avoid unused deps, in order to minimize reverse-updates and security updates (which must be provided nonetheless if the software is in use or not). --[[User:Chrigu|Chrigu]] ([[User talk:Chrigu|talk]]) 13:20, 10 January 2014 (CET)
* Providing binary packages for different major (and sometimes minor) versions, for example: <code>dev-db/mysql-5.X.Y</code> and <code>dev-db/mysql-6.X.Y</code>.
* Provide binary packages for pre-compiled Linux kernels and modules (not just a binary package of <code>sys-kernel/gentoo-sources</code>)
** This makes it possible to build stage4 images from binary packages.
** Most likely there will be separate packages for servers and desktops built with different genkernel configs.
* Handle reverse dependency updates and ABI changes
* Support for multiple environments (development, staging and production)
* Support for multiple architectures (such as x86, amd64 etc.)
* Support for multiple portage build profiles
** system (or base) profile, such as desktop or server (stage3) (all the packages contained within the <code>/etc/portage/make.profile</code> or via <code>emerge @system</code>)
** application profiles, such as php5-app, django-app etc.)
** simple inheritance is used for things like python-app -> django-app
** stacks consist of one system profile and multiple application profiles
* All * don't do this: Gentoo itself has only a few profiles and even there issues arise when combining them (for example desktop + selinux-hardened) --[[User:Tiziano|Tiziano]] ([[User talk:Tiziano|talk]]) 14:40, 3 January 2014 (CET)*** Those are build-profiles (for example chroots or some sort of overlay-fs) not Gentoo (portage ) profiles, we definitely need to clarify those terms ;) --[[User:Chrigu|Chrigu]] ([[User talk:Chrigu|talk]]) 20:01, 5 January 2014 (CET)* All build profiles will use a system profile as their base profile
* Ability to update an existing build profile, without the need to build it from scratch
* Ability to do fully automated clean builds (ie. for new archs or new stacks)
** provide an old version of the tree
** cherry pick updates
*** this should be avoided at all cost since it can lead to various sorts of breakages (ebuild <-> ebuild, ebuild <-> eclass, ebuild <-> profile, eclass <-> profile interaction) --[[User:Tiziano|Tiziano]] ([[User talk:Tiziano|talk]]) 14:24, 3 January 2014 (CET)**** Yes, I agree. Nonetheless, we need the ''possibility'' to do cherry picking, for example to react on zero-day exploits. --[[User:Chrigu|Chrigu]] ([[User talk:Chrigu|talk]]) 19:53, 5 January 2014 (CET)
* Support for a development, staging and production branch
** Ability to automatically sync from upstream
* Profiles sets all the default values for the client's [http://dev.gentoo.org/~zmedico/portage/doc/man/make.conf.5.html <code>make.conf</code>], such as USE flags, BINHOSTS, GENTOO_MIRRORS, CFLAGS, CHOST etc.
** '''Warning''': many such variables are not incremental and therefore need duplication of Gentoo base profile variables (requiring that someone tracks changes in those variables) --[[User:Tiziano|Tiziano]] ([[User talk:Tiziano|talk]]) 14:29, 3 January 2014 (CET)
* keep the profiles (and the inheritance structure) as simple as possible, rather duplicate than inherit for small deviations to avoid inheritence issues --[[User:Tiziano|Tiziano]] ([[User talk:Tiziano|talk]]) 14:33, 3 January 2014 (CET)
== Package host requirements ==
* Serving files via HTTPS
** Binary packages for all the clients (<code>PORTAGE_BINHOST</code>), which were built by the [[#Build_host_requirements|build host]]
*** Binary packages will be accessible via a HTTP URL such as <code>https://packages.example.com/gentoo/ENVIRONMENT/ARCH/BUILD-PROFILE-NAME/latest</code>. This is a symlink to the current snapshot <code>https://packages.example.com/gentoo/ENVIRONMENT/ARCH/BUILD-PROFILE-NAME/YYYY-MM-DD</code>.*** Clients will have <code>PORTAGE_BINHOST="https://packages.example.com/ENVIRONMENT/gentoo/ARCH/SYSTEMENVIRONMENT/BUILD-PROFILE-NAME https://packages.example.com/ENVIRONMENT/gentoo/ENVIRONMENT/ARCH/APP-STACK-PROFILE-NAME ..."</code> set in their <code>/etc/portage/make.conf</code>.
* Support for all three environments (development, staging and production)
* Possibility to authenticate clients either via HTTP basic auth or client certificates.
== File mirror host requirements ==
* Hosts all the files required to build a package (<code>GENTOO_MIRRORS=mirror.example.com/public/gentoo/distfiles</code>)** Acts as a caching mirror for already downloaded packages from an official mirror** Serves fetch-restricted files (<code>dev-java/oracle-jdk-bin</code> for example), to authorized clients* Files are served via HTTPS* Distinguishes between three groups of files** '''public''': Files which are available to all clients (theoretically even to the entire internet)** '''site-local''': Files which are only available to authenticated clients belonging to the same infrastructure (for example those which would put us into [http://www.bettercallsaul.com/ legal troubles] if available to the public)** '''stack-local''': Files which are only available to authenticated clients belonging to the same infrastructure and the software stack group (private files of a specific customer) * Provides an easy way to let an administrator manually upload new files, for example via WebDAV-CGI, SFTP or a similar mechanism.* Possibility to authenticate clients either via HTTP basic auth or client certificates.* Old or no longer supported files will be removed automatically* Can be implemented on the see [[Mirror Server#Build_host_requirements|build hostRequirements]]
== Puppet requirements ==
* Support for all three environments (development, staging and production)* Version controlled via Git* ENC support* Puppet recipes for ** installing, updating, removing and (re-)configuring specific software belonging moved to an application stack (see [[#Build_host_requirements|build hoststoney_orchestra:_Requirements]]), included below for reference.** (re-)configuring software belonging to a system stack** Updating the system stack (<code>emerge @system</code>) aka system update.** installing, updating and removing of kernel packages (including the handling of the ensuing reboot)* use best-of-breed tools like hiera and augeas (this might mean targeting 3.3.x due to module data support in [https{{://github.com/puppetlabs/armatures/blob/master/arm-9.data_in_modules/index.md ARM-9])* Use a sane prexisting puppet architecture conceptstoney orchestra: Requirements}}
== Install host requirements ==
= Implementation proposal =
== Build host farm proposal ==The build host farm consists out of various chroots a system of multiple vms to build binary packages for multiple environments, architectures and build profiles. * Git webhook on internal gitlab install pushes changes to jenkins master.* Jenkins master dishes out jobs to jenkins slave machines for needed architecture and build profile.* Jenkins slaves only get used once and wipe/reprovision themselves after master has stored build artefacts.* We have build-slave templates available for each architecture/build profile combo.* Upon use those get provisioned to the needed environment using puppet.* All of this is set up using puppet and fully automated, even building of new build-slave templates and the whole releng on those.* The build farm also keeps old templates and stable boxes on hold so it can use them to build differentials.* Artefacts slaves will be producing:** "vagrant"-style boot boxes** full binpkg repos for a given env/arch/build profile combo** stage3 balls for each arch/build profile** stage4 balls for each environment** build logs** <code>/var/db/pkg</code>** puppet report data** test results and code analysis results* When we come to continuos deployment the jenkins master will also be able to trigger puppet when merges to master happen.* This rolls out releases to the sub-system that was signed off by a merge to a master branch (see branching strategy in git proposal). === Links === ==== build orchestration ====* [http://mesos.apache.org/ Apache Mesos] cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. Can run for instance Jenkins. ==== package building ====* [http://www.chromium.org/chromium-os/developer-guide/chromite-shell-quick-start chromite] build utility from chromium os ([https://chromium.googlesource.com/chromiumos/chromite/ source repo])** as far as I recall chromium os does highly parallel building making their build really fast with a slight trade of in long termn stability (ie. build might fail due to dependencies being built out of oder), ** the [http://www.chromium.org/chromium-os/developer-guide chromium os developer guide] might also be of interest, among other things it shows that google do split the build into a package building part and an image creation part.* [https://wiki.sabayon.org/?title=En:Entropy entropy] is sabayons portage replacement, it focuses on binaries due to sabayon being a binary distribution** their [https://github.com/Sabayon/build build system "Matter"] might be of interest, it seems to automate large parts of tracking gentoo portage with its tinderbox subsystem** sabayon has <code>kernel-switcher</code> for updating kernels** kernel ebuilds live [https://github.com/Sabayon/sabayon-distro/tree/master/sys-kernel/linux-sabayon here] and probably rely on the [https://github.com/Sabayon/sabayon-distro/blob/master/eclass/sabayon-kernel.eclass sabayon-kernel eclass]. ==== "stage4"/box/iso building ====* [http://packer.io packer.io] can be used to build stage4 (containing a kernel) images and seems to work for gentoo. Packer often gets used to build Vagrant boxes.** [https://github.com/pierreozoux/packer-warehouse/blob/master/var-files/gentoo/generate_latest.sh gentoo script from packer-warehouse] used with packer to create a minimal gentoo vagrant box** currently packer and packer-warehouse do not seem capable of building gentoo machines out of the box, I tested this with osx/virtualbox using gentoo stage3 and portage snapshots [[User:Lucas|Lucas]] ([[User talk:Lucas|talk]]) 11:19, 11 January 2014 (CET)* [https://github.com/jedi4ever/veewee veewee] vagrant box builder (builds stage4 images in a manner similar to packer** has support for a massive amount of guest os types*** installs puppet/chef using gem due to the oldish versions in gentoo (and probably elsewhere)** supports kvm and others as host os** while testing with osx/virtualbox I was able to build and export a vagrant box from gentoo stage3 and portage snapshots without any hiccups [[User:Lucas|Lucas]] ([[User talk:Lucas|talk]]) 11:19, 11 January 2014 (CET)** is in dire need of DRY: [https://github.com/jedi4ever/veewee/pull/690] to make it worth forking* [http://blinkeye.ch/dokuwiki/doku.php/projects/mkstage4 mkstage4]** aimed at creating backup stage4 tarballs of gentoo systems** written in bash** pretty simple, might come in handy as automation tool ==== kernel ==== * at the moment we build tarballs for the kernel+initramfs and the modules using <code>genkernel</code> and have a separate ebuild which installs them* ideally we would like to have an ebuild which takes the kernel sources (like the ebuild for <code>sys-kernel/gentoo-source</code> does), builds it according to some default configuration or a user configuration if available (<code>savedconfig.eclass</code>) and then installs the kernel and the modules as well as some minimal headers+configuration to build other packages requiring the sources to be present* TODO: check whether dracut has some advantages regarding module loading over genkernel-generated initramfs
== Portage tree clone proposal ==
** replace that instance with one native to the infra when it is ready for that
* iPXE [http://ipxe.org/]
 
=== Links ===
* Tools that run puppet on freshly installed machines (and also do some provisioning)
** [https://forge.puppetlabs.com/puppetlabs/razor puppetlabs razor] bare metal/cloud provisioning tool
** [http://www.vagrantup.com/ vagrant] cloud provisioning aimed at provisioning developer boxes (with virtualbox). Has 3rd party support for various cloud systems. Vagrant might be interesting for creating dev clouds. I've seen this being used on production sites.
== Public key infrastructure proposal ==
== git hosting proposal ==
 
* adhere to git-flow for all the things. Automate said usage as far as possible.
{|- class="wikitable"
! colspan=4 | git-flow branching
|-
! Branch
! Environment
! Merge from
! Description
|-
| <code>master</code>
| production
| <code>release/</code> or <code>hotfix/</code>
| Released code with a <code>git tag</code> for each merge.
|-
| <code>release/v0.0.0</code>
| staging
| <code>develop</code>
| Contains final releasing work like updating versioning and changelog. This is where we keep semver concerns in check if they where not taken care of already.
|-
| <code>hotfix/v0.0.0</code>
| staging
| <code>master</code>
| Only for critically urgent fixes. In most cases doing a release from <code>develop</code> is preferred.
|-
| <code>develop</code>
| development
| <code>feature/</code> or <code>master</code>
| Only feature branches that are ready for production should get merged here. <code>master</code> gets merged here after each merge to it. Merging is done with pull requests and review.
|-
| <code>feature/featurename</code>
| development
| <code>develop</code>
| New features get implemented here until they are considered ready for production and merged to <code>develop</code>.
|-
| <code>support/v0.0.0</code>
| LTS
|
| Marked experimental in most implementations and unused for now.
|}
 
* Install gitlab on a vm and integrate external mirrors from github and ldap users from stoney-ldap.
** keep repo of public mirrors in hieradata so we can configure them from puppet.
** each organisation in stoney-ldap automatically gets a private project in gitlab.
* Configure web hook intrastructure and integrate with continuous integration system.
* Make continuous integration show feedback back in gitlab.
** check for <code>git annotate</code> support or use img badges.
 
'''On organization projects in gitlab'''
* Each project comes with default repos.
{| class="wikitable"
! Repo
! Description
|-
| <code>puppet</code>
| Set up using a template, contains a Puppetfile and Puppetfile.lock and a hieradata directory.
|-
| <code>role</code>
| Read only copy of global role module for reference.
|-
| <code>profile</code>
| Read only copy of global profile module for reference.
|}
* Everything in the latter two modules is configurable through hieradata in the first repo.
* The default setup automatically updates <code>role</code> and <code>profile</code> when they get new merges.
* A software agent (ci) regularly clones <code>develop</code>, does a full build and pushes the results back to <code>feature/tinderbox</code>
* This agent autmatically creates pull requests if tinderbox builds did not fail.
* Org leaders may then merge these PRs and bake them into a local release.
* Some kind of UI helps them do this without much technical knowledge.
* More repos may be added by the customer.
* project organizations are private, per customer.
 
== Links ==
* [http://gitlab.org/ gitlab] seems nice even though is is ruby on rails under the hood
* [https://github.com/sag47/gitlab-mirrors gitlab-mirrors] is a companion app to gitlab for adding readonly mirror repos to gitlab. We might consider hacking it to not use <code>git remote prune</code>.
* [http://www.javacodegeeks.com/2014/01/git-flow-with-jenkins-and-gitlab.html git-flow with jenkins and gitlab]
* [https://wiki.jenkins-ci.org/display/JENKINS/Gitlab+Hook+Plugin gitlab hook for jenkins]
= Links =
* [http://wiki.gentoo.org/wiki/Preserve-libs Gentoo preserve-libs]
* [http://swift.siphos.be/aglara/ A Gentoo Linux Advanced Reference Architecture]
* [http://www.gentoo.org/proj/en/gentoo-alt/prefix/ Gentoo Prefix]
* man pages
** [http://dev.gentoo.org/~zmedico/portage/doc/man/portage.5.html portage(5)]
Bureaucrat, administrator
425
edits