Latest revision as of 13:16, 2 December 2011

The Mer and Nemo projects rely quite a lot on the MeeGo community OBS and the infrastructure there - but that will be going away "soon".

[edit] Overview

The Mer project needs to provide 3 areas of service for the Mer Core and any 'incubated' community projects such as Nemo and others.

[edit] Web and community

Basic web services for Mer, Nemo and maybe others (1 VM per service per project, possibly a DB VM)
- Wikis
- Bugzillas
- Web
- Mailing lists
Code hosting : gerrit (1 medium VM)
Download: (1 medium VM with disk space and bandwidth)
Infrastructural services : DNS, LDAP, ssh, sysadmin wiki, backup, monitoring etc (~6 small VMs)

The services above would ideally be hosted across 2 physical hosts with good interconnection for resilience.

[edit] Core QA and Release build service

Mer Code building : OBS (heavy - 3+ large RAM physical hosts)
QA Automation : BOSS (1 VM)
Reference image building : IMG (1 physical host)

We anticipate this needing 3 or 4 physical hosts (depending on spec).

Large amounts of RAM and virtualisation are the most important factor here. SSSE3 would be nice too. (We'd like to experiment with overcommitted tmpfs and swap on SSD as a way to make best use of RAM.)

[edit] Community OBS

We also would like to provide a community build service to fill the same needs as the MeeGo Community OBS did:

Community developer code building for Mer, Nemo, Plasma and other 'incubated' projects : OBS (heavy - 3+ large RAM physical hosts - build.pub.meego.com has 8 at the moment)
QA Automation : BOSS (1 VM)
Image building : IMG (1 or more physical host)

Again the focus is on RAM and physical hosts - each VM worker runs an 8Gb tmpfs and a 4Gb RAM - so multiples of 12Gb in the phost make sense. We slightly overcommit the tmpfs and risk occasional swap. As mentioned SSD may be a good technology here as it may be cheaper than RAM.

In C.OBS we have one worker with 12Gb tmpfs and 8Gb VM ram to handle Qt, Qt WebKit and Chrome.

[edit] Supporting Mer

Our intended governance model is here: Governance draft

As we become more established the advisory board would handle resource planning and funding. We have not formulated any sponsorship arrangements but would expect to put sponsored-by logos in prominent positions. We would prefer to fully disclose sponsorship arrangements but the AB will discuss this if the need arises.

[edit] Contribution Commitments

If you'd like to contribute to the Mer infrastructure then we need to know how long we can rely on services being avaiable. We ask that you make an estimate of how long you anticipate providing the service initially (and realistically a year is a sensible minimum term); if you are going to cease to provide the service to the project, we ask that you please let us know 3 months in advance

[edit] Current Deployment

From a sysadmin/deployment point-of-view we're keeping security fairly tight and prefer to run on dedicated hardware with very limited root access.

So where are we at the moment? :

Carsten and I have bought 3 physical hosts with 24GB RAM which we've almost completed setting up. These now host the QA/Release OBS and the web and infrastructure services; they can support vpn connections to additional phosts. They're pretty much at full capacity
the MeeGo community OBS provides 8x 64GB/16cpu workers and a 64GB/16cpu api/web/scheduler - these are already seeing heavy utilisation and provide an indicative target for a community service
our infrastructure is almost totally virtualised to permit easy migration and we have rapid-VM-deployment tools and good IT policy from the meego.com deployment
we have approached OSUOSL and RackSpace for sponsored hosting
Stefan Werden has made a concrete offer of some hardware and we're working out the details.

We have designed the infrastructure to be able to spread out services over multiple hosting environments. For example we plan to permit QA builds over multiple OBS installations in order to scale better as we support more targets.

[edit] VMs

Most of these VMs run on one very stretched phost!

[edit] Central Admin

We have internal admin VMs:

DNS, Audit, VM buildmaster, CA, generally a 'secure' host - not reachable from outside
puppetadm
mail, 2ndry DNS, internal wiki
ssh access
ldap internal accounts, ssh keys, VM access control
ldap external accounts
openvpn - provides a virtual management lan for LDAP, DNS, ntp etc

[edit] Services

shell - user homes
bz1
bz2
web
boss
gerrit
CI OBS fe (api/webui)
CI OBS be (scheduler, repo, src)
wiki

CI OBS workers : 2x 24Gb phosts

[edit] Missing

Community OBS
Community workers
Image building
Additional CI OBS Workers
Public wiki
Monitoring
Backup

[edit] Technology

We use:

OpenSuse 11.4 on the physical host (legacy from MeeGo infra - OBS workers have to use this so we use it everywhere for consistency. Good choice for customers as it offers supported variants and we like to support SuSE)
Debian Squeeze on VMs unless Suse is needed (our admins prefer Debian)
KVM virtualisation (we used Xen on meego.com and it was fine - KVM is supposed to have better I/O caching)

From a sysadmin PoV:

DCP - 'homegrown' mix of rsync and git to make a distributed etckeeper
Puppet for common file management (SSL keys, pam/ldap conf, resolv.conf, ntp, sshd ...)
LDAP for accounts
ssh via jump machine

[edit] Contributing Hardware

What are we looking for? Hopefully we can take any reasonable contribution and would suggest any of:

[edit] Useful contributions

[edit] A large collection of physical hosts to use as the community build service

The target here is to scale up to the current MeeGo C.OBS size. 8 worker machines with 16 cpus/64Gb RAM and one similar machine to host the web/api/download and similar services.

The hosting facility should be able to support vlans or other secure network connections from worker machines to the scheduler machines.

We'd expect to support users for Plasma Active, Nemo, Cordia and other such projects

Starting smaller with plans to scale if needed is absolutely fine. Virtualisation makes this fairly painless.

[edit] A small collection of physical hosts as QA/build/Image or other services

The target here is to support the QA building of one or more architectures: 1-2 worker machines with 8 cpus/24Gb RAM and one similar machine to host the web/api/download and similar services.

This would be connected to the main Mer CI OBS

Building images to validate commits (and then running tests on them) is an important goal of the Mer systems.

[edit] Individual physical hosts to use as remote workers or service hosts

The target here is to supplement the QA or c.OBS building capacity: a single worker machines with 8 cpus/24Gb RAM and good network connection.

[edit] Service support (eg drupal, email, nagios, puppet etc)

This is currently handled by 1 8core 24Gb machine in a hosted DC.

[edit] Data backup services

We have no "proper" backup service though we do of course duplicate config and data.

[edit] Mirrors (although this is not a critical issue)

[edit] Donations to OSUOSL, Open-SLX or other hosting provider (subject to an agreement with them)

If you can help with any of this we would be extremely grateful.

[edit] Specifications

General specifications are discussed throughout - and without question RAM is the most important thing for the physical hosts. Each admin VM uses 768-1Gb RAM and overcommit the CPUs. The service VMs have 1-2Gb RAM each. The build hosts usefully use 12-16Gb per 4 cores.

Actual disk requirements are fairly low - most VMs run at around 12Gb disk. Running LVM on a software RAID1 on most hosts makes sense. We allocate data volumes in 100Gb chunks to the OBS store and git. On the MeeGo c.obs we used 250Gb on a 500Gb LV for the backend/src and the same on downloads. The build hosts can probably use <250Gb disk.

The amount of RAM and disk you need on a physical host (phost) depends on how many VMs you want to run there - and vice-versa.

We also have to bear in mind growth and network traffic - we don't want to end up with systems that need high speed connectivity running over a slow interlink. There's a fair amount of inter-machine data exchange during builds.

So, for example, a center which has room for just 3 or 4 smallish machines isn't the best place to start a long-term community OBS deployment (but it's a lot better than nothing!) It is however a great place for a CI-OBS instance or as a QA image build/verification farm.

If we provide flashable images for Nemo that would grow so we should probably be aiming for a couple of phosts with >750Gb RAID1.

SSD technology may prove to be effective in backing tmpfs to allow more effective use of RAM.

[edit] Contribution Social Contract

We need to think about contributions to the Mer project.

Questions to ask a contributor:

how long do we get "the contribution" for?
for tangible contributions (eg hardware) who does it belong to?
what happens when it goes away?

There needs to be some kind of agreement like:

if you need to cease providing the service to the project you will let us know 3 months in advance
project data will be treated as confidential

@@ Line 34: / Line 34: @@
 * Image building : IMG (1 or more physical host)
-Again the focus is on RAM and physical hosts.
+Again the focus is on RAM and physical hosts - each VM worker runs an 8Gb tmpfs and a 4Gb RAM - so multiples of 12Gb in the phost make sense. We slightly overcommit the tmpfs and risk occasional swap. As mentioned SSD may be a good technology here as it may be cheaper than RAM.
+In C.OBS we have one worker with 12Gb tmpfs and 8Gb VM ram to handle Qt, Qt WebKit and Chrome.
 == Supporting Mer ==
@@ Line 55: / Line 57: @@
 * our infrastructure is almost totally virtualised to permit easy migration and we have rapid-VM-deployment tools and good IT policy from the meego.com deployment
 * we have approached OSUOSL and RackSpace for sponsored hosting
-* Stefan Werden has made a concrete offer of some hardware
+* Stefan Werden has made a concrete offer of some hardware and we're working out the details.
 We have designed the infrastructure to be able to spread out services over multiple hosting environments. For example we plan to permit QA builds over multiple OBS installations in order to scale better as we support more targets.
@@ Line 108: / Line 110: @@
 = Contributing Hardware =
 What are we looking for? Hopefully we can take any reasonable contribution and would suggest any of:
-* A large collection of physical hosts to use as the community build service
+== Useful contributions ==
-* A small collection of physical hosts as QA/build/Image or other services
-* Individual physical hosts to use as remote workers or service hosts
+=== A large collection of physical hosts to use as the community build service ===
-* Service support (eg drupal, email, nagios, puppet etc)
-* Data backup services
+The target here is to scale up to the current MeeGo C.OBS size. 8 worker machines with 16 cpus/64Gb RAM and one similar machine to host the web/api/download and similar services.
-* Mirrors (although this is not a critical issue)
-* Donations to OSUOSL or other hosting provider (subject to an agreement with them)
+The hosting facility should be able to support vlans or other secure network connections from worker machines to the scheduler machines.
+We'd expect to support users for Plasma Active, Nemo, Cordia and other such projects
+Starting smaller with plans to scale if needed is absolutely fine. Virtualisation makes this fairly painless.
+=== A small collection of physical hosts as QA/build/Image or other services ===
+The  target here is to support the QA building of one or more architectures: 1-2 worker  machines with 8 cpus/24Gb RAM and one similar machine to host the  web/api/download and similar services.
+This would be connected to the main Mer CI OBS
+Building images to validate commits (and then running tests on them) is an important goal of the Mer systems.
+=== Individual physical hosts to use as remote workers or service hosts ===
+The   target here is to supplement the QA or c.OBS building capacity: a single worker  machines with 8 cpus/24Gb RAM and good network connection.
+=== Service support (eg drupal, email, nagios, puppet etc) ===
+This is currently handled by 1 8core 24Gb machine in a hosted DC.
+=== Data backup services ===
+We have no "proper" backup service though we do of course duplicate config and data.
+=== Mirrors (although this is not a critical issue) ===
+=== Donations to OSUOSL, Open-SLX or other hosting provider (subject to an agreement with them) ===
 If you can help with any of this we would be extremely grateful.
+== Specifications ==
+General specifications are discussed throughout - and without question RAM is the most important thing for the physical hosts.  Each admin VM uses 768-1Gb RAM and overcommit the CPUs. The service VMs have 1-2Gb RAM each. The build hosts usefully use 12-16Gb per 4 cores.
+Actual disk requirements are fairly low - most VMs run at around 12Gb disk.  Running LVM on a software RAID1 on most hosts makes sense. We allocate data volumes in 100Gb chunks to the OBS store and git. On the MeeGo c.obs we used 250Gb on a 500Gb LV for the backend/src and the same on downloads. The build hosts can probably use <250Gb disk.
+The  amount of RAM and disk you need on a physical host (phost) depends on  how many VMs you want to run there - and vice-versa.
+We also have to bear in mind growth and network traffic - we don't want to end up with systems that need high speed connectivity running over a slow interlink. There's a fair amount of inter-machine data exchange during builds.
+So, for example, a center which has room for just 3 or 4 smallish machines isn't the best place to start a long-term community OBS deployment (but it's a lot better than nothing!) It is however a great place for a CI-OBS instance or as a QA image build/verification farm.
+If we provide flashable images for Nemo that would grow so we should probably be aiming for a couple of phosts with >750Gb RAID1.
+SSD technology may prove to be effective in backing tmpfs to allow more effective use of RAM.
+= Contribution Social Contract =
+We need to think about contributions to the Mer project.
+Questions to ask a contributor:
+* how long do we get "the contribution" for?
+* for tangible contributions (eg hardware) who does it belong to?
+* what happens when it goes away?
+There needs to be some kind of agreement like:
+* if you need to cease providing the service to the project you will let us know 3 months in advance
+* project data will be treated as confidential

Infrastructure Requirements