Minutes from the Tue Jul 17 infra call

classic Classic list List threaded Threaded
2 messages Options
Guilhem Moulin Guilhem Moulin
Reply | Threaded
Open this post in threaded view
|

Minutes from the Tue Jul 17 infra call

Participants
============

1. guilhem
2. cloph
3. Brett

(On our Jitsi instance today: https://jitsi.documentfoundation.org/infra .)

Agenda
======

 * [rdm#2463] gerrit upgraded to 2.13.11, but no OS upgrade yet, and the
   highstate wasn't applied either
   + Need some refactoring of the salt states (vm148 isn't using the new
     baseline)
   + Any preferred time when to do the upgrade?
   + AI guilhem: upgrade any friday/saturday night
 * Prometheus
   + Brett: Which version of prometheus-node-exporter are we going to
     use? monitoring.df.org is using the backports version. 0.16 was
     recently uploaded into backports and changes a number of metric
     names.
     - G. Fair point, for the baseline we should stick to Debian proper
       if possible as backports don't have security support from the
       Debian Security Team
     - I pulled the backport sources since I wanted prometheus 2.x, and
       prometheus-node-exporter bpo came with it ^^
     - Do we have reasons not to downgrade?  Unlike prometheus,
       prometheus-node-exporter is (should be) in baseline
     - TODO downgrade
 * Alert system update
   + Brett: 'prometheus' branch ready for review, it includes some alert
     rules
   + AI guilhem: review and merge
 * Status page:
   + CachetHQ test instance at https://vm182.documentfoundation.org/
     (snakeoil cert)
   + Works fine with MariaDB 10.1 and PHP 7.0
   + Can subscribe to events by mail or RSS/atom feeds
   + Doesn't monitor directly but another service can use the API to
     send metric points and automatically change state from/to
     failed/operational: https://docs.cachethq.io/v1.0/docs/addons
   + AI guilhem: Deploy a prod instance next to monitoring.tdf (Need PHP
     & MariaDB)
   + monitoring.tdf is hosted at filoo, most of the prod boxes are at
     manitu, so different AS:es and datacenters
 * IPMI
   + new ADMIN password does seem to stick
   + iKVM is a real pain to use… would rather use the serial console
     `ipmitool -I lanplus -H $IP -e'&' -U ADMIN sol activate`
   + OK to start with “console=ttyS1” in the kernel command line and
     start an agetty(8) on that TTY?  Cf. berta
   + Same thing for the VMs but with ttyS0: `virsh console $vmXYZ`; we
     don't really need SPICE do we?
   → OK fine to stick to the console :-)
 * Hypervisor upgrades:
   + charly upgraded to 9.4 on June 30, adapted the salt config for
     hypervisors accordingly
   + 9.5 released on July 14, can upgrade another hypervisor this
     week-end (excelsior and/or dauntless)
     - cloph: can shut down vm176 & vm177 during the upgrade
 * How about moving the calls to Jitsi?  OK for everyone \o/
 * Next call: Tuesday August 21 2018 at 18:30 Berlin time (16:30 UTC)

--
Guilhem.

--
To unsubscribe e-mail to: [hidden email]
Problems? https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/website/
Privacy Policy: https://www.documentfoundation.org/privacy
Florian Effenberger Florian Effenberger
Reply | Threaded
Open this post in threaded view
|

Re: Minutes from the Tue Jul 17 infra call

Hi Guilhem,

thanks a lot for driving this and for sharing the minutes!

Guilhem Moulin wrote on 2018-07-17 at 21:26:

>   * [rdm#2463] gerrit upgraded to 2.13.11, but no OS upgrade yet, and the
>     highstate wasn't applied either
>     + Need some refactoring of the salt states (vm148 isn't using the new
>       baseline)
>     + Any preferred time when to do the upgrade?
>     + AI guilhem: upgrade any friday/saturday night

Please just keep in mind to announce to both website@ and projects@ in
time (so people are prepared and can object if there's a collision), and
avoid doing it before your holiday actually :-)

>   * Status page:
>     + CachetHQ test instance at https://vm182.documentfoundation.org/
>       (snakeoil cert)

Great initiative!

>   * Hypervisor upgrades:
>     + charly upgraded to 9.4 on June 30, adapted the salt config for
>       hypervisors accordingly
>     + 9.5 released on July 14, can upgrade another hypervisor this
>       week-end (excelsior and/or dauntless)
>       - cloph: can shut down vm176 & vm177 during the upgrade

Same here, please announce in time for a longer downtime, and avoid it
runs into your absence. :-)

>   * How about moving the calls to Jitsi?  OK for everyone \o/

Happy to hear that! :)

Florian

--
To unsubscribe e-mail to: [hidden email]
Problems? https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/website/
Privacy Policy: https://www.documentfoundation.org/privacy