@ wrote... (14 years, 5 months ago)

BurgundyWall is located in Calgary, Canada, and the inspiration for the domain name is was located at the end of my living room. If you're really curious you can read (a very little) about me.

If you're interested in any of my articles or if you need a skilled developer then send me an email or check out my resume.

I am available for part-time contract work and have a varied professional experience. So if you need help with embedded Linux application development, backend development, and general Linux (and networking) system administration please get in touch.

If you have some IT related problems that you want to go away I can probably make that happen.

Category: uncategorized
Comments: 0
@ wrote... (1 day, 15 hours ago)

Here's a crappy little script that automates the steps in https://www.burgundywall.com/post/fix-proxmox-containers-wont-start

#!/bin/bash

containers=`pct list | grep stopped | awk '{print $1 " "}'`

echo "restarting containers: $containers"

for m in $containers; do
    lxc-start -n $m &
    echo "force started $m"
done

echo "sleeping for 2 minutes while containers start"
sleep 120

for m in $containers; do
    echo "shutting down $m"
    pct shutdown $m;
    echo "starting $m"
    pct start $m &
done
Category: tech, Tags: lxc, proxmox
Comments: 0
@ wrote... (3 weeks, 2 days ago)

With the version v0.10.0 of Nomad a very important new feature has landed, network namespaces. Network namespaces allows integration with Consul Connect and that allows you to connect to remote services without having to know their address and port. Plus some security stuff but I care more about my services working than being securely broken.

You can read more official information here:

Consul Connect in Nomad are called sidecars. However… sidecars are an advanced feature and the docs and examples are very confusing. At least I found them very confusing. So here is a slightly modified version of the example job file with lots of comments that hopefully explains how to use this great new feature.

more…

Category: tech, Tags: consul, nomad
Comments: 0
@ wrote... (2 months ago)

My cluster was throwing warning Legacy BlueStore stats reporting detected and we could just not abide that.

Here's a simple way to upgrade:

cd /var/lib/ceph/osd
ls                                   # note your osd numbers

ceph osd set noout

for n in 7 8 9 10 11 12 ; do 
  systemctl stop ceph-osd@$n.service; 
  ceph-bluestore-tool repair --path ceph-$n; 
  systemctl start ceph-osd@$n.service; 
  systemctl status ceph-osd@$n.service;
done

# I like to wait until the cluster goes back to green 
# before doing the same on the next host

ceph osd unset noout
Category: tech, Tags: ceph
Comments: 0
@ wrote... (2 months, 1 week ago)

Sometimes an lxc container will refuse to start, usually after something goes wrong on the host. The key is to manually start the container and wait for some timeout or something, then the container will start properly.

Update: this does more or less the same but with less work.

pct list   # note the vmid, replace 100 101 below as appropriate

for m in 100 101; do lxc-start -n $m & done

# after a minute or so you'll see the background tasks finish in the console
pct list   # lxc containers should be running

# Note, that `systemd` will still show the tasks as failed since it didn't
# start them, so now lets stop/start them properly
for m in 100 101; do pct shutdown $m; pct start $m & done

The following is still valid but I now use the above method.

[root@proxmox1 ~]
# pct start 125
pJob for pve-container@125.service failed because a timeout was exceeded.
See "systemctl status pve-container@125.service" and "journalctl -xe" for details.
command 'systemctl start pve-container@125' failed: exit code 1

[root@proxmox1 ~]
# /usr/bin/lxc-start -n 125 -F

# good long wait, over a minute
# login and then `poweroff`

[root@proxmox1 ~]
# pct start 125 && echo $?
0

tl;dr

  1. lxc-start -n 125 -F
  2. login and poweroff
  3. pct start 125
Category: tech, Tags: lxc, proxmox
Comments: 0
@ wrote... (4 months, 3 weeks ago)

Before upgrading to Proxmox 6 you need to upgrade to Corosync 3. Here's an ansible playbook that will automate that…

more…

Category: tech, Tags: ansible, proxmox
Comments: 0
@ wrote... (8 months ago)

We recently upgraded our network to 10 Gbit and were really hoping to see monumental speed increases in our ceph cluster.

One of our benchmarks was pgbench and to say we were sad would be an understatement…

more…

Category: tech, Tags: ceph, postgresql
Comments: 0
@ wrote... (9 months, 1 week ago)

There are lots of posts about setting up CD with Jenkins and Kubernetes but I haven't found any describing how to do it with Nomad and Gitlab.

So here's how I did it…

more…

Category: tech, Tags: cd, ci, gitlab, hashistack, nomad
Comments: 5
@ wrote... (11 months, 2 weeks ago)

I also found the docs for consul connect to be confusing. They don't clearly differentiate the difference between the client and server proxy.

Some declarations that are worth stating explicitly:

  • consul acl needs to be setup first, see consul acl for more info
  • acl and intention are used somewhat interchangeably here
  • client side consul connect proxies can only talk to other consul connect proxies
  • client side consul connect proxies can not talk directly to a service
  • the docs explaining -service vs -listen vs -upstream are terrible
  • I'll use the term proxy to mean consul connect process
  • the term service refers to the actual service (eg. redis)
  • the term server proxy refers to the proxy that connects to a real service
  • the term client proxy refers to the proxy that clients connect to

Having said all that, service mesh sounds like they're worth having.

Mitchell Hashimoto at least partly agrees with me.

more…

Category: tech, Tags: consul
Comments: 0
@ wrote... (11 months, 2 weeks ago)

I found the otherwise great consul docs to be very obtuse and confusing and maybe even wrong.

I'm running these commands against my home setup which only has a single consul server. In a more realistic setup you'll need to duplicate the config changes on all your consul servers and then restart them one at a time.

Ran against consul 1.4.0

more…

Category: tech, Tags: consul
Comments: 2