Piero Bosio Social Web Site Personale

Social Forum federato con il resto del mondo. Non contano le istanze, contano le persone

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Spent my morning figuring out why Nginx was dead on a server with many days of uptime.

Uncategorized

25 Posts 10 Posters 89 Views

undefined monospace@floss.social

@stefano Wait, no, I see your point. I had to edit in "post-install restarts", and realized that systemd taking care of that is indeed something special. I will still put the blame on certbot for taking over port 80 even though instructed to use nginx. That should have resulted in a fatal error.
undefined This user is from outside of this forum
undefined This user is from outside of this forum
stefano@mastodon.bsd.cafe

wrote on last edited by

#13

This post is deleted!
1 Reply Last reply

0
undefined monospace@floss.social

@stefano I agree, it's certbot's behaviour that caused the issue in the end, not systemd doing a good job at system maintenance.
undefined This user is from outside of this forum
undefined This user is from outside of this forum
stefano@mastodon.bsd.cafe

wrote on last edited by

#14

This post is deleted!
1 Reply Last reply

0
undefined stefano@mastodon.bsd.cafe

Spent my morning figuring out why Nginx was dead on a server with many days of uptime. No reboot, no kernel panic. Just... down. Ubuntu 24.04.
The cause? An automatic unattended-upgrade of libc6. This prompted systemd to work its magic, wisely deciding to restart every running service to apply the patch. Fine.
The problem is, in the exact same minute, the systemd timer for certbot decided it was time to renew certificates.
The result:
- systemd stops Nginx.
- Port 80 becomes free.
- certbot, in standalone mode, immediately grabs it for validation.
- systemd tries to restart Nginx, which fails with "Address already in use".
The web server was knocked offline by its own certificate renewal script.
I swear, this is the kind of cascading failure that has never happened to me in years of running *BSD. With a classic cron job, certbot would have failed, logged an error, and tried again the next day. The web server would have remained untouched.
systemd was doing its job, but something failed because of the interactions.
Sometimes, too much automation and too many interconnected parts just create more spectacular ways for things to break.
#SysAdmin #Linux #SystemD #Rant #KISS
undefined This user is from outside of this forum
undefined This user is from outside of this forum
lanodan@queer.hacktivis.me

wrote on last edited by

#15

@stefano Says more about certbot than systemd though.
Like web server can just stay up with using the other ACME challenges (which can be DNS or reverse-proxying the acme client), so web server never has to go down.
undefined 1 Reply Last reply

0
undefined lanodan@queer.hacktivis.me

@stefano Says more about certbot than systemd though.
Like web server can just stay up with using the other ACME challenges (which can be DNS or reverse-proxying the acme client), so web server never has to go down.
undefined This user is from outside of this forum
undefined This user is from outside of this forum
stefano@mastodon.bsd.cafe

wrote on last edited by

#16

This post is deleted!
1 Reply Last reply

0
undefined stefano@mastodon.bsd.cafe

This post is deleted!
undefined This user is from outside of this forum
undefined This user is from outside of this forum
farooqkz@cr8r.gg

wrote on last edited by

#17

@stefano
I agree about the problem of Ubuntu here. But I don't think behavior of certbot is fine here either.
I don't think doing certbot --nginx and then it falling back to standalone without explicit request of the user(here the sysadmin) aligns well with Unix philosophy and designs. To be honest, the certbot itself doesn't very much align with Unix philosophy IMO.
undefined 1 Reply Last reply

0
undefined farooqkz@cr8r.gg

@stefano
I agree about the problem of Ubuntu here. But I don't think behavior of certbot is fine here either.
I don't think doing certbot --nginx and then it falling back to standalone without explicit request of the user(here the sysadmin) aligns well with Unix philosophy and designs. To be honest, the certbot itself doesn't very much align with Unix philosophy IMO.
undefined This user is from outside of this forum
undefined This user is from outside of this forum
stefano@mastodon.bsd.cafe

wrote on last edited by

#18

This post is deleted!
1 Reply Last reply

0
undefined stefano@mastodon.bsd.cafe

Spent my morning figuring out why Nginx was dead on a server with many days of uptime. No reboot, no kernel panic. Just... down. Ubuntu 24.04.
The cause? An automatic unattended-upgrade of libc6. This prompted systemd to work its magic, wisely deciding to restart every running service to apply the patch. Fine.
The problem is, in the exact same minute, the systemd timer for certbot decided it was time to renew certificates.
The result:
- systemd stops Nginx.
- Port 80 becomes free.
- certbot, in standalone mode, immediately grabs it for validation.
- systemd tries to restart Nginx, which fails with "Address already in use".
The web server was knocked offline by its own certificate renewal script.
I swear, this is the kind of cascading failure that has never happened to me in years of running *BSD. With a classic cron job, certbot would have failed, logged an error, and tried again the next day. The web server would have remained untouched.
systemd was doing its job, but something failed because of the interactions.
Sometimes, too much automation and too many interconnected parts just create more spectacular ways for things to break.
#SysAdmin #Linux #SystemD #Rant #KISS
undefined This user is from outside of this forum
undefined This user is from outside of this forum
pertho@mastodon.bsd.cafe

wrote on last edited by

#19

This post is deleted!
1 Reply Last reply

0
undefined stefano@mastodon.bsd.cafe

Spent my morning figuring out why Nginx was dead on a server with many days of uptime. No reboot, no kernel panic. Just... down. Ubuntu 24.04.
The cause? An automatic unattended-upgrade of libc6. This prompted systemd to work its magic, wisely deciding to restart every running service to apply the patch. Fine.
The problem is, in the exact same minute, the systemd timer for certbot decided it was time to renew certificates.
The result:
- systemd stops Nginx.
- Port 80 becomes free.
- certbot, in standalone mode, immediately grabs it for validation.
- systemd tries to restart Nginx, which fails with "Address already in use".
The web server was knocked offline by its own certificate renewal script.
I swear, this is the kind of cascading failure that has never happened to me in years of running *BSD. With a classic cron job, certbot would have failed, logged an error, and tried again the next day. The web server would have remained untouched.
systemd was doing its job, but something failed because of the interactions.
Sometimes, too much automation and too many interconnected parts just create more spectacular ways for things to break.
#SysAdmin #Linux #SystemD #Rant #KISS
undefined This user is from outside of this forum
undefined This user is from outside of this forum
ricardo@mastodon.bsd.cafe

wrote on last edited by

#20

This post is deleted!
1 Reply Last reply

0
undefined stefano@mastodon.bsd.cafe

Spent my morning figuring out why Nginx was dead on a server with many days of uptime. No reboot, no kernel panic. Just... down. Ubuntu 24.04.
The cause? An automatic unattended-upgrade of libc6. This prompted systemd to work its magic, wisely deciding to restart every running service to apply the patch. Fine.
The problem is, in the exact same minute, the systemd timer for certbot decided it was time to renew certificates.
The result:
- systemd stops Nginx.
- Port 80 becomes free.
- certbot, in standalone mode, immediately grabs it for validation.
- systemd tries to restart Nginx, which fails with "Address already in use".
The web server was knocked offline by its own certificate renewal script.
I swear, this is the kind of cascading failure that has never happened to me in years of running *BSD. With a classic cron job, certbot would have failed, logged an error, and tried again the next day. The web server would have remained untouched.
systemd was doing its job, but something failed because of the interactions.
Sometimes, too much automation and too many interconnected parts just create more spectacular ways for things to break.
#SysAdmin #Linux #SystemD #Rant #KISS
undefined This user is from outside of this forum
undefined This user is from outside of this forum
chebra@mstdn.io

wrote on last edited by

#21

@stefano How did you even trace this down??
undefined 1 Reply Last reply

0
undefined chebra@mstdn.io

@stefano How did you even trace this down??
undefined This user is from outside of this forum
undefined This user is from outside of this forum
stefano@mastodon.bsd.cafe

wrote on last edited by

#22

This post is deleted!
undefined 1 Reply Last reply

0
undefined stefano@mastodon.bsd.cafe

This post is deleted!
undefined This user is from outside of this forum
undefined This user is from outside of this forum
chebra@mstdn.io

wrote on last edited by

#23

@stefano Oh I need to step up my logging game... a lot...
undefined 1 Reply Last reply

0
undefined chebra@mstdn.io

@stefano Oh I need to step up my logging game... a lot...
undefined This user is from outside of this forum
undefined This user is from outside of this forum
stefano@mastodon.bsd.cafe

wrote on last edited by

#24

This post is deleted!
1 Reply Last reply

0
undefined stefano@mastodon.bsd.cafe

Spent my morning figuring out why Nginx was dead on a server with many days of uptime. No reboot, no kernel panic. Just... down. Ubuntu 24.04.
The cause? An automatic unattended-upgrade of libc6. This prompted systemd to work its magic, wisely deciding to restart every running service to apply the patch. Fine.
The problem is, in the exact same minute, the systemd timer for certbot decided it was time to renew certificates.
The result:
- systemd stops Nginx.
- Port 80 becomes free.
- certbot, in standalone mode, immediately grabs it for validation.
- systemd tries to restart Nginx, which fails with "Address already in use".
The web server was knocked offline by its own certificate renewal script.
I swear, this is the kind of cascading failure that has never happened to me in years of running *BSD. With a classic cron job, certbot would have failed, logged an error, and tried again the next day. The web server would have remained untouched.
systemd was doing its job, but something failed because of the interactions.
Sometimes, too much automation and too many interconnected parts just create more spectacular ways for things to break.
#SysAdmin #Linux #SystemD #Rant #KISS
undefined This user is from outside of this forum
undefined This user is from outside of this forum
stratacast@mastodon.bsd.cafe

wrote on last edited by

#25

@stefano Stuff like this is why I disable unattended upgrades on my servers :) thank you for the reminder
1 Reply Last reply

0

Feed RSS

Spent my morning figuring out why Nginx was dead on a server with many days of uptime.

Gli ultimi otto messaggi ricevuti dalla Federazione

undefined
anarchiversitario@poliversity.it

Rojava: nominato il governatore di hasakah. è un esponente dell amministrazione autonoma della siria del nord-est
@anarchia
In Siria prosegue l’implementazione del cessate il fuoco firmato dal cosiddetto governo di transizione di Damasco e l’Amministrazione autonoma del Rojava. In questo quadro, ieri sera, mercoledì 4 febbraio,

read more
undefined
paoloredaelli@mastodon.uno

@quinta #Firefox ha l'opzione "Copia link pulito". Ho una vaga memoria di un'applicazione analoga per #Android, ma non riesco più a ritrovarla

read more
undefined
evan@cosocial.ca

@lazysupper @dahukanna so, in a long conversation, the number of people who can see the responses gets smaller and smaller over time?

read more
undefined
buckrogers1965@techhub.social

@evan
Just visible to Alice unless she accepts the post. And she controls the visibility on her posts.

read more
undefined
lazysupper@famichiki.jp

@dahukanna
Oh... I need to change my answer. 😅
Other: the dark blue-grey.
@evan

read more
undefined
evan@cosocial.ca

@ori I think this is where I got on the merry-go-round.

read more
undefined
evan@cosocial.ca

@monnier you replied in the wrong thread!

read more
undefined
dougwade@mastodon.xyz

@evan I think so. I think in an ideal world, I would prefer an audience that expands, but in the real world where people use followers-only to feel safe online, it is important that only followers appear downstream of a followers-only post. At least, that’s what I would expect absent some other cue.

read more

@pierobosio@soc.bosio.info

Post suggeriti

undefined

Interesting article on Mozilla's AI investment.
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized vivaldi mozilla firefox browser windows macos linux
2

0 Votes

2 Posts

8 Views

undefined

@jon @Vivaldi I love you guys. You're one of the bright spots in the tech world.Giving users what they want, rather than what you think they (should) want. Amazing that this isn't the norm, but it's not. Thank you Vivaldi for daring to be different 💪
undefined

“Condividere la conoscenza è l'atto più fondamentale dell'amicizia.
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized stallman opensource gnu linux
1

0 Votes

1 Posts

6 Views

undefined

“Condividere la conoscenza è l'atto più fondamentale dell'amicizia. Perché è un modo in cui puoi dare qualcosa senza perdere qualcosa.”Richard Matthew Stallman#stallman #opensource #gnu #linux @linux
undefined

Screw it, I’m installing Linux
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized linux
1

0 Votes

1 Posts

11 Views

undefined

Screw it, I’m installing LinuxBut gaming on Linux is now viable, too. Linux has been a perfectly viable desktop OS for ages. Or at least on mine. 2026 is the year of Linux on the desktop. Calling it now. I am going to put Linux on my gaming PC. This time I'm really going to do it.https://www.theverge.com/tech/823337/switching-linux-gaming-desktop-cachyos#Linux
undefined

Hey #Proxmox community!
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized proxmox debian apt spacewalk qualvosec ansible proxlb linux
1

1

0 Votes

1 Posts

25 Views

undefined

Hey #Proxmox community! I would like to hear your thoughts on how you usually update your Proxmox nodes and clusters. How do you handle minor Proxmox and #Debian package upgrades with #APT?What would you think about a new API endpoint that lets you run unattended upgrades with a simple call like:/nodes/{node_name}/apt/upgradeAt the moment you need to use the node’s HTML5 console to perform upgrades. Other methods exist such as running unattended Debian upgrade scripts, using patch management tools like #Spacewalk or #QualvoSec, or automating the process with #Ansible over SSH. My idea is to have an API based solution that relies on Proxmox authentication and authorization. This would also allow third party tools such as #ProxLB to provide automated patch management and even handle guest rebalancing in a way that is similar to DRS without requiring direct SSH access.I have already been running this approach on several internal clusters since the release of PVE 8 without issues. Now I am interested to hear if you would use unattended upgrades in general or if you are already running them today.#Linux #OpenSource #PatchManagement #Security #DevOps #Automation #Ansible #PVE #PVE8 #PVE9

Piero Bosio Social Web Site Personale

Spent my morning figuring out why Nginx was dead on a server with many days of uptime.

Feed RSS

Gli ultimi otto messaggi ricevuti dalla Federazione

Post suggeriti

Interesting article on Mozilla's AI investment.

“Condividere la conoscenza è l'atto più fondamentale dell'amicizia.

Screw it, I’m installing Linux

Hey #Proxmox community!