Skip to content
0
  • Home
  • Piero Bosio
  • Blog
  • World
  • Fediverso
  • News
  • Categories
  • Old Web Site
  • Recent
  • Popular
  • Tags
  • Users
  • Home
  • Piero Bosio
  • Blog
  • World
  • Fediverso
  • News
  • Categories
  • Old Web Site
  • Recent
  • Popular
  • Tags
  • Users
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse

Piero Bosio Social Web Site Personale Logo Fediverso

Social Forum federato con il resto del mondo. Non contano le istanze, contano le persone
stefano@mastodon.bsd.cafeundefined

Stefano Marinelli

@stefano@mastodon.bsd.cafe
About
Posts
2.5k
Topics
704
Shares
2.5k
Groups
0
Followers
1
Following
0

View Original

Posts

Recent Best Controversial

  • Spent my morning figuring out why Nginx was dead on a server with many days of uptime.
    stefano@mastodon.bsd.cafeundefined stefano@mastodon.bsd.cafe

    Spent my morning figuring out why Nginx was dead on a server with many days of uptime. No reboot, no kernel panic. Just... down. Ubuntu 24.04.

    The cause? An automatic unattended-upgrade of libc6. This prompted systemd to work its magic, wisely deciding to restart every running service to apply the patch. Fine.

    The problem is, in the exact same minute, the systemd timer for certbot decided it was time to renew certificates.

    The result:

    - systemd stops Nginx.
    - Port 80 becomes free.
    - certbot, in standalone mode, immediately grabs it for validation.
    - systemd tries to restart Nginx, which fails with "Address already in use".

    The web server was knocked offline by its own certificate renewal script.

    I swear, this is the kind of cascading failure that has never happened to me in years of running *BSD. With a classic cron job, certbot would have failed, logged an error, and tried again the next day. The web server would have remained untouched.

    systemd was doing its job, but something failed because of the interactions.

    Sometimes, too much automation and too many interconnected parts just create more spectacular ways for things to break.

    #SysAdmin #Linux #SystemD #Rant #KISS

    Uncategorized sysadmin linux systemd rant kiss
  • 1 / 1
  • Login

  • Login or register to search.
  • First post
    Last post