A 5-minute grace period begins (allowing for deployment recoveries)
If the service recovers within 5 minutes, the error is cleared (normal deployment scenario)
If still failing after 5 minutes, an automatic power-cycle is triggered via Home Assistant
The machine powers off for 10 seconds, then powers back on

All activity is logged with timestamps for monitoring and troubleshooting.

Prerequisites

Docker and Docker Compose installed
Home Assistant instance running with network access
A power switch entity configured in Home Assistant
Long-lived access token from Home Assistant

Installation

1. Download/Organize Files

Clone or download this repository to your machine:

git clone <repository-url>
cd Thinkcentre-watchdog

The directory should contain:

Dockerfile - Container definition
thinkcenter_monitor.sh - Monitoring script
docker-compose.yml - Docker Compose configuration
.env.example - Environment variable template
README.md - This file

2. Create Configuration File

Copy the example environment file and edit it with your actual values:

cp .env.example .env

Edit .env and configure:

# Your target service URL
TARGET_URL=http://your-kubernetes-service:8080

# Home Assistant configuration
HA_URL=http://homeassistant:8123
HA_TOKEN=your_long_lived_access_token_here
HA_ENTITY=switch.your_power_switch_entity

# Optional: Adjust timing if needed
GRACE_PERIOD=300      # 5 minutes
CHECK_INTERVAL=30     # Check every 30 seconds

3. Generate Home Assistant Token

Open Home Assistant web interface
Go to Settings → Developer Tools → Long-Lived Access Tokens
Click Create Token
Name it (e.g., "Thinkcentre Watchdog")
Copy the token and paste it in your .env file as HA_TOKEN

4. Configure Power Switch in Home Assistant

Ensure you have a switch entity in Home Assistant that controls the machine's power. Common options:

Smart Outlet/Relay: If using a smart power outlet
IPMI/Redfish: For datacenter machines
Smart Plug: Like Tasmota, Zigbee, or Z-Wave devices

Configure the entity ID in your .env as HA_ENTITY (e.g., switch.thinkcentre_power)

5. Build and Run

Start the monitoring container:

docker compose up -d

The container will:

Build from the Dockerfile
Start with restart: unless-stopped policy
Mount logs to a named volume
Apply resource limits (0.1 CPU, 64MB memory)

6. View Logs

Monitor real-time logs:

docker compose logs -f thinkcenter-monitor

Or view persistent logs from the volume:

docker volume inspect thinkcenter_logs
# Look at the Mountpoint directory

7. Stop or Restart

Stop the container:

docker compose down

Restart the container:

docker compose restart thinkcenter-monitor

Deploying Multiple Instances

To monitor multiple machines:

For Machine 2:

Create a separate directory:

mkdir thinkcentre-watchdog-machine2
cd thinkcentre-watchdog-machine2

# Copy files
cp /path/to/original/* .

# Create unique .env
cp .env.example .env

# Edit .env for machine 2
nano .env
# Change: HA_ENTITY=switch.machine2_power
# Change: TARGET_URL to machine 2's service URL

Then run:

docker compose up -d

Using Namespace (Alternative)

Or manage from one directory with unique service names:

docker compose -f docker-compose.yml -f docker-compose.machine2.yml up -d

Configuration Variables

Variable	Default	Description
`TARGET_URL`	`http://localhost:8080`	Service URL to monitor
`HA_URL`	`http://homeassistant:8123`	Home Assistant base URL
`HA_TOKEN`	(required)	Home Assistant long-lived access token
`HA_ENTITY`	`switch.thinkcentre_power`	Home Assistant switch entity ID
`GRACE_PERIOD`	`300`	Seconds to wait before power-cycling (5 minutes)
`CHECK_INTERVAL`	`30`	Seconds between health checks

Troubleshooting

Container won't start

Check if HA_TOKEN is set:

docker compose config | grep HA_TOKEN

No logs appearing

Check the volume mount:

docker volume ls | grep thinkcenter_logs
docker volume inspect thinkcenter_logs

Power-cycle not triggering

Verify HA_TOKEN is valid (check Home Assistant logs)
Confirm HA_ENTITY exists in Home Assistant
Check network connectivity: docker compose exec thinkcenter-monitor curl -v http://homeassistant:8123

Service not responding correctly

Test the target URL directly:

docker compose exec thinkcenter-monitor curl -v http://your-service:8080

How It Works

Health Check: Every CHECK_INTERVAL seconds, HTTP response code is checked
Grace Period: First 502 error triggers a 5-minute window for recovery
Recovery Detection: If service returns non-502 during grace period, error resets
Power Cycle: After grace period expires with continued 502s, power cycle triggers:
- Send turn_off to HA switch entity
- Wait 10 seconds
- Send turn_on to HA switch entity
Logging: All events timestamped and logged to /var/log/thinkcenter_monitor.log

Resource Limits

CPU: 0.1 cores (limited to prevent resource hogging)
Memory: 64MB (minimal requirements for bash + curl)
Logging: JSON file driver, max 10MB per file, keeps 3 files (30MB total)

Debugging

Enable verbose output by checking logs with:

docker compose logs --tail 50 thinkcenter-monitor

To test the script locally (without Docker):

bash thinkcenter_monitor.sh

License

Monitoring solution for Thinkcentre machines.

Support

For issues or improvements, check the logs first and verify all environment variables are correctly set in your .env file.