Monitoring and Observability Setup with Grafana, Prometheus, and Alerts
Complete monitoring stack setup with metrics, logs, and alerts for production applications.
Implement full observability that detects issues before users notice them, with actionable dashboards and intelligent alerting.
At a glance
Access
Free prompt
Open to copy without upgrading.
Prompt objective
Implement full observability that detects issues before users notice them, with actionable dashboards and intelligent alerting.
Real use case
A SaaS platform serving [NUMBER] active users experiences database storage issues at 3am with no alerting in place. The on-call team only discovers the problem when customers start reporting issues at 8am, resulting in significant user churn and revenue loss. They need proactive monitoring to catch infrastructure problems before business impact.
Customize these fields first
Replace the placeholders with your own context before you run the prompt. That usually improves the first output more than adding more instructions later.
Prompt
Configure a complete observability stack for [PROJECT NAME], a [APPLICATION TYPE] application with [NUMBER] active users running on [DOCKER/KUBERNETES/VPS].\\\\\\\\n\\\\\\\\n**Services to monitor:**\\\\\\\\n- [LIST: e.g. Node.js API, PostgreSQL, Redis, Nginx, Workers]\\\\\\\\n- Infrastructure: CPU, memory, disk, network\\\\\\\\n\\\\\\\\n**1) Metrics (Prometheus + Grafana):**\\\\\\\\n\\\\\\\\n**Prometheus:**\\\\\\\\n- prometheus.yml configuration with targets\\\\\\\\n- Scrape interval per service\\\\\\\\n- Required exporters: node_exporter, postgres_exporter, redis_exporter\\\\\\\\n- Application custom metrics (prom-client for Node.js):\\\\\\\\n - \\\\\\\\\\\\\\\\\\\\\\\`http_requests_total\\\\\\\\\\\\\\\\\\\\\\\` (counter by route, method, status)\\\\\\\\n - \\\\\\\\\\\\\\\\\\\\\\\`http_request_duration_seconds\\\\\\\\\\\\\\\\\\\\\\\` (histogram)\\\\\\\\n - \\\\\\\\\\\\\\\\\\\\\\\`active_connections\\\\\\\\\\\\\\\\\\\\\\\` (gauge)\\\\\\\\n - \\\\\\\\\\\\\\\\\\\\\\\`business_events_total\\\\\\\\\\\\\\\\\\\\\\\` (counter: signups, orders, payments)\\\\\\\\n - \\\\\\\\\\\\\\\\\\\\\\\`queue_size\\\\\\\\\\\\\\\\\\\\\\\` (gauge per queue)\\\\\\\\n- Retention and storage sizing\\\\\\\\n\\\\\\\\n**Grafana Dashboards:**\\\\\\\\n- Dashboard 1: Overview (uptime, request rate, error rate, latency p50/p95/p99)\\\\\\\\n- Dashboard 2: Infrastructure (CPU, RAM, disk, network per container)\\\\\\\\n- Dashboard 3: Database (connections, query duration, cache hit ratio, dead tuples)\\\\\\\\n- Dashboard 4: Business (revenue/hour, conversions, churn indicators)\\\\\\\\n- For each dashboard: exportable JSON with template variables\\\\\\\\n\\\\\\\\n**2) Logs (Grafana Loki or ELK):**\\\\\\\\n- Structured log format (JSON)\\\\\\\\n- Correct log levels: ERROR (failures), WARN (degradation), INFO (events), DEBUG (dev)\\\\\\\\n- Correlation ID per request (trace ID)\\\\\\\\n- Log rotation and retention policy\\\\\\\\n- Query examples for common troubleshooting scenarios\\\\\\\\n\\\\\\\\n**3) Alerts (Alertmanager):**\\\\\\\\n- Alert rules by severity:\\\\\\\\n - **CRITICAL** (PagerDuty/SMS): downtime, error rate > 5%, disk > 95%\\\\\\\\n - **WARNING** (Slack/Email): latency p95 > 2s, CPU > 80%, memory > 85%\\\\\\\\n - **INFO** (Slack): deploy completed, backup finished, cron executed\\\\\\\\n- Routing: who receives which alert (on-call rotation)\\\\\\\\n- Silencing and inhibition rules\\\\\\\\n- Runbooks linked to each alert (what to do when triggered)\\\\\\\\n\\\\\\\\n**4) Uptime Monitoring:**\\\\\\\\n- Standardized health check endpoints (/health, /ready)\\\\\\\\n- External ping (UptimeRobot/Better Uptime)\\\\\\\\n- Public status page for customers\\\\\\\\n\\\\\\\\n**5) docker-compose for the monitoring stack:**\\\\\\\\n- Prometheus + Grafana + Loki + Alertmanager\\\\\\\\n- Persistent volumes for data\\\\\\\\n- Network configuration\\\\\\\\n\\\\\\\\nProvide all configuration files and dashboard JSON exports.
Open directly in an AI — the text is pre-filled:
How to use this prompt
- 1Replace the key placeholders first: PROJECT NAME, APPLICATION TYPE, NUMBER, DOCKER/KUBERNETES/VPS.
- 2Replace any bracketed placeholders like [this] with your own context.
- 3Add extra background information when you want more tailored results.
- 4Combine multiple prompts in one conversation when you need a richer output.
- 5Save your best-performing prompts so they are easy to reuse later.
Next best step
Open the guide first, then branch only if you still need more.
A guide for technical builders choosing between prompts, coding workflows, and agent-based implementation.
If this prompt is close but not quite right, generate variants next. If the job is recurring, move into the course library after the guide.
Related prompts
View allComplete CI/CD Pipeline with GitHub Actions for Next.js Applications
Automated pipeline configuration with tests, build, preview deploys, and production deployment.
Best for
Automate the entire software delivery lifecycle with GitHub Actions, from push to production deployment, including tests, code analysis, and preview environments.
Docker Containerization and Docker Compose Orchestration for Production
Optimized Dockerfiles and docker-compose for development and production environments.
Best for
Create a containerized environment that ensures parity between development and production, with optimized builds, multi-stage builds, and security configurations.
Infrastructure as Code with Terraform for AWS/Hetzner
Automated cloud infrastructure provisioning with Terraform, reusable modules, and state management.
Best for
Automate provisioning of all infrastructure required for a production application, ensuring reproducibility, version control, and compliance.
Incident Response Playbook for Engineering Teams
Structured process for detection, response, communication, and postmortem for production incidents.
Best for
Establish a clear incident response process that minimizes detection and resolution time (MTTR), protects user experience, and generates learnings for the team.
Explore other prompt categories
Move sideways into adjacent libraries when the current category is not the full answer.
Free browsing stays open. Premium prompts unlock the reusable workflow layer.
Use the guides and role paths to validate the job first. Upgrade when you want the full prompt text, editable premium prompts, and the surrounding course paths in one place.
Free access
- Browse guides, role paths, and category pages.
- Preview prompts before you decide to upgrade.
- Find the right starting point without friction.
Membership access
- Unlock premium prompts and the full copy text.
- See more workflow paths and course connections.
- Keep the reusable templates in one place.