Worker Nodes

Learn how to deploy and manage Plexus worker nodes across any infrastructure to process your evaluation tasks.

Overview

Plexus worker nodes are long-running daemon processes that handle evaluation tasks and other operations. You can run these workers on any computer with Python installed - whether it's in the cloud (AWS, Azure, GCP) or on your own premises.

Workers are managed using the Plexus CLI tool, which makes it easy to start, configure, and monitor worker processes across your infrastructure.

Starting a Worker

Use the plexus command worker command to start a worker process. Here's a basic example:

plexus command worker \ --concurrency 4 \ --queue celery \ --loglevel INFO

--concurrency: Number of worker processes (default: 4)

--queue: Queue to process (default: celery)

--loglevel: Logging level (default: INFO)

Worker Specialization

Workers can be specialized to handle specific types of tasks using target patterns. This allows you to dedicate certain workers to particular workloads:

# Worker that only processes dataset-related tasks plexus command worker \ --target-patterns "datasets/*" \ --concurrency 4 # Worker for GPU-intensive tasks plexus command worker \ --target-patterns "*/gpu-required" \ --concurrency 2 # Worker handling multiple task types plexus command worker \ --target-patterns "datasets/*,training/*" \ --concurrency 8

Target patterns use the format domain/subdomain and support wildcards. Some examples:

  • datasets/call-criteria - Only process call criteria dataset tasks
  • training/call-criteria - Only handle call criteria training tasks
  • */gpu-required - Process any tasks requiring GPU resources
  • datasets/* - Handle all dataset-related tasks

Deployment Examples

Here are some common deployment scenarios:

AWS EC2

# Run in a screen session for persistence screen -S plexus-worker plexus command worker \ --concurrency 8 \ --loglevel INFO # Ctrl+A, D to detach

Local Development

# Run with increased logging for debugging plexus command worker \ --concurrency 2 \ --loglevel DEBUG

GPU Worker

# Dedicated GPU worker with specific targeting plexus command worker \ --concurrency 1 \ --target-patterns "*/gpu-required" \ --loglevel INFO

Best Practices

  • Use a process manager (like systemd, supervisor, or screen) to keep workers running
  • Set concurrency based on available CPU cores and memory
  • Use target patterns to optimize resource utilization
  • Monitor worker logs for errors and performance issues
  • Deploy workers close to your data sources when possible
  • Consider using auto-scaling groups in cloud environments

Additional Resources

For more information about worker deployment and management:

  • See the CLI documentation for detailed command reference
  • Check the built-in help with plexus command worker --help
  • View worker logs with --loglevel DEBUG for troubleshooting