Worker Nodes

Learn how to deploy and manage Plexus worker nodes across any infrastructure to process your evaluation tasks.

Overview

Plexus worker nodes are long-running daemon processes that handle evaluation tasks and other operations. You can run these workers on any computer with Python installed - whether it's in the cloud (AWS, Azure, GCP) or on your own premises.

Workers are managed using the Plexus CLI tool, which makes it easy to start, configure, and monitor worker processes across your infrastructure.

Starting a Worker

Use the plexus command worker command to start a worker process. Here's a basic example:

plexus command worker \
  --concurrency 4 \
  --queue celery \
  --loglevel INFO

--concurrency: Number of worker processes (default: 4)

--queue: Queue to process (default: celery)

--loglevel: Logging level (default: INFO)

Worker Specialization

Workers can be specialized to handle specific types of tasks using target patterns. This allows you to dedicate certain workers to particular workloads:

# Worker that only processes dataset-related tasks
plexus command worker \
  --target-patterns "datasets/*" \
  --concurrency 4

# Worker for GPU-intensive tasks
plexus command worker \
  --target-patterns "*/gpu-required" \
  --concurrency 2

# Worker handling multiple task types
plexus command worker \
  --target-patterns "datasets/*,training/*" \
  --concurrency 8

Target patterns use the format domain/subdomain and support wildcards. Some examples:

datasets/call-criteria - Only process call criteria dataset tasks
training/call-criteria - Only handle call criteria training tasks
*/gpu-required - Process any tasks requiring GPU resources
datasets/* - Handle all dataset-related tasks

Deployment Examples

Here are some common deployment scenarios:

AWS EC2

# Run in a screen session for persistence
screen -S plexus-worker
plexus command worker \
  --concurrency 8 \
  --loglevel INFO
# Ctrl+A, D to detach

Local Development

# Run with increased logging for debugging
plexus command worker \
  --concurrency 2 \
  --loglevel DEBUG

GPU Worker

# Dedicated GPU worker with specific targeting
plexus command worker \
  --concurrency 1 \
  --target-patterns "*/gpu-required" \
  --loglevel INFO

Best Practices

Use a process manager (like systemd, supervisor, or screen) to keep workers running
Set concurrency based on available CPU cores and memory
Use target patterns to optimize resource utilization
Monitor worker logs for errors and performance issues
Deploy workers close to your data sources when possible
Consider using auto-scaling groups in cloud environments

Additional Resources

For more information about worker deployment and management:

See the CLI documentation for detailed command reference
Check the built-in help with plexus command worker --help
View worker logs with --loglevel DEBUG for troubleshooting