Skip to main content

Full Deployment Guide

Complete end-to-end deployment workflow for Pixell agents from development to production. This guide covers the entire lifecycle from building agents with PAK to deploying them with PAR in production environments.

Deployment Overview

The Pixell deployment workflow consists of:

  • 🏗️ Development - Build agents with PAK
  • 📦 Packaging - Create APKG files
  • 🚀 Deployment - Deploy to PAR runtime
  • 🔧 Configuration - Production configuration
  • 📊 Monitoring - Health checks and metrics
  • 🔄 Scaling - Horizontal and vertical scaling

Prerequisites

Before starting the deployment process:

  • PAK installed - PAK Installation
  • PAR runtime - PAR Installation
  • Docker - For containerized deployments
  • Kubernetes - For orchestrated deployments (optional)
  • Cloud account - AWS, GCP, or Azure (for cloud deployment)

Step 1: Development and Testing

1.1 Create Your Agent

Start by creating a new agent project:

# Create a new agent
pixell init my-production-agent
cd my-production-agent

# Verify the project structure
ls -la

1.2 Develop Agent Logic

Implement your agent's core functionality:

# src/main.py
import os
import json
from typing import Dict, Any, List

class ProductionAgent:
def __init__(self):
self.name = "Production Agent"
self.version = "1.0.0"
self.capabilities = ["data_processing", "analysis", "reporting"]

def process_request(self, request: Dict[str, Any]) -> Dict[str, Any]:
"""Process incoming requests"""
try:
# Validate request
if not self._validate_request(request):
return {"error": "Invalid request format"}

# Process based on request type
request_type = request.get("type")
if request_type == "data_analysis":
return self._analyze_data(request)
elif request_type == "report_generation":
return self._generate_report(request)
else:
return {"error": "Unsupported request type"}

except Exception as e:
return {"error": f"Processing failed: {str(e)}"}

def _validate_request(self, request: Dict[str, Any]) -> bool:
"""Validate request structure"""
required_fields = ["type", "data"]
return all(field in request for field in required_fields)

def _analyze_data(self, request: Dict[str, Any]) -> Dict[str, Any]:
"""Analyze data and return insights"""
data = request.get("data", {})

# Perform analysis
analysis_result = {
"summary": "Data analysis completed",
"insights": ["Trend detected", "Anomaly found"],
"confidence": 0.95,
"timestamp": "2024-01-15T10:30:00Z"
}

return analysis_result

def _generate_report(self, request: Dict[str, Any]) -> Dict[str, Any]:
"""Generate report based on request"""
data = request.get("data", {})

# Generate report
report = {
"title": "Analysis Report",
"sections": ["Executive Summary", "Detailed Analysis", "Recommendations"],
"generated_at": "2024-01-15T10:30:00Z",
"status": "completed"
}

return report

# Global agent instance
agent = ProductionAgent()

def get_agent():
"""Get the agent instance"""
return agent

1.3 Implement REST API

# src/rest/routes.py
from fastapi import APIRouter, HTTPException, BackgroundTasks
from pydantic import BaseModel
from typing import Dict, Any, Optional
from ..main import get_agent

router = APIRouter()

class ProcessRequest(BaseModel):
type: str
data: Dict[str, Any]
priority: Optional[str] = "normal"

@router.post("/process")
async def process_request(request: ProcessRequest, background_tasks: BackgroundTasks):
"""Process agent requests"""
try:
agent = get_agent()
result = agent.process_request(request.dict())

# Log the request
background_tasks.add_task(log_request, request.dict(), result)

return {
"success": True,
"result": result,
"agent": agent.name,
"version": agent.version
}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))

@router.get("/health")
async def health_check():
"""Health check endpoint"""
return {
"status": "healthy",
"agent": "production-agent",
"version": "1.0.0",
"uptime": "00:05:23"
}

@router.get("/capabilities")
async def get_capabilities():
"""Get agent capabilities"""
agent = get_agent()
return {
"capabilities": agent.capabilities,
"description": "Production-ready data processing agent"
}

async def log_request(request: Dict[str, Any], result: Dict[str, Any]):
"""Log request for monitoring"""
# Implement logging logic
pass

1.4 Implement A2A Service

# src/a2a/service.py
from pixell_runtime.proto import agent_pb2, agent_pb2_grpc
from google.protobuf.timestamp_pb2 import Timestamp
from ..main import get_agent

class ProductionAgentService(agent_pb2_grpc.AgentServiceServicer):
def __init__(self):
self.agent = get_agent()

def Health(self, request, context):
"""Health check for A2A communication"""
return agent_pb2.HealthResponse(
status="healthy",
message="Production agent is running",
timestamp=Timestamp(seconds=int(time.time()))
)

def DescribeCapabilities(self, request, context):
"""Describe agent capabilities"""
return agent_pb2.CapabilitiesResponse(
capabilities=self.agent.capabilities,
description="Production-ready data processing agent",
version="1.0.0"
)

def Invoke(self, request, context):
"""Handle A2A invoke requests"""
try:
# Parse the request
message = request.message
context_dict = dict(request.context) if request.context else {}

# Process the request
result = self.agent.process_request({
"type": context_dict.get("type", "data_analysis"),
"data": json.loads(message) if isinstance(message, str) else message
})

return agent_pb2.InvokeResponse(
response=json.dumps(result),
success=True,
metadata={"agent": "production-agent", "version": "1.0.0"}
)
except Exception as e:
return agent_pb2.InvokeResponse(
response=f"Error: {str(e)}",
success=False,
metadata={"error": str(e)}
)

1.5 Test Locally

Test your agent locally before packaging:

# Start development server
pixell run-dev

# Test REST API
curl -X POST http://localhost:8000/process \
-H "Content-Type: application/json" \
-d '{"type": "data_analysis", "data": {"dataset": "sales_2024"}}'

# Test health endpoint
curl http://localhost:8000/health

# Test capabilities
curl http://localhost:8000/capabilities

Step 2: Build and Package

2.1 Configure Agent Manifest

Update your agent.yaml for production:

# agent.yaml
name: production-agent
version: 1.0.0
description: Production-ready data processing agent
author: Your Team
entry_point: src/main.py

# Production configuration
environment: production
log_level: INFO

# Agent capabilities
capabilities:
- data_processing
- analysis
- reporting

# API endpoints
endpoints:
rest:
- path: /process
method: POST
description: Process data analysis requests
- path: /health
method: GET
description: Health check endpoint
- path: /capabilities
method: GET
description: Get agent capabilities
a2a:
- service: ProductionAgentService
methods: [Health, DescribeCapabilities, Invoke]
ui:
- path: /
description: Agent dashboard

# Production settings
production:
max_workers: 4
timeout: 30
memory_limit: 2GB
auto_restart: true

# Security settings
security:
authentication: required
rate_limiting: enabled
cors_origins:
- https://yourdomain.com
- https://api.yourdomain.com

2.2 Build APKG Package

Build your agent into a production-ready APKG:

# Validate configuration
pixell validate

# Build production package
pixell build --compress --output-dir ./dist/production

# Verify the package
pixell inspect ./dist/production/production-agent-1.0.0.apkg

2.3 Test APKG Locally

Test your APKG package locally:

# Run PAR with your APKG
pixell-runtime --package ./dist/production/production-agent-1.0.0.apkg

# Test the deployed agent
curl -X POST http://localhost:8080/process \
-H "Content-Type: application/json" \
-d '{"type": "data_analysis", "data": {"dataset": "sales_2024"}}'

Step 3: Production Configuration

3.1 Environment Configuration

Create production environment configuration:

# .env.production
# Runtime configuration
RUNTIME_MODE=three-surface
PORT=8080
ADMIN_PORT=9090
HOST=0.0.0.0

# Agent configuration
AGENT_PACKAGE_PATH=./production-agent-1.0.0.apkg
AGENT_MEMORY_LIMIT=2GB
AGENT_TIMEOUT=30

# Security
API_KEY=your-production-api-key
JWT_SECRET=your-jwt-secret
CORS_ORIGINS=https://yourdomain.com,https://api.yourdomain.com

# Logging
LOG_LEVEL=INFO
LOG_FORMAT=json
LOG_FILE=/var/log/par/production.log

# Performance
MAX_WORKERS=4
KEEPALIVE_TIMEOUT=5
MAX_CONNECTIONS=1000

# Monitoring
METRICS_ENABLED=true
HEALTH_CHECK_INTERVAL=30

3.2 Production Configuration File

Create a comprehensive production configuration:

# par-config.production.yaml
runtime:
mode: three-surface
host: 0.0.0.0
port: 8080
admin_port: 9090
workers: 4
timeout: 30

agent:
package_path: ./production-agent-1.0.0.apkg
memory_limit: 2GB
timeout: 30
auto_restart: true

logging:
level: INFO
format: json
file: /var/log/par/production.log
max_size: 100MB
backup_count: 5
rotation: true

security:
api_key: ${PAR_API_KEY}
jwt_secret: ${PAR_JWT_SECRET}
cors_origins:
- https://yourdomain.com
- https://api.yourdomain.com
rate_limiting:
enabled: true
requests_per_minute: 1000
ssl:
enabled: true
cert_path: /etc/ssl/certs/par.crt
key_path: /etc/ssl/private/par.key

performance:
max_workers: 4
timeout: 30
memory_limit: 2GB
keepalive_timeout: 5
max_connections: 1000

monitoring:
metrics: true
health_check_interval: 30
status_endpoint: /health
metrics_endpoint: /metrics
alerting:
enabled: true
webhook_url: ${ALERT_WEBHOOK_URL}
thresholds:
cpu_usage: 80
memory_usage: 90
response_time: 5.0

storage:
type: s3
bucket: your-par-bucket
region: us-west-2
access_key: ${AWS_ACCESS_KEY}
secret_key: ${AWS_SECRET_KEY}

Step 4: Container Deployment

4.1 Create Dockerfile

Create a production-ready Dockerfile:

# Dockerfile
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
gcc \
g++ \
curl \
&& rm -rf /var/lib/apt/lists/*

# Copy requirements
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY . .

# Create non-root user
RUN useradd -m -u 1000 par && chown -R par:par /app
USER par

# Create log directory
RUN mkdir -p /app/logs

# Expose ports
EXPOSE 8080 8081 9090

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1

# Start PAR
CMD ["pixell-runtime", "--config", "par-config.production.yaml"]

4.2 Build Docker Image

# Build production image
docker build -t production-agent:1.0.0 .

# Tag for registry
docker tag production-agent:1.0.0 your-registry.com/production-agent:1.0.0

# Push to registry
docker push your-registry.com/production-agent:1.0.0

4.3 Deploy with Docker Compose

# docker-compose.production.yml
version: '3.8'

services:
production-agent:
image: your-registry.com/production-agent:1.0.0
container_name: production-agent
ports:
- "8080:8080"
- "8081:8081"
- "9090:9090"
volumes:
- ./logs:/app/logs
- ./data:/app/data
environment:
- RUNTIME_MODE=three-surface
- PORT=8080
- ADMIN_PORT=9090
- LOG_LEVEL=INFO
- API_KEY=${PAR_API_KEY}
- JWT_SECRET=${PAR_JWT_SECRET}
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
deploy:
resources:
limits:
memory: 2G
cpus: '1.0'
reservations:
memory: 1G
cpus: '0.5'

# Optional: Add reverse proxy
nginx:
image: nginx:alpine
container_name: production-agent-nginx
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
- ./ssl:/etc/nginx/ssl
depends_on:
- production-agent
restart: unless-stopped

# Optional: Add monitoring
prometheus:
image: prom/prometheus:latest
container_name: production-agent-prometheus
ports:
- "9091:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
restart: unless-stopped

grafana:
image: grafana/grafana:latest
container_name: production-agent-grafana
ports:
- "3000:3000"
volumes:
- grafana-storage:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
restart: unless-stopped

volumes:
grafana-storage:

Step 5: Kubernetes Deployment

5.1 Create Kubernetes Manifests

# k8s-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: production-agents
---
apiVersion: v1
kind: ConfigMap
metadata:
name: production-agent-config
namespace: production-agents
data:
par-config.yaml: |
runtime:
mode: three-surface
host: 0.0.0.0
port: 8080
admin_port: 9090
logging:
level: INFO
format: json
performance:
max_workers: 4
timeout: 30
---
apiVersion: v1
kind: Secret
metadata:
name: production-agent-secrets
namespace: production-agents
type: Opaque
data:
api-key: <base64-encoded-api-key>
jwt-secret: <base64-encoded-jwt-secret>

5.2 Deployment Manifest

# k8s-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: production-agent
namespace: production-agents
spec:
replicas: 3
selector:
matchLabels:
app: production-agent
template:
metadata:
labels:
app: production-agent
spec:
containers:
- name: production-agent
image: your-registry.com/production-agent:1.0.0
ports:
- containerPort: 8080
name: http
- containerPort: 8081
name: grpc
- containerPort: 9090
name: admin
env:
- name: RUNTIME_MODE
value: "three-surface"
- name: PORT
value: "8080"
- name: API_KEY
valueFrom:
secretKeyRef:
name: production-agent-secrets
key: api-key
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: production-agent-secrets
key: jwt-secret
volumeMounts:
- name: config
mountPath: /app/config
- name: logs
mountPath: /app/logs
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: config
configMap:
name: production-agent-config
- name: logs
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: production-agent-service
namespace: production-agents
spec:
selector:
app: production-agent
ports:
- name: http
port: 80
targetPort: 8080
- name: grpc
port: 8081
targetPort: 8081
- name: admin
port: 9090
targetPort: 9090
type: LoadBalancer

5.3 Deploy to Kubernetes

# Create namespace and config
kubectl apply -f k8s-namespace.yaml

# Deploy the agent
kubectl apply -f k8s-deployment.yaml

# Check deployment status
kubectl get pods -n production-agents
kubectl get services -n production-agents

# Get external IP
kubectl get service production-agent-service -n production-agents

Step 6: Cloud Deployment

6.1 AWS ECS Deployment

# ecs-task-definition.json
{
"family": "production-agent",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "1024",
"memory": "2048",
"executionRoleArn": "arn:aws:iam::account:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::account:role/productionAgentTaskRole",
"containerDefinitions": [
{
"name": "production-agent",
"image": "your-account.dkr.ecr.region.amazonaws.com/production-agent:1.0.0",
"portMappings": [
{
"containerPort": 8080,
"protocol": "tcp"
},
{
"containerPort": 8081,
"protocol": "tcp"
},
{
"containerPort": 9090,
"protocol": "tcp"
}
],
"environment": [
{
"name": "RUNTIME_MODE",
"value": "three-surface"
},
{
"name": "PORT",
"value": "8080"
}
],
"secrets": [
{
"name": "API_KEY",
"valueFrom": "arn:aws:secretsmanager:region:account:secret:par/api-key"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/production-agent",
"awslogs-region": "us-west-2",
"awslogs-stream-prefix": "ecs"
}
},
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
}
}
]
}

6.2 Google Cloud Run

# cloud-run.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: production-agent
annotations:
run.googleapis.com/ingress: all
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/maxScale: "10"
run.googleapis.com/cpu: "2"
run.googleapis.com/memory: "4Gi"
spec:
containers:
- image: gcr.io/your-project/production-agent:1.0.0
ports:
- containerPort: 8080
env:
- name: RUNTIME_MODE
value: "three-surface"
- name: PORT
value: "8080"
resources:
limits:
cpu: "2"
memory: "4Gi"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10

Step 7: Monitoring and Observability

7.1 Health Checks

# Basic health check
curl http://your-agent-url/health

# Detailed status
curl http://your-agent-url/status

# Metrics endpoint
curl http://your-agent-url/metrics

7.2 Prometheus Configuration

# prometheus.yml
global:
scrape_interval: 15s

scrape_configs:
- job_name: 'production-agent'
static_configs:
- targets: ['production-agent-service:9090']
metrics_path: /metrics
scrape_interval: 10s

7.3 Grafana Dashboard

{
"dashboard": {
"title": "Production Agent Dashboard",
"panels": [
{
"title": "Request Rate",
"type": "graph",
"targets": [
{
"expr": "rate(par_requests_total[5m])",
"legendFormat": "Requests/sec"
}
]
},
{
"title": "Response Time",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(par_request_duration_seconds_bucket[5m]))",
"legendFormat": "95th percentile"
}
]
},
{
"title": "Error Rate",
"type": "graph",
"targets": [
{
"expr": "rate(par_errors_total[5m])",
"legendFormat": "Errors/sec"
}
]
}
]
}
}

Step 8: Scaling and Optimization

8.1 Horizontal Scaling

# k8s-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: production-agent-hpa
namespace: production-agents
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: production-agent
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80

8.2 Performance Optimization

# Performance tuning
performance:
max_workers: 8
timeout: 30
memory_limit: 4GB
keepalive_timeout: 5
max_connections: 2000
connection_pooling: true
caching:
enabled: true
ttl: 300
max_size: 1000

Troubleshooting

Common Issues

1. Agent Not Starting

Symptoms: Agent fails to start or crashes

Solutions:

# Check logs
kubectl logs -n production-agents deployment/production-agent

# Check resource limits
kubectl describe pod -n production-agents production-agent-pod

# Verify configuration
kubectl get configmap -n production-agents production-agent-config -o yaml

2. High Memory Usage

Symptoms: Agent consuming too much memory

Solutions:

# Increase memory limits
resources:
limits:
memory: "4Gi"
requests:
memory: "2Gi"

3. Slow Response Times

Symptoms: High response times

Solutions:

# Optimize performance settings
performance:
max_workers: 8
timeout: 30
connection_pooling: true
caching:
enabled: true

Best Practices

1. Security

  • Use strong API keys and JWT secrets
  • Enable SSL/TLS encryption
  • Implement rate limiting
  • Regular security updates

2. Monitoring

  • Set up comprehensive health checks
  • Monitor key metrics (CPU, memory, response time)
  • Implement alerting for critical issues
  • Regular log analysis

3. Performance

  • Optimize resource allocation
  • Use connection pooling
  • Implement caching where appropriate
  • Regular performance testing

4. Reliability

  • Implement proper error handling
  • Use circuit breakers for external dependencies
  • Regular backups and disaster recovery
  • Blue-green deployments for zero downtime

Next Steps

After successful deployment:

  1. Monitoring - Set up comprehensive monitoring
  2. Scaling - Scale your deployment
  3. Security - Implement security best practices
  4. Best Practices - Follow production best practices

Ready to monitor your deployment? Check out Monitoring to set up comprehensive monitoring for your production agents!