Custom LLM Integration

The Agent Governance SDK can be integrated with any LLM provider through manual tracking or custom wrapper implementations. This guide shows how to create monitoring for custom LLM providers not directly supported by the SDK.

When to Use Custom Integration

Use custom integration when:

Unsupported Providers: Working with LLMs not directly supported (Cohere, PaLM, local models)
Custom Infrastructure: Using internally hosted or modified LLM implementations
Specialized Workflows: Complex multi-model or agent-to-agent communication patterns
Legacy Systems: Integrating with existing agent implementations

Basic Custom Integration Pattern

Wrapper Class Approach

Create a wrapper class that implements monitoring around your LLM client:

import { AgentMonitor } from '@agent-governance/node';

class CustomLLMMonitor {
  constructor(llmClient, monitor, agentId, options = {}) {
    this.llmClient = llmClient;
    this.monitor = monitor;
    this.agentId = agentId;
    this.options = options;
  }

  async generateResponse(prompt, sessionId, userId = null) {
    const startTime = Date.now();

    try {
      // Track conversation start if first message
      if (this.options.trackConversationStart !== false) {
        this.monitor.trackConversationStart(this.agentId, sessionId, userId);
      }

      // Track user message
      this.monitor.trackUserMessage(this.agentId, sessionId, prompt, userId);

      // Call your custom LLM
      const response = await this.llmClient.generateText({
        prompt: prompt,
        maxTokens: this.options.maxTokens || 1000,
        temperature: this.options.temperature || 0.7
      });

      const endTime = Date.now();
      const latency = endTime - startTime;

      // Track agent response with metadata
      this.monitor.trackAgentResponse(this.agentId, sessionId, response.text, {
        llmLatency: latency,
        tokensUsed: {
          input: response.usage?.promptTokens || 0,
          output: response.usage?.completionTokens || 0,
          total: response.usage?.totalTokens || 0
        },
        cost: this.calculateCost(response.usage),
        model: this.options.model || 'custom-model'
      });

      return response;

    } catch (error) {
      // Track error
      this.monitor.trackError(this.agentId, sessionId, error, {
        errorType: error.constructor.name,
        severity: 'high',
        recoverable: false
      });
      throw error;
    }
  }

  calculateCost(usage) {
    // Implement cost calculation for your provider
    if (!usage || !this.options.pricing) return 0;

    const inputCost = (usage.promptTokens / 1000000) * this.options.pricing.input;
    const outputCost = (usage.completionTokens / 1000000) * this.options.pricing.output;

    return inputCost + outputCost;
  }
}

// Usage
const monitor = new AgentMonitor({
  apiKey: 'your-api-key',
  organizationId: 'your-org-id'
});

const customLLM = new CustomLLMMonitor(yourLLMClient, monitor, 'custom-agent', {
  model: 'your-custom-model-v1',
  pricing: { input: 0.5, output: 1.5 }, // per million tokens
  trackConversationStart: true
});

const response = await customLLM.generateResponse(
  'Help me with my banking question',
  'session-123',
  'user-456'
);

Proxy Pattern Integration

For more advanced integration, implement a proxy that intercepts LLM calls:

class LLMProxy {
  constructor(originalClient, monitor, agentId) {
    this.originalClient = originalClient;
    this.monitor = monitor;
    this.agentId = agentId;

    return new Proxy(originalClient, {
      get: (target, prop) => {
        if (typeof target[prop] === 'function' && this.shouldMonitor(prop)) {
          return this.createMonitoredMethod(target, prop);
        }
        return target[prop];
      }
    });
  }

  shouldMonitor(methodName) {
    const monitoredMethods = [
      'generate', 'chat', 'complete', 'inference',
      'predict', 'query', 'ask'
    ];
    return monitoredMethods.includes(methodName);
  }

  createMonitoredMethod(target, methodName) {
    return async (...args) => {
      const sessionId = this.extractSessionId(args) || `session-${Date.now()}`;
      const userMessage = this.extractUserMessage(args);
      const startTime = Date.now();

      try {
        if (userMessage) {
          this.monitor.trackUserMessage(this.agentId, sessionId, userMessage);
        }

        const result = await target[methodName].apply(target, args);

        const responseText = this.extractResponseText(result);
        if (responseText) {
          this.monitor.trackAgentResponse(this.agentId, sessionId, responseText, {
            llmLatency: Date.now() - startTime,
            method: methodName,
            tokensUsed: this.extractTokenUsage(result)
          });
        }

        return result;

      } catch (error) {
        this.monitor.trackError(this.agentId, sessionId, error);
        throw error;
      }
    };
  }

  extractSessionId(args) {
    return args.find(arg => arg?.sessionId)?.sessionId;
  }

  extractUserMessage(args) {
    return args.find(arg => typeof arg === 'string') ||
           args.find(arg => arg?.prompt || arg?.message)?.prompt ||
           args.find(arg => arg?.prompt || arg?.message)?.message;
  }

  extractResponseText(result) {
    return result?.text || result?.content || result?.response || String(result);
  }

  extractTokenUsage(result) {
    return result?.usage || result?.tokens || { input: 0, output: 0, total: 0 };
  }
}

// Usage
const monitoredLLM = new LLMProxy(yourLLMClient, monitor, 'custom-agent');
const response = await monitoredLLM.generate('Hello world');

Specific Provider Examples

Cohere Integration

import { CohereClient } from 'cohere-ai';

class CohereAgentMonitor extends AgentMonitor {
  wrapCohere(cohere, agentId, options = {}) {
    const monitor = this;

    return new Proxy(cohere, {
      get(target, prop) {
        if (prop === 'generate') {
          return async function(params) {
            const sessionId = options.sessionId || `session-${Date.now()}`;
            const startTime = Date.now();

            try {
              monitor.trackConversationStart(agentId, sessionId, options.userId);
              monitor.trackUserMessage(agentId, sessionId, params.prompt, options.userId);

              const response = await target.generate.call(target, params);

              monitor.trackAgentResponse(agentId, sessionId, response.generations[0].text, {
                llmLatency: Date.now() - startTime,
                tokensUsed: {
                  input: response.meta?.tokens?.input_tokens || 0,
                  output: response.meta?.tokens?.output_tokens || 0,
                  total: (response.meta?.tokens?.input_tokens || 0) + (response.meta?.tokens?.output_tokens || 0)
                },
                model: params.model || 'command',
                provider: 'cohere'
              });

              return response;
            } catch (error) {
              monitor.trackError(agentId, sessionId, error);
              throw error;
            }
          };
        }
        return target[prop];
      }
    });
  }
}

Local Model Integration

class LocalModelMonitor {
  constructor(monitor, agentId, modelConfig) {
    this.monitor = monitor;
    this.agentId = agentId;
    this.modelConfig = modelConfig;
  }

  async runInference(input, sessionId, options = {}) {
    const startTime = Date.now();

    try {
      this.monitor.trackUserMessage(this.agentId, sessionId, input);

      const result = await this.callLocalModel(input);

      this.monitor.trackAgentResponse(this.agentId, sessionId, result.text, {
        llmLatency: Date.now() - startTime,
        model: this.modelConfig.modelName,
        provider: 'local',
        gpuUtilization: result.gpuStats?.utilization,
        memoryUsed: result.memoryStats?.used,
        inferenceTime: result.processingTime
      });

      return result;

    } catch (error) {
      this.monitor.trackError(this.agentId, sessionId, error, {
        errorType: 'LocalModelError',
        modelPath: this.modelConfig.modelPath,
        severity: 'high'
      });
      throw error;
    }
  }

  async callLocalModel(input) {
    // Implement your local model calling logic
    return {
      text: 'Generated response from local model',
      processingTime: 150,
      gpuStats: { utilization: 0.85 },
      memoryStats: { used: '2.1GB' }
    };
  }
}

Multi-Model Workflows

class MultiModelWorkflow {
  constructor(monitor, agentId) {
    this.monitor = monitor;
    this.agentId = agentId;
    this.models = {
      classifier: new ClassificationModel(),
      generator: new GenerationModel(),
      validator: new ValidationModel()
    };
  }

  async processRequest(userMessage, sessionId) {
    try {
      this.monitor.trackConversationStart(this.agentId, sessionId);
      this.monitor.trackUserMessage(this.agentId, sessionId, userMessage);

      const classification = await this.classifyIntent(userMessage, sessionId);
      const response = await this.generateResponse(userMessage, classification, sessionId);
      const validatedResponse = await this.validateResponse(response, sessionId);

      this.monitor.trackAgentResponse(this.agentId, sessionId, validatedResponse.text, {
        workflow: 'multi_model',
        steps: ['classification', 'generation', 'validation'],
        finalScore: validatedResponse.score
      });

      return validatedResponse;

    } catch (error) {
      this.monitor.trackError(this.agentId, sessionId, error);
      throw error;
    }
  }

  async classifyIntent(message, sessionId) {
    const result = await this.models.classifier.predict(message);
    this.monitor.trackToolCall(
      this.agentId,
      sessionId,
      'intent_classification',
      { message },
      { intent: result.intent, confidence: result.confidence },
      result.processingTime
    );
    return result;
  }

  async generateResponse(message, classification, sessionId) {
    const result = await this.models.generator.generate({ prompt: message, intent: classification.intent });
    this.monitor.trackToolCall(
      this.agentId,
      sessionId,
      'response_generation',
      { prompt: message, intent: classification.intent },
      { response: result.text, quality: result.qualityScore },
      result.processingTime
    );
    return result;
  }

  async validateResponse(response, sessionId) {
    const validation = await this.models.validator.validate(response.text);
    this.monitor.trackToolCall(
      this.agentId,
      sessionId,
      'response_validation',
      { response: response.text },
      { isValid: validation.isValid, issues: validation.issues },
      validation.processingTime
    );
    if (!validation.isValid) {
      throw new Error(`Response validation failed: ${validation.issues.join(', ')}`);
    }
    return validation;
  }
}

Agent-to-Agent Communication

class MultiAgentMonitor {
  constructor(monitor) {
    this.monitor = monitor;
    this.agents = new Map();
  }

  registerAgent(agentId, agentConfig) {
    this.agents.set(agentId, agentConfig);
  }

  async facilitateAgentCommunication(fromAgentId, toAgentId, message, sessionId) {
    try {
      this.monitor.track(fromAgentId, {
        sessionId,
        interactionType: 'agent_communication',
        metadata: { direction: 'outbound', targetAgent: toAgentId, messageType: 'agent_to_agent', content: message }
      });
      this.monitor.track(toAgentId, {
        sessionId,
        interactionType: 'agent_communication',
        metadata: { direction: 'inbound', sourceAgent: fromAgentId, messageType: 'agent_to_agent', content: message }
      });
      const response = await this.processAgentMessage(toAgentId, message, sessionId);
      return response;
    } catch (error) {
      this.monitor.trackError(fromAgentId, sessionId, error, { errorType: 'AgentCommunicationError', targetAgent: toAgentId });
      throw error;
    }
  }

  async processAgentMessage(agentId, message, sessionId) {
    const agentConfig = this.agents.get(agentId);
    return `Response from ${agentId}`;
  }
}

Best Practices for Custom Integration

Testing Custom Integrations

describe('Custom LLM Integration', () => {
  let monitor, customLLM, mockLLMClient;

  beforeEach(() => {
    monitor = new MockAgentMonitor();
    mockLLMClient = { generateText: jest.fn().mockResolvedValue({ text: 'Mock response', usage: { promptTokens: 10, completionTokens: 20, totalTokens: 30 } }) };
    customLLM = new CustomLLMMonitor(mockLLMClient, monitor, 'test-agent');
  });

  test('should track successful generation', async () => {
    await customLLM.generateResponse('Hello', 'session-123');
    const events = monitor.getEvents();
    expect(events).toHaveLength(3);
    expect(events[1].event.interactionType).toBe('user_message');
    expect(events[2].event.interactionType).toBe('agent_response');
  });

  test('should track errors', async () => {
    mockLLMClient.generateText.mockRejectedValue(new Error('API Error'));
    await expect(customLLM.generateResponse('Hello', 'session-123')).rejects.toThrow();
    const errorEvents = monitor.getEventsByType('error');
    expect(errorEvents).toHaveLength(1);
  });
});

Next Steps

Manual Tracking

Learn detailed manual tracking techniques

Performance Optimization

Optimize your custom integration

Testing Guide

Test your integration thoroughly

Troubleshooting

Debug integration issues

Getting Started

Core Features

Integrations

Examples

When to Use Custom Integration

Basic Custom Integration Pattern

Wrapper Class Approach

Proxy Pattern Integration

Specific Provider Examples

Cohere Integration

Local Model Integration

Multi-Model Workflows

Agent-to-Agent Communication

Best Practices for Custom Integration

Testing Custom Integrations

Next Steps

Manual Tracking

Performance Optimization

Testing Guide

Troubleshooting

Getting Started

Core Features

Integrations

Examples

​When to Use Custom Integration

​Basic Custom Integration Pattern

​Wrapper Class Approach

​Proxy Pattern Integration

​Specific Provider Examples

​Cohere Integration

​Local Model Integration

​Multi-Model Workflows

​Agent-to-Agent Communication

​Best Practices for Custom Integration

​Testing Custom Integrations

​Next Steps

Manual Tracking

Performance Optimization

Testing Guide

Troubleshooting

When to Use Custom Integration

Basic Custom Integration Pattern

Wrapper Class Approach

Proxy Pattern Integration

Specific Provider Examples

Cohere Integration

Local Model Integration

Multi-Model Workflows

Agent-to-Agent Communication

Best Practices for Custom Integration

Testing Custom Integrations

Next Steps