Quickstart Guides

Go from zero to a running workload in under 5 minutes. Each quickstart includes copy-paste code and step-by-step instructions.

Deploy Your First AI Model on HarchOS

This tutorial walks you through deploying an AI model as a production inference endpoint on the HarchOS sovereign AI mesh. You will install the SDK, register a model, deploy it with carbon-aware scheduling, and run your first inference request.

5 minutes TypeScript SDK H100 GPU
1

Install the HarchOS SDK

Install the HarchOS TypeScript SDK using npm. This gives you access to all platform APIs including compute, data, and monitoring.

bash
npm install @harchos/sdk
2

Initialize the Client

Create a HarchOS client with your API key and configuration. Set sovereignty to "strict" to ensure your data stays within the Morocco jurisdiction. Enable carbon-aware scheduling for optimal energy efficiency.

typescript
import { HarchOS } from '@harchos/sdk';

const client = await HarchOS.create({
  apiKey: process.env.HARCHOS_API_KEY,
  region: 'morocco',
  sovereignty: 'strict',
  carbonAware: true,
});
3

Register Your Model

Upload and register your model artifact. HarchOS supports PyTorch, TensorFlow, and ONNX formats. The platform automatically validates the model, generates metadata, and prepares it for deployment.

typescript
const model = await client.models.register({
  name: 'my-llama-model',
  framework: 'pytorch',
  artifact: './model.pt',
  sovereignty: 'strict',
});

console.log(`Model registered: ${model.id}`);
4

Deploy to an Inference Endpoint

Deploy your model as a scalable inference endpoint. Specify GPU type, count, and scheduling strategy. HarchOS will automatically select the optimal hub based on carbon intensity, latency requirements, and resource availability.

typescript
const endpoint = await client.models.deploy(model.id, {
  gpu: 'H100',
  count: 2,
  hub: 'auto',           // Let THINK choose the optimal hub
  schedule: 'carbon-optimal',
  autoScale: {
    min: 1,
    max: 8,
    targetLatency: '100ms',
  },
});

console.log(`Endpoint live at: ${endpoint.url}`);
5

Run Inference

Send inference requests to your deployed endpoint. The SDK handles load balancing, retry logic, and automatic failover between hubs. Monitor response times and energy consumption in real time.

typescript
const result = await client.inference.run(endpoint.id, {
  input: 'Explain the HarchOS SENSE-THINK-ACT architecture',
  maxTokens: 512,
  temperature: 0.7,
});

console.log(result.output);
console.log(`Latency: ${result.latencyMs}ms`);
console.log(`Energy: ${result.energyWh}Wh (source: ${result.energySource})`);

You did it!

Your first AI model is now running on the HarchOS sovereign AI mesh. It is deployed with carbon-aware scheduling, meaning HarchOS automatically routes inference requests to the hub with the lowest carbon intensity while maintaining your latency requirements. You can monitor your endpoint in the HarchOS dashboard or via the monitoring API.