Quickstart /0.1
Quickstart Guides
Go from zero to a running workload in under 5 minutes. Each quickstart includes copy-paste code and step-by-step instructions.
5-Minute Quickstarts
Start in Minutes
Tutorial
Deploy Your First AI Model on HarchOS
This tutorial walks you through deploying an AI model as a production inference endpoint on the HarchOS sovereign AI mesh. You will install the SDK, register a model, deploy it with carbon-aware scheduling, and run your first inference request.
Install the HarchOS SDK
Install the HarchOS TypeScript SDK using npm. This gives you access to all platform APIs including compute, data, and monitoring.
npm install @harchos/sdk
Initialize the Client
Create a HarchOS client with your API key and configuration. Set sovereignty to "strict" to ensure your data stays within the Morocco jurisdiction. Enable carbon-aware scheduling for optimal energy efficiency.
import { HarchOS } from '@harchos/sdk';
const client = await HarchOS.create({
apiKey: process.env.HARCHOS_API_KEY,
region: 'morocco',
sovereignty: 'strict',
carbonAware: true,
});Register Your Model
Upload and register your model artifact. HarchOS supports PyTorch, TensorFlow, and ONNX formats. The platform automatically validates the model, generates metadata, and prepares it for deployment.
const model = await client.models.register({
name: 'my-llama-model',
framework: 'pytorch',
artifact: './model.pt',
sovereignty: 'strict',
});
console.log(`Model registered: ${model.id}`);Deploy to an Inference Endpoint
Deploy your model as a scalable inference endpoint. Specify GPU type, count, and scheduling strategy. HarchOS will automatically select the optimal hub based on carbon intensity, latency requirements, and resource availability.
const endpoint = await client.models.deploy(model.id, {
gpu: 'H100',
count: 2,
hub: 'auto', // Let THINK choose the optimal hub
schedule: 'carbon-optimal',
autoScale: {
min: 1,
max: 8,
targetLatency: '100ms',
},
});
console.log(`Endpoint live at: ${endpoint.url}`);Run Inference
Send inference requests to your deployed endpoint. The SDK handles load balancing, retry logic, and automatic failover between hubs. Monitor response times and energy consumption in real time.
const result = await client.inference.run(endpoint.id, {
input: 'Explain the HarchOS SENSE-THINK-ACT architecture',
maxTokens: 512,
temperature: 0.7,
});
console.log(result.output);
console.log(`Latency: ${result.latencyMs}ms`);
console.log(`Energy: ${result.energyWh}Wh (source: ${result.energySource})`);You did it!
Your first AI model is now running on the HarchOS sovereign AI mesh. It is deployed with carbon-aware scheduling, meaning HarchOS automatically routes inference requests to the hub with the lowest carbon intensity while maintaining your latency requirements. You can monitor your endpoint in the HarchOS dashboard or via the monitoring API.
Continue Learning