TypeScript SDK inference

OctoAI TypeScript SDK at a glance

If you need assistance with any specifics for using the OctoAI TypeScript SDK, please see the TypeScript SDK Reference.

The OctoAI TypeScript SDK is intended to help you use OctoAI endpoints. At its simplest, it allows you to run inferences against an endpoint by providing a dictionary with the necessary inputs.

TypeScript

import { Client } from "@octoai/client";

// Token can be provided to the client here.
const client = new Client();

// The endpoint URL can be provided, as well as the inputs necessary to run an inference.
const result = await client.infer("yourEndpointUrl", {"key": "value"});

// It also allows for inference streams for LLMs
const stream = await client.inferStream("yourEndpointUrl", {"key": "value"});

// And for server-side asynchronous inferences
const future = await client.inferAsync("yourEndpointUrl", {"key": "value"});
let ready = await client.isFutureReady(future);
  while (!ready) {
    ready = await client.isFutureReady(future);
    await new Promise((resolve) => setTimeout(resolve, 1000));
  }
  const outputs = (await client.getFutureResult(future));

// And includes healthChecks
if (await client.healthCheck("healthCheckEndpointUrl") === 200) {
	// Run some inferences
}

Example: Whisper Speech Recognition

Whisper is a natural language processing model that converts audio to text. Like with Stable Diffusion, we’ll use base64 encoding an mp3 or a wav file into a base64 string.

TypeScript

import { Client } from "@octoai/client";
import { readFileSync} from "fs";

const whisperEndpoint = "https://whisper-demo-kk0powt97tmb.octoai.run/predict";
const whisperHealthCheck = "https://whisper-demo-kk0powt97tmb.octoai.run/healthcheck";

// This instantiation approach takes your OCTOAI_TOKEN as an environment variable
// If you have not set it as an envvar, you can use the below instead
// const OCTOAI_TOKEN = "API token here from following the token creation guide";
// const client = new Client(OCTOAI_TOKEN);
const client = new Client();

// First, we need to convert an audio file to base64.
const audio = readFileSync("./octo_poem.wav", {
    encoding: "base64",
});

// These are the inputs we will send to the endpoint, including the audio base64 string.
const inputs = {
    language: "en",
    task: "transcribe",
    audio: audio,
};

async function wavToText() {
    if (await client.healthCheck(whisperHealthCheck) === 200) {
        const outputs: any = await client.infer(whisperEndpoint, inputs);
        console.log(outputs.transcription);
    }
}

wavToText().then();

With this particular test file, we will have this printed in the console:

Once upon a time, an AI system was asked to come up with a poem for an octopus. 
It said something along the lines of, Octopus, you are very wise.

Whisper Outputs

The above outputs variable returns JSON in something like the following format.

JSON

{
  prediction_time_ms: 626.42526,
  response: {
    segments: [ [Object] ],
    word_segments: [
      [Object]
    ]
  },
  transcription: ' Once upon a time, an AI system was asked to come up with a poem for an octopus. It said something along the lines of, Octopus, you are very wise.'
}

Each segment is an object that looks something like:

JSON

{
  start: 5.553,
  end: 8.66,
  text: 'It said something along the lines of, Octopus, you are very wise.',
  words: [
    {
      word: 'It',
      start: 5.553,
      end: 5.633,
      score: 0.945,
      speaker: null
    },
    {
      word: 'said',
      start: 5.653,
      end: 5.814,
      score: 0.328,
      speaker: null
    },
    // etc...
  ],
  speaker: null
}

Each word_segment is an object that looks something like:

JSON

{ word: 'Once', start: 0.783, end: 0.903, score: 0.883, speaker: null }

TypeScript SDK asynchronous inference

The asynchronous inference API addresses longer inferences so you can so you can provide responses faster to clients. The inference data is stored for 24 hours and is then deleted. This can be used simply in the TypeScript SDK due to it managing your headers and also authentication, as well as providing helper methods to manage the responses received from the server.

TypeScript

// Client instantiation, inputs and endpoint URLs are identical to the previous Whisper Guide.
// Similarly, the setup will be identical for other endpoints you use this feature on as well.
import { Client } from "@octoai/client";
import * as fs from "fs";

const whisperEndpoint = "https://whisper-demo-kk0powt97tmb.octoai.run/predict";
const whisperHealthCheck = "https://whisper-demo-kk0powt97tmb.octoai.run/healthcheck";

// This instantiation approach takes your OCTOAI_TOKEN as an environment variable
// If you have not set it as an envvar, you can use the below instead
// const OCTOAI_TOKEN = "API token here from following the token creation guide";
// const client = new Client(OCTOAI_TOKEN);
const client = new Client();

// First, we need to convert an audio file to base64.
const audio = fs.readFileSync("./octo_poem.wav", {
  encoding: "base64",
});

// These are the inputs we will send to the endpoint, including the audio base64 string.
const inputs = {
  language: "en",
  task: "transcribe",
  audio: audio,
};

The pattern of creating a future with the same URL and inputs is the same regardless of the endpoint you’re using. Custom endpoints also allow this feature.

TypeScript

async function wavToText() {
    if (await client.healthCheck(whisperHealthCheck) === 200) {
        const future = await client.inferAsync(whisperEndpoint, inputs);
        let ready = await client.isFutureReady(future);
        while (!ready) {
            ready = await client.isFutureReady(future);
            await new Promise((resolve) => setTimeout(resolve, 1000));
        }
        const outputs = (await client.getFutureResult(future));
        console.log(outputs.transcription);
    }
}

wavToText().then();

As with the previous example, the console shows:

Once upon a time, an AI system was asked to come up with a poem for an octopus.
It said something along the lines of, Octopus, you are very wise.

This API allows you to store and check on many futures running inferences on the server side. If you merge these steps with client.inferAsync and client.isFutureReady as well as client.getFutureResult, all endpoints can be used asynchronously on the server side, allowing you to collect futures and poll for when one is ready then surface those results.

Quickstart

Text Gen Solution

Media Gen Solution

Compute Service

Private Deployment

CLI

Python SDK

TypeScript SDK

FAQs

TypeScript SDK inference

OctoAI TypeScript SDK at a glance

Example: Whisper Speech Recognition

Whisper Outputs

TypeScript SDK asynchronous inference

Quickstart

Text Gen Solution

Media Gen Solution

Compute Service

Private Deployment

CLI

Python SDK

TypeScript SDK

FAQs

​OctoAI TypeScript SDK at a glance

​Example: Whisper Speech Recognition

​Whisper Outputs

​TypeScript SDK asynchronous inference

OctoAI TypeScript SDK at a glance

Example: Whisper Speech Recognition

Whisper Outputs

TypeScript SDK asynchronous inference