crofAI Documentation

Sections

Basic API Format

example

POST /v1/{MODEL-TIER}/{MODEL-NAME}

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "system message content"
                
    },

                    {
                
        "role": "user",

                        "content": "user message content"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "AI response"

                }

Please note that model names are not always obvious, if your guess for the model name doesn't work, please return to the docs. Also note that speeds may vary between each model.

It is also important to note that this format does not work for stable-diffusion.
To use stable diffusion please refer to the bottom of the premium models section, or the bottom of the code examples section.

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "system message content"},

                        {"role": "user", "content": "user message content"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/{MODEL-TIER}/{MODEL-NAME}', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'system message content' },

                        { role: 'user', content: 'user message content' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/{MODEL-TIER}/{MODEL-NAME}', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

Free models

tinyllama (slow)

POST /v1/free

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "I am doing well, thank you for asking. How about you?"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 500

gemma-7b-it ~35 tokens/second

POST /v1/free/gemma_7b

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "I am doing well, thank you for asking. How about you?"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 500

llama3-8b-instruct ~25 tokens/second

POST /v1/free/llama3

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "I'm doing well, thank you for asking! I'm a large language model, so I don't have emotions like humans do, but I'm always happy to help with any questions or tasks you may have. I'm functioning properly and ready to assist you in any way I can. How about you, how are you doing today?"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 400

mistral-7b-instruct ~25 tokens/second

POST /v1/free/mistral-7b-instruct

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": " I'm an artificial intelligence and don't have feelings or the ability to be helpful in the same way a human assistant does. My purpose is to provide information and answer questions to the best of my ability. Is there something specific you'd like to ask or learn about? I'm here to help you. I'm an artificial intelligence and don't have the ability to feel emotions or physical sensations, so I don't have the capability to be "happy," "sad," or any other emotional state. however, I'm here to help answer any question or solve any problem you have to the best of my ability. Is there something specific you would like to know or ask about? I'm here for you!"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 500

llama3.1-8b-instruct ~25 tokens/second

POST /v1/free/llama3.1-8b

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "I'm just a chatbot, so I don't have feelings or emotions like humans do. However, I'm functioning properly and ready to help with any questions or tasks you may have! I'm always "on" and ready to assist. How about you? How can I help you today?"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 400

Premium models

tinyllama (fast)

POST /v1/premium/tinyllama

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "I am doing well, thank you for asking. How about you?"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 500

qwen1.5-0.5b-chat ~100 tokens/second

POST /v1/premium/qwen0.5

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "Certainly! How are you doing? I hope everything is good and you are doing well. Is there anything I can do to support you? I'd love to help in any way. How should I proceed? Is there anything I can do to help? I'd be happy to assist you in any way possible."

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 800

gemma-7b-it ~300 tokens/second

POST /v1/premium/gemma_7b

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "As a chatbot, I do not have personal feelings or physical health. However, I am functioning optimally and ready to assist you with any information or task you have for me. My purpose is to provide helpful and informative responses. How can I assist you today?"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 700

llama3-8b-Instruct ~550 tokens/second

POST /v1/premium/llama3

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "I'm doing well, thank you for asking! I'm a large language model, so I don't have emotions like humans do, but I'm always happy to help with any questions or tasks you may have. I'm functioning properly and ready to assist you in any way I can. How about you, how are you doing today?"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 1024

llama3.1-8b-Instruct ~350 tokens/second

POST /v1/premium/llama3.1-8b

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "I'm doing well, thank you for asking! I'm a large language model, so I don't have emotions like humans do, but I'm always happy to help with any questions or tasks you may have. I'm functioning properly and ready to assist you in any way I can. How about you, how are you doing today?"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 400

codestral-22b-instruct ~75 tokens/second

POST /v1/premium/codestral-22b-instruct

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "I'm an artificial intelligence, so I don't have feelings or emotions. However, I'm here and ready to assist you to the best of my abilities! How can I help you today?"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 1024

mistral-7b-instruct ~60 tokens/second

POST /v1/premium/mistral-7b-instruct

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": " I'm an artificial intelligence and don't have feelings or the ability to be helpful in the same way a human assistant does. My purpose is to provide information and answer questions to the best of my ability. Is there something specific you'd like to ask or learn about? I'm here to help you. I'm an artificial intelligence and don't have the ability to feel emotions or physical sensations, so I don't have the capability to be "happy," "sad," or any other emotional state. however, I'm here to help answer any question or solve any problem you have to the best of my ability. Is there something specific you would like to know or ask about? I'm here for you!"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 500

stable-diffusion-xl-base-1.0 (BETA)

POST /v1/premium/stable_diffusion

view code

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "prompt": "cat"

                }

RESPONSE 200 OK

                {

                    "response": "(binary string goes here)"

                }

Please note that the response may vary.

Business models

llama3-70b-Instruct ~140 tokens/second

POST /v1/premium/llama3_70b

HEADERS:
X-API-Key: api-token-here

BODY (application/json)

                {

                    "messages": [

                    {

                        "role": "system",

                        "content": "you are a chatbot"
                
    },

                    {
                
        "role": "user",

                        "content": "how are you?"
                
    }],

                    "max_tokens": 300

                }

RESPONSE 200 OK

                {

                    "response": "I'm doing great, thanks for asking! I'm a large language model, so I don't have emotions or feelings like humans do, but I'm always happy to chat with you and help with any questions or topics you'd like to discuss. I'm functioning properly and ready to assist you with anything you need. How about you? How's your day going?"

                }

Please note that the above response may vary and the application/json may include more messages.
it is also important to note that the max tokens this will ever generate is 400

Code examples

tinyllama (slow)

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/free/', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/free/', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 30 tokens/second

gemma-7b-it ~35 tokens/second

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/free/gemma_7b', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/free/gemma_7b', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 35 tokens/second

llama3-8b-instruct ~25 tokens/second

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/free/llama3', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/free/llama3', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 25 tokens/second

llama3.1-8b-instruct ~25 tokens/second

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/free/llama3', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/free/llama3.1-8b', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

mistral-7b-instruct ~25 tokens/second

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/free/mistral-7b-instruct', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/free/mistral-7b-instruct', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 25 tokens/second

tinyllama (fast)

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/premium/tinyllama', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/premium/tinyllama', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 80 tokens/second

qwen1.5-0.5b-chat ~100 tokens/second

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/premium/qwen0.5', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/premium/qwen0.5', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 100 tokens/second

mistral-7b-instruct ~60 tokens/second

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/premium/mistral-7b-instruct', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/premium/mistral-7b-instruct', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 60 tokens/second

gemma-7b-it ~300 tokens/second

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/premium/gemma_7b', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/premium/gemma_7b', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 300 tokens/second

llama3-8b-Instruct ~550 tokens/second

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/premium/llama3', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/premium/llama3', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 550 tokens/second

llama3.1-8b-instruct ~350 tokens/second

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/premium/llama3.1-8b', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/premium/llama3.1-8b', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 350 tokens/second

codestral-22b-instruct ~75 tokens/second

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/premium/codestral-22b-instruct', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/premium/codestral-22b-instruct', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 75 tokens/second

stable-diffusion-xl-base-1.0 (BETA)

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "prompt": "cat"

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/premium/stable_diffusion', json=payload, headers=headers)

                value = r1.json()

                try:

                    with open("generated_image.png", "wb") as f:

                        f.write(eval(value["response"]))

                except KeyError:

                    print(value)

                # good luck writing this in any other language, I'm so confused :)

llama3-70b-Instruct ~140 tokens/second (business users only)

Python Example

                import requests

                import json

                headers = {"X-API-Key": "myapikey"}

                payload = {

                    "messages": [

                        {"role": "system", "content": "you are a chatbot"},

                        {"role": "user", "content": "how are you?"}

                    ], 

                    "max_tokens": 500

                }

                r1 = requests.post(url=f'https://ai.nahcrof.com/v1/business/llama3_70b', json=payload, headers=headers)

                value = r1.json()

                try:

                    print(value["response"])

                except KeyError:

                    print(value)

TypeScript Example

                import axios, { AxiosResponse } from 'axios';

                const headers = { 'X-API-Key': 'myapikey' };

                const payload = {

                    messages: [

                        { role: 'system', content: 'you are a chatbot' },

                        { role: 'user', content: 'how are you?' }

                    ],

                        max_tokens: 500

                    };

                axios.post('https://ai.nahcrof.com/v1/business/llama3_70b', payload, { headers })

                    .then((response: AxiosResponse<{ response: string }>) => {

                            try {

                                console.log(response.data.response);

                            } catch (error) {

                                if (error instanceof Error && error.name === 'TypeError') {

                                    console.log(response.data);

                                } else {

                                    throw error;

                                }

                            }

                        })

                        .catch((error) => {

                            console.error(error);

                        });

average speed: 140 tokens/second

Errors

401 UNAUTHORIZED

                {

                    "error": true

                    "status": 401

                    "message": "Unauthorized"

                }

500 INTERNAL SERVER ERROR

                {

                    "error": true

                    "status": 500

                    "message": "whatever internal error you have goes here"

                }

crofAI Documentation

Sections

Basic API Template

Free API

Premium API

Business API

Code API Examples

API Errors

Basic API Format

example

POST /v1/{MODEL-TIER}/{MODEL-NAME}

Python Example

TypeScript Example

Free models

tinyllama (slow)

POST /v1/free

gemma-7b-it ~35 tokens/second

POST /v1/free/gemma_7b

llama3-8b-instruct ~25 tokens/second

POST /v1/free/llama3

mistral-7b-instruct ~25 tokens/second

POST /v1/free/mistral-7b-instruct

llama3.1-8b-instruct ~25 tokens/second

POST /v1/free/llama3.1-8b

Premium models

tinyllama (fast)

POST /v1/premium/tinyllama

qwen1.5-0.5b-chat ~100 tokens/second

POST /v1/premium/qwen0.5

gemma-7b-it ~300 tokens/second

POST /v1/premium/gemma_7b

llama3-8b-Instruct ~550 tokens/second

POST /v1/premium/llama3

llama3.1-8b-Instruct ~350 tokens/second

POST /v1/premium/llama3.1-8b

codestral-22b-instruct ~75 tokens/second

POST /v1/premium/codestral-22b-instruct

mistral-7b-instruct ~60 tokens/second

POST /v1/premium/mistral-7b-instruct

stable-diffusion-xl-base-1.0 (BETA)

POST /v1/premium/stable_diffusion

Business models

llama3-70b-Instruct ~140 tokens/second

POST /v1/premium/llama3_70b

Code examples

tinyllama (slow)

Python Example

TypeScript Example

gemma-7b-it ~35 tokens/second

Python Example

TypeScript Example

llama3-8b-instruct ~25 tokens/second

Python Example

TypeScript Example

llama3.1-8b-instruct ~25 tokens/second

Python Example

TypeScript Example

mistral-7b-instruct ~25 tokens/second

Python Example

TypeScript Example

tinyllama (fast)

Python Example

TypeScript Example

qwen1.5-0.5b-chat ~100 tokens/second

Python Example

TypeScript Example

mistral-7b-instruct ~60 tokens/second

Python Example

TypeScript Example

gemma-7b-it ~300 tokens/second

Python Example

TypeScript Example

llama3-8b-Instruct ~550 tokens/second

Python Example

TypeScript Example

llama3.1-8b-instruct ~350 tokens/second

Python Example

TypeScript Example

codestral-22b-instruct ~75 tokens/second

Python Example