OpenLLM#
- pydantic model llama_index.llms.openllm.OpenLLM#
OpenLLM LLM.
Show JSON schema
{ "title": "OpenLLM", "description": "OpenLLM LLM.", "type": "object", "properties": { "callback_manager": { "title": "Callback Manager" }, "system_prompt": { "title": "System Prompt", "description": "System prompt for LLM calls.", "type": "string" }, "messages_to_prompt": { "title": "Messages To Prompt" }, "completion_to_prompt": { "title": "Completion To Prompt" }, "output_parser": { "title": "Output Parser" }, "pydantic_program_mode": { "default": "default", "allOf": [ { "$ref": "#/definitions/PydanticProgramMode" } ] }, "query_wrapper_prompt": { "title": "Query Wrapper Prompt" }, "model_id": { "title": "Model Id", "description": "Given Model ID from HuggingFace Hub. This can be either a pretrained ID or local path. This is synonymous to HuggingFace's '.from_pretrained' first argument", "type": "string" }, "model_version": { "title": "Model Version", "description": "Optional model version to save the model as.", "type": "string" }, "model_tag": { "title": "Model Tag", "description": "Optional tag to save to BentoML store.", "type": "string" }, "prompt_template": { "title": "Prompt Template", "description": "Optional prompt template to pass for this LLM.", "type": "string" }, "backend": { "title": "Backend", "description": "Optional backend to pass for this LLM. By default, it will use vLLM if vLLM is available in local system. Otherwise, it will fallback to PyTorch.", "enum": [ "vllm", "pt" ], "type": "string" }, "quantize": { "title": "Quantize", "description": "Optional quantization methods to use with this LLM. See OpenLLM's --quantize options from `openllm start` for more information.", "enum": [ "awq", "gptq", "int8", "int4", "squeezellm" ], "type": "string" }, "serialization": { "title": "Serialization", "description": "Optional serialization methods for this LLM to be save as. Default to 'safetensors', but will fallback to PyTorch pickle `.bin` on some models.", "enum": [ "safetensors", "legacy" ], "type": "string" }, "trust_remote_code": { "title": "Trust Remote Code", "description": "Optional flag to trust remote code. This is synonymous to Transformers' `trust_remote_code`. Default to False.", "type": "boolean" }, "class_name": { "title": "Class Name", "type": "string", "default": "OpenLLM" } }, "required": [ "model_id", "serialization", "trust_remote_code" ], "definitions": { "PydanticProgramMode": { "title": "PydanticProgramMode", "description": "Pydantic program mode.", "enum": [ "default", "openai", "llm", "guidance", "lm-format-enforcer" ], "type": "string" } } }
- Config
arbitrary_types_allowed: bool = True
- Fields
- Validators
_validate_callback_manager
»callback_manager
set_completion_to_prompt
»completion_to_prompt
set_messages_to_prompt
»messages_to_prompt
- field backend: Optional[Literal['vllm', 'pt']] = None#
Optional backend to pass for this LLM. By default, it will use vLLM if vLLM is available in local system. Otherwise, it will fallback to PyTorch.
- field model_id: str [Required]#
Given Model ID from HuggingFace Hub. This can be either a pretrained ID or local path. This is synonymous to HuggingFace’s ‘.from_pretrained’ first argument
- field model_tag: Optional[str] = None#
Optional tag to save to BentoML store.
- field model_version: Optional[str] = None#
Optional model version to save the model as.
- field prompt_template: Optional[str] = None#
Optional prompt template to pass for this LLM.
- field quantize: Optional[Literal['awq', 'gptq', 'int8', 'int4', 'squeezellm']] = None#
Optional quantization methods to use with this LLM. See OpenLLM’s –quantize options from openllm start for more information.
- field serialization: Literal['safetensors', 'legacy'] [Required]#
Optional serialization methods for this LLM to be save as. Default to ‘safetensors’, but will fallback to PyTorch pickle .bin on some models.
- field trust_remote_code: bool [Required]#
Optional flag to trust remote code. This is synonymous to Transformers’ trust_remote_code. Default to False.
- async achat(messages: Sequence[ChatMessage], **kwargs: Any) Any #
Async chat endpoint for LLM.
- async acomplete(*args: Any, **kwargs: Any) Any #
Async completion endpoint for LLM.
- astream_chat(messages: Sequence[ChatMessage], **kwargs: Any) Any #
Async streaming chat endpoint for LLM.
- astream_complete(*args: Any, **kwargs: Any) Any #
Async streaming completion endpoint for LLM.
- chat(messages: Sequence[ChatMessage], **kwargs: Any) Any #
Chat endpoint for LLM.
- classmethod class_name() str #
Get the class name, used as a unique ID in serialization.
This provides a key that makes serialization robust against actual class name changes.
- complete(*args: Any, **kwargs: Any) Any #
Completion endpoint for LLM.
- stream_chat(messages: Sequence[ChatMessage], **kwargs: Any) Any #
Streaming chat endpoint for LLM.
- stream_complete(*args: Any, **kwargs: Any) Any #
Streaming completion endpoint for LLM.
- property metadata: LLMMetadata#
LLM metadata.
- pydantic model llama_index.llms.openllm.OpenLLMAPI#
OpenLLM Client interface. This is useful when interacting with a remote OpenLLM server.
Show JSON schema
{ "title": "OpenLLMAPI", "description": "OpenLLM Client interface. This is useful when interacting with a remote OpenLLM server.", "type": "object", "properties": { "callback_manager": { "title": "Callback Manager" }, "system_prompt": { "title": "System Prompt", "description": "System prompt for LLM calls.", "type": "string" }, "messages_to_prompt": { "title": "Messages To Prompt" }, "completion_to_prompt": { "title": "Completion To Prompt" }, "output_parser": { "title": "Output Parser" }, "pydantic_program_mode": { "default": "default", "allOf": [ { "$ref": "#/definitions/PydanticProgramMode" } ] }, "query_wrapper_prompt": { "title": "Query Wrapper Prompt" }, "address": { "title": "Address", "description": "OpenLLM server address. This could either be set here or via OPENLLM_ENDPOINT", "type": "string" }, "timeout": { "title": "Timeout", "description": "Timeout for sending requests.", "type": "integer" }, "max_retries": { "title": "Max Retries", "description": "Maximum number of retries.", "type": "integer" }, "api_version": { "title": "Api Version", "description": "OpenLLM Server API version.", "enum": [ "v1" ], "type": "string" }, "class_name": { "title": "Class Name", "type": "string", "default": "OpenLLM_Client" } }, "required": [ "timeout", "max_retries", "api_version" ], "definitions": { "PydanticProgramMode": { "title": "PydanticProgramMode", "description": "Pydantic program mode.", "enum": [ "default", "openai", "llm", "guidance", "lm-format-enforcer" ], "type": "string" } } }
- Config
arbitrary_types_allowed: bool = True
- Fields
- Validators
_validate_callback_manager
»callback_manager
set_completion_to_prompt
»completion_to_prompt
set_messages_to_prompt
»messages_to_prompt
- field address: Optional[str] = None#
OpenLLM server address. This could either be set here or via OPENLLM_ENDPOINT
- field api_version: Literal['v1'] [Required]#
OpenLLM Server API version.
- field max_retries: int [Required]#
Maximum number of retries.
- field timeout: int [Required]#
Timeout for sending requests.
- async achat(messages: Sequence[ChatMessage], **kwargs: Any) Any #
Async chat endpoint for LLM.
- async acomplete(*args: Any, **kwargs: Any) Any #
Async completion endpoint for LLM.
- astream_chat(messages: Sequence[ChatMessage], **kwargs: Any) Any #
Async streaming chat endpoint for LLM.
- astream_complete(*args: Any, **kwargs: Any) Any #
Async streaming completion endpoint for LLM.
- chat(messages: Sequence[ChatMessage], **kwargs: Any) Any #
Chat endpoint for LLM.
- classmethod class_name() str #
Get the class name, used as a unique ID in serialization.
This provides a key that makes serialization robust against actual class name changes.
- complete(*args: Any, **kwargs: Any) Any #
Completion endpoint for LLM.
- stream_chat(messages: Sequence[ChatMessage], **kwargs: Any) Any #
Streaming chat endpoint for LLM.
- stream_complete(*args: Any, **kwargs: Any) Any #
Streaming completion endpoint for LLM.
- property metadata: LLMMetadata#
LLM metadata.