TiDB Graph Store¶
In [ ]:
Copied!
%pip install llama-index-llms-openai
%pip install llama-index-graph-stores-tidb
%pip install llama-index-embeddings-openai
%pip install llama-index-llms-azure-openai
%pip install llama-index-llms-openai
%pip install llama-index-graph-stores-tidb
%pip install llama-index-embeddings-openai
%pip install llama-index-llms-azure-openai
In [ ]:
Copied!
# For OpenAI
import os
os.environ["OPENAI_API_KEY"] = "sk-xxxxxxx"
import logging
import sys
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
# define LLM
llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
Settings.llm = llm
Settings.chunk_size = 512
# For OpenAI
import os
os.environ["OPENAI_API_KEY"] = "sk-xxxxxxx"
import logging
import sys
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
# define LLM
llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
Settings.llm = llm
Settings.chunk_size = 512
In [ ]:
Copied!
# For Azure OpenAI
import os
import openai
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
import logging
import sys
logging.basicConfig(
stream=sys.stdout, level=logging.INFO
) # logging.DEBUG for more verbose output
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
openai.api_type = "azure"
openai.api_base = "https://<foo-bar>.openai.azure.com"
openai.api_version = "2022-12-01"
os.environ["OPENAI_API_KEY"] = "<your-openai-key>"
openai.api_key = os.getenv("OPENAI_API_KEY")
llm = AzureOpenAI(
deployment_name="<foo-bar-deployment>",
temperature=0,
openai_api_version=openai.api_version,
model_kwargs={
"api_key": openai.api_key,
"api_base": openai.api_base,
"api_type": openai.api_type,
"api_version": openai.api_version,
},
)
# You need to deploy your own embedding model as well as your own chat completion model
embedding_llm = OpenAIEmbedding(
model="text-embedding-ada-002",
deployment_name="<foo-bar-deployment>",
api_key=openai.api_key,
api_base=openai.api_base,
api_type=openai.api_type,
api_version=openai.api_version,
)
Settings.llm = llm
Settings.embed_model = embedding_llm
Settings.chunk_size = 512
# For Azure OpenAI
import os
import openai
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
import logging
import sys
logging.basicConfig(
stream=sys.stdout, level=logging.INFO
) # logging.DEBUG for more verbose output
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
openai.api_type = "azure"
openai.api_base = "https://.openai.azure.com"
openai.api_version = "2022-12-01"
os.environ["OPENAI_API_KEY"] = ""
openai.api_key = os.getenv("OPENAI_API_KEY")
llm = AzureOpenAI(
deployment_name="",
temperature=0,
openai_api_version=openai.api_version,
model_kwargs={
"api_key": openai.api_key,
"api_base": openai.api_base,
"api_type": openai.api_type,
"api_version": openai.api_version,
},
)
# You need to deploy your own embedding model as well as your own chat completion model
embedding_llm = OpenAIEmbedding(
model="text-embedding-ada-002",
deployment_name="",
api_key=openai.api_key,
api_base=openai.api_base,
api_type=openai.api_type,
api_version=openai.api_version,
)
Settings.llm = llm
Settings.embed_model = embedding_llm
Settings.chunk_size = 512
Using Knowledge Graph with TiDB¶
Prepare a TiDB cluster¶
- TiDB Cloud [Recommended], a fully managed TiDB service that frees you from the complexity of database operations.
- TiUP, use `tiup playground`` to create a local TiDB cluster for testing.
Get TiDB connection string¶
For example: mysql+pymysql://user:password@host:4000/dbname
, in TiDBGraphStore we use pymysql as the db driver, so the connection string should be mysql+pymysql://...
.
If you are using a TiDB Cloud serverless cluster with public endpoint, it requires TLS connection, so the connection string should be like mysql+pymysql://user:password@host:4000/dbname?ssl_verify_cert=true&ssl_verify_identity=true
.
Replace user
, password
, host
, dbname
with your own values.
Initialize TiDBGraphStore¶
In [ ]:
Copied!
from llama_index.graph_stores.tidb import TiDBGraphStore
graph_store = TiDBGraphStore(
db_connection_string="mysql+pymysql://user:password@host:4000/dbname"
)
from llama_index.graph_stores.tidb import TiDBGraphStore
graph_store = TiDBGraphStore(
db_connection_string="mysql+pymysql://user:password@host:4000/dbname"
)
Instantiate TiDB KG Indexes¶
In [ ]:
Copied!
from llama_index.core import (
KnowledgeGraphIndex,
SimpleDirectoryReader,
StorageContext,
)
documents = SimpleDirectoryReader(
"../../../examples/data/paul_graham/"
).load_data()
from llama_index.core import (
KnowledgeGraphIndex,
SimpleDirectoryReader,
StorageContext,
)
documents = SimpleDirectoryReader(
"../../../examples/data/paul_graham/"
).load_data()
In [ ]:
Copied!
storage_context = StorageContext.from_defaults(graph_store=graph_store)
# NOTE: can take a while!
index = KnowledgeGraphIndex.from_documents(
documents=documents,
storage_context=storage_context,
max_triplets_per_chunk=2,
)
storage_context = StorageContext.from_defaults(graph_store=graph_store)
# NOTE: can take a while!
index = KnowledgeGraphIndex.from_documents(
documents=documents,
storage_context=storage_context,
max_triplets_per_chunk=2,
)
Querying the Knowledge Graph¶
In [ ]:
Copied!
query_engine = index.as_query_engine(
include_text=False, response_mode="tree_summarize"
)
response = query_engine.query(
"Tell me more about Interleaf",
)
query_engine = index.as_query_engine(
include_text=False, response_mode="tree_summarize"
)
response = query_engine.query(
"Tell me more about Interleaf",
)
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" WARNING:llama_index.core.indices.knowledge_graph.retrievers:Index was not constructed with embeddings, skipping embedding usage... INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
In [ ]:
Copied!
from IPython.display import Markdown, display
display(Markdown(f"<b>{response}</b>"))
from IPython.display import Markdown, display
display(Markdown(f"{response}"))
Interleaf was a software company that developed a scripting language and was known for its software products. It was inspired by Emacs and faced challenges due to Moore's law. Over time, Interleaf's prominence declined.