How to Enrich Text Data Using OpenAI’s Chat API

Humans easily read and understand text (like this post), whereas machines have long struggled to process natural language. Examples of such textual data include customer feedback forms, social media posts, support tickets, business documents, product descriptions, chat logs, and many other forms of written human language.

Until recent developments in the research community, traditional natural language processing focused on text classification, topic modeling, or simple sentiment analysis. With the advent of large language models, the sophistication with which machines can now parse natural language as well as generate natural language has dramatically increased.

Large Language Models are the latest advancement in deep-learning models which learn from vast amounts of text data to understand, predict, and generate human-like language based on the input it receives. Some common business use cases for large language models are:

Summarization: remove extraneous details and extract key points from a block of text.

Standardization: rewrite text to a consistent tone and style

Translation: convert from one language to another

Question answering: provide answers to customer’s questions

Evaluation: analyze text to provide advice, ideas, or recommendations

Sentiment analysis: parse ambiguous language to extract the emotional tone

In this blog, we will show how engineers can incorporate large language models into these workflows. We will focus on ChatGPT, a large language model offering from OpenAI, though much of this will also apply to other LLMs. Specifically we’ll show how to connect your code directly to ChatGPT using the Chat Completions API.

OpenAI’s Chat API

Most first impressions of ChatGPT are through its chat UI. While ChatGPT’s elegant user experience has helped fuel its explosive growth, it is impractical for most software engineering tasks. For programmatic access, OpenAI provides an API and libraries in multiple programming languages.

The OpenAI Chat API charges per token read and generated. Tokens are generally words, subwords, and punctuation. The rule of thumb is that there are approximately 100 tokens for every 75 words.

This information is available from OpenAI’s website.

OpenAI REST API charges per 1,000 tokens. Up-to-date pricing can be found here. Note that input (tokens you send to the model) and output (tokens the model returns to you) are priced differently.

Summarization Tutorial using Chat Completions API

Prerequisites

To run this example, your computer will need Python 3.7+ installed. You will also need to create an account with OpenAI.

Additionally, you will also need to install the OpenAI Python library using pip:

				
					pip install --upgrade openai

Example Code

In this example, we will develop a Python application for a fictional property management company. The app will take emails from tenants regarding maintenance issues and create a short summary for their ticket management system.

				
					from openai import OpenAI

# Insert your own API key
client = OpenAI(api_key="{API KEY}")

# Input text you want to analyze. Can be from databases, event streams, or API calls.
input = """Hey woke up this morning to water hitting my face.
We had a bad storm last night and I guess a few shingles blew off the roof.
I have a bucket under the leak and there is already an inch of water in it.
Luckily I caught it before it leaked too much.
I don't want a water bed, if you catch my drift.
"""

# The system message (instruction) sets the behavior of the assistant.
instruction = """Summarize the tenant's problem in a few words.
Output JSON in format: {"summary": "Short summary text"}"""

# Specify the model (LLM) we want to use
model = "gpt-3.5-turbo-0125"

# Call the API
completion = client.chat.completions.create(
    model=model,
    response_format={"type": "json_object"},
    seed=12345,
    max_tokens=50,
    messages=[
        {"role": "system", "content": instruction},
        {"role": "user", "content": input},
    ],
)

# Response
print(f"Reponse:  {completion.choices[0].message.content}")
print(f"Finish reason: {completion.choices[0].finish_reason}")
print(f"Model: {completion.model}")
print(f"Prompt tokens: {completion.usage.prompt_tokens}")
print(f"Completion tokens: {completion.usage.completion_tokens}")
print(f"Total tokens: {completion.usage.total_tokens}")
print(f"Fingerprint: {completion.system_fingerprint}")
print(f"Request Complete")

Let’s break down what the code is doing line by line.

				
					# Insert your own API key
client = OpenAI(api_key="{API KEY}")

Your API keys can be generated in your account settings. Remember to follow best practices with your API keys.

				
					# Input text you want to analyze. Can be from databases, event streams, or API calls.
input = """Hey woke up this morning to water hitting my face.
We had a bad storm last night and I guess a few shingles blew off the roof.
I have a bucket under the leak and there is already an inch of water in it.
Luckily I caught it before it leaked too much.
I don't want a water bed, if you catch my drift.
"""

For the tutorial, we are hard-coding our input. In a more functional application, we’d retrieve our input from a database table, event stream, or API call.

				
					# The system message (instruction) sets the behavior of the assistant.
instruction = """Summarize the tenant's problem in a few words.
Output JSON in format: {"summary": "Short summary text"}"""

This instruction (prompt) tells ChatGPT how to handle the input.

				
					# Specify the model (LLM) we want to use
model = "gpt-3.5-turbo-0125"

The API gives you fine-grained control over which model to use. This page details the models available. It is a best practice to specify the model version since new versions can introduce breaking changes and should be tested first. 0125 is the most recent version of ChatGPT 3.5 as of the time of writing this blog article.

				
					completion = client.chat.completions.create

This code acts as a wrapper to the underlying chat REST API.

				
					response_format={"type": "json_object"},

JSON mode is a new feature for the ChatGPT API. It ensures that the returned result is grammatically correct JSON. Note that JSON mode does not guarantee that the response matches a specific JSON schema, just that it will parse without error.

				
					seed=12345,

Seeds are used to ensure repeatable outputs. This means that the same input will produce the same output. This is useful for reproducing errors and ensuring idempotent runs. The number used for the seed does not matter so long as it stays consistent between calls. I recommend reading the documentation for this parameter, as there are several caveats to its use.

				
					max_tokens=50,

Set a cap for expected returns to help manage costs and act as a form of simple quality control.

				
					    
    messages=[
        {"role": "system", "content": instruction},
        {"role": "user", "content": input},

The instruction and input are passed in via the messages parameter. The messages parameter orders input as a sequential conversation between a user role and an assistant role. The system role contains high-level instructions to the model.

The user role will be the model input, and the assistant role will be the model’s response. You can provide your own assistant responses if you want to give examples of the model. (See the few-shot learning section below.)

				
					{
    "id":"chatcmpl-7LJv9I6UrcP1YPlA5JJA9ccKWkkU7",
    "choices":[{
        "finish_reason":"stop",
        "index":0,
        "message":{
            "content":"{\"summary\": \"Roof leak due to storm, shingles blown off, water dripping into the house\"}",
            "role":"assistant",
            "function_call":null,
            "tool_calls":null}}
    ],
    "created":1700092155,
    "model":"gpt-3.5-turbo-1106",
    "object":"chat.completion",
    "system_fingerprint":"fp_eeff13170a",
    "usage":{
        "completion_tokens":22,
        "prompt_tokens":108,
        "total_tokens":130}
}

This is an example of a response from the API. Most of the attributes are self-explanatory.

The "finish_reason" is the reason why the model stopped generating tokens. A detailed explanation of the main reasons can be found here.

"system_fingerprint" is used with the seed parameter to ensure reproducible outputs.

By default, the API will return one message in the response. The API will return multiple messages if you set the "n" parameter in the request. This is useful in scenarios where the user can select their preferred response.

Improvements

The example demonstrates how you utilize OpenAI’s powerful models with only a few lines of code. While the results are often good out of the box, there are several techniques and features you can use to enhance your application.

Prompt Engineering

How you word your prompts can greatly impact how the model responds. With current LLMs, this can be more art than science. phData provides a helpful guide on how to write better instructions for the model.

A better prompt for the above example would be:

The user will describe a problem by a tenant in an apartment building. You will concisely summarize the problem in a JSON format. The summary should tell the repair person the core problem at a glance. Ignore non-maintenance-related comments. Return the summary in this JSON format: {"summary": "Short summary text"}

Longer prompts will generally produce better results at the expense of more tokens (remember all tokens are charged by the API).

Few-Shot Learning

Sometimes, the LLM will need help understanding your instructions. To assist it, you can provide a few examples along with your instructions. This is known as few-shot learning.

Here is what an instruction with few-shot learning might look like:

				
					completion = client.chat.completions.create(
    model=model,
    response_format={"type": "json_object"},
    seed=12345,
    max_tokens=50,
    temperature=0.7,
    messages=[
        {"role": "system", "content": instruction},
        {"role": "user", "content": "There is a puddle of water by my diswasher."},
        {"role": "assistant", "content": """"{"summary": "Dishwasher is leaking."}"""},
        {"role": "user", "content": "Not sure what is going on, but the temperature in my apartment is 95."},
        {"role": "assistant", "content": """"{"summary": "Temperature too hot."}"""},
        {"role": "user", "content": "There is a mysterious brown spot on my ceiling. Is it leaking?"},
        {"role": "assistant", "content": """"{"summary": "Possible roof leak."}"""},
        {"role": "user", "content": input},
    ],
)

You can see that we provided three examples of user assistant responses to the model before appending the user input. As with longer prompts, you get better results at the expense of increased tokens billed.

Fine Tuning

If three examples can provide better responses, what about 200? 1000? Prepending 1000 examples to every prompt would not be practical. Instead, OpenAI offers a service known as fine-tuning. The result of fine-tuning is a custom model with your data embedded in it. The downside is that OpenAI charges a higher fee per token for using fine-tuned models.

Functions

The newer ChatGPT models can invoke custom functions in your code. At a high level:

Define a set of function descriptions that are passed to the model.
The model chooses which (if any) functions to invoke, along with a list of parameter values.
Your code will execute the functions and return the results as the “tool” role.
The model will then summarize the results.

This is advanced functionality and is still being refined by OpenAI. The documentation can be found here.

Retrieval Augment Generation (RAG)

Put simply, RAG is the ability of a model to query a data store, retrieve a result, and then incorporate that result into its response. For more information, see our RAG guide.

Conclusion

In this blog, we’ve demonstrated how easy it is to incorporate OpenAI’s models into your own code. We also looked at several features and techniques that can be used to further extend this functionality.

These are exciting times for software and data engineers. Capabilities previously thought to be decades away are now available with a simple API call. The sudden appearance of these technologies has left many organizations scrambling to find the right strategy.

phData is ready to assist.

We have the expertise to ensure your ChatGPT setup is optimized and powerful, driving your organization forward.