Modelling: Text Generation

On this page, we discuss an overview of text generation, case studies illustrating real world-applications of text generation in the banking industry, literature review conducted, and our implementation in the project.

What is Text Generation?

Text generation is a process where an Artificial Intelligence (AI) system produces written content that closely mimics natural human language. The goal is to generate coherent, meaningful, and contextually appropriate text based on input prompts or conditions. This technology is essential across many fields, including natural language processing, content creation, and customer service (Awan, 2023).

How Does Text Generation Work?

Text generation relies on advanced language models, such as GPT (Generative Pre-trained Transformer) and Google's PaLM, trained on extensive text datasets. These models use deep learning techniques and neural networks to understand sentence structure and predict the most probable words or phrases based on input prompts.

During the generation process, the AI model takes a seed input (e.g., a sentence or keyword) and uses its learned knowledge to produce new text. The process continues until a desired length or stopping condition is reached.

Why is Text Generation Important?

Text generation offers several key benefits:

Efficiency: Automates content creation, reducing manual writing time and effort.
Personalisation: Fine-tuned models generate personalised content based on user preferences and historical data, enabling tailored recommendations and customised responses.
Language Accessibility: Enhances translation services and speech synthesis, making information accessible to individuals with reading or language difficulties.

Use cases of Text Generation

Text generation is employed in numerous practical applications, including:

Content Creation: AI can quickly generate articles, blog posts, and product descriptions, producing quality content at scale.
Chatbots and Virtual Assistants: AI-driven chatbots use text generation for conversational interactions, offering personalised assistance and information.
Language Translation: Models improve translation services by generating accurate translations in real time.
Summarisation: AI can generate concise summaries of research papers, news articles, or books, identifying key points for quick insights.

Real-World Applications in the Banking industry

Text generation technology is transforming the traditional banking experience by enhancing customer service and improving data-driven decision-making (Admin & Surovtseva, 2024). Some key use cases demonstrating the potential value of text generation in banking include:

Customer Service Enhancement

Advanced Chatbots for 24/7 Support

AI-powered chatbots can provide round-the-clock assistance, reducing customer wait times and enabling instant support. These chatbots offer guidance on tasks such as recommending financial services, checking account balances, and completing transactions.

For example, Wells Fargo's generative AI virtual assistant, Fargo, launched in March 2023, has managed 20 million interactions and is projected to handle 100 million interactions annually. The assistant leverages Google's PaLM 2 large language model (LLM) to answer everyday banking queries and perform tasks such as analysing spending patterns, checking credit scores, paying bills, and providing transaction details.

Personalised Financial Advice

Text generation can aid frontline staff by providing client-specific suggestions based on real-time customer interactions and data. This enables financial advisors to offer tailored experiences and recommendations to customers.

For instance, Morgan Stanley has introduced an AI assistant based on OpenAI's GPT-4, giving its 16,000 financial advisors instant access to approximately 100,000 research reports and documents. This AI assistant helps advisors quickly find and synthesise answers to finance and investment queries, enabling highly personalised instant insights.

Data-Driven Decision-Making

Market and Investment Strategy Analysis

Text generation can assist in analysing market trends, economic indicators, and investment opportunities, generating personalised investment recommendations. It can also simulate and test different market scenarios to inform trading strategies.

While interest in applying generative AI across these functions is increasing, banks are still exploring its potential for generating market and investment strategies. According to Jason Napier, Head of European Banks Research at UBS, the full potential of AI in banking is still nascent, with more significant deployments expected in the future.

Fraud Detection and Risk Assessment

Generative AI can identify anomalies indicating potential fraud such as unusual spending patterns and create synthetic scenarios to train fraud detection models. By analysing historical data patterns, it can help banks assess and mitigate risks.

For example, Mastercard recently launched a generative AI model to help banks detect suspicious transactions more effectively on its network. The technology can improve banks' fraud detection rates by 20%, with potential increases of up to 300% in some cases. The model leverages the 125 billion transactions that pass through Mastercard's card network annually as training data.

Literature Review

The literature review of text generation aims to assess different text generation models and identify the most suitable one for producing insights, comparisons, and suggestions efficiently and quickly.

Techniques Evaluated:

Three text generation models were considered for this assessment: GPT-2, Gemma-2b-it, and Mixtral-8x7B-Instruct-v0.1. The evaluation focused on their output coherence and inference time.

GPT-2: Developed by OpenAI, GPT-2 generates human-like text based on prompts. Its versatility allows it to produce text across a variety of genres, making it useful for creative writing, chatbots, and data augmentation. GPT-2 benefits from pre-training on a large corpus of internet text, providing strong grammar and semantics comprehension. However, it may also generate biased or inaccurate text due to its training data, and controlling the output may require careful fine-tuning of parameters such as max length (Singh, 2023).
Gemma-2b-it: Gemma-2b-it is a lightweight, decoder-only model from Google's Gemma family. It performs well in tasks such as question answering, summarisation, and reasoning. This model's compact size and versatility make it practical for deployment in various environments and fine-tuning for specific use cases. Like other language models, Gemma-2b-it may have limitations such as biases and challenges in handling nuanced language or providing complete factual accuracy (HuggingFace, 2024).
Mixtral-8x7B-Instruct-v0.1: It is a large language model developed with a Sparse Mixture of Experts (SMoE) architecture, which excels in generating high-quality text across a variety of tasks. It has demonstrated superior performance compared to many predecessors like Llama-2-70B. However, Mixtral lacks built-in moderation mechanisms, which can pose challenges in controlling output for sensitive or inappropriate content (Unreal Speech, 2024).

Results:

To assess the models' ability to recognise trends and generate logical outputs, a specific prompt was used:

"You are an analyst from GXS Bank. Help me describe what you see in terms of trend with this JSON format: {"Positive Insights": , "Negative Insights": , "Topic Insights": }. Do not put extra words like 'Based on...'. Output STRICTLY in JSON ONLY. Do not talk about null data. The following is the overall data acquired from our banking application: {'Jan 2024': 4, 'Feb 2024': 3.6, 'Mar 2024': 3.2, 'Apr 2024': 2, 'May 2024': 3.8, 'Jun 2024': 4.5, 'July 2024': 4.5, 'Aug 2024': 4.2, 'Sep 2024': 3, 'Oct 2024': 4, 'Nov 2024': 3.7, 'Dec 2024': 4.1}."

Details of the output are here.

Output Coherence: In terms of output coherence, Mixtral-8x7B-Instruct-v0.1 provided the most coherent response, accurately identifying trends and generating JSON format output for conversion. GPT-2, on the other hand, simply repeated the prompt without meaningful trend recognition, while Gemma-2b-it struggled to discern numerical trends.
Inference Time: Regarding inference time, Mixtral-8x7B-Instruct-v0.1 was the fastest, taking 14 seconds due to its cloud-based inference. GPT-2 was also relatively fast, completing inference in 30 seconds since it was executed locally. Gemma-2b-it took significantly longer—16 minutes and 32 seconds—likely due to a lack of GPU support.

Thus, Mixtral-8x7B-Instruct-v0.1 emerged as the best choice for our product, offering a good balance of performance and speed through its cloud-based inference capabilities.

Usage in the Project

The Model Itself

The H2OGPTE client is utilised within the project for text generation (details here). We first initialise the H2OGPTE class with our API key:

self.client = H2OGPTE(address='https://h2ogpte.genai.h2o.ai', api_key=api_key)

We then input our prompt to generate insights from the text generation model:

session_id = self.client.create_chat_session()
response = session.query(prompt, timeout=70, rag_config={"rag_type": "llm_only", "llm": ..., "max_new_tokens": ...})

Adjust the configuration accordingly to suit your needs. In our case, we used Mixtral LLM and 1024 max new tokens to control the output length.

The results are then compiled and used within our data pipeline.

The prompt

Based on the type of query we want, we curate different prompts for the model (details here).

self.insights_output = "{\"Positive Insights\": <Paragraph>, \"Negative Insights\": <Paragraph>, \"Topic Insights\": <Paragraph>}. "
self.comparison_output = "{\"Better Topics\": {<Topic>: <Paragraph on why>, <Topic>: <Paragraph on why>, ...}, \"Worse Topics\": {<Topic>: <Paragraph on why>, <Topic>: <Paragraph on why>, ...}}. "
self.suggestions_output = "{<Topic>: <Suggestion>, <Topic>: <Suggestion>,...}. "

self.main_data_prompt = lambda data: f"The following is the overall data acquired from our banking application: {data}."
self.topic_data_prompt = lambda bank, topic, data: f"This is the ratings for the {topic} of the {bank} application: {data}."
self.insights_prompt = lambda format: f"You are an analyst from GXS Bank. Help me describe what you see in terms of trend with this JSON format: {format}"
self.comparison_prompt = lambda format: f"You are an analyst from GXS Bank. Help me compare performance of topics with the other bank using this JSON format: {format}"
self.suggestions_prompt = lambda format: f"You are an analyst from GXS Bank. Based on the poor/negative topics picked up, suggest and recommend solutions to them using this JSON format: {format}"

self.rules_prompt = "Do not put extra words like 'Based on...'. Output STRICTLY in JSON ONLY. Do not talk about null data."

By mixing and matching based on inquiry type (Insights, Comparisons, Suggestions), we have a fixed prompt structure used for our model.