Prompt Engineering

Prompt:

Any input that we provide the generative model to produce the required output is called a prompt. It can often be a series of instructions.

The general structure of the prompt:

The image indicates the difference between naive prompting on the left and a somewhat improved prompt on the right.

Though simple prompting can usually get you answers, it is always better to have an improved structured prompt to get the most accurate and efficient results.

In a well-structured prompt, we usually have these 4 elements:

Instruction:

It tells the model what we really want it to do.

Context:

It often gives the background of the situation in the instruction. Or it tries to provide the proper direction for the generative model.

Input data:

This is any additional statistics or information that we have regarding our instruction. This input data can sometimes act as context based on information or it works as an analytical aspect as in the example below.

Output indicator:

It can be part of the instruction or it can be given individually. It usually indicates how we want the output. Sometimes it can be code or text or image etc.. Apart from that, we can optimally choose how we want the output to be precise as below

Though this is a general optimal structure of an effective prompt, we can include more stuff or reduce the stuff based on your need and level of satisfaction with the generated output.

Prompt Engineering:

As discussed above it is often important to structure the prompt efficiently to get accurate results. The process of designing effective prompts to attain appropriate results is called prompt engineering.

Before prompt engineering:

After prompt engineering:

Prompt engineering is often a step-by-step process

It is often important to determine the goal of the project and put the goal into the structure of the instruction

Once the initial goal is defined and given a form of instruction feed it to the Generative Model and determine which factors that you require are satisfied and which are not.

Based on this analysis, we will try refining and restructuring the prompt as below.

After going through this rigorous cycle of refining prompts. The optimal structure should look like below

Potential applications can be as follows

There are four major Aspects/Practices:

It is always important to maintain the clarity of the sentence, we need to maintain the below-mentioned steps:

Example:

Context is also another important aspect as discussed above

Example:

Precision helps to outline the prompt to our specific target need.

Example:

Precision and context may look the same, each has its own way of optimizing the prompt. The context tries to provide the direction and background for our goal or instruction whereas the precision concentrates on how clearly you mention your need and how well you fine-tuned your context for your need.

One more optimal approach is to use Role-play as a strategy:

This approach can be significantly useful in the following scenario

These are some of the prompt engineering tools that were available in the market

IBM watsonx
Prompt lab
Prompt perfect
Dust
Spellbook
Github resources
openAI playground etc..

Text to Text prompt techniques:

Zero-shot prompting technique:

This is a scenario where the model generates an accurate expected response without needing any improvement/prompting using one-shot or few-shot techniques(will be discussed later). This is the ideal state that all LLM models want to achieve.

User feedback loop:

This a scenario where the user iteratively gives prompts till he is satisfied by the model response.

One-shot/Few-shot:

This is a scenario where the user provides one or more prompts and responses to a similar task and asks the model with a new task anticipating a good and similar response as the examples provided. Usually, these kinds of situations are not desirable but happen when the task you carry is new and complex for the generative model.

Interview pattern approach:

This is an approach where the user will prompt a model in a way that the model acts as an interviewer and asks a set of questions regarding the task and based on all these exchanges the model will answer give the final response that the user is looking for.

Chain of thoughts:

This is a prompt-based learning approach similar to the one-shot/few-shot approach. This is used to teach the model a reasoning-based concept on how to solve a particular task with a clear step-by-step explanation. And then eventually asking a similar question to extract a similar reasoning-based answer.

The major difference between few-shot prompting and chain-of-thought prompting is that we don't consider reasoning in the case of the few-shot approach rather we just try to provide questions and corresponding answers.

According to researchers, these phrases are more likely to produce correct answers even with zero-shot prompting with the above kinds of problems

But they don't often work alone but their combination with the chain of thought is much encouraged.

problems with the chain of thought:

Naive greedy decoding in Chain of Thought (CoT) prompting can lead to suboptimal reasoning paths, as it selects the highest probability token at each step without considering long-term coherence. This approach often misses better overall solutions, propagates early errors, and cannot explore alternative paths, resulting in inconsistent and less reliable outputs for complex, multi-step reasoning tasks.

Self Consistency prompting:

This is an approach in which the model is triggered with the same prompt multiple times, the answers are evaluated against each other and the most consistent answer will be considered.

Tree of thought prompting:

The working of the chain of thought is as follows

first, the model is asked to generate the first intermediate thought. Later we ask the model to evaluate its thought in terms of its ability to reach the final results. Based on its evaluation we then choose the most possible one and then extend our thought and repeat the process. At any stage it reaches a dead end then it back propagates and tries branching through the next possible branch.

In the research paper, they have explained various ways we can explore this method

For the generation of thoughts, we might choose a sampling approach or we can also opt for the proposed approach. In the sampling approach, we just go to the model and sample n thoughts for 'n' times, or else we propose the model to give n thoughts for 1st thought to solve the final problem. Sampling is suitable when the thought we are generating is long (paragraph). Proposing is suitable for short-sized thoughts like words or lines.

We also have the option to go with a voting-based choice of thoughts or we can also opt for a value-based approach. In the value-based approach, if we have 4 thoughts for the first state we will assign value to each of them by numbers or some set of labels and they are fixed even if we have to backtrack and go further. But the vote-based system is a dynamic approach where we have to vote each time.

We can further prompt the LLM to opt for BFS or DFS for tracking over the tree

They also have evaluated against a game called "Game of 24" and they claim a massive success rate against all other approaches.

They have tried this approach on many other tasks and quoted the performance of the different tasks in their paper.

Prompt hacks:

These are some creative techniques that help you enhance the capability of the prompts that you write. There is no standard approach to this, it's all about how you use your creativity. Here are a few examples of it.

General prompt:

With a little prompt hack:

Another example could be to use LLM to write prompts for our tasks for other image generation models to enhance the prompt instruction idea we have

General prompt:

With a little prompt hack:

Text-to-image prompting techniques:

Style Modifiers:

These are descriptors that are used to influence the artistic style are visual attributes of the image. These are certain sets of words that influence the appearance of the image.

Quality boosters:

Using certain keywords can boost the quality of the image enhancing its readability and quality.

Repetition:

This is a method where the main idea is to repeat the crucial words/aspects that you want your image to have will be repeated in the prompt multiple times to increase the impact of the word and push the image generation model to produce the right image.

Weighted term prompting:

In this, we try to include some numbers along with the words in the prompt to depict the impact of those words while generating the perfect prompt.

Fix deformed generation:

These are sets of phrases that help us to avoid deformities or anomalies that may impact the quality of the image. Deformities could be distortions of body parts pixelation issues etc.. This can be avoided by using certain negative prompts as below

However, this approach can inadvertently cause the model to focus on the very features it's meant to avoid, as it processes these negative terms directly. To overcome this, alternative strategies like positive reinforcement and specific, detailed descriptions of desired attributes are recommended. These methods emphasize desired qualities without directly mentioning the undesired ones, steering the model toward generating more accurate and visually appealing images. For instances where negative prompting is necessary, it should be subtly integrated within a positively framed prompt to minimize focusing on undesired features.

Machine learning guider

Search This Blog