LLM | Shubham | Dev Blog

You Probably Don’t Need Fine-Tuning

Jumping onto the fine-tuning ship often feels like the natural next step when prompts hit a corner case. With the rise of SaaS tools simplifying fine-tuning, it’s tempting—but it’s often unnecessary. Instead of investing time in dataset generation, cleaning, and splitting, focus on refining prompts. Use larger models or plugins designed to optimize prompt writing. Often, better prompt engineering can outperform the results of fine-tuning. Or, consider hybrid approaches. Combining LLMs with classical ML techniques like K-means clustering or decision trees can extend and enhance model outputs effectively. ...

Enhancing Retrieval with Hypothetical Document Embeddings (HyDE) in RAG Systems

Find the paper here RAG The Retrieval-Augmented Generation (RAG) technique offers a promising approach when leveraging large language models like LLMs to build knowledge bases. Envision creating a chatbot capable of querying a collection of textbooks. A standard pre-trained LLM doesn’t inherently possess this capability. This is where RAG comes into play. RAG works by dissecting your corpus into more manageable segments or documents. Ideally, these segments should fit within the context window of the language model in use. For user queries, they’re translated into vector embeddings, and a method like cosine similarity matches relevant documents to the query. Subsequently, the language model synthesizes a response using the actual user query and the matched documents (typically the top_k documents). ...

Lost In The Middle - LLM

I’ve been working with large language models from across the spectrum, including OpenAI GPT, Anthropic, LLAMA, etc., for quite some time. For most of this journey, I was leaning towards selecting larger models (with more context windows) and cramming as much context as possible in hopes of getting better inferences and domain-specific reasoning. However, after the release of GPT-3.5-Turbo-16K, I realized the capabilities of these LLMs do not necessarily scale with the context window. Interestingly, the same model might perform poorly as the context length is increased. ...

Teaching Task to Language models using: Zero Shot, One Shot & Few Shot

In this blog, we will discuss prompt engineering techniques used in working with large language models: zero-shot, one-shot, and few-shot learning. These methods aim to enable models to perform and learn new tasks quickly with little or no training data. Zero-Shot Learning Zero-shot learning refers to the ability of a language model to perform a task without having seen any examples from that specific task during training. This is particularly useful when there is a lack of labeled data for a given task. Large language models, like GPT-3, have shown remarkable capabilities in zero-shot learning by leveraging their vast knowledge and understanding of language. ...