Posts

Recent Trends with Text Embeddings: Decoder-Only LLMs

Introduction Fine-tuned BERT-style bidirectional encoders, such as E5, have long dominated text embedding models, delivering state-of-the-art performance for many sequence-level tasks at an accessible cost. However, currently, many of the top performing models on the MTEB leaderboard, a text embedding benchmark for various tasks (retrieval, clustering, classification, etc.) are decoder-only language models at the ~7B scale. These models leverage larger context windows and pretraining on web-scale data. This post reviews the papers of the following models:...

Beyond Chain-of-Thought: Evolving Prompts for Enhanced LLM Reasoning

Introduction: The Evolving Chain-of-thought In this post I focus on peer-reviewed papers from the last year or two. Most of these papers focus on techniques that aim to improve upon the commonly used chain-of-thought (CoT) approach, especially for tasks requiring complex reasoning. It’s not an exhaustive list by any means, but rather my summaries of papers I’ve been reading recently. It’s worth noting that many of these papers use weaker LLMs than today’s top models....

Paper Deep Dive: The E5 Embeddings, Part 1

Recently I evaluated the E5 embeddingds (there’s also a small and large versions). I got impressive results so thought I should read the paper behind it. I’m writing my summary here for myself and anyone else who might find it useful. Title: Text Embeddings by Weakly-Supervised Contrastive Pre-training This image was generated by DALL-E. The following text is human generated for the most part :) Idea This is a nice data paper....

This Blog

Back to Blogging After a hiatus, I’ve decided to give blogging another go. My previous blog was set up using Jekyll and hosted on GitHub. Since I blogged infrequently, I didn’t feel the need to explore other platforms. However, a recent shift in my attitude towards social media, specifically Twitter, and the ease of writing with GenAI, rekindled my interest in blogging. I yearned for a more aesthetic and user-friendly experience, leading me to transition my blog to Hugo, using the Papermod theme....

First Impression of GPT4

Introduction GPT4 was just released and it’s amazing. I use ChatGPT daily for all sorts of tasks, esp coding (in addition to Co-pilot etc.). I feel I now do more code reviews and apply fixes than actual coding. I’ve seen posts on Twitter about the future of NLP and what problems are still worth tackling outside of large LMs. This is an interesting topic. I don’t have the answer. I do have some thoughts but instead of rambling I will show an example in a domain GPT4 knows well....