Shafiq Joty

Senior Director, Research

My Background

Shafiq (raihanjoty.github.io) directs the NLP group's work on large language modeling (LLM) and generative AI. Some of his group's recent projects include SFR-RAG, SFR-Judge, SFR-RAG-Agent and xGen. He is also a tenured Associate Professor (currently on leave) in the School of Computer Science and Engineering (SCSE) at NTU. He was a founding manager of the Salesforce Research Asia (Singapore) lab. His research contributed to 35+ patents and more than 170+ papers in top-tier NLP and ML conferences and journals. He severed as a PC chair of SIGDIAL-2023, best paper award committee of ICLR-23, NAACL-22 and a (senior) area chair for all the NLP and ML conferences.

3 authors

April 1, 2026

Introduction Reinforcement Learning from Human or AI Feedback (RLHF, RLAIF) has become the standard recipe for aligning large language models (LLMs). But as we push into the agentic era — where models call…

3 authors

March 20, 2026

VIBEPASS, a new benchmark, reveals a fundamental weakness in modern AI coding assistants: even with near-perfect scores on code generation tasks, frontier models falter when it comes to finding and fixing subtle bugs…

4 authors

March 11, 2026

AI agents that rely on web search are vulnerable to “well poisoning” attacks, where adversaries publish fabricated but authoritative-sounding content designed to be retrieved during search. Think “AI Slop” for agents. Our research…

4 authors

November 3, 2025

What is FINDAP? The FINDAP Framework is a cutting-edge approach to fine-tune large language models (LLMs) specifically for the finance industry. While LLMs like ChatGPT or Bard are excellent general-purpose tools, specialized domains…

5 authors

March 26, 2025

AI is rapidly transforming industries, helping businesses enhance customer experiences, improve efficiency, and make smarter decisions. But an essential question arises: How can we ensure that AI is creating accurate and grounded answers?…

Image shows a brain with wires going through floating above a circuit.

6 authors

October 28, 2024

The SFR-Embedding-Mistral marks a significant advancement in text-embedding models, building upon the solid foundations of E5-mistral-7b-instruct and Mistral-7B-v0.1.

4 authors

September 26, 2024

As the development and deployment of large language models (LLMs) accelerates, evaluating model outputs has become increasingly important. The established method of evaluating responses typically involves recruiting and training human evaluators, having them…

3 authors

September 17, 2024

Retrieval Augmented Generation (RAG) has not only gained steam as one of the most invested areas of research in generative AI but also gathered considerable popularity and commercialization opportunities. RAG is typically applied…

Shafiq Joty

Bert Legrand

July 18, 2024

Creating the world’s first LLM Benchmark for CRM

6 authors

October 20, 2023

TL;DR: With CodeChain, a pretrained large language model (LLM) can solve challenging coding problems by integrating modularity in generation samples and self-improve by employing a chain of self-revisions on representative sub-modules. CodeChain can…

Shafiq Joty

Building Efficient RL Training for the Agentic Era

VIBEPASS: Can Vibe Coders Really Pass the Vibe Check?

Poisoning the Well: Search Agents Get Tricked by Maliciously Hosted Content

Making AI Smarter for Finance: The FINDAP Framework

Does Context Matter? Introducing ContextualJudgeBench for RAG and Summarization Evaluation

SFR-Embedding-Mistral: Enhance Text Retrieval with Transfer Learning

Accelerating Your Model Evaluation and Fine-tuning with SFR-Judge

Building Contextually Faithful RAG Applications with SFR-RAG

Creating the World’s First LLM Benchmark for CRM

CodeChain: Towards Modular Code Generation through Chain of Self-revisions

Shafiq Joty

Get the latest articles in your inbox.

360 Highlights

IT

Commerce

Marketing

Service

Sales

Thanks, you're subscribed!