For the last couple of years we've been using a RAG proxy for one of our features, but the YC founders behind it are now pivoting to a different startup.
What their API provided was pretty basic, so I'm sure other platforms offer this:
- act as an OpenAI-style proxy, taking in a turn-based conversation history+ new user prompt
- augment the context in the system message with relevant content from a vector database
- query the LLM with our API key, and return the result so we can use it in-app.
I'd rather not spend much time spinning up a RAG pipeline, if I can just redirect to another endpoint. What are people using?