[Insights from MLOps Now] Project Pluto — Leveraging LLMs & LLMOps in financial media

In our first iteration of the 2024 MLOps Now series, we brought together fintech founders and industry leaders in banking & capital markets to explore the potential of AI within fintech and financial services. During ≤, Kibeom Kim, the co-founder and CTO of Project Pluto shared how his team is leveraging large language models and LLMOps to scale profitable content generation.

About Project Pluto

Project Pluto↗ is an AI-powered finance media startup founded with a mission to build an AI automation platform for market intelligence, co-founded by Hyun Hong↗ and Kibeom Kim↗.

Hyun is a finance influencer and a YouTuber with 310K subscribers and shares stories on the U.S. markets from stocks to hedge funds. Before founding Project Pluto, she was an Investment Banker, a Trader, and a Hedge Fund Analyst on Wall Street with experiences at Citigroup, J.P Morgan, McKinsey & Company, and more. Kibeom was a software engineer at Google and worked on Google’s milestone product teams like Chrome for Android and TensorFlow APIs.

How Project Pluto leverages LLMs

In the past year, the team has been exploring the use of the LLMs with products like FinanceGPT↗ and StockGPT. Project Pluto’s Wallstreet Now↗, an automated media outlet for the U.S. market, has over 100K monthly active users. The LLM-generated content is read by analysts and traders at the top investment banks, asset management, and alike. Wallstreet Now is challenging the $1T financial media market by cutting the time and cost to generate content to zero.

Project Pluto has been using OpenAI’s GPTs and Google Gemini and was able to cut down the cost to generate a single article down to $1. Rather than fine-tuning open-source models or building a purpose-specific sLLM, the team opted for heavy prompting and human-in-the-loop feedback. Using mostly GPT-4, Project Pluto set a strong baseline in terms of extracting facts from existing articles, generating journalist-like text, and double-checking the output article.

The team’s LLM pipeline consists of three modules.

Article analysis — The module extracts only the pure facts delivered in the article and stores it in the database. It separates the facts from the journalist’s nuanced expressions and opinions while respecting the copyright of the original author.

Article generation — Based on the facts identified in the analysis stage, this module writes articles based on the extracted facts. It collects the necessary information and data and employs common prompting techniques like chain-of-thought and few-shot. Here’s an example that Kibeom shared.

Article review — In this human-in-the-loop stage, a person with a financial background re-verifies and scores the accuracy and quality of the AI-generated articles. The team also samples inaccurate articles to refine the prompts and an AI grading system grades the articles. This process, in which humans evaluate articles and the model is tuned according to these evaluation standards, repeats in a loop. Project Pluto has been able to enhance the efficiency of its service and reduce costs through this continuous improvement and grading process.

I want to write Market Research Report. Check the below RULE and correct my report.

RULE
1. This is Market Research Report.
2. Do not change the price in the report.
3. Never change the price in the report.

How Project Pluto leverages LLMOps

The competitive advantage of the Project Pluto team comes from its internal LLMOps platform and the prompting pipelines they set up to automate these multi-stakeholder, process-intensive tasks without sacrificing accuracy. As the product and contents scaled, the team soon faced several problems.

Increase in communication costs and development cycles — The process of writing code by the developers and performing prompt engineering by the financial experts is inefficient.

Long feedback loop for improving generated content — Without a systematic workflow and criteria for testing and applying feedback, developers often ran tests passively, resulting in degrading accuracy.

Increasing technical difficulty and dependencies — The LLM app became increasingly complex with multiple external APIs like LLMs, vector retrieval, and Google Search.

Using Gaia↗, the team built an LLMOps platform that (1) is easy enough to be used by non-developers, (2) automates the common LLM workflows like RAG, and (3) supports multiple integrations and APIs. Here are some of the key features laid out by Kibeom during the event and previously shared in Project Pluto’s blog post↗.

Workflow automation — Compose a step that connects with APIs like GPT, Vector DB, RDBMS, and Google Search and connect each step in conditions, series, or parallel. The drag-and-drop interface is simple enough to be used by financial experts with no dev experience.

Experiments & tracking — Run batch experiments and log all metadata from evaluation metrics to results. This is works as the “VESSL Run of LLMs” where the team can version, reproduce, track, and share all experiments and workflows.

Deployment & monitoring — Deploy workflows into production and monitor runs. Here, you can review the prediction history with detailed metadata and system metrics like latency and token usage.

Project Pluto is the perfect example of how emerging companies are using frontier models like OpenAI GPT and Google Gemini to quickly test and iterate their LLM products while following the best practices and emphasizing the operational infrastructure when using these models. In our comping series of MLOps Now, we hope to bring more founders and builders in this domain to discuss how they are leveraging MLOps, LLMOps, and AI infrastructure in general to further drive their product’s competitive edge.

Community

[Insights from MLOps Now] Project Pluto — Leveraging LLMs & LLMOps in financial media

About Project Pluto

How Project Pluto leverages LLMs

How Project Pluto leverages LLMOps

Yong Hee

Try VESSL today

MLOps for high-performance ML teams

RESOURCES

COMPANY

FOLLOW US