[{"content":"This week I decided to try out Tauric Research\u0026rsquo;s TradingAgents after reading their arXiv paper. As someone interested in both AI and trading, the idea of a multi-agent LLM-powered trading framework was too cool to pass up.\nThe architecture of Tauric Research\u0026rsquo;s TradingAgents framework is designed to facilitate multi-agent trading using large language models (LLMs). Here\u0026rsquo;s a breakdown of the key components:\nData Ingestion: The framework starts by ingesting market data, which is crucial for making informed trading decisions. This data is processed and fed into the system in real-time.\nAgent Coordination: Multiple agents work in tandem, each specializing in different aspects of trading. These agents communicate and coordinate to optimize trading strategies.\nModel Selection: The framework allows for flexible model selection, enabling users to choose from various LLMs like GPT-4o and o1. This flexibility is key to adapting to different market conditions and trading goals.\nDecision Making: The agents use the insights from the LLMs to make trading decisions. This involves analyzing market trends, predicting future movements, and executing trades.\nFeedback Loop: After executing trades, the system evaluates the outcomes and adjusts strategies accordingly. This feedback loop is essential for continuous improvement and adaptation to changing market dynamics.\nThis architecture not only supports complex trading strategies but also allows for experimentation and integration with custom models, as I plan to do with my own intraday trading system.\nSetup Experience It took me a bit to get everything set up in my CLI (mostly just Python environment stuff and getting the right API keys), but once I hooked it up with my OpenAI API, it was actually really easy to use. The docs are pretty clear, and the CLI lets you pick tickers, models, and other options interactively.\n# Clone the repo git clone https://github.com/TauricResearch/TradingAgents.git cd TradingAgents # (Recommended) Create a virtual environment conda create -n tradingagents python=3.13 conda activate tradingagents # Install dependencies pip install -r requirements.txt # Set your API keys export FINNHUB_API_KEY=your_finnhub_key export OPENAI_API_KEY=your_openai_key # Run the CLI python -m cli.main My First Test: SPY with GPT-4o and o1 For my first test, I used SPY as the ticker. I set GPT-4o for the basic models and o1 for the final model. The whole run only cost me about 5 cents in API usage, which is honestly super reasonable for a full multi-agent LLM pipeline.\nUnderstanding the TradingAgents Architecture Thoughts and Next Steps What I want to do next is pair TradingAgents with my own intraday trading model and custom indicators. My idea is to use TradingAgents as a separate signal—if it says \u0026ldquo;buy\u0026rdquo; or \u0026ldquo;sell\u0026rdquo; for the day, I would only take trades in that direction with my own system. I think this could be a really interesting way to combine LLM-based reasoning with my more traditional quant models.\nPros:\nEasy to set up after initial environment config Super flexible with model selection Cheap to run even with GPT-4o and o1 Open source and well-documented Cons:\nStill early days for LLM trading agents (lots to experiment with) Needs good API keys and some Python comfort Links Tauric Research TradingAgents GitHub arXiv Paper: TradingAgents: Multi-Agents LLM Financial Trading Framework ","permalink":"https://adv-andrew.github.io/andrewvu.io/posts/tauric-tradingagents-experience/","summary":"\u003cp\u003eThis week I decided to try out \u003ca href=\"https://github.com/TauricResearch/TradingAgents\"\u003eTauric Research\u0026rsquo;s TradingAgents\u003c/a\u003e after reading their \u003ca href=\"https://arxiv.org/pdf/2412.20138\"\u003earXiv paper\u003c/a\u003e. As someone interested in both AI and trading, the idea of a multi-agent LLM-powered trading framework was too cool to pass up.\u003c/p\u003e\n\u003cp\u003e\u003cimg alt=\"Screenshot of TradingAgents CLI running SPY test\" loading=\"lazy\" src=\"https://github.com/TauricResearch/TradingAgents/blob/main/assets/schema.png?raw=true\"\u003e\u003c/p\u003e\n\u003cp\u003eThe architecture of Tauric Research\u0026rsquo;s TradingAgents framework is designed to facilitate multi-agent trading using large language models (LLMs). Here\u0026rsquo;s a breakdown of the key components:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\n\u003cp\u003e\u003cstrong\u003eData Ingestion\u003c/strong\u003e: The framework starts by ingesting market data, which is crucial for making informed trading decisions. This data is processed and fed into the system in real-time.\u003c/p\u003e","title":"Trying Tauric Research TradingAgents"},{"content":"So this weekend I went down a pretty wild rabbit hole. It all started when my friend Jeffrey Zang showed me this insane hackathon project he built at Waterloo called \u0026ldquo;Opus\u0026rdquo; (check it out here).\nWatching his demo was honestly mind-blowing - like, an AI that can actually use your computer the way you would? That got me way too curious for my own good.\nWhat Even Are Computer Using Agents? Before I get into my testing adventure, let me explain what Computer Using Agents (CUAs) actually are, because this stuff is genuinely cool:\nBasically, they\u0026rsquo;re AI systems that can see your screen, understand what\u0026rsquo;s happening, and then click, type, and navigate just like a human would. No special APIs needed - they literally just look at pixels and figure out what to do.\nHow Opus Works (From What I Could Tell) Jeffrey\u0026rsquo;s project uses Microsoft\u0026rsquo;s Opus architecture, which is pretty clever:\nComponent What It Does Why It\u0026rsquo;s Cool Vision-Language Models \u0026ldquo;Sees\u0026rdquo; and understands your screen Like having AI eyes that know what buttons are Action Planning Breaks down complex tasks into steps \u0026ldquo;To send an email, first open Gmail, then\u0026hellip;\u0026rdquo; Execution Engine Actually performs the clicks/typing The hands that do the work Feedback Loops Learns from mistakes \u0026ldquo;Oops, wrong button, let me try again\u0026rdquo; Down the Rabbit Hole: Microsoft UFO2 After seeing Jeffrey\u0026rsquo;s demo, I had to try this myself. That\u0026rsquo;s when I found Microsoft\u0026rsquo;s latest research project: UFO2 (GitHub link).\nThis thing just came out in April 2025 and it\u0026rsquo;s supposed to be state-of-the-art for Computer Using Agents.\nSetting Up UFO2 (The Fun Part) Here\u0026rsquo;s how I got it running on my Windows machine:\n# Clone the repository git clone https://github.com/microsoft/UFO.git cd UFO # Install dependencies pip install -r requirements.txt # Set up your OpenAI API key export OPENAI_API_KEY=\u0026#34;your-api-key-here\u0026#34; # Run UFO2 python main.py Welcome to use UFO🛸, A UI-focused Agent for Windows OS Interaction. _ _ _____ ___ | | | || ___| / _ \\ | | | || |_ | | | | | |_| || _| | |_| | \\___/ |_| \\___/ Please enter your request to be completed🛸: The Reality Check: Testing UFO2 I decided to test it with what seemed like a super simple task:\nMy Test Prompt: \u0026ldquo;Open File Explorer, navigate to the Documents folder, right-click in an empty space, create a new text file named \u0026rsquo;todo.txt\u0026rsquo;, and open it in Notepad\u0026rdquo;\nThe Results Were\u0026hellip; Interesting Metric Result My Reaction Cost $1.12 (GPT-4o Turbo) \u0026ldquo;Wait, that much for one task??\u0026rdquo; Time 8 minutes \u0026ldquo;I could do this in 10 seconds\u0026hellip;\u0026rdquo; Success Rate Failed multiple times \u0026ldquo;This is harder than it looks\u0026rdquo; Attempts ~6-7 tries \u0026ldquo;At least it\u0026rsquo;s persistent?\u0026rdquo; What Went Wrong (And Why) The system kept getting stuck in these weird loops:\n# Pseudo-code of what I think was happening while task_not_complete: take_screenshot() analyze_image() # Expensive API call plan_action() # Another expensive call execute_action() if action_failed: backtrack() # More API calls try_again() # Even more calls Problems I noticed:\nClicked on wrong folders constantly Got confused by similar-looking UI elements Kept trying the same failed action multiple times Used TONS of tokens for each screenshot analysis Breaking Down Why UFO2 Is So Expensive The architecture is honestly pretty brilliant, but it\u0026rsquo;s also super token-heavy:\nUFO2\u0026rsquo;s Processing Pipeline graph TD A[Screenshot] --\u0026gt; B[Vision Model Analysis] B --\u0026gt; C[Action Planning] C --\u0026gt; D[Execute Action] D --\u0026gt; E[New Screenshot] E --\u0026gt; B B --\u0026gt; F[Token Cost: ~500-1000] C --\u0026gt; G[Token Cost: ~300-500] Step Token Usage Why It\u0026rsquo;s Expensive Screen Analysis 500-1000 tokens Each screenshot needs detailed vision processing Action Planning 300-500 tokens Complex reasoning about next steps Error Recovery 200-400 tokens When it messes up (which is often) Context Maintenance 100-300 tokens Remembering what it\u0026rsquo;s supposed to be doing Total per action: ~1000-2000 tokens My simple task: Probably used 15,000+ tokens total\nCurrent State: The Good and The Frustrating What\u0026rsquo;s Cool About CUAs Right Now Actually works (eventually) - like, it really can use your computer No API required - works with any app or website Surprisingly good at complex reasoning - it understands context way better than I expected Getting better fast - the technology is improving rapidly What\u0026rsquo;s\u0026hellip; Not So Great Problem Impact Example Super Expensive $1+ for simple tasks My todo.txt task cost more than a coffee Really Slow Minutes for 10-second tasks Faster to just do it myself Unreliable Fails frequently Spent more time fixing than working Token Heavy Costs scale quickly 5 tasks = lunch money What I Haven\u0026rsquo;t Tried Yet: OpenAI\u0026rsquo;s Operator I keep hearing about OpenAI\u0026rsquo;s Operator system, which supposedly is way more optimized than UFO2. From what I\u0026rsquo;ve read, it might be better because:\nBetter integration between vision and action models More efficient token usage (hopefully?) Trained specifically on computer interaction tasks Probably has better error recovery Code snippet for when I get access:\n# What I imagine Operator\u0026#39;s API might look like from openai import Operator agent = Operator() result = agent.execute_task(\u0026#34;Create a new text file called todo.txt\u0026#34;) print(f\u0026#34;Task completed in {result.time_taken} seconds for ${result.cost}\u0026#34;) My Vision: The Ultimate Computer Assistant Here\u0026rsquo;s what I think the endgame looks like - and honestly, this is what got me so excited about this tech:\nThe Dream: OS-Level Overlay Agent Imagine having a chat interface that\u0026rsquo;s always available on your computer, where you can just ask:\n💬 \u0026#34;Install that Minecraft mod I bookmarked\u0026#34; → Downloads Forge → Installs the mod → Moves it to %appdata%\\.minecraft\\mods\\ → Done in under 30 seconds 💬 \u0026#34;Attach my resume to that email draft\u0026#34; → Finds your latest resume → Attaches it to Gmail → Maybe even updates your contact info 💬 \u0026#34;Set up a study group meeting for next Tuesday\u0026#34; → Checks everyone\u0026#39;s calendars → Finds a free room → Sends calendar invites Technical Requirements for This to Work Requirement Current Status What We Need Speed 8 min for simple tasks \u0026lt; 30 seconds Cost $1+ per task \u0026lt; $0.10 per task Reliability ~60% success rate \u0026gt; 95% success rate Context Forgets easily Remembers your preferences What Needs to Happen for CUAs to Be Actually Useful 1. Efficiency Improvements # Current approach (expensive) def current_cua_approach(): while not task_complete: screenshot = capture_screen() # Large image analysis = expensive_vision_model(screenshot) # $$$$ action = plan_with_llm(analysis) # More $$ execute(action) # Better approach (cheaper) def optimized_approach(): lightweight_model = load_specialized_ui_model() # Smaller, faster cached_state = get_cached_screen_state() # Reuse analysis action = lightweight_model.predict(cached_state) # Cheap! execute(action) 2. Specialized Models Instead of using general-purpose vision models, we probably need:\nUI-specific vision models trained on screenshots Action prediction models that understand common computer tasks Local processing for privacy and speed Caching systems to avoid re-analyzing similar screens Why This Matters (Beyond Just Being Cool) As a college student thinking about career stuff, this space seems huge:\nIndustry Opportunities Role What You\u0026rsquo;d Work On Skills Needed ML Engineer Training better CUA models Python, PyTorch, Computer Vision Software Engineer Building CUA frameworks System design, APIs, Performance optimization Product Manager CUA applications Understanding user workflows, UX design Research Scientist Next-gen architectures Deep learning, HCI research The company that figures out fast, cheap, reliable CUAs is going to be massive.\nReal-World Applications Customer Service: CUAs handling support tickets Testing: Automated software QA Data Entry: Eliminating repetitive tasks Accessibility: Helping people with disabilities use computers Personal Productivity: Everyone having an AI assistant My Takeaways After This Weekend CUAs are real and they work - but they\u0026rsquo;re not ready for everyday use yet The technology is advancing fast - this stuff will probably be mainstream in 2-3 years There\u0026rsquo;s a huge opportunity for optimization and improvement The cost/speed problem is solvable - just needs better engineering What I Want to Explore Next todo_list = [ \u0026#34;Try OpenAI\u0026#39;s Operator when I get access\u0026#34;, \u0026#34;Build a simple CUA for specific tasks (maybe just browser automation)\u0026#34;, \u0026#34;Look into browser-use and other open-source alternatives\u0026#34;, \u0026#34;Maybe contribute to UFO2\u0026#39;s GitHub repo?\u0026#34;, \u0026#34;Write more about this as the tech evolves\u0026#34; ] Conclusion: Why I\u0026rsquo;m Excited About This Space Look, I know I\u0026rsquo;m just a college student who spent a weekend playing with cool AI tech. But honestly? This feels like one of those moments where you can see the future coming.\nComputer Using Agents are going to change how we interact with technology. Yeah, they\u0026rsquo;re expensive and slow right now, but so were the first smartphones. The potential is massive.\nFor anyone reading this (especially recruiters 👀), this is a space worth paying attention to. The intersection of computer vision, AI reasoning, and system automation is going to create entirely new categories of software.\nPlus, it\u0026rsquo;s just really fun to watch an AI try to use Windows and get confused by the same UI quirks that annoy all of us.\nWant to chat about CUAs, AI automation, or just cool tech in general? I\u0026rsquo;m always down to discuss this stuff - drop me a line!\nTags: #AI #ComputerUsingAgents #CollegeTech #Automation #MachineLearning #StudentProjects #MicrosoftUFO #AIResearch\nP.S. - Shoutout to Jeffrey for showing me Opus and starting this whole exploration. College hackathons really do lead to the coolest discoveries.\n","permalink":"https://adv-andrew.github.io/andrewvu.io/posts/my-first-post/","summary":"\u003cp\u003eSo this weekend I went down a pretty wild rabbit hole. It all started when my friend \u003cstrong\u003eJeffrey Zang\u003c/strong\u003e showed me this insane hackathon project he built at Waterloo called \u003cstrong\u003e\u0026ldquo;Opus\u0026rdquo;\u003c/strong\u003e (\u003ca href=\"https://github.com/jeffrey-zang/opus\"\u003echeck it out here\u003c/a\u003e).\u003c/p\u003e\n\u003cp\u003eWatching his demo was honestly mind-blowing - like, an AI that can actually \u003cem\u003euse\u003c/em\u003e your computer the way you would? That got me way too curious for my own good.\u003c/p\u003e\n\u003cp\u003e\u003cimg alt=\"Jeffrey\u0026rsquo;s Opus demo in action\" loading=\"lazy\" src=\"https://media.discordapp.net/attachments/1265179839040847884/1389829516192383066/image.png?ex=68660b1a\u0026is=6864b99a\u0026hm=11bff423ead69fd49dc4c16374bad59aceb55648cf6eff716db2143028fdc4bc\u0026=\u0026format=webp\u0026quality=lossless\u0026width=1450\u0026height=873\"\u003e\u003c/p\u003e","title":"My thoughts on AI CUAs (computer using agent)"}]