Day 11 of 100 Days Agentic Engineer Challenge: Browser Use — AI Agent
It’s the next day of my journey and I start to fully focus on AI agents, researching different solutions. There are many open source implementations presented on Github. One of them is Browser Usage. But more about that in a moment. First, a summary of my daily tasks.
My daily task routine
1.Physical activity — The idea is to repeat every day the minimum exercise, which for now is 20 pushups, that is all, later I will increase the number of repetitions and also add other exercises.
2.Seven hours of sleep — I was able to sleep 7 hours and do not work over night, the idea is to go to sleep lately at 11 pm.
3.AI Agents — frameworks, platforms and tools
4.AI Assistant: in queue
5.PAIC: in queue
6.Data Science: in queue
If you want to know what all these tasks are about, read the introduction to the 100 Days Agentic Engineer Challenge.
Browser Use — AI Agent and other ideas
Browser Usage allows your AI agents to seamlessly interact with and control your browser. It provides an easy and efficient way to integrate AI with web browsing, enabling automation and enhanced interactions within the online environment. With this feature, your AI can navigate, extract and interact with web content, making browser tasks easier and more efficient. Here are a few ideas described on his Github repo: https://github.com/browser-use/browser-use
- Write a letter in Google Docs to my Papa, thanking him for everything, and save the document as a PDF.
- Read my CV & find ML jobs, save them to a file, and then start applying for them in new tabs, if you need help, ask me.
- Find flights on kayak.com from Zurich to Beijing from 25.12.2024 to 02.02.2025.
- Look up models with a license of cc-by-sa-4.0 and sort by most likes on Hugging face, save top 5 to file.
- Look up models with a license of cc-by-sa-4.0 and sort by most likes on Hugging face, save top 5 to file.
More examples
On Github repo you can find 4 videos presenting above mentioned cases or there is a nice video on WorldofAI channel you can watch below:
More usage examples
See the examples folder on Github for more usage examples, here are the most interesting ones:
- amazon_search.py: A script showcasing how to perform Amazon searches.
- bedrock_claude.py: Adds support for Bedrock integration with Browser Use, highlighting advanced AI capabilities. (Updated 3 days ago)
- check_appointment.py: Enables parallelized agents, which is useful for tasks requiring concurrent processes.
- clipboard.py: Demonstrates copy/paste functionality from the clipboard. (Updated 2 weeks ago)
- deepseek.py: Example demonstrating deep search functionality. (Updated 2 days ago)
- find_and_apply_to_jobs.py: Provides an example of a job applier, showcasing automation in job applications.
- notification.py: Notifies users when a task is completed. (Updated 2 weeks ago)
- ollama.py: Added an example for integrating Ollama, a new AI capability. (Updated 4 days ago)
- post-twitter.py: Introduces post and reply functionality for Twitter/X, ideal for social media automation.
- real_browser.py: Example of connecting AI to a real browser for enhanced interactivity. (Updated 2 weeks ago)
- result_processing.py: Highlights robust file upload and existing browser connection functionality.
- save_trace.py: Allows saving traces with browser context information, useful for debugging or analytics.
- scrolling_page.py: Clarifies tasks for scrolling down a page, relevant for dynamic web pages.
- web_voyager_agent.py: Sets a default square window size for a browsing agent, potentially optimizing rendering.