GC-201b · Module 2

API & Browser Automation

3 min read

API integration in Gemini CLI works through two channels. The built-in web_fetch tool handles simple HTTP requests — fetching API documentation, checking endpoint responses, downloading data. For more sophisticated API interactions — authenticated requests, complex payloads, multi-step flows — MCP servers provide typed, validated tool interfaces. The Fetch MCP server adds proper HTTP method support, header management, and response parsing.

Browser automation extends Gemini CLI into the UI testing and web scraping domain. The Puppeteer and Playwright MCP servers give Gemini the ability to launch browsers, navigate pages, interact with elements, take screenshots, and extract data. The workflow: "Navigate to our staging environment, log in with the test account, check that the dashboard loads, screenshot any errors." This turns Gemini CLI into a lightweight QA automation tool.

{
  "mcpServers": {
    "fetch": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-fetch"]
    },
    "puppeteer": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-puppeteer"],
      "includeTools": ["navigate", "screenshot", "click", "fill", "evaluate"]
    }
  }
}

Simple API access Use the built-in web_fetch tool. No configuration needed. Works for GET requests, public APIs, and documentation fetching.
Authenticated API access Use the Fetch MCP server or a custom MCP server. Supports headers, authentication tokens, POST/PUT/DELETE methods, and structured payloads.
Browser automation Use Puppeteer or Playwright MCP servers. For UI testing, screenshot capture, form interaction, and JavaScript execution in a real browser context.
Custom API integration Build a custom MCP server for your team's internal APIs. Expose typed tools that map to your specific endpoints and workflows.