Crawl4AI
Crawl4AI: Open-source LLM-friendly web crawler and scraping tool.
Configuring a Crawl4AI MCP Tool Instance
On the XpertAI platform, you can directly configure the built-in Crawl4AI MCP service as an SSE-type MCP tool instance for use by agents and workflows.
You can also find Crawl4AI in the MCP tool template marketplace and create it with one click.
Configuration steps are as follows:
Configuration Example When adding an MCP tool on the XpertAI platform, configure it as:
type: sse
url: "http://crawl4ai:11235/mcp/sse"
name: crawl4aiUsage Once configured, agents or workflows can directly call the capabilities provided by the
crawl4aiMCP tool (e.g.,md,html,screenshot,pdf,crawl,ask, etc.) during orchestration.
Use Cases
Use Case 1: Automated Web Content Collection Agent ("Summary Bot")
Objective: Enable users to submit a webpage link through an intelligent agent to automatically retrieve a Markdown text summary of the page.
Key Steps
- User inputs a link → The agent triggers the MCP's
mdtool to convert the target webpage into Markdown text. - The agent receives and displays the summary, and can further extract key content based on user needs.
Summary Illustration
User: Please summarize the main content of https://example.com.
Agent → Calls MCP tool `md` to retrieve content in Markdown format
Agent → Returns the summary to the user
Use Case 2: Multimedia Content Capture and Analysis Workflow ("Report Generator")
Objective: After a user inputs a target URL, the agent automatically captures a webpage screenshot and PDF, and generates a final report on the XpertAI platform.
Implementation Steps
- User inputs a link → The agent sequentially calls:
screenshot: Generates a webpage screenshotpdf: Exports the webpage as a PDF- Optional:
askorhtml→ Extracts structured text content
- The agent compiles the screenshot, PDF, and text into a shareable report.
Workflow Illustration
User: Please capture the https://example.com page and generate a report.
→ Agent calls MCP:
1⃣ Calls `screenshot` to obtain a page screenshot
2⃣ Calls `pdf` to obtain a page PDF
3⃣ Optional: Calls `ask`/`html` to retrieve structured text or Markdown
Agent: Below are the page screenshot and PDF file, with the extracted summary as follows... (displays content)
Cross-Scenario Common Configuration Notes
MCP Tool List
Tool Name Function Description mdGenerates page content in Markdown htmlExtracts preprocessed HTML screenshotCaptures a full-page screenshot (PNG) pdfExports the page as a PDF document execute_jsExecutes custom JavaScript in the page context crawlCrawls multiple URLs askQueries the indexed library context For more tool parameters and usage, visit:
https://docs.crawl4ai.com/core/docker-deployment/#mcp-model-context-protocol-support