Skip to main content

Get Web Crawler Node

The Get Web Crawler Node allows you to extract data from web pages by providing a URL. This node crawls the specified webpage and retrieves its content, making it easy to gather information from websites for analysis, processing, or display in your workflows.

Get Web Crawler node


Basic Usage

Use the Text, Display Text, and Get Web Crawler Node for your process.


Inputs

Input (Web URL)

  • Type: Text containing a valid web page URL
  • Mandatory: Yes
  • Works best with: Text Input, Text node, Form Node

Provide the complete URL of the webpage you want to crawl (e.g., https://www.example.com or https://www.example.com/page).


Outputs

Output

  • Type: Text containing the extracted web page content
  • Works best with: Display Text, AI General Prompt, Text to Speech, Document Download

The extracted content from the specified webpage, including text, headings, and other readable content.


Example Workflow

Website Content Extractor

Scenario: Extract and display the content from the LearningFlow website homepage.

Get Web Crawler Example

Steps to Create the Flow:

  1. Start with the Start Node.

  2. Add and connect a Text node with the website URL.

    • Example URL:
    https://www.learningflow.ai/
    • Connect to Input (Web URL)
  3. Add and connect the Get Web Crawler node:

    • Connect the Text node's output to the Input (Web URL)
    • The node will automatically crawl and extract content when executed
  4. Add Display Text to show the extracted content.

    • Connect the Output to Display Text
    • Users will see the webpage content in text format

Preview:

The flow extracts content from the LearningFlow website, including:

LearningFlow.ai

Resources
Close Resources Open Resources

Featured Tools
...

The extracted content includes all readable text from the webpage, such as headings, paragraphs, navigation items, and other text elements.

Result:

Users can:

  • Extract text content from any publicly accessible webpage
  • Gather information from websites without manual copying
  • Use the extracted content for further analysis or processing
  • Access website data programmatically