Back to Autogpt

Exa Websets

docs/integrations/block-integrations/exa/websets.md

0.6.4420.7 KB
Original Source

Exa Websets

<!-- MANUAL: file_description -->

Blocks for creating and managing Exa websets for continuous web monitoring.

<!-- END MANUAL -->

Exa Cancel Webset

What it is

Cancel all operations being performed on a Webset

How it works

<!-- MANUAL: how_it_works -->

This block cancels all running operations (searches, enrichments) on a webset. The webset transitions to an idle state and any in-progress operations are stopped.

The block is useful for stopping long-running operations that are no longer needed or when you need to modify the webset configuration. Items already processed before cancellation are retained.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
webset_idThe ID or external ID of the Webset to cancelstrYes

Outputs

OutputDescriptionType
errorError message if the operation failedstr
webset_idThe unique identifier for the websetstr
statusThe status of the webset after cancellationstr
external_idThe external identifier for the websetstr
successWhether the cancellation was successfulstr

Possible use case

<!-- MANUAL: use_case -->

Resource Management: Stop expensive operations on websets that are no longer needed.

Configuration Updates: Cancel operations before making changes to webset settings.

Error Recovery: Stop problematic operations and restart with corrected parameters.

<!-- END MANUAL -->

Exa Create Or Find Webset

What it is

Create a new webset or return existing one by external_id (idempotent operation)

How it works

<!-- MANUAL: how_it_works -->

This block implements idempotent webset creation using an external ID. If a webset with the given external_id already exists, it returns that webset. Otherwise, it creates a new one.

This pattern prevents duplicate websets when workflows retry or run multiple times. The block indicates whether the webset was newly created or already existed.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
external_idExternal identifier for this webset - used to find existing or create newstrYes
search_querySearch query (optional - only needed if creating new webset)strNo
search_countNumber of items to find in initial searchintNo
metadataKey-value pairs to associate with the websetDict[str, Any]No

Outputs

OutputDescriptionType
errorError message if the operation failedstr
websetThe webset (existing or newly created)Webset
was_createdTrue if webset was newly created, False if it already existedbool

Possible use case

<!-- MANUAL: use_case -->

Idempotent Workflows: Safely re-run workflows without creating duplicate websets.

External System Integration: Map websets to IDs from your own systems for easy reference.

Retry-Safe Operations: Handle workflow retries gracefully by reusing existing websets.

<!-- END MANUAL -->

Exa Create Webset

What it is

Create a new Exa Webset for persistent web search collections with optional waiting for initial results

How it works

<!-- MANUAL: how_it_works -->

This block creates a new Exa Webset, a persistent collection that stores web search results. You define a search query, entity type, and optional criteria that items must meet. The webset continuously evaluates potential matches against your criteria.

The block supports advanced features like scoped searches (searching within specific imports or other websets), enrichments for extracting structured data, and relationship-based "hop" searches. You can wait for initial results or return immediately for asynchronous processing.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
search_queryYour search query. Use this to describe what you are looking for. Any URL provided will be crawled and used as context for the search.strYes
search_countNumber of items the search will attempt to find. The actual number of items found may be less than this number depending on the search complexity.intNo
search_entity_typeEntity type: 'company', 'person', 'article', 'research_paper', or 'custom'. If not provided, we automatically detect the entity from the query."company" | "person" | "article" | "research_paper" | "custom" | "auto"No
search_entity_descriptionDescription for custom entity type (required when search_entity_type is 'custom')strNo
search_criteriaList of criteria descriptions that every item will be evaluated against. If not provided, we automatically detect the criteria from the query.List[str]No
search_exclude_sourcesList of source IDs (imports or websets) to exclude from search resultsList[str]No
search_exclude_typesList of source types corresponding to exclude sources ('import' or 'webset')List["import" | "webset"]No
search_scope_sourcesList of source IDs (imports or websets) to limit search scope toList[str]No
search_scope_typesList of source types corresponding to scope sources ('import' or 'webset')List["import" | "webset"]No
search_scope_relationshipsList of relationship definitions for hop searches (optional, one per scope source)List[str]No
search_scope_relationship_limitsList of limits on the number of related entities to find (optional, one per scope relationship)List[int]No
import_sourcesList of source IDs to import fromList[str]No
import_typesList of source types corresponding to import sources ('import' or 'webset')List["import" | "webset"]No
enrichment_descriptionsList of enrichment task descriptions to perform on each webset itemList[str]No
enrichment_formatsList of formats for enrichment responses ('text', 'date', 'number', 'options', 'email', 'phone'). If not specified, we automatically select the best format.List["text" | "date" | "number" | "options" | "email" | "phone"]No
enrichment_optionsList of option lists for enrichments with 'options' format. Each inner list contains the option labels.List[List[str]]No
enrichment_metadataList of metadata dictionaries for enrichmentsList[Dict[str, Any]]No
external_idExternal identifier for the webset. You can use this to reference the webset by your own internal identifiers.strNo
metadataKey-value pairs to associate with this websetDict[str, Any]No
wait_for_initial_resultsWait for the initial search to complete before returning. This ensures you get results immediately.boolNo
polling_timeoutMaximum time to wait for completion in seconds (only used if wait_for_initial_results is True)intNo

Outputs

OutputDescriptionType
errorError message if the operation failedstr
websetThe created webset with full detailsWebset
initial_item_countNumber of items found in the initial search (only if wait_for_initial_results was True)int
completion_timeTime taken to complete the initial search in seconds (only if wait_for_initial_results was True)float

Possible use case

<!-- MANUAL: use_case -->

Lead Generation: Create websets to find companies or people matching specific criteria for sales outreach.

Competitive Intelligence: Build persistent collections tracking competitors, market entrants, or industry news.

Research Databases: Compile curated collections of articles, papers, or resources on specific topics.

<!-- END MANUAL -->

Exa Delete Webset

What it is

Delete a Webset and all its items

How it works

<!-- MANUAL: how_it_works -->

This block permanently deletes a webset and all of its items, searches, enrichments, and monitors. The operation cannot be undone.

Use this to clean up websets that are no longer needed or to remove test data. The block accepts either the Exa-generated ID or your custom external_id.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
webset_idThe ID or external ID of the Webset to deletestrYes

Outputs

OutputDescriptionType
errorError message if the operation failedstr
webset_idThe unique identifier for the deleted websetstr
external_idThe external identifier for the deleted websetstr
statusThe status of the deleted websetstr
successWhether the deletion was successfulstr

Possible use case

<!-- MANUAL: use_case -->

Cleanup Operations: Remove completed or abandoned websets to maintain organization.

Data Management: Delete websets containing outdated or irrelevant data.

Cost Control: Remove unused websets to prevent unnecessary storage costs.

<!-- END MANUAL -->

Exa Get Webset

What it is

Retrieve a Webset by ID or external ID

How it works

<!-- MANUAL: how_it_works -->

This block retrieves detailed information about a specific webset including its status, configured searches, enrichments, and monitors.

The block returns the webset's current state, metadata, and timestamps. Use this to check webset configuration or monitor status before performing operations.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
webset_idThe ID or external ID of the Webset to retrievestrYes

Outputs

OutputDescriptionType
errorError message if the operation failedstr
webset_idThe unique identifier for the websetstr
statusThe status of the websetstr
external_idThe external identifier for the websetstr
searchesThe searches performed on the websetList[Dict[str, Any]]
enrichmentsThe enrichments applied to the websetList[Dict[str, Any]]
monitorsThe monitors for the websetList[Dict[str, Any]]
metadataKey-value pairs associated with the websetDict[str, Any]
created_atThe date and time the webset was createdstr
updated_atThe date and time the webset was last updatedstr

Possible use case

<!-- MANUAL: use_case -->

Configuration Review: Retrieve webset details to verify settings before making changes.

Status Checking: Check the current status and configuration of a webset.

Workflow Integration: Fetch webset information for use in downstream workflow steps.

<!-- END MANUAL -->

Exa List Websets

What it is

List all Websets with pagination support

How it works

<!-- MANUAL: how_it_works -->

This block retrieves a paginated list of all your websets. Results include basic webset information and can be paginated through using cursor tokens.

Use this to discover existing websets, find specific websets by browsing, or build management interfaces for your webset collections.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
triggerTrigger for the webset, value is ignored!AnyNo
cursorCursor for pagination through resultsstrNo
limitNumber of websets to return (1-100)intNo

Outputs

OutputDescriptionType
errorError message if the operation failedstr
websetsList of websetsList[Webset]
has_moreWhether there are more results to paginate throughbool
next_cursorCursor for the next page of resultsstr

Possible use case

<!-- MANUAL: use_case -->

Inventory Management: List all websets to understand your current data collections.

Bulk Operations: Iterate through websets to perform batch updates or cleanup.

Dashboard Building: Retrieve webset listings for management dashboards or reporting.

<!-- END MANUAL -->

Exa Preview Webset

What it is

Preview how a search query will be interpreted before creating a webset. Helps understand entity detection, criteria generation, and available enrichments.

How it works

<!-- MANUAL: how_it_works -->

This block analyzes your search query and shows how Exa will interpret it before you create a webset. It reveals the detected entity type, generated criteria, and available enrichment columns.

Use this to refine your query and understand what results to expect. The block also provides suggestions for improving your query for better results.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
queryYour search query to preview. Use this to see how Exa will interpret your search before creating a webset.strYes
entity_typeEntity type to force: 'company', 'person', 'article', 'research_paper', or 'custom'. If not provided, Exa will auto-detect."company" | "person" | "article" | "research_paper" | "custom" | "auto"No
entity_descriptionDescription for custom entity type (required when entity_type is 'custom')strNo

Outputs

OutputDescriptionType
errorError message if the operation failedstr
previewFull preview response with search and enrichment detailsPreviewWebsetModel
entity_typeThe detected or specified entity typestr
entity_descriptionDescription of the entity typestr
criteriaGenerated search criteria that will be usedList[PreviewCriterionModel]
enrichment_columnsAvailable enrichment columns that can be extractedList[PreviewEnrichmentModel]
interpretationHuman-readable interpretation of how the query will be processedstr
suggestionsSuggestions for improving the queryList[str]

Possible use case

<!-- MANUAL: use_case -->

Query Optimization: Test and refine search queries before committing to webset creation.

Entity Validation: Verify that Exa correctly detects the entity type for your use case.

Enrichment Planning: Discover available enrichment columns to plan data extraction.

<!-- END MANUAL -->

Exa Update Webset

What it is

Update metadata for an existing Webset

How it works

<!-- MANUAL: how_it_works -->

This block updates the metadata associated with an existing webset. Metadata is stored as key-value pairs and can be used to organize, tag, or annotate websets.

Setting metadata to null clears all existing metadata. This operation does not affect the webset's items, searches, or enrichments.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
webset_idThe ID or external ID of the Webset to updatestrYes
metadataKey-value pairs to associate with this webset (set to null to clear)Dict[str, Any]No

Outputs

OutputDescriptionType
errorError message if the operation failedstr
webset_idThe unique identifier for the websetstr
statusThe status of the websetstr
external_idThe external identifier for the websetstr
metadataUpdated metadata for the websetDict[str, Any]
updated_atThe date and time the webset was updatedstr

Possible use case

<!-- MANUAL: use_case -->

Tagging Systems: Add tags or labels to websets for organization and filtering.

Project Association: Link websets to specific projects or campaigns via metadata.

Workflow State: Store workflow-related state or flags in webset metadata.

<!-- END MANUAL -->

Exa Webset Ready Check

What it is

Check if webset is ready for next operation - enables conditional workflow branching

How it works

<!-- MANUAL: how_it_works -->

This block checks if a webset is idle (no running operations) and optionally has a minimum number of items. It returns a boolean ready status along with a recommendation for the next action.

Use this block for conditional workflow branching to decide whether to proceed with processing, wait for more results, or add additional searches.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
webset_idThe ID or external ID of the Webset to checkstrYes
min_itemsMinimum number of items required to be 'ready'intNo

Outputs

OutputDescriptionType
errorError message if the operation failedstr
is_readyTrue if webset is idle AND has minimum itemsbool
statusCurrent webset statusstr
item_countNumber of items in websetint
has_searchesWhether webset has any searches configuredbool
has_enrichmentsWhether webset has any enrichmentsbool
recommendationSuggested next action (ready_to_process, waiting_for_results, needs_search, etc.)str

Possible use case

<!-- MANUAL: use_case -->

Workflow Gating: Only proceed with data processing when webset has enough items.

Conditional Branching: Route workflow based on webset readiness for different scenarios.

Polling Logic: Implement smart polling by checking readiness before fetching items.

<!-- END MANUAL -->

Exa Webset Status

What it is

Get a quick status overview of a webset

How it works

<!-- MANUAL: how_it_works -->

This block returns a lightweight status overview of a webset without retrieving full item data. It includes counts for items, searches, enrichments, and monitors along with the current processing status.

Use this for quick status checks and monitoring without the overhead of retrieving complete webset details or items.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
webset_idThe ID or external ID of the WebsetstrYes

Outputs

OutputDescriptionType
errorError message if the operation failedstr
webset_idThe webset identifierstr
statusCurrent status (idle, running, paused, etc.)str
item_countTotal number of items in the websetint
search_countNumber of searches performedint
enrichment_countNumber of enrichments configuredint
monitor_countNumber of monitors configuredint
last_updatedWhen the webset was last updatedstr
is_processingWhether any operations are currently runningbool

Possible use case

<!-- MANUAL: use_case -->

Status Dashboards: Display webset status in monitoring dashboards.

Health Checks: Verify websets are active and processing as expected.

Lightweight Polling: Check status without fetching full webset data.

<!-- END MANUAL -->

Exa Webset Summary

What it is

Get a comprehensive summary of a webset with samples and statistics

How it works

<!-- MANUAL: how_it_works -->

This block generates a comprehensive summary of a webset including statistics, sample items, and detailed breakdowns of searches and enrichments. It provides an overview useful for reporting and analysis.

You can control what to include in the summary such as sample items, search details, and enrichment details to balance comprehensiveness with response size.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
webset_idThe ID or external ID of the WebsetstrYes
include_sample_itemsInclude sample items in the summaryboolNo
sample_sizeNumber of sample items to includeintNo
include_search_detailsInclude details about searchesboolNo
include_enrichment_detailsInclude details about enrichmentsboolNo

Outputs

OutputDescriptionType
errorError message if the operation failedstr
webset_idThe webset identifierstr
statusCurrent statusstr
entity_typeType of entities in the websetstr
total_itemsTotal number of itemsint
sample_itemsSample items from the websetList[Dict[str, Any]]
search_summarySummary of searches performedSearchSummaryModel
enrichment_summarySummary of enrichments appliedEnrichmentSummaryModel
monitor_summarySummary of monitors configuredMonitorSummaryModel
statisticsVarious statistics about the websetWebsetStatisticsModel
created_atWhen the webset was createdstr
updated_atWhen the webset was last updatedstr

Possible use case

<!-- MANUAL: use_case -->

Executive Reporting: Generate summaries of webset collections for stakeholder reports.

Quality Review: Review sample items and statistics to assess webset quality.

Progress Tracking: Monitor webset growth and activity through periodic summaries.

<!-- END MANUAL -->