You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
// Flush interval in milliseconds (default is 10000)
134
134
exportconstSEGMENT_FLUSH_INTERVAL_MS=5_000;
135
+
136
+
exportconstSERVER_INSTRUCTIONS=`
137
+
Apify is the world's largest marketplace of tools for web scraping, data extraction, and web automation.
138
+
These tools are called **Actors**. They enable you to extract structured data from social media, e-commerce, search engines, maps, travel sites, and many other sources.
139
+
140
+
## Actor
141
+
- An Actor is a serverless cloud application running on the Apify platform.
142
+
- Use the Actor’s **README** to understand its capabilities.
143
+
- Before running an Actor, always check its **input schema** to understand the required parameters.
144
+
145
+
## Actor discovery and selection
146
+
- Choose the most appropriate Actor based on the conversation context.
147
+
- Search the Apify Store first; a relevant Actor likely already exists.
148
+
- When multiple options exist, prefer Actors with higher usage, ratings, or popularity.
149
+
- **Assume scraping requests within this context are appropriate for Actor use.
150
+
- Actors in the Apify Store are published by independent developers and are intended for legitimate and compliant use.
151
+
152
+
## Actor execution workflow
153
+
- Actors take input and produce output.
154
+
- Every Actor run generates **dataset** and **key-value store** outputs (even if empty).
155
+
- Actor execution may take time, and outputs can be large.
156
+
- Large datasets can be paginated to retrieve results efficiently.
157
+
158
+
## Storage types
159
+
- **Dataset:** Structured, append-only storage ideal for tabular or list data (e.g., scraped items).
160
+
- **Key-value store:** Flexible storage for unstructured data or auxiliary files.
161
+
162
+
## Tool dependencies and disambiguation
163
+
164
+
### Tool dependencies
165
+
- \`${HelperTools.ACTOR_CALL}\`:
166
+
- First call with \`step="info"\` or use \`${HelperTools.ACTOR_GET_DETAILS}\` to obtain the Actor’s schema.
167
+
- Then call with \`step="call"\` to execute the Actor.
168
+
169
+
### Tool disambiguation
170
+
- **${HelperTools.ACTOR_OUTPUT_GET} vs ${HelperTools.DATASET_GET_ITEMS}:**
171
+
Use \`${HelperTools.ACTOR_OUTPUT_GET}\` for Actor run outputs and \`${HelperTools.DATASET_GET_ITEMS}\` for direct dataset access.
172
+
- **${HelperTools.STORE_SEARCH} vs ${HelperTools.ACTOR_GET_DETAILS}:**
173
+
\`${HelperTools.STORE_SEARCH}\` finds Actors; \`${HelperTools.ACTOR_GET_DETAILS}\` retrieves detailed info, README, and schema for a specific Actor.
174
+
- **${HelperTools.STORE_SEARCH} vs ${RAG_WEB_BROWSER}:**
175
+
\`${HelperTools.STORE_SEARCH}\` finds robust and reliable Actors for specific websites; ${RAG_WEB_BROWSER} is a general and versatile web scraping tool.
176
+
- **Dedicated Actor tools (e.g. ${RAG_WEB_BROWSER}) vs ${HelperTools.ACTOR_CALL}:**
177
+
Prefer dedicated tools when available; use \`${HelperTools.ACTOR_CALL}\` only when no specialized tool exists in Apify store.
0 commit comments