Back to Scrapegraph Ai

๐Ÿ•ท๏ธ ScrapeGraphAI: ํ•œ ๋ฐฉ์— ๋๋‚ด๋Š” ์›น์Šคํฌ๋ž˜ํ•‘

docs/korean.md

2.0.013.4 KB
Original Source

๐Ÿš€ ๋” ๋น ๋ฅด๊ณ  ๊ฐ„๋‹จํ•œ ๋Œ€๊ทœ๋ชจ ์Šคํฌ๋ž˜ํ•‘ ๋ฐฉ๋ฒ•(๋‹จ 5์ค„์˜ ์ฝ”๋“œ)์„ ์ฐพ๊ณ  ๊ณ„์‹ ๊ฐ€์š”? ScrapeGraphAI.com์˜ ํ–ฅ์ƒ๋œ ๋ฒ„์ „์„ ํ™•์ธํ•ด๋ณด์„ธ์š”! ๐Ÿš€


๐Ÿ•ท๏ธ ScrapeGraphAI: ํ•œ ๋ฐฉ์— ๋๋‚ด๋Š” ์›น์Šคํฌ๋ž˜ํ•‘

English | ไธญๆ–‡ | ๆ—ฅๆœฌ่ชž | ํ•œ๊ตญ์–ด | ะ ัƒััะบะธะน | Tรผrkรงe | Deutsch | Espaรฑol | franรงais | Portuguรชs

<p align="center"> <a href="https://trendshift.io/repositories/9761" target="_blank"></a> <p align="center">

ScrapeGraphAI๋Š” ์›น ์‚ฌ์ดํŠธ์™€ ๋กœ์ปฌ ๋ฌธ์„œ(XML, HTML, JSON, Markdown ๋“ฑ)์— ๋Œ€ํ•œ ์Šคํฌ๋ž˜ํ•‘ ํŒŒ์ดํ”„๋ผ์ธ์„ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด LLM ๋ฐ ์ง์ ‘ ๊ทธ๋ž˜ํ”„ ๋กœ์ง์„ ์‚ฌ์šฉํ•˜๋Š” ํŒŒ์ด์ฌ ์›น์Šคํฌ๋ž˜ํ•‘ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ž…๋‹ˆ๋‹ค.

์ถ”์ถœํ•˜๋ ค๋Š” ์ •๋ณด๋ฅผ ๋งํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์•Œ์•„์„œ ์ฒ˜๋ฆฌํ•ด ์ค๋‹ˆ๋‹ค!

<p align="center"> </p>

๐Ÿš€ ํ†ตํ•ฉ

ScrapeGraphAI๋Š” ์ธ๊ธฐ ์žˆ๋Š” ํ”„๋ ˆ์ž„์›Œํฌ ๋ฐ ๋„๊ตฌ์™€์˜ ์›ํ™œํ•œ ํ†ตํ•ฉ์„ ์ œ๊ณตํ•˜์—ฌ ์Šคํฌ๋ž˜ํ•‘ ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค. ํŒŒ์ด์ฌ์ด๋“  Node.js๋กœ ๊ฐœ๋ฐœํ•˜๋“ , LLM ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์‚ฌ์šฉํ•˜๋“ , ๋…ธ์ฝ”๋“œ ํ”Œ๋žซํผ์ด๋“  ์ €ํฌ์˜ ํฌ๊ด„์ ์ธ ํ†ตํ•ฉ ์˜ต์…˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

๋” ๋งŽ์€ ์ •๋ณด๋Š” ๋‹ค์Œ ๋งํฌ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค

ํ†ตํ•ฉ:

๐Ÿš€ ๋น ๋ฅธ ์„ค์น˜

Scrapegraph-ai์— ๋Œ€ํ•œ ์ฐธ์กฐ ํŽ˜์ด์ง€๋Š” PyPI์˜ ๊ณต์‹ ํŽ˜์ด์ง€์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค: pypi.

bash
pip install scrapegraphai

# ์ค‘์š” (์›น์‚ฌ์ดํŠธ ์ฝ˜ํ…์ธ  ๊ฐ€์ ธ์˜ค๊ธฐ์šฉ)
playwright install

์ฐธ๊ณ : ๋‹ค๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์™€์˜ ์ถฉ๋Œ์„ ํ”ผํ•˜๊ธฐ ์œ„ํ•ด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ฐ€์ƒ ํ™˜๊ฒฝ์— ์„ค์น˜ํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค ๐Ÿฑ

๐Ÿ’ป ์‚ฌ์šฉ๋ฒ•

์›น์‚ฌ์ดํŠธ(๋˜๋Š” ๋กœ์ปฌ ํŒŒ์ผ)์—์„œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์—ฌ๋Ÿฌ ํ‘œ์ค€ ์Šคํฌ๋ž˜ํ•‘ ํŒŒ์ดํ”„๋ผ์ธ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐ€์žฅ ์ผ๋ฐ˜์ ์ธ ๊ฒƒ์€ SmartScraperGraph๋กœ, ์‚ฌ์šฉ์ž ํ”„๋กฌํ”„ํŠธ์™€ ์†Œ์Šค URL์ด ์ฃผ์–ด์ง„ ๋‹จ์ผ ํŽ˜์ด์ง€์—์„œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.

python
from scrapegraphai.graphs import SmartScraperGraph

# ์Šคํฌ๋ž˜ํ•‘ ํŒŒ์ดํ”„๋ผ์ธ์— ๋Œ€ํ•œ ๊ตฌ์„ฑ ์ •์˜
graph_config = {
    "llm": {
        "model": "ollama/llama3.2",
        "model_tokens": 8192,
        "format": "json",
    },
    "verbose": True,
    "headless": False,
}

# SmartScraperGraph ์ธ์Šคํ„ด์Šค ์ƒ์„ฑ
smart_scraper_graph = SmartScraperGraph(
    prompt="์›นํŽ˜์ด์ง€์—์„œ ์œ ์šฉํ•œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜์„ธ์š”. ํšŒ์‚ฌ๊ฐ€ ํ•˜๋Š” ์ผ์— ๋Œ€ํ•œ ์„ค๋ช…, ์ฐฝ๋ฆฝ์ž ๋ฐ ์†Œ์…œ ๋ฏธ๋””์–ด ๋งํฌ๋ฅผ ํฌํ•จํ•˜์„ธ์š”",
    source="https://scrapegraphai.com/",
    config=graph_config
)

# ํŒŒ์ดํ”„๋ผ์ธ ์‹คํ–‰
result = smart_scraper_graph.run()

import json
print(json.dumps(result, indent=4))

[!NOTE] OpenAI๋‚˜ ๋‹ค๋ฅธ ๋ชจ๋ธ๋“ค์€ LLM ์„ค์ •๋งŒ ๋ฐ”๊พธ๋ฉด ๋ฉ๋‹ˆ๋‹ค!

python
graph_config = {
   "llm": {
       "api_key": "YOUR_OPENAI_API_KEY",
       "model": "openai/gpt-4o-mini",
   },
   "verbose": True,
   "headless": False,
}

์ถœ๋ ฅ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ dictionary ํ˜•ํƒœ๊ฐ€ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค:

python
{
    "description": "ScrapeGraphAI transforms websites into clean, organized data for AI agents and data analytics. It offers an AI-powered API for effortless and cost-effective data extraction.",
    "founders": [
        {
            "name": "",
            "role": "Founder & Technical Lead",
            "linkedin": "https://www.linkedin.com/in/perinim/"
        },
        {
            "name": "Marco Vinciguerra",
            "role": "Founder & Software Engineer",
            "linkedin": "https://www.linkedin.com/in/marco-vinciguerra-7ba365242/"
        },
        {
            "name": "Lorenzo Padoan",
            "role": "Founder & Product Engineer",
            "linkedin": "https://www.linkedin.com/in/lorenzo-padoan-4521a2154/"
        }
    ],
    "social_media_links": {
        "linkedin": "https://www.linkedin.com/company/101881123",
        "twitter": "https://x.com/scrapegraphai",
        "github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai"
    }
}

์—ฌ๋Ÿฌ ํŽ˜์ด์ง€์—์„œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๊ฑฐ๋‚˜, Python ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์ƒ์„ฑํ•˜๊ฑฐ๋‚˜, ์‹ฌ์ง€์–ด ์˜ค๋””์˜ค ํŒŒ์ผ์„ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค๋ฅธ ํŒŒ์ดํ”„๋ผ์ธ๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ ์ด๋ฆ„์„ค๋ช…
SmartScraperGraph์‚ฌ์šฉ์ž ํ”„๋กฌํ”„ํŠธ์™€ ์ž…๋ ฅ ์†Œ์Šค๋งŒ ์žˆ์œผ๋ฉด ๋˜๋Š” ๋‹จ์ผ ํŽ˜์ด์ง€ ์Šคํฌ๋ž˜ํผ์ž…๋‹ˆ๋‹ค.
SearchGraph๊ฒ€์ƒ‰ ์—”์ง„์˜ ์ƒ์œ„ n๊ฐœ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์—์„œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๋Š” ๋‹ค์ค‘ ํŽ˜์ด์ง€ ์Šคํฌ๋ž˜ํผ์ž…๋‹ˆ๋‹ค.
SpeechGraph์›น์‚ฌ์ดํŠธ์—์„œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๊ณ  ์˜ค๋””์˜ค ํŒŒ์ผ์„ ์ƒ์„ฑํ•˜๋Š” ๋‹จ์ผ ํŽ˜์ด์ง€ ์Šคํฌ๋ž˜ํผ์ž…๋‹ˆ๋‹ค.
ScriptCreatorGraph์›น์‚ฌ์ดํŠธ์—์„œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๊ณ  ํŒŒ์ด์ฌ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋‹จ์ผ ํŽ˜์ด์ง€ ์Šคํฌ๋ž˜ํผ์ž…๋‹ˆ๋‹ค.
SmartScraperMultiGraph๋‹จ์ผ ํ”„๋กฌํ”„ํŠธ์™€ ์ถœ์ฒ˜ ๋ชฉ๋ก์ด ์ฃผ์–ด์ง€๋ฉด ์—ฌ๋Ÿฌ ํŽ˜์ด์ง€์—์„œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๋Š” ๋‹ค์ค‘ ํŽ˜์ด์ง€ ์Šคํฌ๋ž˜ํผ์ž…๋‹ˆ๋‹ค.
ScriptCreatorMultiGraph์—ฌ๋Ÿฌ ํŽ˜์ด์ง€์™€ ์†Œ์Šค์—์„œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๊ธฐ ์œ„ํ•œ ํŒŒ์ด์ฌ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋‹ค์ค‘ ํŽ˜์ด์ง€ ์Šคํฌ๋ž˜ํผ์ž…๋‹ˆ๋‹ค.

๊ฐ ๊ทธ๋ž˜ํ”„์—๋Š” ๋‹ค์ค‘ ๋ฒ„์ „์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด LLM์„ ๋ณ‘๋ ฌ๋กœ ํ˜ธ์ถœํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

OpenAI, Groq, Azure, Gemini์™€ ๊ฐ™์€ API๋ฅผ ํ†ตํ•ด ๋‹ค์–‘ํ•œ LLM์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, Ollama๋ฅผ ์ด์šฉํ•œ ๋กœ์ปฌ ๋ชจ๋ธ๋„ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

๋กœ์ปฌ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋ ค๋ฉด Ollama๋ฅผ ์„ค์น˜ํ•˜๊ณ  ollama pull ๋ช…๋ น์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ๋‹ค์šด๋กœ๋“œํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“– ๋ฌธ์„œ

ScrapeGraphAI ๊ด€๋ จ ๋ฌธ์„œ๋Š” ์—ฌ๊ธฐ์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Docusaurus๋„ ์—ฌ๊ธฐ์—์„œ ํ™•์ธํ•ด ๋ณด์„ธ์š”.

๐Ÿค ๊ธฐ์—ฌ

์ž์œ ๋กญ๊ฒŒ ๊ธฐ์—ฌํ•˜๊ณ  Discord ์„œ๋ฒ„์— ์ฐธ์—ฌํ•˜์—ฌ ๊ฐœ์„  ์‚ฌํ•ญ์„ ๋…ผ์˜ํ•˜๊ณ  ์ œ์•ˆํ•ด ์ฃผ์„ธ์š”!

๊ธฐ์—ฌ ๊ฐ€์ด๋“œ๋ผ์ธ์„ ์ฐธ๊ณ ํ•˜์„ธ์š”.

๐Ÿ”— ScrapeGraph API & SDKs

์‹œ์Šคํ…œ์— ScrapeGraph๋ฅผ ํ†ตํ•ฉํ•˜๊ธฐ ์œ„ํ•œ ๋น ๋ฅธ ์†”๋ฃจ์…˜์„ ์ฐพ๊ณ  ์žˆ๋‹ค๋ฉด, ์—ฌ๊ธฐ!์—์„œ ๊ฐ•๋ ฅํ•œ API๋ฅผ ํ™•์ธํ•ด ๋ณด์„ธ์š”.

Python๊ณผ Node.js SDK๋ฅผ ์ œ๊ณตํ•˜์—ฌ ํ”„๋กœ์ ํŠธ์— ์‰ฝ๊ฒŒ ํ†ตํ•ฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜์—์„œ ํ™•์ธํ•ด ๋ณด์„ธ์š”.

SDK์–ธ์–ดGitHub ๋งํฌ
Python SDKPythonscrapegraph-py
Node.js SDKNode.jsscrapegraph-js

๊ณต์‹ API ๋ฌธ์„œ๋Š” ์—ฌ๊ธฐ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ”ฅ ๋ฒค์น˜๋งˆํฌ

Firecrawl ๋ฒค์น˜๋งˆํฌ Firecrawl benchmark์— ๋”ฐ๋ฅด๋ฉด, ScrapeGraph๋Š” ์‹œ์žฅ์—์„œ ์ตœ๊ณ ์˜ ํŽ˜์ฒ˜์ž…๋‹ˆ๋‹ค!

๐Ÿ“ˆ ํ…”๋ ˆ๋ฉ”ํŠธ๋ฆฌ

์ €ํฌ๋Š” ํŒจํ‚ค์ง€์˜ ํ’ˆ์งˆ๊ณผ ์‚ฌ์šฉ์ž ๊ฒฝํ—˜์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ์ต๋ช…์˜ ์‚ฌ์šฉ ์ง€ํ‘œ๋ฅผ ์ˆ˜์ง‘ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ฐ์ดํ„ฐ๋Š” ๊ฐœ์„  ์‚ฌํ•ญ์˜ ์šฐ์„ ์ˆœ์œ„๋ฅผ ์ •ํ•˜๊ณ  ํ˜ธํ™˜์„ฑ์„ ๋ณด์žฅํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค. ์˜ตํŠธ์•„์›ƒํ•˜๋ ค๋ฉด ํ™˜๊ฒฝ ๋ณ€์ˆ˜ SCRAPEGRAPHAI_TELEMETRY_ENABLED=false๋ฅผ ์„ค์ •ํ•˜์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์—ฌ๊ธฐ์—์„œ ์„ค๋ช…์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

โค๏ธ ๊ธฐ์—ฌ์ž๋“ค

๐ŸŽ“ ์ธ์šฉ

์šฐ๋ฆฌ์˜ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์—ฐ๊ตฌ ๋ชฉ์ ์œผ๋กœ ์‚ฌ์šฉํ•œ ๊ฒฝ์šฐ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ธ์šฉํ•ด ์ฃผ์„ธ์š”:

text
  @misc{scrapegraph-ai,
    author = {Lorenzo Padoan, Marco Vinciguerra},
    title = {Scrapegraph-ai},
    year = {2024},
    url = {https://github.com/VinciGit00/Scrapegraph-ai},
    note = {๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ์„ ํ™œ์šฉํ•œ ์Šคํฌ๋ž˜ํ•‘์šฉ Python ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ}
  }

์ €์ž๋“ค

์—ฐ๋ฝ์ฒ˜
Marco Vinciguerra
Lorenzo Padoan

๐Ÿ“œ ๋ผ์ด์„ ์Šค

ScrapeGraphAI๋Š” MIT License๋กœ ๋ฐฐํฌ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ LICENSE ํŒŒ์ผ์„ ์ฐธ์กฐํ•˜์„ธ์š”.

๊ฐ์‚ฌ์˜ ๋ง

  • ํ”„๋กœ์ ํŠธ์— ๊ธฐ์—ฌํ•œ ๋ชจ๋“  ๋ถ„๋“ค๊ณผ ์˜คํ”ˆ ์†Œ์Šค ์ปค๋ฎค๋‹ˆํ‹ฐ์— ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.
  • ScrapeGraphAI๋Š” ๋ฐ์ดํ„ฐ ํƒ์ƒ‰ ๋ฐ ์—ฐ๊ตฌ ๋ชฉ์ ์œผ๋กœ๋งŒ ์‚ฌ์šฉ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ ์˜ค์šฉ์— ๋Œ€ํ•ด ์ฑ…์ž„์„ ์ง€์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

Made with โค๏ธ by ScrapeGraph AI

Scarf tracking