Files
2026-02-16 14:02:42 +09:00

136 lines
4.4 KiB
Markdown

---
name: ck-gkg
description: >
Queries and explores Google Knowledge Graph for entity information and relationships.
Activate when user says 'look up entity', 'knowledge graph search', 'get entity details',
'find information about a person or organization', or 'Google Knowledge Graph'.
Accepts entity names, types, and relationship queries.
---
## Overview
Interfaces with the Google Knowledge Graph Search API to retrieve structured entity data including descriptions, types, images, and related entities for people, places, organizations, and concepts.
## When to Use
- Retrieving structured factual data about a named entity (person, org, place, concept)
- Enriching application data with authoritative entity metadata
- Verifying entity existence and canonical identifiers (MID)
- Building entity-linked features (auto-complete, entity cards, disambiguation)
- Cross-referencing entities across data sources using Knowledge Graph IDs
## Don't Use When
- Task needs real-time news or current events (Knowledge Graph has limited freshness)
- Searching for web pages rather than entities (use a web search API)
- Querying proprietary or internal data — KG only covers public knowledge
- Deep relationship traversal across many hops (use a graph database instead)
## Steps / Instructions
### 1. Set Up API Access
Required: Google Cloud project with Knowledge Graph Search API enabled.
```bash
# Environment variable (never hardcode)
export GOOGLE_API_KEY="your-key-here"
```
Enable API in Google Cloud Console:
```
APIs & Services → Library → Knowledge Graph Search API → Enable
```
### 2. Basic Entity Search
```python
import requests
import os
def search_knowledge_graph(query: str, types: list[str] | None = None, limit: int = 5):
url = "https://kgsearch.googleapis.com/v1/entities:search"
params = {
"query": query,
"key": os.environ["GOOGLE_API_KEY"],
"limit": limit,
"indent": True,
}
if types:
params["types"] = types # e.g., ["Person", "Organization"]
response = requests.get(url, params=params)
response.raise_for_status()
return response.json()
# Usage
results = search_knowledge_graph("Python programming language", types=["SoftwareApplication"])
```
### 3. Parse the Response
```python
def extract_entities(kg_response: dict) -> list[dict]:
entities = []
for item in kg_response.get("itemListElement", []):
result = item.get("result", {})
entities.append({
"name": result.get("name"),
"description": result.get("description"),
"detailed_description": result.get("detailedDescription", {}).get("articleBody"),
"types": result.get("@type", []),
"mid": result.get("@id"), # Google MID: /m/05z1_
"url": result.get("url"),
"image": result.get("image", {}).get("url"),
"score": item.get("resultScore"),
})
return entities
```
### 4. Entity Type Filters
Common KG entity types:
```
Person — individuals, public figures
Organization — companies, nonprofits, governments
Place — cities, countries, landmarks
Event — historical or recurring events
CreativeWork — books, films, music, software
Thing — generic fallback type
```
### 5. Handle Multi-Entity Disambiguation
```python
def disambiguate_entity(name: str, context_hint: str) -> dict | None:
results = search_knowledge_graph(name, limit=10)
entities = extract_entities(results)
# Score by context relevance
for entity in entities:
desc = (entity.get("description") or "").lower()
if context_hint.lower() in desc:
return entity
# Fall back to highest-scored result
return entities[0] if entities else None
```
### 6. Rate Limits and Caching
- Free tier: 100,000 requests/day
- Cache responses for repeated entity lookups
- Use MID as cache key for stable entity references
```python
from functools import lru_cache
@lru_cache(maxsize=500)
def cached_entity_lookup(query: str) -> str:
results = search_knowledge_graph(query, limit=1)
return str(results)
```
## Notes
- API key must be restricted to Knowledge Graph API in Google Cloud Console
- MID identifiers are stable — use them for entity cross-referencing
- Knowledge Graph data is Wikipedia/Wikidata-sourced; verify critical facts
- Response scores are relative within a single query, not absolute quality metrics