This site has no affiliated with GDELT (gdeltproject.org), but does provide an interface to some of its data services - specifically its DOC, GEO and Television APIs. These services should not be confused with the GDELT Events databases to which "GDELT" is most closely associated by some, although they are related.
Using this site you can do things like:
- search in English for global content in any or all languages on a particular topic
- search for specific content or published in the past hour from eg. Japan, or in Japanese, or referencing a Japanese location
- find content based on features and text in its imagery
- investigate and compare media trends over time
- Search term - any work or phrase in English (GDELT matches these against English translations of non-English content). Phrases should be in "double quotes". Spaces outside quotes are interpreted as AND. Boolean is supported, so you can search for '(x OR y)' or 'x -y' (NOT y). There are also two powerful special search functions that can help to find the right content:
- near: articles that feature two words within n words of each other - e.g. near5:"climate emergency" (example)
- repeat: articles that feature a word a minimum of n times (max 5) - e.g. repeat3:"climate" (example)
- Image tags - images within content are processed using deep learning algorithms to identify features and text they contain. Search for available tags in the dialogue box (example)
- Themes - content is interpreted and linked to 'themes' reflecting its subject matter, which you can search by. Themes are based on GDELT's Global Knowledge Graph (GKG).
- Source countries - countries of content origin. Choose up to ~7 - current implementation interprets these as (tag1 OR tag2)
- Source languages - languages of the content. Choose up to ~7 - currently interpreted as (theme1 OR theme2)
- Domain - define web domain of content - e.g. 'bbc.co.uk' or top-level domain '.gov' (US government content). Currently supports up to 5 domains, or 7 with _Exclude subdomains_ checked.
- Date range - define any window for content dated since 1 January 2017. 'Recent' field must be clear to use this. CONTENT view will only return matching articles published in the most recent 3 months of your specified window.
- Recent - set to return most recent content in terms of minutes/hours/days/months/year. Use format '12h', '5d', '3w', '6m', '1y' (1y max) or just a number for minutes.
- GT - A Google Translate widget to translate articles headlines in ART LIST and ART GALLERY modes. To use scroll to the bottom of the results window and select the target language. Translations of abbreviated headlines aren't always very good, but should still give an idea of topic.
- CONTENT - This tab offers the various modes to access the content matching the search (example1 )
- TIMELINE - This offers the modes available to view volumetric trends for the query over time (example1, example2).
- GEO - Geographical tools to investigate media published in the past 7 days. Some modes (e.g. default POINT DATA example) report not the origins of content, but countries and place names that are referenced. Some of the modes are image-specific, and will only work for image tag searches. Others work only with search terms and not image tags.
- TV - This explores GDELT's Television Explorer API, a collaboration with the Internet Archive to make searchable the "closed captioning" text streams accompanying TV news - i.e. the digitised speech that is broadcast - for selected networks.
- This tool works by remembering the data for TIMELINE queries that you Add, and comparing them when you click View. See the guide.