This newly released API provides streaming access to a whole host of GDELTs coolest features.

In this section I am going to walk through a bunch of the API search parameters that will allow us to generate our own interactive trelliscopes to help us understand what is going on in the world.

To do this, I am going to walk through the majority of the parameters of the workhorse function get_data_ft_v2_api which accesses data from GDELT and builds various interactive visualizations depending on the type of results you want.

I encourage everyone who wants to fully understand what the V2 API can do and how my function works to take a look at this FT V2 API documentation from GDELT and the function documentation gdeltr2 in the code ?gdeltr2::get_data_ft_v2_api() .

Now it is time to load the package.

Now that we have covered what this parameter is lets build a vector of terms to use for our analysis

You can also combine phrases like '"Brooklyn Nets" playoffs' which would search for “Brooklyn Nets” AND the word playoffs or '"New York City" "City Council"' which would look for “New York City” AND City Council.

In some cases this won’t matter too much, but depending on the terms you are looking for it may.

For example, if there was an article in a publication that discussed a fundraiser in Brooklyn that nets a large donation from an unknown benefactor it would return this as a match. This is not something I want to see if I am only looking for articles about my favorite NBA team the Brooklyn Nets. To find those matches I want want to search '"Brookyln Nets"' .

To GDELT if you search for "Brooklyn Nets" it will go through every article and return anything that contains the words Brooklyn AND Nets. The words don’t have to be next to each other to return a positive match.

If you want to find exact terms you need to use a quoted string.

GDELT indexes and sucks in all the text from page it processes. The V2 Full Text API lets users search through this text and if there is a match returns any matches.

This can be a person, place, thing, name, phrase, literally ANYTHING you can imagine and when you think about the fact that GDELT monitors millions of websites from around the world every second you quickly realize that you can use this to find some truly obscure information.

This parameter lets you search GDELT’s entire Full Text API for up to the last 3 months for ANY term your heart desires.

Now lets build a vector of some of the domains I enjoy from various aspects of what I like to monitor in the world.

Remember to use the type of website that it is ie don’t search espn if you are looking for espn.com

The APIs do a variety of things from extracting out numeric references, people, places, to generating sentiment scores. They even extract out media content and process the videos and photos through all of Google Cloud Vision’s APIs which can do everything from identifying what is in the photo, to predicting the sentiment of the any person in the photo, they even attempt to identify whether the photo contains a brand!

Each second GDELT scours millions of websites from around the world for new content. When a new article is discovered the contents of the article get processed through GDELT’s APIs.

This parameter makes GDELT an easy way to continuously keep track of the websites you enjoy reading.

The V2 Full Text API lets you provide a set of web domains and if they are indexed by GDELT it will return the articles it indexed over the user specified time period.

Indexing and parsing web-domains is the lifeblood of GDELT. If there is a website you read that publishes information the odds are it is tracked by GDELT.

One article can have many themes as well.

For example if GDELT reads in an article and processes text to find the phrase “white stuff falling to the ground from the man’s head from a dry scalp” it would assign the GKG theme corresponding to dandruff from above.

As part of this they use the processed text assign any number of themes to the article.

Every time GDELT processes new content it performs Natural Language Processing [NLP] on the text.

They track economic events, political events, scientific events, corporate actions, just to name a few categories!

Some of the themes are taken from other organizations like the World Bank while others internally developed. The themes cover just about every imaginable topic in world affairs. They range from obscurities like dandruff to outlandish topics like perverted actions to serious topics like drug overdoses .

GDELT has created a list of nearly 21,000 themes it actively seeks to tag whenever it process an article.

This is one of my favorite features!!

Here is how we define them:

The reason you will see the set.seed function is to ensure that your random 3 themes are the same as mine.

In this example in addition to the themes I specifically define, I am going to include 3 additional random themes.

If you were looking for articles about dandruff you must enter the exact code of "TAX_DISEASE_DANDRUFF" . The codes are not case sensitive "tax_disease_dandruff" or Tax_Disease_Dandruff would also work.

When entering in a theme you must be exact.

You can interactively explore the active GKG themes as of this post here or explore GKG themes in R with the following code:

Given the shear volume and power of these GKG themes I encourage everyone to spend some time perusing the themes to identify those that may be of interest in your every day quest for staying on top of world affairs.

Here we are going to define a vector of some random things I want to look for in photos as part of our analysis.

Consider spelling, abbreviations and logo names. Google’s OCR technology isn’t perfect so there may be some inaccuracy.

GDELT’s V2 Full Text API will then let you search through the results for any matching user defined text!

Since GDELT uses Google Cloud Vision for every website it monitors it takes each piece of media content it finds and uses Google Cloud Visions OCR API to extract out any text it finds.

One of the many really cool things Google Cloud Vision does is extract out text from media content.

This is a great way to keep track of brands and find pictures containing certain text that may be of interest to you.

This parameter enables you use GDELT to search for text extracted from media content.

Use the specific code without quotes. These are also not case sensitive. The tags also won’t always be accurate.

I encourage you to spend a bit of time perusing them to help you identify imagetags that may be of interest to you and as before, you can explore the active tags interactively here or in R with the following code:

Similar to the GKG Themes there are a number of imagetags, 8977 as of the time of this post.

As is the case with OCR, GKG, and Imagetags don’t use quotes and give the exact term in which ever case you desire. With these I encourage you to monitor them over time as Google is constantly learning new things that you can search for and also remember that Google isn’t always accurate in its predictions.

You can explore the active tags interactively here or in R by with the following code:

It is both fun and useful to spend some time looking at the multitude of people, places, things, concepts, teams, brands, companies, locations and ideas that Google has learned and actively looks for.

Similar to the GKG Themes and Imagetags there are a number of imagewebtags, 21,097 as of the time of this post.

For our example we can leave this parameter empty as we want to use the default of 250 results.

There is a way to circumvent this a bit by defining specific dates instead of time-spans for the time horizon but that is much more complicated and outside the scope of this tutorial. Any advanced user who needs to know more about this can reach out to me directly.

The default and maximum allowed per API call is 250 which at times may mean you aren’t getting the full set of results.

This parameter defines the maximum amount of results you want to return for any API call with results.

For our initial example lets define the timespan as anything published within the last 5 days.

When defining a timespan it must be given as a string containing either minutes, hours, days, or weeks ie: "24 hours" , "97 minutes" , "17 days" , or "12 weeks" .

In gdeltr2 you must define time-spans, if you don’t it reverts the default of anything in the last 24 hours.

You can ask the API for information as far back as 12 weeks to as recent as 1 minute ago.

As I mentioned before, GDELTs V2 Full Text API gives you access to information over continuous 3 month periods.

The final step before we can run our search and create the interactive trelliscopes is to go over a few other parameters you should be aware of.

Again, since we are using the default parameter we don’t need to enter anything.

To explore the available country codes you can use the following code:

If you wish to isolate a country you must specify the exact country code or codes you want to isolate.

If you don’t enter anything this it will search the whole world and this is the default.

This parameter lets you isolate your search to a specific country or countries.

Finally if we were to enter nothing and exclude the my_trelliscope_parameters from our function call it would default to the parameters above.

Finally path, allows you to save and publish your trelliscope if you have access to folders related to a website. In this tutorial we aren’t going to do that so we will set that parameter to NULL .

Columns define the number of columns we want to use, in this case we will use 2.

Rows define the number of rows for the trelliscope. We will use 1.

They must be passed through as a list containing either: rows, columns, or path.

This is an advanced feature that lets the user define some parameters that modify the interactive trelliscope.

Creating the Trelliscopes

We are now ready to use the get_data_ft_v2_api function and create our interactive trelliscopes!

The function supports a number of different types of output via its modes parameter.

These options include basic image panels with links to the article, interactive visualization of the amount of activity for the specified search parameter and even various forms of wordclouds for each search parameter!

By default gdeltr2 will create an object in your environment starting with the word trelliscope this makes it easy for you to explore the trelliscope once it has been created.

Image Panels Parameter: modes = 'ArtList' This is the default parameter and my personal go to for how I interact with GDELT on a daily basis. In order to create an image panel trelliscope all we do is pass along each of the parameters we defined and in no time you should have an interactive trelliscope with clickable links for your view into the world. Here is how to do it: get_data_ft_v2_api(terms = my_terms, domains = my_domains, images_web_tag = my_image_web, images_tag = my_image_tags, images_ocr = my_ocr, gkg_themes = my_themes, modes = c("Artlist"), timespans = my_timespan, trelliscope_parameters = my_trelliscope_parameters)

trelliscopeImage

Click here if you wish to explore in full screen mode

Timeline Volume Parameter: modes = "TimelineVolInfo" This mode will create an interactive chart with the date on the x axis and the volume score on the y axis. The volume score is essentially how much your specified parameter was mentioned. In these trelliscopes if you click on the line it will return a tooltip containing clickable links to your specified search parameter. When using this mode I highly recommend changing the timespan to 12 weeks in order to give you the maximum view on how often your parameter was discussed. Here is how to create this: get_data_ft_v2_api(terms = my_terms, domains = my_domains, images_web_tag = my_image_web, images_tag = my_image_tags, images_ocr = my_ocr, gkg_themes = my_themes, modes = c("TimelineVolInfo"), timespans = "12 weeks", trelliscope_parameters = my_trelliscope_parameters)

trelliscopeHighcharter

Click here if you wish to explore in full screen mode



Wordclouds Parameter: modes = c("WordCloudEnglish", "WordCloudTheme", "WordCloudImageTags", "WordCloudImageWebTags") These modes allow you to create various wordclouds for each specified search parameter. WordCloudEnglish returns a word cloud of the most commonly used English words for the specified sarch parameter. WordCloudTheme returns a wordcloud of the most identified GKG themes. WordCloudImageTags returns the most commonly found Imagetags and WordCloudImageWebTags returns the most commonly found ImageWebTags. These wordclouds can be a great way to quickly understand what is going on with regards to specific search parameter. Here is how we create a trelliscope that displays each of these wordcloud types. For this analysis we are going to slightly modify the panels to only show 1 column and 1 row and we will also expand the time frame to 2 weeks. get_data_ft_v2_api(terms = my_terms, domains = my_domains, images_web_tag = my_image_web, images_tag = my_image_tags, images_ocr = my_ocr, gkg_themes = my_themes, modes = c("WordCloudEnglish", "WordCloudTheme", "WordCloudImageTags", "WordCloudImageWebTags"), timespans = "2 weeks", trelliscope_parameters = list(rows = 1, columns = 1, path = NULL))

trelliscopeWordcloud

Click here if you wish to explore in full screen mode



Interacting with the Trelliscope After you execute the code you will notice the trelliscope in your viewer pane. To best interact with the Trelliscope I like to open it in my browser of choice. To do that click the button to the right of the broom button. If you always want to open interactive content in your browser you can run the following code: options(viewer = NULL)

Navigating Interactive Trelliscopes Now that we’ve created a bunch of interactive trelliscopes I wanted to take a moment and help you better understand how to use them. These trelliscopes will help you quickly navigate through vasts amounts of information in large part due to their interactive filter capabilities.

Shortcuts You case use the f key to call up all the filters. You can modify the grid by pressing g . You can modify the labels by pressing l . You can change the sorting parameters by pressing s .

Instead of clicking the left and right arrows you can use the left or right keys, or if you are on a mobile device or tablet and it is a trelliscope that lives on the web you can swipe left or right.

Filtering Searching by Text Searching by click