News in a word cloud

This tutorial was originally published on DataCareer.

In this tutorial we will retrieve the latest news and visualise it in a word cloud, using Python 3.

NewsAPI.org is an easy to use API to get news from over 30,000 sources all over the world. The API is free for all non-commercial projects (including open-source) and in-development commercial projects. You do need to register though to get a an 'API key'. You can do this very easily in a few seconds at: https://newsapi.org/register.

Let's start with importing the required packages for this tutorial.

After registering at NewsAPI.org, you can find your API key at: https://newsapi.org/docs/authentication The following one is a dummy one, so please replace it with your own.

NewsAPI offers three endpoints:

  1. '/v2/top-headlines', for the most important headlines per country and category
  2. '/v2/everything', for all the news articles from over 30,000 sources
  3. '/v2/sources', for information on the various sources

We will use the 'everything' endpoint, to get news about 'Big Data'.

Now we can retrieve the news with the requests package.

The news can probably be found in the key articles, so let's just print the first value.

That seems right. Let's walk through all the news headlines with a loop (print just the titles).

Pretty easy right? Feel free to try out some other queries and the different endpoints. You can find the documentation at https://newsapi.org/docs and if you're logged in, all the example queries are already with your own API Key.

Now we have the headlines for 'Big Data', let's do something fun with it. We can visualize it with the wordcloud package. You can install the package via pip or conda-forge. Then, import the wordcloud & matplotlib packages

Now put all the headlines together in one string:

Now we have all the headlines together in one variable, we can use it to generate the word cloud with the following code:

What other cool things can you do with the NewsAPI and the wordcloud package?