Twitter Sentiment Analysis using Open AI and Power BI

This article is an experiment that explains how to use an Open AI to predict the sentiment analysis and gender in recent tweets for a specific topic and show the result in a Power BI dashboard.

What is Open AI?

The Open AI model is trained on a dataset of 3.6 billion Tweets. The training process takes about 4 days on 8 GPUs. After training, the model can accurately predict the sentiment of a tweet with 85% accuracy. The model can also be fine-tuned to accurately predict the sentiment of tweets from a specific Twitter user with 90% accuracy.

How does it work?

You input some text as a prompt, and the API will return a text completion that attempts to match whatever instructions or context you gave it.

You can think of this as a very advanced autocomplete — the model processes your text prompt and tries to predict what’s most likely to come next.

This video explains better how Open AI works

In our case, we use the expression, “Decide whether a Tweet’s sentiment is positive, neutral, or negative. Tweet:“, to extract the sentiment, and, “Extract the gender and decide whether a name´s gender is male, female, or unknown. Name:“, to extract the gender from the user name.

How does the experiment work?

The Python script gets the recent tweets about a topic and analyzes the sentiment and the gender of the text of each tweet. After that, the result is saved in an Excel file. I don’t recommend it because it can get slow, but it’s possible to run the Python code directly from Power BI. Follow the instructions here.

Before executing the Python script, you must create an account in Twitter develop and Open AI to obtain the “BEARER_TOKEN” and the “OPEN AI KEY” respectively.

Follow below the Python code:

Python

# Twitter sentiment analysis using Open AI and Power BI
# Author: Lawrence Teixeira
# Date: 2022-10-09

# Requirements
# pip install tweepy==4.0
# pip install openai

# Import the packages
import pandas as pd
import tweepy
import openai

# Connect to Twitter API
MY_BEARER_TOKEN = "YOU HAVE TO INSERT HERE YOUR TWITTER BEARER TOKEN"

# create your client
client = tweepy.Client(bearer_token=MY_BEARER_TOKEN)

# Functions to extract sentiment and gender with Open AI API
# if you want to know more examples about how to use Open AI click [here](https://beta.openai.com/examples/).

openai.api_key = "YOU HAVE TO INSERT HERE YOUR OPEN AI KEY"

def Generate_OpenAI_Sentiment(question_type, openai_response ):
    response = openai.Completion.create(
      engine="text-davinci-002",
      prompt= question_type + ":/n/n" + format(openai_response) +"/n/n Sentiment:",
      temperature=0.7,
      max_tokens=100,
      top_p=1,
      frequency_penalty=0.5,
      presence_penalty=0
    )
    return response['choices'] [0]['text']

def Generate_OpenAI_Gender(question_type, openai_response ):
    response = openai.Completion.create(
      engine="text-davinci-002",
      prompt= question_type + ":/n/n" + format(openai_response),
      temperature=0.7,
      max_tokens=100,
      top_p=1,
      frequency_penalty=0.5,
      presence_penalty=0
    )
    return response['choices'] [0]['text']

# Query search for tweets. Here your can put whatever you want.
# if you want to know more about que Twitter query parameters click [here](https://developer.twitter.com/en/docs/twitter-api/tweets/search/api-reference/get-tweets-search-recent/).
query = "#UkraineWarNews lang:en"

# if wnat to your start and end time for fetching tweets
#start_time = "2022-10-07T00:00:00Z"
#end_time   = "2022-10-08T00:00:00Z"

# get tweets from the API
tweets = client.search_recent_tweets(query=query,
                                    #start_time=start_time,
                                    #end_time=end_time,
                                     tweet_fields = ["created_at", "text", "source"],
                                     user_fields = ["name", "username", "location", "verified", "description"],
                                     max_results = 100,
                                     expansions='author_id'
                                     )

## Create a data frame to save the results
tweet_info_ls = []
# iterate over each tweet and corresponding user details
for tweet, user in zip(tweets.data, tweets.includes['users']):
    tweet_info = {
        'created_at': tweet.created_at,
        'text': tweet.text,
        'source': tweet.source,
        'name': user.name,
        'username': user.username,
        'location': user.location,
        'verified': user.verified,
        'description': user.description,
        'Sentiment': Generate_OpenAI_Sentiment("Decide whether a Tweet's sentiment is positive, neutral, or negative. Tweet", tweet.text ),
        'Gender': Generate_OpenAI_Gender("Extract the gender and decide whether a name´s gender is male, female, or unknown. Name", user.name ),
        'Query': query.rsplit(' ', 2)[0]
    }
    tweet_info_ls.append(tweet_info)
# create dataframe from the extracted records
tweets_df = pd.DataFrame(tweet_info_ls)

# remove the timezone format
tweets_df['created_at'] = tweets_df['created_at'].dt.tz_localize(None)

# if your use Google Colab, save the result of a csv file in the Google Drive
#tweets_df.to_excel("drive/MyDrive/datasets/Resulados_twitter.xlsx")

# if your want to insert direct in Power BI
print(tweets_df)

Once you execute this Python code and refresh the Power Bi report, you will see the analysis result. In my case, I chose UkraineWarNews. It’s interesting to see in the Power Bi dashboard, that 78% are negative tweets 16% of positive and 33% are male versus 5% female. You can interact with this report by clicking on the visuals.

Click here, to see this report in full-screen mode.

Important: This experiment gets only the last 100 tweets to analyze, and gender is defined only by the spelling of the name and not by the sexual orientation of each individual.

You can download the Power BI report here, and, the version of the Python code in Google Colab here.

There are a lot of possibilities for using this solution in the real world. The Open AI has a lot of other examples like extracting keywords, text summarization, grammar correction, restaurant review creator, and much more. You can access all the examples here. If you have questions about the solution, feel free to comment in the box below.

That´s it for today.

Author: Lawrence Teixeira

With over 30 years of expertise in the Technology sector and 18 years in leadership roles as a CTO/CIO, he excels at spearheading the development and implementation of strategic technological initiatives, focusing on system projects, advanced data analysis, Business Intelligence (BI), and Artificial Intelligence (AI). Holding an MBA with a specialization in Strategic Management and AI, along with a degree in Information Systems, he demonstrates an exceptional ability to synchronize cutting-edge technologies with efficient business strategies, fostering innovation and enhancing organizational and operational efficiency. His experience in managing and implementing complex projects is vast, utilizing various methodologies and frameworks such as PMBOK, Agile Methodologies, Waterfall, Scrum, Kanban, DevOps, ITIL, CMMI, and ISO/IEC 27001, to lead data and technology projects. His leadership has consistently resulted in tangible improvements in organizational performance. At the core of his professional philosophy is the exploration of the intersection between data, technology, and business, aiming to unleash innovation and create substantial value by merging advanced data analysis, BI, and AI with a strategic business vision, which he believes is crucial for success and efficiency in any organization. View all posts by Lawrence Teixeira